Understanding the Dataset
The dataset provided for this project was obtained from the Washington post github repository. The dataset contains information about the date, age, gender, race and the location where the shooting occurred and some additional features.
There was a total of 8002 instances in this dataset with many missing values. The predominant race in this dataset is White with 3300 instances of death with the second highest being Black with 1766 instances. The feature named “flee” has information about whether the victim was fleeing on foot, using a vehicle or not fleeing. Surprisingly, the majority of the data is in the ‘not fleeing’ category.
After cleaning the data there will be a better opportunity to analyze it as right now the data has thousands of missing values in various columns.