October 16, 2023

Data cleaning 

The data has many missing values and has incorrect formatting for some of the features. The ‘gender’ column has 31 missing values. The overwhelming majority of the instances are male, I imputed Male for the missing values of gender. 

The column ‘flee’ has 966 missing values. Since ‘not fleeing’ was most of the dataset is ‘not fleeing’ I imputed that in place of the missing values. Similarly, the missing values in the column ‘armed’ was imputed with the most common occurrence.  

Since the date information is given, I performed a temporal analysis to show the number of shootings per year. There is a steady uptick in the number of fatal shootings since 2017.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *