October 27, 2023

The new city-wise data with the share of each race needed to be cleaned. The dataset had some extensions in the city name like “city”, “CDP” and “town”. These extensions were removed and the city and the state columns were joined similar to the original dataset. This will act as the primary key when merging the two datasets using left join. 

Left join is used because I want to preserve every city and instance in the original dataset. But I am okay with dropping out the race share information from the cities which do not occur in the shooting data. Eventually I had a dataset with all the shooting related information along with the race share for each city. I will be using this combined dataset for the analysis and the modelling. 

 

Leave a Reply

Your email address will not be published. Required fields are marked *