November 1, 2023 – Devcharan Krishna Naik

Random Forest Classifier

I performed categorical encoding on the dataset to remove all string values and convert everything into numerical values. This is done because most machine learning models work only with numerical values.

I split the dataset into train and test sets using the sklearn train_test_split function into a 70-30 split. The data was then run through a random forest classifier model. The features used were as follows: age, manner of death, gender, sign of mental illness, threat level, body camera, latitude and longitude, share of each race in that city.

The accuracy score after testing the dataset in predicting the race from the other variables was 0.618. GridSearch cross validation was performed to find the best selection of parameters for this specific model, however, the change in accuracy was minimal.

Leave a Reply Cancel reply