ARIMA
To delve deeper into time series modelling, I considered the monthly counts of each crime – as in the number of incidents of each type reported in each month from July 1, 2015, up to October 31, 2023. However, for drug related crimes, the dataset does not have data post 2019.
Using this monthly crime count, I created a pandas series where the index column contained the time stamp and the series value contained the monthly count. This would be my primary data for any temporal analysis and forecasting.
The first model I intended to apply to the data was ARIMA – Autoregressive Integrated Moving Average.
This model did not perform well in forecasting as it does not handle seasonality and non-linear dependencies in the data well. The predictions from the ARIMA model were not representative of the true values. I tried changing the model parameters p,d and q. But the predictions from all combinations were subpar.
The ADF statistic for ARIMA was –0.20585 which suggests that our data is non stationary.