Demographic, socio-economic and crime rate data of the Greater London Region, retrieved from the London Datastore, are used in this project. In this project, 3 key analysis will be performed:
With the limited police resources and possible adverse impact when crime occurs, analytics on crime has been done as far back as in the 1800s (Hunt, 2019). Crime occurrence was found to have spatial patterns, and thus predictive analytics should be possible. However, mixed results were obtained in the research to determine whether predictive policing results to lower crime rates (Meijer & Wessels, 2019). Thus, it is more beneficial to use the analytics to determine areas with higher risk of crime and to discover the underlying factors to the increased risk.
Traditionally, crime analysis is done manually or through a spreadsheet program (RAND Corporation, 2013). This project would give the users an easier way to do the analysis using a web application.
This project aims to deliver an interactive user web application interface, whereby users are able to apply actionable insights based on the 3 key analysis
Understanding hot spots of crime rate, with a visual map of Greater London
Clustering of areas based on different techniques
Forecast possible hot spots based on different regression models
To provide data-driven insights to inform preventative measures such as warnings and allocation of police force resources and influence ward planning policies
A borough includes wards, which is the primary unit of English electoral geography for civil parishes and district councils. There are a total of 32 boroughs in Greater London, excluding the City of London.
Datasets should all have consistent depth (ward VS borough) and the same duration
Using dplyr package, immediate operation and data-manipulation
2. Choose the right R package to visualize:
Based on our project objectives, we came out with storyboards and evaluated different versions of our interactive visualisation application. With our shortlisted storyboard in mind, we explored the R packages required to build the visualisation.
3. Data visualization and Analysis
Exploratory Data Analysis (EDA)
Exploratory Spatial Data Analysis (ESDA)
Finding spatial hotspots, outliers and anomalies of wards with high crime rate
Time series of geo-spatic data
Understanding how crime rates have changed over the years, broken down by wards
Clustering of Location Authority District : Finding similar LAD
Hierarchical Clustering (Hcluster)
Hierarchical Clustering with Spatial Constraints (GeoCluster)
Clustering of Spatio-Temporal Data (STC Model)
Regression: Forecasting of crime rate in each LAD
Geographically weighted regression (GWR)
Geographically And Temporally Weighted Regression (GTWR)
4. Building of Artifact - Web Application
R Markdown development
Functionality checks
The timeframe for this project is illustrated in the Gantt Chart below
Meijer, A., & Wessels, M. (2019, February 12). Predictive Policing: Review of Benefits and Drawbacks. International Journal of Public Administration, 42(12), 1031-1039. doi: 10.1080/01900692.2019.1575664