Parking Tickets Through Time and Space

Every year, the City of Toronto publishes their parking tickets data. This collection contains some interesting information such as the type of parking infraction issued, the time of infraction and of course, the location. I thought it would be interesting to do a quick analysis and visualization. I'll be using Excel, QGIS, Python and some of its extended libraries and modules for this project.



The CSV parking data can be accessed through the City's Open Data Portal. In this post, i'll be using their most recent publication (2015) which includes 3 CSV files. Here's a preview of what it looks like.

Screenshot of parking tickets data from the City of Toronto.

Screenshot of parking tickets data from the City of Toronto.

After acquiring the data, the first step is the preperation and formatting to create a workable file, a process also known as "cleaning the data". In our case, the CSV files didn't need much work and it was simply a question of deleting unnecessary columns. I kept the following columns for this project: date_of_infraction, infraction_code, and time_of_infraction. Another thing I did was to convert the column time_of_infraction into a string type and to transform it into a hh:mm format (e.g. 1.0 is convert to 00:01). Once that is done, we can concatenate the 3 CSV files into one in Pandas and start our project.


top 3 parking offence

The City of Toronto delivered over 2 millions parking infractions last year. I was interested to see what were the top 3 parking offences. A quick calculation reveals that parking on private property, parking without a permit during prohibited time and "Park signed highway during prohibited -Times/Days" (not to sure what this one means) are the most frequent ones.


I also wanted to see how issued parking infractions are distributed throughout the day of the week and hour. To do this, we can concatenate both column  date_of_infraction and time_of_infraction and use the function pd.to_datetime. Doing this allows us to take advantage of panda's functions regarding time. For example, we can extract the hour and day of the week for each row and group them. Below are some visualizations (heatmaps and simple plot) done with Matplotlib and Seaborn.

Heatmap showing the count of all the parking tickets based on the day of the week and hour. It make sense that the highest count are during weekdays and business hours.


Heatmap showing the count of parking tickets "Signed Highway Prohibited Time/Days" based on the day of the week and hour. This offence is most frequent during business hours.

Heatmap showing the count of parking tickets "Park Prohibited Time No Permit" based on the day of the week and hour. Interesting to note how Saturday and Sunday stands out at midnight. It is most likely due to drivers leaving their cars overnight for the weekend.

Heatmap showing the count of parking tickets "Park  on private property" based on the day of the week and hour. From this heatmap, it looks like parking ticket agents do their run during late mornings and between 2 and 4 AM.

A comparison of top 3 parking offences issued grouped by hour of the day.


Mapping Parking Tickets for cars on cycling lanes

Since we do have the location (street address) of each issued parking ticket, we can also plot them on the map. Plotting all the 2 million parking infractions would be time-consuming and wouldn't give us much insight unless we categorized them based on the infraction code. Since i'm interested in sustainable transportation, I wanted to take a closer look on parking tickets related to cycling. While going over the infraction codes, I found out that code 383, 384 and 387 were given to vehicles that are parked or stopped on bicycle lanes. Nothing is more upsetting and dangerous for cyclists when you have a car sitting on a cycling lane that was specifically designed and implemented to make the street safer and more efficient for everyone! So, I filter out the other codes, which can easily be done with Excel (or Python) and kept only the relevant ones. 

There are many ways to geolocate datasets. The easiest (and free) method is to import your CSV file and to use MMQGIS's function "Geocode CSV with Google / OpenStreet Map". Since our dataset is quite large, I use OpenStreet Map Web service, which is as good as Google (Google Maps has a daily API usage limit for geocoding). Once that's done, we can imported our data to a webmap platform like CARTO. The intensity map below doesn't represent all the problematic areas since not every parked cars will be reported. It does give a general idea and might start a conversation on how and where the City of Toronto should intervene.