How can big data and smart analytics tools ignite growth for your company? Find out at DataBeat, May 19-20 in San Francisco, from top data scientists, analysts, investors, and entrepreneurs. Register now and save $200!
We’ve likely all wondered, sometimes sourly as the officer is writing us up for yet another speeding violation, whether police officers have quotas for how many traffic tickets they issue. After all, tickets are revenue, and local governments are often cash-strapped.
“Big data” to the rescue.
Now that at least some cities have open public data, anyone with a little programming skill, the inclination to use it, and the burning desire to know, can check. For instance, if the distribution of when tickets are issued is heavy at the end of the month, that could be a sign of quotas that need to be filled.
That’s exactly what Robert Picard, a university student and intern at alternative search engine DuckDuckGo, did, grabbing a public dataset of tickets issued from 2009 to 2011 in Baltimore. I talked to him this afternoon.
“The original dataset is about two million tickets,” he told me. “I live in Jacksonville, but used data from Baltimore because it was the only place I found any.”
After removing traffic camera tickets (which wouldn’t be affected by quotas, theoretically) as well as correcting for more frequent dates and the fact that only eight months have 31 days, Picard graphed the remaining tickets in a normalized view. The normalized view shows positive (above the line) when more tickets are issued than an expected average, and negative (below the line) when fewer tickets are issued than expected.
Above: Above the line is more tickets issued than a straight average, below the line is fewer tickets issued.
Image Credit: Robert Picard
The data certainly shows a lumpiness. It doesn’t show an average number of tickets issued on each day or in each week. In fact, the date shows more tickets issued near the end of the month — and the beginning of the month.
A possible explanation:
Departments have quotas, and officers do rush to fill it, and that enthusiasm or emphasis carries over into the first week or so of the next month, at which point officers forget about tickets for a while until they are reminded again in the last week of the month.
That’s conceivable, Picard told me, but it’s just one hypothesis. Based on the data alone, he couldn’t really say with certainty why more tickets were being issued on those dates.
Ultimately, I guess, big data can’t answer all questions, and a full explanation has to go beyond the data to the rationale behind behavior.
photo credit: Thomas Hawk via photopin cc