Thursday, 17 March 2016

A Little Conspiracy Theory

One of my interests in Taiwan is traffic, driving culture and road design. I recall, on the first day I arrived in Taiwan way back in 2005, being driven into Taipei city from the airport whilst asleep and waking up when the car finally stopped at a traffic light; I looked out of the window and was aghast at what I saw on the Taipei city streets. From that point on this was the focal point of my "culture shock".

So I always keep an eye out for news items about traffic governance policies and research, and there have been a couple of these recently. One is in today's Taipei Times and is entitled "Bureau to analyze data to predict traffic accidents". 

The headline strikes me as odd.

I would have thought that the National Freeway Bureau would already have extensive traffic accident records going back over several decades since the freeways were first constructed detailing where accidents occur, under what weather conditions, with what proximate causes and with what consequences (e.g. numbers of fatalities and injuries). Presumably that data, if it exists as surely it must, has already been used to confirm the most likely accident spots along the freeways and under what conditions they are most likely to occur (e.g. under heavy rain or during rush hours or national holidays).

So why is this headline in the news today? Well the opening paragraph to the story states...
"The National Freeway Bureau yesterday said that it is soon to use data gathered through the electronic toll collection (ETC) system to help it forecast traffic on freeway sections prone to accidents and to enforce road maintenance work more efficiently."
OK, so the data they want to analyse is not accident data, but traffic data for the purpose of getting road maintenance work done "more efficiently". Presumably they believe that this traffic data will allow them to predict when freeway traffic is at its lowest ebbs, and schedule road maintenance accordingly.

But does that actually make any sense? We already know that traffic on the freeways is greater during rush hours, weekends and national holidays and conversely that there are fewer vehicles on the freeways at night and say, in the early hours of Sunday morning. So what sort of patterns are they looking for in the traffic data - differences in traffic volume between 10 a.m. to 11 a.m. on Tuesday mornings and 10 a.m. to 11 a.m. on Thursday mornings? If yes, then what reason could they possibly have to think these variations would be anything more than marginal?

Unless I am missing something, that is rather odd and requires explanation. But on with the rest of the story...
"Bureau Deputy Director-General Wu Mu-fu (吳木富) said that eTags, a device that allows drivers to pay their toll fees when driving on freeways, are now installed in more than 6 million vehicles nationwide, accounting for about 80 percent of all vehicles."
So the data will come from electronic tag surveillance.
"Wu said that data analysis would enable the bureau to better understand the driving habits of freeway users and the types of traffic violations they commit.“We would also look at weather information, the layout of routes and traffic reports. By combining such data, we would be able to forecast more accurately when and where traffic accidents might occur,” he said."
But as I said above, the Bureau can probably already do this using historical accident data. Perhaps the difference is scale; the existing data can tell them which, say, five hundred meter stretch of freeway is most likely to see an accident when it's raining during rush hour, but the electronic surveillance data will be able to narrow that down depending on how often signals are sent from the eTag devices to the data collection center.
"The National Highway Police Bureau often attributes the cause of an accident to a single factor, Wu said, adding that accidents could happen due to multiple reasons, including a vehicle’s condition, driving habits of freeway users and other external factors.Identifying potential risk factors would help law enforcement to take preventative measures to reduce the occurrence of traffic accidents, he said."
Again this is odd. Common sense already tells us what the "potential risk factors" are; bald tyres, snapped brake cables, slippery roads, rush hour traffic, driver fatigue etc. Of course the biggest risk factor is the actual drivers themselves and the horrifying human errors that recur time after time. The police already know about these things, so it is difficult to see how the additional data is going to help the police.


Instead of reading the quotes from Deputy Director General Wu at face value and taking the Taipei Times seriously as a newspaper, we could switch on our intellects and actually think about it. Why might the police want data about traffic behaviour at a finer resolution than say, five hundred meters?

Speed cameras. They have a limited range - perhaps ten to fifteen meters. Could it be that the National Freeway Bureau (along with the National Highway Police Bureau) want this additional data so that they can plan the best points along the freeways to site speed cameras? Whilst the ostensible motive for this is to prevent traffic accidents, this might well be a smokescreen as they must surely already know from existing data which five hundred meter stretches of freeway will see the highest numbers of traffic accidents; just stick speed cameras along these areas every twenty meters or so.

That would solve the ostensible problem, but what if the real problem they are trying to solve isn't reducing the numbers of accidents? What if the real problem they are trying to solve is budgetary resources? By placing cameras at spots where people are most likely to drive over the speed limit or commit other minor infractions, the two bureaus would stand to benefit from an additional or expanded revenue stream from the fines drivers are forced to pay.

It would be very interesting to see how the historical accident data would map onto the location data for new speed cameras and perhaps other surveillance cameras. The extent of overlap should be revealing.


(Added later...)

Since the time of writing earlier this morning I have found two problems with this theory.

The first is that the eTag system does not deliver a constant stream of data, but only interval data (the intervals being several kilometers or even tens of kilometers). That means that the eTag system could not be used to produce data at a fine resolution.

The second is that the eTag system does not deliver information on vehicle speed, only binary data on whether the vehicle is present or not at a given interval along the freeway. So that means the data cannot be used to identify areas where people are more or less likely to speed or to commit other infractions of the traffic laws.

However, that second point does not fit with one of the quotes from the Deputy Director General mentioned above that this data could be used to "better understand the driving habits of freeway users and the types of traffic violation they commit" . So what is going on here?

Perhaps my theory is wrong, and this isn't a disguised attempt to find ways of maximizing revenue through the levering of fines. But if the use of the eTag data is a genuine attempt to improve traffic safety on the freeways, it isn't clear how this data would help the National Freeway Bureau and the National Highway Police Bureau to achieve that.

Perhaps data on traffic volume generated from the eTags could be compared with accident data and mapped onto the freeways to generate a range of probability values for whether accidents are more or less likely for a given section of the freeway under higher or lower traffic volumes. The Freeway Bureau could then alter the warning information on the overhead electronic panels that straddle the freeways.

But surely they could do this anyway, and just put permanent warnings on those sections of the road with the highest accident rates.

Another possibility is that the National Highway Police Bureau could use probability values to better allocate their patrol cars to different sections of the freeways at the most dangerous times.

That makes sense, but the usefulness of the eTag data to this end really depends on how large the discrepancy is between when common sense would suggest are the most dangerous times on the accident prone sections of the freeway, and what the data-generated probability values are. If a large discrepancy is found then the use of the eTag data would be worthwhile, but if the discrepancy is small and the probability values basically just match up with what you would expect anyway on the basis of common sense, then it would seem that the eTag data is superfluous.

It might be that the only way to properly understand this is to actually go to an office for the National Freeway Bureau, or the National Highway Police Bureau and ask them what on earth they are actually trying to do. You know, kind of like those people we read about in fantasy detective fiction - what are they called, "journalists" or something?

No comments:

Post a Comment

Comment moderation is now in place, as of April 2012. Rules:

1) Be aware that your right to say what you want is circumscribed by my right of ownership here.

2) Make your comments relevant to the post to which they are attached.

3) Be careful what you presume: always be prepared to evince your point with logic and/or facts.

4) Do not transgress Blogger's rules regarding content, i.e. do not express hatred for other people on account of their ethnicity, age, gender, sexual orientation or nationality.

5) Remember that only the best are prepared to concede, and only the worst are prepared to smear.