Sunday, August 28, 2011

Riots and the Big Data Problem

The recent riots in London
Back from vacation, I was catching up on a few recent issues of The Economist. The riots in England have obviously made headlines in the UK magazine and one particular issue - very much related to content management - caught my eye.

The rioters were acting in plain view of the cameras and they were coordinating their actions using the BBM (BlackBerry Messenger). The UK is one of the countries with the highest density of surveillance cameras in the world and so the police have apparently plenty of video material and BlackBerry traffic to analyze to identify and apprehend some of the trouble-makers. Turns out, the data is not just plenty - there is too much of it.

Indeed, the data volumes are so huge that the police hardly stand a chance to ever analyze it. Strapped by tight budgets and austerity measures, the UK police have barely the resources to prosecute the most severe crime and there are no resources left to dig through the gigabytes and gigabytes of surveillance data.

This is an interesting “big data” problem. Lots has been written about big data lately. The availability of detailed data tracking for every transaction and every move opens up new opportunities that just a few years ago were unthinkable. Analyzing and understanding the data patterns leads to new types of services and efficiencies that savvy companies have already begun to take advantage of. And more is to come.

That’s all great for structured data which is relatively easy to mine and analyze using computer programs. The challenge comes when the data is unstructured, such as text messages or video feeds. Unstructured data is much harder to analyze programmatically with reliable outcomes and speeds that can keep up with the torrential pace at which the data is being generated.

Sure, content analytics are already a well established discipline and many vendors from IBM to OpenText have content analytics offerings today. IBM even made a lot of headlines earlier this year with its Watson project - a supercomputer specialized on natural language analysis and reasoning... and on the TV game Jeopardy. Watson was a unique system designed for a specific purpose and even Watson would have had a hard time identifying faces of perpetrators from hours of riot video footage.

That job is much harder and the technology is by far not as mature. Content analytics have a great future in light of the big data problem and it will be fun to watch as the technology matures over the next few years. In the mean time, let's hope the UK police apprehend the key trouble makers from the recent riots by whatever means they have at their disposal.

No comments:

Post a Comment