Data analytics is growing more sophisticated by the minute, monitoring your Internet behaviour in ways you haven’t even thought about. This is fascinating, terrifying – and might save your life.
On the island of forgotten toys that is the average Amazon cart, there are a few certainties. One: you’ll leave your mother’s birthday gift in there so long that you’ll end up having to pay for expedited shipping. Two: Amazon knows which items you’re likely to buy – and which ones you’ll jettison. A couple of years ago Amazon was granted a patent for Anticipatory Shipping, a program to predict what products people will purchase before they purchase them. It analyses shopping history, demographics and, ominously, “any other suitable source of information” to gain insight into your buying patterns, so Amazon can ship potential buys to a location near you.
Though Amazon hasn’t said how much of the technology outlined in the Anticipatory Shipping patent is already in use, the rapid rollout of superfast shipping options such as free same-day delivery and Prime Now one-hour service suggests the company has already honed its ability to know what you want before you want it. It’s just one example of the growing power of data analytics: companies, police and government agencies mining the ever-expanding digital corpus of information online for data that can, in essence, predict the future. Internet data analytics (see below) has the power to radically alter shopping, medicine, science and terrorism. It also makes a lot of people vastly uncomfortable.
Kira Radinsky, a computer scientist from Israel, who collaborated with Micro-soft Research while earning her PhD, does research that demonstrates how advantageous data analytics can be: she and her team fed years of articles from The New York Times, Wikipedia entries and other streams of Web data into a computer, after which the computer tried to predict global events like riots and epidemics. In a recent test, news articles from as far back as the 1970s indicated that, in countries with high population density and low gross domestic product, drought years followed by floods could trigger outbreaks of cholera. These criteria allowed Radinsky’s computer to predict a cholera outbreak in Cuba in 2012 – the first in more than 100 years.The part of data science that unnerves people is not work like Radinsky’s – the more help we can get to combat the caprice of truly irrational actors, such as diseases, terrorists and weather, the better. It’s that computers don’t understand people. Humans can consider the impact of free will and emotion with a degree of subtlety that computers are not yet capable of. As capabilities outpace regulations, there are sure to be moments in which the thud-headedness of data algorithms leads to problems.
Recently, Radinsky and her fellow researchers fed their computer news items that they thought would almost certainly trigger follow-up events to see whether the computer could predict them. One such item was the 2010 murder of an Iranian professor in Tehran, Iran. Radinsky’s staff, acknowledging the complicated politics of the Middle East, suggested that the teacher’s death might trigger protests at his university. The computer’s prediction: there would be a funeral. Certainly, these shortcomings are not insurmountable. Data science will continue to improve and, in the meantime, there are efforts to create hybrid systems that combine the capabilities of both big data and human intuition. In the US state of Colorado in September, in preparation for a campus visit by the Dalai Lama, law-enforcement officers from the University of Colorado and Boulder County formed a cyber team to monitor social media for tweets, blog posts, or online manifestos that might predict a shooting or other disaster at large campus events.
The group’s first attempts at an algorithmic approach were ineffective. Early on, they tried to isolate suspicious individuals by searching for tweets containing the word bomb. They uncovered an enormous number of tweets using innocuous slang like “photobomb.” Once they refined their search techniques, they still had to figure out how to differentiate a legitimate bomb threat from a disgruntled football fan casually suggesting that a bomb in the stadium might be the only way to keep his team from losing the next game. To improve the system’s sophistication, they brought in humans. After identifying a suspect based on social media usage, the team treats each case like a police investigation.
“We’ll look into their other social media,” says Sara Pierce, a detective from the Longmont Police Department. “We’ll see what other profiles they have, if this is a common topic.” It’s traditional detective work, using a computer, yes, but also using years of human police experience. As data science races towards a future in which Amazon will just buy things for us, it will be fascinating to watch computers leap this last hurdle on their own. And they’ll have to: human vicissitudes are what make people prone to crime, rioting and fickle changes of mind in the first place, and these are the things that we most need computers to help predict. We built computers so that they would understand us, not the other way around. They’re getting better at it.
Interesting Internet things you didn’t think anyone knew
Booz Allen Hamilton – What’s going to happen next in this football game
The tech: BlitzD, an app that uses game tape to predict play calls in the USA’s NFL.
How it works: Ninety columns’ worth of play data, including down, distance, location on field, score and time, is fed into Microsoft’s Azure machine-learning engine. The engine figures out tendencies, then gives a breakdown of the percentage chance that the next play will be a run or a pass, to the left, right or centre of the field.
Status: BlitzD was 78 % accurate at predicting run versus pass during a demonstration using the 2015 Super Bowl, and accuracy is expected to improve with more, higher-quality data. Meanwhile, the NFL is evaluating the fairness of advanced technology on the sidelines.
UnitesUs – Which workplace will make you happy
Tech: An algorithmic approach that matches job seekers to employers based on more than just skills and previous employment.
How it works: Job seekers upload a résumé, then fill out a personality survey or give the software access to online profiles. IBM Watson’s Personality Insights tool uses text from the profiles to develop a personality assessment. If the seeker’s skill set and personality match an employee with an opening with more than 50 % certainty, UnitesUs introduces them.
Status: Up and running since April 2015, it has achieved a 7,5 % callback rate, compared with less than 5 % for standard job-search sites.
Workday – That you’re probably going to quit your job
Tech: Software that can tell who is planning to leave a job, and recommends ways to retain unhappy workers.
How it works: An algorithm uses more than 100 variables to assign a worker a retention risk rating. The software then points out the biggest issues: whether the worker has been in a single position too long, commutes too far, etc. Based on the employer’s historical data, the software then suggests the best thing to do to retain the person.
Status: Retention recommendations are available as of September 2015.
Transportation Security Administration – Whether you are likely to be a terrorist
Tech: Risk-based security, making the airport-security experience easier for low-risk people.
How it works: TSA has developed a list of types of people who don’t need to be vetted because of their low probability of being a security risk. This includes elderly people, very young people, members of the military, and those who have been qualified ahead of time through the PreCheck program.
Status: As of June 2015, nearly one million daily travellers have an easier trip through security – but TSA recently failed a variety of security tests and is overhauling its processes.
Police departments around the country – If you are a criminal or a victim
Tech: Predictive policing, the use of algorithms to develop crime hot-spot maps and intercede before crimes happen.
How it works: There are two kinds of predictive policing: hot-spot policing involves focusing police in areas where there is a high risk of crime. Cops used to make the maps themselves, but now they can be created algorithmically. At the individual level, police use previous arrests, social networks, and drug use to generate a list of at-risk people.
Status: According to John Hollywood of the Rand Corp, studies have shown a 10 to 20 % improvement in crime prevention by focusing police on algorithm-derived hot spots. Individual interventions are considered promising, but it is difficult to prove their effectiveness.
The part of data science that unnerves people is that computers don’t understand the impact of free will and human emotions.
This article was originally published in the April 2016 issue of Popular Mechanics magazine.