Author: James Hendler
Tetherless World Professor of Computer, Web, and Cognitive Sciences at Rensselaer Polytechnic Institute
When it comes to making big and difficult decisions, artificial intelligence (AI) is often seen as a magic bullet. So it is no wonder that, in recent years, we have seen government officials grappling with how to use AI to make their jobs easier. But those of us who understand these technologies know that such magic doesn’t exist. Our future will thus depend on the ability we have to educate government decision-makers so that they can understand how AI actually works.
Without that understanding there will be many cases of governments attempting to use the emerging AI technologies inappropriately. For example, last year the U.S. Department of Homeland Security (DHS) proposed using AI as part of a developing “extreme vetting initiative (EVI).” Put simply, the EVI proposal suggested that it is possible to make “determinations via automation” about whether an individual seeking asylum would become a “positively contributing member of society” or was more likely to be a terrorist threat.
On November 17, 2017, 54 experts on AI, machine learning, and related technologies (including myself, I’m proud to say) sent a letter expressing grave concerns about the proposed plan to the Acting Secretary of DHS, Elaine Duke. “Simply put,” the letter said, “no computational methods can provide reliable or objective assessments of the traits that [DHS] seeks to measure.”
It is important to realize that in this case, a policy decision, perhaps well intended, was being made by those who did not understand the details of the technology they were proposing to use. To better understand why this problem was not appropriately addressed using AI, one needs to know a bit about how AI-based machine-learning algorithms work.
One major use of these programs is in learning to predict outcomes given a bunch of example data. The goal is to determine which features of the data can best determine what categories things will fall into. If the data has some number of examples that fall into one category, and the rest fall into another, then the algorithms can be trained to figure out what factors in the data correlate the most with the different categories.
For example, suppose we have a lot of data about patients coming to the emergency room in a hospital. Some of these patients are sent home and some are admitted. But there are some of those who are sent home who end up coming back to the ER within some small period of time— these are known as “revisits.” Reducing revisit rates, by appropriately admitting the patients instead of sending them home after the first visit, helps improve patient care and also saves the hospital considerable costs. In fact, reducing revisit rates is considered a critical part of improving health care in America.
In work led by Professor Kristin Bennett at Rensselaer Polytechnic Institute, machine learning systems were shown to be able to help doctors make better decisions with respect to this problem. The hospital’s electronic health records were used to algorithmically compare the people who were admitted and stayed, the people who were released and didn’t come back, and the people who were the revisitors, returning to the ER within some set amount of time. The hospital’s data included many thousands of potential features for each patient (when they came, where they live, what time they were released, how old they are, what treatments they were given, and many more). Using machine learning, we were able to identify smaller sets of features that the doctors could use to better predict if someone was likely to be readmitted, and to take that into account in their decision-making.
So what is wrong with using similar technology for systems like the EVI? To start with, even the best of these learning systems is still functioning at significantly less than 100 percent accuracy. In the case of the hospital, while it would be nice if we could find all the patients who might come back, it is considered enough to just help doctors do better. If the revisit rate is cut down, that would help many people, and that would be a great outcome, even if it wasn’t perfect. In the case of the EVI, people requesting asylum into the U.S. may be in extreme danger if they aren’t admitted—what error rate is considered appropriate?
That’s not the only problem. The true challenge to these machine learning algorithms is what is called “data skew.” The more examples in one category, as compared to another, the larger the error rate is likely to be. If one in 100 patients is a revisitor, the system will do better than if that number is one in 1,000. If it’s one in 1,000,000, the system could degrade to essentially randomness—there’s just not enough data to go on. Consider what would happen if the hospital had only had one revisitor, and his name was Fred. Would we want the doctors to assume everyone named Fred should be admitted?
The EVI would have been looking at data in which very large numbers of people have been brought into the U.S., with few (if any) examples of people who have actually committed terrorist acts once here. The situation would be similar to the patient named Fred—lots of data, very few instances. So technically, the skew would defeat the purpose and the system would be largely random.
There were other technical issues outlined in the letter to DHS, but this issue of learning in the presence of skew was one of the key ones. As the letter explained, “on the scale of the American population and immigration rates, criminal acts are relatively rare, and terrorist acts are extremely rare.” In other words, the chances of a learning algorithm performing well were very, very low. It was an inappropriate use of AI technology, and it would likely have negatively impacted many people without any significant benefit to public safety.
I am glad to say that the arguments by AI experts proved convincing. In May 2018, DHS announced that it would not be trying to use machine learning for this project. The letter also went on to say: “Data mining is a powerful tool. Appropriately harnessed, it can do great good for American industry, medicine, and society.” In short, we weren’t arguing against the use of AI in general, but rather to educate the decision-makers on what was not appropriate in this particular case.
This is why it is crucial that those of us in universities who understand these technologies learn to communicate our findings clearly and proactively. Taking steps to foster a deeper understanding of this field is necessary to help our governments and policy-makers make better informed choices about when to—and perhaps more importantly, when not to—deploy the powerful new technologies like AI.