For NEIL, seeing can mean comprehension

CMU’s Never-Ending Image Learner is scanning the Web to build its own database of facts and connections

ConceptNet is the most advanced semantic network ever to come out of the Massachusetts Institute of Technology. It contains more than one million facts shoveled into it by thousands of online contributors who—since 1999—have built up the system up en masse, Wikipedia-style.

But last year, when researchers tested it using questions from the Wechsler Preschool and Primary Scale of Intelligence Test—questions such as, “Why is ice cream kept in the freezer?”—

ConceptNet turned out to be about as smart as a four-year-old child.

“Better algorithms could get the results to (the) average for a five- or six-year-old, but something new would be needed to answer comprehension questions with the skill of a child of eight,” the University of Illinois team concluded.

Despite sci-fi predictions about artificial intelligences becoming dangerously smarter than humans, researchers have so far found few ways to teach computers those millions of little bits of information that make up common sense—aside from just feeding them facts, one by one, a process slower and less comprehensive than the way a young human learns.

That’s led to computers that can beat chess champions, but which don’t understand bits of common sense that can’t be deduced by algorithms and mathematics, such as the fact that pencils are used for writing, or that cups hold liquids.

Tom Mitchell, Fredkin University Professor of Computer Science and head of CMU’s Machine Learning Department, attempted to find a better way. Since 2010, his Never-Ending Language Leaner, or NELL, has since been scanning the Internet 24 hours a day, seven days a week, trying to deduce relationships between nouns extracted from a collection of more than 1 billion Web pages. Some of NELL’s recent epiphanies: “newborn photography is a form of visual art,” “cilantro citrus chicken is a food,” and “Ernie Hudson is a male.” (You can follow along as NELL learns at twitter.com/cmunell.)

But we don’t learn everything—or even most things—about the world around us from written words, says Abhinav Gupta, assistant research professor at the Robotics Institute. We learn many things visually, and some of the things we learn are so obvious to us, we take them for granted.

“Some things are so basic they aren’t put into words,” he says. “No one is going to say ‘the white sheep’ because we all understand that most sheep are white. In fact, there are more references to ‘black sheep’ out there, because that is a phrase” that plays on their uniqueness.

To study how computers can learn from visual information as well as language, Gupta, Abhinav Shrivastava, a Ph.D. student in artificial intelligence and robotics, and Xinlei Chen, a Ph.D. student in the Language Technologies Institute, have created the Never-Ending Image Leaner, or NEIL, whose virtual kindergarten is the sea of pictures available online. Since July 2013, NEIL has been searching the Web for images related to nouns—provided either by NELL or by Gupta and his team—and is using the pictures to infer relationships.

The program understands five broad kinds of relationships: “can have a part,” “can be/can have,” “can be found in,” “can be a kind of/look similar to,” “can be an attribute of.” For example, when NEIL was presented with the word “tires,” the program searched for images marked with the term. After scanning countless images of tires, it concluded that “wheel can be a kind of, or look similar to tire,” “tire can be, or can have, black and “tire can be, or can have, round shape.”

With an ever-evolving base of knowledge, NEIL is always discovering relationships between nouns new and those already familiar to it. It has so far deduced 3,000 relationships between 2,500 concepts.

Gupta says the image-scanning NEIL is able to grasp things lost on the language-absorbing NELL: “NEIL knows what ‘white’ looks like, and recognizes that whiteness is an aspect of most sheep.”
Like NELL, NEIL is not always correct. Misidentifying a yellow billiard ball, the program concluded that a set of pool balls have a part that is “lemon,” and, bafflingly, thinks a “bridge can be a part of (an) eggplant.”

Also, some concepts just stump it. “I don’t think NEIL will ever understand what cheese is,” says Gupta. “Cheese comes in too many colors and shapes for NEIL to comprehend it.”

As for homonyms, NEIL is able to cluster images into different sets if it senses the pictures related to the word come in more than one distinct shape. Under its entry for “bass,” a few clusters are of fish and a few are of musical instruments.

The point of the project is to study how a computer can learn independently, a process that’s crucial to someday developing an artificial intelligence smarter than the average kindergartener. “We are learning how a computer (could) answer questions,” says Chen, who spent much of last summer writing code for NEIL’s.

A browse through the nouns being studied by NEIL reveals some of the outside interests of Gupta, Chen and Shrivastava. Terms they entered into NEIL “just for fun” include Batman, Yoda, Iron Man, Xindi (a race of aliens from “Star Trek: Enterprise”) and a variety of specific car models (new and vintage).

Sci-fi-style computers that can learn independently are probably decades, if not centuries, away. But Mitchell says the research being done with NEIL and NELL has the potential to change technology already in use, starting with Web search engines.

“Right now, a search engine reading the Web is like you or me reading a book in Swahili,” Mitchell says. “We could find a certain phrase if we looked over every bit of text, but we would have no idea what it meant.” A more advanced search engine would understand something about the terms for which it was searching. A researcher in an obscure field might type her terms into a search engine, and be able to find everyone working on similar projects—even if they weren’t described in exactly the same words.

“It would reinvent search engines as we know them,” Mitchell says, “if search engines not only found the words you were looking for, but understood the text, and therefore the answers to your queries.”

For More Information:

Nick Keppler is a Pittsburgh-based freelance writer whose work has appeared across the country in alternative newsweeklies such as St. Louis’ Riverfront Times and the Village Voice. | thelink@cs.cmu.edu

Undergraduate

Master's

Doctoral

For NEIL, seeing can mean comprehension

CMU’s Never-Ending Image Learner is scanning the Web to build its own database of facts and connections