Tag Archives: intelligence

Automating big-data analysis : MIT Research

System that replaces human intuition with algorithms outperforms 615 of 906 human teams.

By Larry Hardesty


Big-data analysis consists of searching for buried patterns that have some kind of predictive power. But choosing which “features” of the data to analyze usually requires some human intuition. In a database containing, say, the beginning and end dates of various sales promotions and weekly profits, the crucial data may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans.

MIT researchers aim to take the human element out of big-data analysis, with a new system that not only searches for patterns but designs the feature set, too. To test the first prototype of their system, they enrolled it in three data science competitions, in which it competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers’ “Data Science Machine” finished ahead of 615.

In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.

“We view the Data Science Machine as a natural complement to human intelligence,” says Max Kanter, whose MIT master’s thesis in computer science is the basis of the Data Science Machine. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”

Between the lines

Kanter and his thesis advisor, Kalyan Veeramachaneni, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), describe the Data Science Machine in a paper that Kanter will present next week at the IEEE International Conference on Data Science and Advanced Analytics.

Veeramachaneni co-leads the Anyscale Learning for All group at CSAIL, which applies machine-learning techniques to practical problems in big-data analysis, such as determining the power-generation capacity of wind-farm sites or predicting which students are at risk fordropping out of online courses.

“What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering,” Veeramachaneni says. “The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas.”

In predicting dropout, for instance, two crucial indicators proved to be how long before a deadline a student begins working on a problem set and how much time the student spends on the course website relative to his or her classmates. MIT’s online-learning platform MITxdoesn’t record either of those statistics, but it does collect data from which they can be inferred.

Featured composition

Kanter and Veeramachaneni use a couple of tricks to manufacture candidate features for data analyses. One is to exploit structural relationships inherent in database design. Databases typically store different types of data in different tables, indicating the correlations between them using numerical identifiers. The Data Science Machine tracks these correlations, using them as a cue to feature construction.

For instance, one table might list retail items and their costs; another might list items included in individual customers’ purchases. The Data Science Machine would begin by importing costs from the first table into the second. Then, taking its cue from the association of several different items in the second table with the same purchase number, it would execute a suite of operations to generate candidate features: total cost per order, average cost per order, minimum cost per order, and so on. As numerical identifiers proliferate across tables, the Data Science Machine layers operations on top of each other, finding minima of averages, averages of sums, and so on.

It also looks for so-called categorical data, which appear to be restricted to a limited range of values, such as days of the week or brand names. It then generates further feature candidates by dividing up existing features across categories.

Once it’s produced an array of candidates, it reduces their number by identifying those whose values seem to be correlated. Then it starts testing its reduced set of features on sample data, recombining them in different ways to optimize the accuracy of the predictions they yield.

“The Data Science Machine is one of those unbelievable projects where applying cutting-edge research to solve practical problems opens an entirely new way of looking at the problem,” says Margo Seltzer, a professor of computer science at Harvard University who was not involved in the work. “I think what they’ve done is going to become the standard quickly — very quickly.”

Source: MIT News Office

 

The rise and fall of cognitive skills:Neuroscientists find that different parts of the brain work best at different ages.

By Anne Trafton


CAMBRIDGE, Mass–Scientists have long known that our ability to think quickly and recall information, also known as fluid intelligence, peaks around age 20 and then begins a slow decline. However, more recent findings, including a new study from neuroscientists at MIT and Massachusetts General Hospital (MGH), suggest that the real picture is much more complex.

The study, which appears in the XX issue of the journal Psychological Science, finds that different components of fluid intelligence peak at different ages, some as late as age 40.

“At any given age, you’re getting better at some things, you’re getting worse at some other things, and you’re at a plateau at some other things. There’s probably not one age at which you’re peak on most things, much less all of them,” says Joshua Hartshorne, a postdoc in MIT’s Department of Brain and Cognitive Sciences and one of the paper’s authors.

“It paints a different picture of the way we change over the lifespan than psychology and neuroscience have traditionally painted,” adds Laura Germine, a postdoc in psychiatric and neurodevelopmental genetics at MGH and the paper’s other author.

Measuring peaks

Until now, it has been difficult to study how cognitive skills change over time because of the challenge of getting large numbers of people older than college students and younger than 65 to come to a psychology laboratory to participate in experiments. Hartshorne and Germine were able to take a broader look at aging and cognition because they have been running large-scale experiments on the Internet, where people of any age can become research subjects.

Their web sites, gameswithwords.org and testmybrain.org, feature cognitive tests designed to be completed in just a few minutes. Through these sites, the researchers have accumulated data from nearly 3 million people in the past several years.

In 2011, Germine published a study showing that the ability to recognize faces improves until the early 30s before gradually starting to decline. This finding did not fit into the theory that fluid intelligence peaks in late adolescence. Around the same time, Hartshorne found that subjects’ performance on a visual short-term memory task also peaked in the early 30s.

Intrigued by these results, the researchers, then graduate students at Harvard University, decided that they needed to explore a different source of data, in case some aspect of collecting data on the Internet was skewing the results. They dug out sets of data, collected decades ago, on adult performance at different ages on the Weschler Adult Intelligence Scale, which is used to measure IQ, and the Weschler Memory Scale. Together, these tests measure about 30 different subsets of intelligence, such as digit memorization, visual search, and assembling puzzles.

Hartshorne and Germine developed a new way to analyze the data that allowed them to compare the age peaks for each task. “We were mapping when these cognitive abilities were peaking, and we saw there was no single peak for all abilities. The peaks were all over the place,” Hartshorne says. “This was the smoking gun.”

However, the dataset was not as large as the researchers would have liked, so they decided to test several of the same cognitive skills with their larger pools of Internet study participants. For the Internet study, the researchers chose four tasks that peaked at different ages, based on the data from the Weschler tests. They also included a test of the ability to perceive others’ emotional state, which is not measured by the Weschler tests.

The researchers gathered data from nearly 50,000 subjects and found a very clear picture showing that each cognitive skill they were testing peaked at a different age. For example, raw speed in processing information appears to peak around age 18 or 19, then immediately starts to decline. Meanwhile, short-term memory continues to improve until around age 25, when it levels off and then begins to drop around age 35.

For the ability to evaluate other people’s emotional states, the peak occurred much later, in the 40s or 50s.

More work will be needed to reveal why each of these skills peaks at different times, the researchers say. However, previous studies have hinted that genetic changes or changes in brain structure may play a role.

“If you go into the data on gene expression or brain structure at different ages, you see these lifespan patterns that we don’t know what to make of. The brain seems to continue to change in dynamic ways through early adulthood and middle age,” Germine says. “The question is: What does it mean? How does it map onto the way you function in the world, or the way you think, or the way you change as you age?”

Accumulated intelligence

The researchers also included a vocabulary test, which serves as a measure of what is known as crystallized intelligence — the accumulation of facts and knowledge. These results confirmed that crystallized intelligence peaks later in life, as previously believed, but the researchers also found something unexpected: While data from the Weschler IQ tests suggested that vocabulary peaks in the late 40s, the new data showed a later peak, in the late 60s or early 70s.

The researchers believe this may be a result of better education, more people having jobs that require a lot of reading, and more opportunities for intellectual stimulation for older people.

Hartshorne and Germine are now gathering more data from their websites and have added new cognitive tasks designed to evaluate social and emotional intelligence, language skills, and executive function. They are also working on making their data public so that other researchers can access it and perform other types of studies and analyses.

“We took the existing theories that were out there and showed that they’re all wrong. The question now is: What is the right one? To get to that answer, we’re going to need to run a lot more studies and collect a lot more data,” Hartshorne says.

The research was funded by the National Institutes of Health, the National Science Foundation, and a National Defense Science and Engineering Graduate Fellowship.

Source: MIT News Office