P. 1
Competition Breeds The Best in Analytics

Competition Breeds The Best in Analytics

|Views: 70|Likes:
Published by Crowdsourcing.org

More info:

Published by: Crowdsourcing.org on Oct 23, 2011
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less


Competition breeds the best in analytics - SAS V oices


SAS Voices

Anna Brown | OCTOBER 21, 2011
390 0



I attended the Predictive Analytics World conference in NYC this week and found Kaggle, a platform that hosts data prediction competitions, fascinating. It accepts the broadest range of data mining, forecasting and bioinformatics problems and conducts worldwide competitions that invite not only true data scientists but also electrical engineers, statisticians, really ANYONE who can solve the problem to participate. Certainly a productive use of crowdsourcing, this is a game for the smartest, most determined problem solvers in the world – set on a formalized, global stage. Data predictive challenges have dealt with a variety of topics like improving research for HIV, forecasting tourism, predicting beer sales, estimating which Wikipedia editors are most likely to resign and even answering “dark matter” questions that NASA has been pondering for decades. The prizes for solving the world’s complex problems vary from as little as $500 to $3 million, the latter being the biggest data mining competition ever by the Health Provider Network to help prevent the number of hospital visits in a given year. “The award amount usually isn’t that large,” says Kaggle CEO Anthony Goldbloom, “yet we consistently receive have hundreds of entries for each project.” When it comes to predictive analytics problems, passionate statistical modelers just want to come up with the best answer. The prize, it seems, is icing on the cake. Who exactly wins the competitions; who comes up with the best predictive score? Surprisingly, electrical engineers and physicists are the most successful. “Computer scientists and statisticians don’t do as well, perhaps because they are so tied to specific algorithms in the field,” Goldbloom says. You might also be surprised to learn that a PhD student in glaciology won the NASA dark matter challenge. “The contest allows people to look at a problem that they wouldn’t otherwise know existed,” says Goldbloom. So it’s not only a win for NASA that now has the “magic” formula for its dark matter work, but a mega opportunity for the student for visibility, jobs, not to mention the satisfaction of beating his peers. This particular contest, according to Goldbloom, was fierce. And that’s not unusual. Participants have visibility into levels of scores from other contenders. So when one outperforms another, the former leader is motivated to keep working on the model to jump ahead. Then the other participant works harder to regain the lead, and so on and so forth. Goldbloom sees this leapfrogging effect quite often. A competitive dynamic kicks in and the drive to win, to reach the best possible solution, is sky high. The solutions come in quickly, too. The dark matter problem was solved within a week. Once Kaggle made data available for the tourism forecasting competition, forecasting errors dropped dramatically in just two weeks. However within a month, the results stayed flat. “The results start to level out after a point because participants have squeezed the most out of the given data for the best models,” explains Goldbloom. He talked about how data scientists – be them students or practitioners – are hungry for real-world datasets like this and enjoy the challenge. Meanwhile, according to a McKinsey report published earlier this year, companies are lacking in supply of analytical talent. It doesn’t add up. But Kaggle is providing a forum to unite the two entities, making data science a sport. Bookmark on Delicious

1 of 2

10/25/2011 6:42 AM

Competition breeds the best in analytics - SAS V oices


Digg this post Recommend on Facebook Share on FriendFeed share via Reddit Share with Stumblers Tweet about it Print for later Tell a friend

SAS Voices
News and views from the people who make SAS a great place to work

The SAS Dummy
A SAS® blog for the rest of us

The Text Frontier
Text mining, voice mining and unstructured data analysis

The Analytic Insurer
Solving your customer, risk, fraud and operational challenges in insurance

The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programs

Data Roundtable
A community of experts sponsored by DataFlux

Closing the Intelligence Gap
Performance management and behavioral change management

A Shot in the Arm
Transforming quality, cost, and outcomes in the healthcare ecosystem

JMP Blog
Data visualization, statistical discovery, design of experiments, predictive modeling and more

Customer Analytics
Evolving relationships for business growth

The Business Forecasting Deal
Exposing bad practices and offering practical solutions in business forecasting

Generation SAS
Resources and tips for students and educators

Innsikt og Refleksjon
Nyheter og meninger om Business Analytics

Key Happenings at support.sas
Updates and advances in SAS online support

Information Architect
Capitalizing on Data Integration & Management

Law enforcement and intelligence community technology

Left of the Date Line
Business analytics from the far side of the world

The SAS Training Post
SAS programming tips & tricks, certification advice, and classroom reports from SAS trainers

State and Local Connection
State and local governments using data to serve citizens and save money

Beyond Business
Today's news and tomorrow's business strategies

Real BI for Real Users
SAS BI tips, lessons learned, problems and successes

The Corner Office
SAS executives on the larger issues that affect a global business

The SAS Bookshelf
SAS publishing program offerings, author updates and publishing trends

The Principled Achiever
Winning business, earning trust

Value Alley
Your pathway from strategy to process to repeatable value creation

Peer Revue
A SAS insider reveals how his peers make great software

SAS Users Groups
A snapshot of global users events, including best papers, presentations and innovative uses of SAS.

The blog content appearing on this site does not necessarily represent the opinions of SAS. Your use of this blog is governed by the Terms of Use and the SAS Privacy Statement.

Copyright © SAS Institute Inc. All Rights Reserved

2 of 2

10/25/2011 6:42 AM

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->