This action might not be possible to undo. Are you sure you want to continue?
Lois Mermelstein The Law Office of Lois D. Mermelstein email@example.com 512-222-8589
Cyberspace Law Committee Meeting, August 3, 2012
Ted Claypoole Womble Carlyle firstname.lastname@example.org 704-331-4910
What Is Big Data? ✤ Data that exceeds the processing capacity of conventional database systems. Too much data It moves too fast It’s too diverse ✤ ✤ ✤ .
and bandwidth are becoming exponentially faster Networking is expanding exponentially And you can buy all the pieces .html . processing speed.data. processing ✤ ✤ source: http://radar.How’d we get here? ✤ Storage.com/2011/08/building-data-startups.oreilly. infrastructure.
Crunching Big Data .Volume ✤ Turn 12 terabytes of tweets/day into improved product sentiment analysis Convert 350 billion annual meter readings to better predict power consumption Crunching Facebook recommendations based on your friends’ interests ✤ ✤ .
Velocity ✤ Time-sensitive analysis and decision-making .Crunching Big Data .to catch important events as they happen When there’s too much input data (so toss some) or immediate decisions must be made Examples: ✤ ✤ ✤ Scrutinize 5 million trade events/day to identify potential fraud Analyze 500 million daily call detail records in real-time to predict customer churn faster ✤ .
keep everything .Crunching Big Data . location data. video. sensor data. log files.there might be something useful in what you throw away . and anything else that’s available ✤ ✤ Principle: when you can. click streams. audio.Variety ✤ Not just names/addresses in a customer database Want to analyze text.
8/9/2006) Anonymous Netflix users aren’t. browsing history is unique and repeatable (8/1/2012) Target knows when you’re pregnant (NYT.Unexpected Consequences ✤ Anonymous AOL searcher isn’t (NYT. 12/13/2007) For many. 2/19/2012) ✤ ✤ ✤ . when compared with IMDb database (Wired.
Lessons to (Re)learn ✤ Correlation isn't causation But correlation may be all you need ✤ ✤ You can't hide in the crowd .
(Cal.Personally Identifiable Information •PII as a mathematical function •How many points of data do you need? •Pineda v Williams Sonoma Stores. Feb 10 2011) . Inc.
HIPAA De-Identified Data •Re-Identifying De-Identified Data .
Escaping Regulatory Requirements •Privacy •Fair Credit Reporting •Redlining •Employment Discrimination .
Single Transaction Owned By: •Retailer •Wholesale vendor •Manufacturer •Shipping Company •Customer’s Bank •Customer’s ISP •Retailer’s Bank •Merchant Card Processor •Phone company/Hardware/Software .
Government Using Big Data •Law Enforcement .
Copyright Issues •Who owns the data? •Who owns the derivative works? •Combined data? .