8 Things You Can’t Afford to Ignore About eDiscovery

AIIM 8 Things Series

John Wang, CCP Product Manager and eDiscovery Specialist jwang@zlti.com February 25, 2010

Brought to you by:

About ZL Technologies
• Experts in Total Information Governance
– – – – – Unstructured Content Archiving eDiscovery Compliance Secure Email Scalability & Low TCO via Private Clouds

• Select Customers

About John Wang
• Experience / Roles
– 15+ years in Technology

i Product Manager
• Degrees

Solutions Architect

Developer

– ..

M&T MBA

Computer Science

Finance

• Industry Participation
EDRM
• Project Leadership • Search Guide Co-author

AIIM
• Research proposal, execution, and presentation

LexisNexis
• Certified Concordance Professional

Agenda
1. 2. 3. 4. 5. 6. 7. 8. Early Case Assessment Data Mapping Investigative eDiscovery Concept Search Non-Linear Review Parallel Search End-to-End eDiscovery Cloud Computing

Overview

?

Did you know?

5 Year Enterprise Data Growth Estimate

85% will be Unstructured!
Sources: Gartner

Overview
• ESI is discoverable • ESI volume is growing at 55+% annually* • Litigation is increasing
– 42% US organizations expecting more litigation (from 34%)** – 83% US organizations have been litigated against in 2008**

• Timelines have been shortened • How do we handle this is an affordable way? • Can we move from a reactive, bottom-up approach to a strategic, top-down approach?

• This presentation shows us 8 technologies to do just that!
Sources: * ESG ** Fulbright & Jaworski

Early Case Assessment

?

Did you know?
In-house eDiscovery

Payback Period
Sources: Gartner, Merrill Lynch

Early Case Assessment
3 Questions
– Does the complaint have merit? – How much will this cost us? – What has the org learned? Item Payback Period Litigation Success Cost Reduction Achievement 3-6 months, or 1 large IP case 76%** 50%**

Overview
– Estimate risk to prosecute or defend a case – Formulate resolution in first 90 120 days – Examine key facts, allegations, applicable laws and venues – Analyze and assess potential trial themes for both sides – Pursue the best course
Sources: ** Cogent Research

Early Case Assessment Results
100% 80% 60% 40% 20%

0%

Cost of E-Discovery

Litigation Success Rate

Without ECA

With ECA

Early Case Assessment
Traditional Post-Collection ECA
Processing

Preservation Information Management

Identification

Review

Production

Presentation

Collection

Analysis

VOLUME
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net

RELEVANCE

Assess ESI after Collection, Preservation, Processing and Analysis

Early Case Assessment
ECA “Now”
Processing

Preservation Information Management

Identification

Review

Production

Presentation

Collection

Analysis

VOLUME

RELEVANCE
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net

Compress timeline and assess before collection, reducing processing, analysis and review

Early Case Assessment
Deployment – In-house eDiscovery – Allows faster and iterative searching, “going back to the well” Process – Analysis – Visualization How does it affect you? – Resolve cases faster – Resolve cases more favorably – Reduce costs Action Plan – Evaluate solutions – Try solutions on known cases and case data – Evaluate results

Data Mapping

?

Did you know?
Fortune 1000 Data per Firm

In potentially 100s of Repositories!
Sources: Industry Sources

Data Mapping
Required by Rule 26(a)(1)(B)
• “… a copy of, or a description by category and location of, all documents, electronically stored information, and tangible things” Requirements – Repositories – Types of ESI per repository – Custodians – Retention policy – Preservation & disposition – Legal hold enforcement – Collection method – Accessibility

Take Advantage of Rule 37(F) • Provides defense against sanctions for “routine, good-faith operation of an electronic information system.”
The Three Ss of eDiscovery

Spoliation

“I’m Sorry”

Sanctions

Data Mapping
Integrated Data Mapping
Data Mapping

Legal Hold Notification

Culling

How does it affect you? – Reduce sanction risk – Reduce overhead from 10 hrs to 30 min / week – Reduce costs – Automate collections and legal holds – Work with BCP/DR and InfoSec/DLP

Collection

Legal Hold

Action Plan – Evaluate current solution and available solutions – Analyze options if there is a gap

Investigative eDiscovery
Exclusionary ED
Approach
– Cull by Custodian – Cull by Date – Cull by File type
Cull by Date Cull by Custodian Cull by File type

Investigative ED
• Approach
– Cull by Matter – Roots in Forensics
Cull by Matter

Limitations

Review

• Benefits

Review

– Blunt tool – De-selects on secondary characteristics – Find relevance late in process – May need to go back to the source late in the process – More false negatives as the collection grows

– Finding highly relevant information early in the process – Finds information not necessarily tied to custodians, e.g. file server data – Supports ECA

Investigative eDiscovery
Investigative eDiscovery is based on the science of forensics, an older and more complete approach than traditional eDiscovery. New technologies make Investigative eDiscovery a reality again.

Key Technologies – Billion document search engines – Index in-place – Cloud / GRID scalability

How does it affect you? – Higher Success Rates – Lower Information Risk via Wider Safe Harbor – Better results – Successful ECA Action Plan – Evaluate past performance wrt initially missed relevant email – Calculate cost – Investigate options

Concept Search

?

Did you know?
Keyword Search

Missed Relevant Documents
Sources: Blair & Maron

Concept Search
• Attorneys and paralegals are not familiar with the terms in use – Many words can be used to mean the same thing – Organizations often create special “code words”
Subway Accident

Subway Company
“unfortunate incident”

Victims “Disaster”

“event,” “incident,” “situation,” “problem,” “difficulty”

Concept Search
Actively Researched and Developed Technology
Year 1763 1948 1951 1988 1999 2003 Technique Bayes Theory (Bayesian Inference) Shannon Entropy (Shannon Information Theory) K-Nearest Neighborhood Latent Semantic Indexing (LSI) Probabilistic LSI Latent Dirichlet Allocation

How does it affect you? – Find more relevant documents – Discovery case facts faster – Recommended by courts and the Sedona Conference Action Plan – Evaluate test cases – Get review teams involved for real world analysis

Non-Linear Review

?

Did you know?
Legal Review Productivity

Increased Productivity from Non-Linear Review
Sources: Deloitte, Industry Sources

Non-Linear Review
Traditional Linear eDiscovery
– Grouped by source, custodian, date, etc. – Like documents are scattered – 10,000s of docs / case

Non-Linear Review
– Grouped by concept, nearduplication – Easy navigation via visualization – Less context switching – Better sampling – 1,000,000s of docs / case

Technologies
– – – – Clustering Auto-Classification Concept Search Visualization

Non-Linear Review
Key Statistics
• • • 72% of attorneys say review is the most expensive part of ED Review is up to 80% of ED costs Can save $187,500 on a 1.5 M doc case
eDiscovery Review Productivity

Non-Linear Review

Traditional Linear Review

How does it affect you? • Faster review drives – Lower costs – Faster results – Better results – Successful ECA Action Plan – Evaluate current process and costs – Justify investigation – Review options

0

5,000

10,000

15,000

Parallel Search

?

Did you know?

Keyword Search is still advancing?

Term searches – in seconds to minutes
Source: Gartner

Parallel Search
Search 100,000 terms across billions of documents in seconds to minutes…
• • • • • • Keywords User names Email addresses Patent numbers SSNs etc…

How does it affect you? – Take the guesswork out of choosing keywords – Run queries as simulations – Supports wildcard search, proximity search, etc. Action Plan – Review complex searches – See if parallel search can provide new insights that could not be economically performed before.

End-to-End eDiscovery

?

Did you know?
eDiscovery Vendors

Offering Products and Services
Sources: Socha-Gelbmann 2009 E-Discovery Survey

End-to-End eDiscovery
Single / Multi-Vendor
Processing Preservation Information Management

Identification

Review

Production

Presentation

Collection

Analysis

VOLUME
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net

RELEVANCE

Typical Archive

Initial Search

Case Analytics

Review tool

3.5 days to index 30TB

3 days to index 1.1TB

4 days to export 2M docs

• 25% of vendors (150+) will disappear by 2011 • More vendors are entering eDiscovery than leaving

End-to-End eDiscovery
Single Platform
Processing Preservation Information Management

Identification

Review

Production

Presentation

Collection

Analysis

VOLUME
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net

RELEVANCE

• No data transfer between initial collection, review, and production • No incompatibilities or inter-stage processing time delays

End-to-End eDiscovery
• True End-to-End eDiscovery is: – Single platform • Benefits – Integrated Data Map & Legal Hold – Single Collection – Enterprise-wide search in review platform – No intermediate Productions • Bottom Line – Cost and Time Savings How does it affect you? – Faster – More Reliable – Lower Cost – Institutional Memory Action Plan – Evaluate current process and costs – Justify investigation – Review options

Cloud Computing

?

Did you know?
Cloud Computing

Market Forecast by 2011 & 2013!
Sources: Gartner, Merrill Lynch

Cloud Computing

Industry hype?
• Today: – $56 billion – 3% of enterprises using cloud • By 2013: – $150 billion market? – 50+% of email archiving in the cloud?

Sources: Gartner, Forrester

Cloud Computing

Industry hype?
• Today: – $56 billion – 3% of enterprises using cloud • By 2013: – $150 billion market? – 50+% of email archiving in the cloud?

The Good, The Bad, and The Solution …
Sources: Gartner, Forrester

Cloud Computing
The Good
1. Lower Cost
– Only pay for what you use

2. Scalability
– GRID / MapReduce

3. Increased Storage
– Virtualized file system

4. Flexibility
– Deploy new capability quickly

5. Automation
– – Less manpower requirement Inside and outside counsel

6. More mobility

Cloud Computing
The Good
1. Lower Cost
– Only pay for what you use

The Bad
1. Guaranteed service levels
– Some have no guarantees – Data not under your control

2. Scalability
– GRID / MapReduce

2. Security & shared tenancy
– Provider capabilities vary – Also may have no guarantees

3. Increased Storage
– Virtualized file system

4. Flexibility
– Deploy new capability quickly

3. Chain of custody
– Forensic examination?

5. Automation
– – Less manpower requirement Inside and outside counsel

4. Lock-in and pricing
– Ability to get data out?

6. More mobility

5. Current adoption
– Only 3% of business users!

Cloud Computing
The Solution Private Cloud Computing • What is it?
– Cloud infrastructure deployed in-house

• Added Benefits
– Secure – QoS / SLA
IT Organizations Will Spend More Money on Private Cloud Computing Investments Than on Offerings From Public Cloud Providers Through 2012

How does it affect you? • Faster review drives – Lower costs – Better resource utilization – Scales for one time projects Action Plan – Check internal cloud strategy – Run savings figuress

Gartner

8 Things You Can’t Afford to Ignore
with eDiscovery 1. 2. 3. 4. 5. 6. 7. 8. Early Case Assessment Data Mapping Investigative eDiscovery Concept Search Non-Linear Review Parallel Search End-to-End eDiscovery Cloud Computing ZL Technologies • Experts in Total Information Governance

More Information
• • http://aiim.typepad.com/ http://www.zlti.com/

– Unstructured Content Archiving – eDiscovery – Compliance – Secure Email – Scalability & Low TCO via Private Clouds

Thank You

Brought to you by:
Thank You John Wang jwang@zlti.com