You are on page 1of 12

Week 1 Unit 1: Background

Background
Course overview Text Analytics with SAP HANA Platform
Course Content:
Week 1: Overview: Text on SAP HANA Platform
Week 2: Text Analysis: Entity Extraction
Week 3: Text Analysis: Relationship Extraction
Week 4: Text Mining

Week 5: Final Exam

System Exercises (optional):


Some course units have hands-on system exercises
Amazon Web Services (AWS): Create an AWS account
SAP Cloud Appliance Library (CAL): Create an SAP CAL account
2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

Background
SAPs text analysis technology

Inxight spun off


from PARC, a
Xerox company

Inxight acquired
by Business
Objects

Finite-state technology
for modeling natural
language

Integration of text
analysis technology
into BI applications

1997

2007

2015 SAP SE or an SAP affiliate company. All rights reserved.

Business Objects
acquired by SAP

First integration
into SAP HANA

Text analysis in
SAP HANA

Text analysis technology


continues to focus on BI
applications

Foundation for full-text


search, BI, and sentiment
analysis applications

Foundation for virtually


any type of unstructured
textual data processing
on the platform

2008

2012

Today

Public

Background
Why does SAP HANA provide text analysis functionality? (1/2)
Massive amounts of unstructured data are being
captured in operational, CRM, maintenance,
engineering, R&D, and call center systems, as well as
social media, blogs, forums, e-mails, documents,...

Enterprise Challenges

Companies are struggling to:


Search on unstructured text-related content
Extract meaningful, structured information from unstructured text
Combine unstructured with structured data
Leverage data in real time to gauge and guide their business
strategy and solve critical problems

2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

Background
Why does SAP HANA provide text analysis functionality? (2/2)
Capabilities
Native full-text and fuzzy search
In-database text analytics

Graphical modeling of search models


Info Access HTML5 UI toolkit and API for JavaScript
Benefits
Less data duplication and movement leverage one
infrastructure for analytical and search workloads
Extract salient information from unstructured textual data
Easy-to-use modeling tools SAP HANA studio
Build search applications quickly Info Access

2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

Background
Which types of text processing capabilities are supported? (1/4)

Search

Text analysis

Text mining

In addition to string matching,


SAP HANA features full-text
search, which works on content
stored in tables or exposed via
views. Just like searching on the
Internet, full-text search
finds terms irrespective of the
sequence of characters and
words.

Capabilities range from basic


tokenization and stemming to
more complex semantic
analysis in the form of entity
and fact extraction. Text
analysis applies within individual
documents and is the
foundation for both full-text
search and text mining.

Text mining makes semantic


determinations about the overall
content of documents relative to
other documents. Capabilities
include key term identification
and document categorization.
Text mining is complementary to
text analysis.

2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

Background
Which types of text processing capabilities are supported? (2/4)

Search

Text analysis

Text mining

In addition to string matching,


SAP HANA features full-text
search, which works on content
stored in tables or exposed via
views. Just like searching on the
Internet, full-text search
finds terms irrespective of the
sequence of characters and
words.

Capabilities range from basic


tokenization and stemming to
more complex semantic
analysis in the form of entity
and fact extraction. Text
analysis applies within individual
documents and is the
foundation for both full-text
search and text mining.

Text mining makes semantic


determinations about the overall
content of documents relative to
other documents. Capabilities
include key term identification
and document categorization.
Text mining is complementary to
text analysis.

2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

Background
Which types of text processing capabilities are supported? (3/4)
Nicole Kidman, Aaron Eckhart and Rabbit Hole
By MEKADO MURPHY
Dan Steinberg/ Associated Press

Search

Text analysis

In addition to string matching,


HANA features full-text search
which works on content stored
in tables or exposed via views.
Just like searching on the
Internet, full-text search
finds terms irrespective of the
sequence of characters and
words.

Capabilities range from basic


tokenization and stemming to
more complex semantic
analysis in the form of entity
and fact extraction. Text
analysis applies within individual
documents and is the
foundation for both full-text
search and text mining.

2015 SAP SE or an SAP affiliate company. All rights reserved.

Aaron Eckhart and Nicole Kidman at the Toronto


International Film Festival

Text mining

TORONTO Nicole Kidman returns to Toronto, this


time in the role of both actor and producer for her latest
project, Rabbit Hole. The film, in which she co-stars
with Aaron Eckhart, looks at a suburban married couple
who experience a tremendous loss.
Rabbit Hole is based on the play by David LindsayAbaire, who also adapted it for the screen. The play
received a positive review when it premiered at
Nicole Kidman
Manhattan Theater
Club in 2006 and caught the PERSON
Aaron
Eckhart
attention of Ms. Kidman and her producing partner, PERSON
Per
MEKADO
MURPHY
PERSON
Saari, who decided
to option
it.
Ms. Kidman Dan
and Steinberg
Mr. Eckhart shared some thoughtsPERSON
Associated
Press
about the new film and the
process of working with ORGANIZATION
their
TORONTO
CITY
director, John Cameron Mitchell.
Nicole Kidman
PERSON
Toronto
CITY
David Lindsay-Abaire
PERSON
Manhattan Theater Club
PLACE
2006
YEAR
Ms. Kidman
PERSON
Per Saari
PERSON
Ms. Kidman
PERSON
Mr. Eckhart
PERSON
John Cameron Mitchell
PERSON

Text mining makes semantic


determinations about the overall
content of documents relative to
other documents. Capabilities
include key term identification
and document categorization.
Text mining is complementary to
text analysis.

Public

Background
Which types of text processing capabilities are supported? (4/4)
At Dresden Semperoper, a New Take on Tristan and
Isolde
By ROSLYN SULCAS February 17, 2015
DRESDEN, Germany David Dawsons new Tristan
and Isolde for the Dresden Semperoper Ballett raises
interesting questions about the full-length story ballet, a
genre
much-loved Seeking
by audiences
and seldom tackled by
Vodafone Turns Focus
to Broadband,
to Catch
choreographers
today.
Up to Rivals

Search

Text analysis

InByaddition
toItsstring
matching,
surprising
that the Tristan and Isolde story, a Capabilities range from basic
MARK SCOTT February
16, 2015
medieval Celtic tale that has long figured in literature,
HANA features
full-text
search
tokenization and stemming to
filmthe
andway
in Wagners
opera of the same name, has been
As consumers change
they use their
so
infrequently
used
by
ballet. Like Romeo and Juliet,
it complex semantic
which
works
on
content
stored
more
smartphones,
surf the
web,
and watch
television,
has
instant
attraction
and
union
between
lovers
from
is finding itself in need of a face-lift. After
Category
Classical_Music
opposing
camps,
withviews.
society and history against
them,
inVodafone
tables
or heavily
exposed
via
analysis
in
the form of entity
years
of focusing
on its
cellphone
business,
and
tragic
death
at
its
end.
You
can
imagine
what
John
Vodafone, based in Britain and the worlds secondKeyand
terms fact Semperoper,
Wagner,Text
ballet,
Just
like searching
on the
extraction.
Cranko
or Kenneth
MacMillan,
largest mobile operator
behind
China Mobile
basedwho
on brought the big, allJohn Cranko, Royal Ballet School
guns-blazing
story
ballets
like
Manon
and
Eugene
subscribers, is
concentrating
on high-speed broadband.
Internet,
full-text
analysis
applies within individual
Oneginsearch
to the world in the 1960s and 1970s (ballet
box
offices
are to
stillpay
thanking
them), might have done with it.
Once, Europeans
happy
for of
separate
finds
terms were
irrespective
the
documents and is the
cellphone, cable and pay-TV services. Now, they prefer
sequence
ofa single
characters
and content
foundation for both full-text
them bundled into
package that streams
to any device a smartphone, tablet or Internetwords.
search and text mining.
connected television.

Text mining
Text mining makes semantic
determinations about the overall
content of documents relative to
other documents. Capabilities
include key term identification
and document categorization.
Text mining is complementary to
text analysis.

Regional rivals like OrangeCategory


of France and Deutsche
Telecommunications
Telekom of Germany have moved quickly to offer
Key terms
Vodafone, broadband, cellphone business,
Orange, Deutsche Telekom,

2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

Background
SAP HANA architecture
SAP HANA Apps

Apps on SAP HANA

Applications running
natively on / against
SAP HANA database

Applications on any
platform using SQL
via ODBC/JDBC
SQL

TA & TM APIs

OData

Extended Application Services (XS)

Modeler,
dev.
workbench

Search

Text Mining

Model

Engines
Linguistic
processing

Tables
Studio

Metadata

Store

Entity & fact


extraction
Preprocessor

SAP HANA
2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

10

Thank you

Contact information:
open@sap.com

2015 SAP SE or an SAP affiliate company. All rights reserved.


No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate
company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices.
Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its
affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and
services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as
constituting an additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop
or release any functionality mentioned therein. This document, or any related presentation, and SAP SEs or its affiliated companies strategy and possible future
developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time
for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forwardlooking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place
undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.

2015 SAP SE or an SAP affiliate company. All rights reserved.

Public

12

You might also like