Professional Documents
Culture Documents
When SQL Is Not Enough - There Comes Elasticsearch PDF
When SQL Is Not Enough - There Comes Elasticsearch PDF
2 |
Agenda
What
Why
Jump start
Analysis in depth
Side by side with SQL
Demo
What is ES
Powerful real-time search and analytics engine
Scaling
Cluster; Node; Shard (Primary/ Replica)
RESTful APIs
Send/Receive JSON
Basic queries via query string
http://localhost:9200/{indexName}/{type}/_search?q=searchstr&size=100
http://localhost:9200/{index1,index2}/{type}/_search?q=createdby:ivo
http://localhost:9200/_search?q=tag:spam
Query DSL
fuzzy query
Any index search solution is way better than “LIKE”
How does SQL Full-text Index Work
Column-level language
Used by stemmers and tokenizers
Different columns for different languages
Language tags are respected (XML, binary)
Stop words
ALTER FULLTEXT STOPLIST ProductSL
ADD ‘blah' LANGUAGE 1033;
Thesaurus files
(i.e. “song”->”tune”)
Inverted Index
ES Analysis Process
Character filters
Simplify data (“&” -> “and”, “ü” -> “u”)
Tokenizers
Split data into words (terms, tokens)
Token filters
Lowercase
Remove words w/o relevance impact (“a”, “the”)
Synonyms added
Stemming
Reduce to root form (“dogs” -> “dog”)
Analyzers
FT fields are analyzed into terms to create inverted index
Configured when index is created
"Set the shape to semi-transparent by calling set_trans(5)"
Analyzer Type Example
Whitespace Set, the, shape, to, semi-transparent, by, calling, set_trans(5)
Standard (Def.) set, the, shape, to, semi, transparent, by, calling, set_trans, 5
Simple set, the, shape, to, semi, transparent, by, calling, set, trans
Stop set, the, shape, to, semi, transparent, by, calling, set, trans
Language (EN) set, shape, semi, transparent, calling, set_trans, 5
Pattern “nonword”:{ “type”: “pattern”, “pattern”:”[^\\w]+” }
Custom Allows combination of Tokenizer[1:1] and TokenFilters[0:N]
Security Remarks
RAM is Important
Data structures reside in-memory
Performance and reliability depend on it
• Be Aware
• No authentication!
• Protect private data alone
• Prevent expensive requests (DoS)
• Protect http://localhost:9200
Side by Side
ElasticSearch SQL Full-text Search
Performance RAM mainly Disk I/O mainly
Licensing Open Source Commercial
Platform Any (Java) Windows Only
Wildcards Yes Partly
FTS Syntax Rich Basic
Extensibility Plugins CLR or custom code
Scale Out Yes No
Relational Integrity No Yes
Security No Yes
FT Search Setup Manual Wizard
Index Update Manual Auto
From SQL to Elasticsearch
Rivers (deprecated)
Logstash
Open source log management tool
Client libraries
.NET
Elasticsearch.Net
Nest
Also Java, JS, Perl, Python, Ruby, PHP
Summary
Install Java
Download ES zip
Install
[ESHome]/bin> service install
Set ES service to start automatically
[ESHome]/bin> service manager
Open in browser http://localhost:9200/
Plugin Install
[ESHome]/bin> plugin -i elasticsearch/marvel/latest
Restart ES
Takeaways
Tools
Kopf: https://github.com/lmenezes/elasticsearch-kopf
Marvel: https://www.elastic.co/products/marvel
Curl: http://curl.haxx.se/download.html
JDBC Driver: http://www.java2s.com/Code/Jar/s/Downloadsqljdbc430jar.htm
Community
https://discuss.elastic.co
Getting Started
http://joelabrahamsson.com/elasticsearch-101/
Sponsors