Introduction to ElasticSearch Ashish @ashish_fagna
Elasticsearch real time, search and analytics engine
Elasticsearch real time, search and analytics engine distributed
real time, search and analytics engine distributed scales massively Elasticsearch
Elasticsearch real time, search and analytics engine distributed scales massively high availability
Elasticsearch real time, search and analytics engine distributed scales massively high availability RESTful API
Elasticsearch real time, search and analytics engine distributed scales massively high availability RESTful API JSON over HTTP
Elasticsearch real time, search and distributed scales massively high availability RESTful API JSON over HTTP schema analytics enginefree
Elasticsearch real time, search and analytics engine distributed scales massively RESTful API JSON over HTTP high availability schema free multi tenancy
Elasticsearch real time, search and analytics engine distributed open-source scales massively RESTful API JSON over HTTP high availability schema free multi tenancy
Elasticsearch real time, search and analytics engine Lucene based distributed open-source scales massively RESTful API JSON over HTTP high availability schema free multi tenancy
> ./bin/elasticsearch > _
Usage
> curl -XGET localhost:9200/?pretty
> curl -XGET localhost:9200/?pretty verb
> curl -XGET localhost:9200/?pretty node
> curl -XGET localhost:9200/?pretty HTTP port
> curl -XGET localhost:9200/?pretty path
> curl -XGET localhost:9200/?pretty query string
> curl -XGET localhost:9200/?pretty
GET /
GET / { Man", for Search", : "Exploding : "You Know, : true, : 200, : { "name" "tagline" "ok" "status" "version" "number" "snapshot_build" : "0.90.1", : false } }
Where do we start?
With data
{ is AWESOME","tweet": "nick": "name": "date": "rt" : "loc": "lat": "lon": "I think #elasticsearch "@ashish", "Ashish K", "2016-01-01", 5, { 13.4, 52.5 }
Saving in ElasticSearch
PUT /index/type/id
PUT /index/type/id where?
PUT /myapp/type/id
PUT /myapp/type/id what?
PUT /myapp/tweet/id
PUT /myapp/tweet/id which?
PUT /myapp/tweet/1
PUT /myapp/tweet/1 -d ' { is AWESOME","I think #elasticsearch "@ashish", "Ashish Kumar", "2013-06-03", 5, { 77, 13.5 "tweet": "nick": "name": "date": "rt": "loc": "lat": "lon": } } '
{ "_index": "_type": "_id": "_version": "ok": "myapp", "tweet", "1", 1, true } # 201 CREATED
Get
GET /myapp/tweet/1
{ "_index": "_type": "_id": "_version": "exists": "_source": "myapp", "tweet", "1", 1, true, { ...OUR TWEET... } } # 200 OK
Update
PUT /myapp/tweet/1 -d ' { is AWESOME","I know #elasticsearch "@ashish", "Ashish Kumar", "2016-01-01", 5, { 77, 13.5 "tweet": "nick": "name": "date": "rt": "loc": "lat": "lon": } } '
{ "_index": "_type": "_id": "_version": "ok": "myapp", "tweet", "1", 2, true } # 200 OK
Delete
DELETE /myapp/tweet/1
{ "_index": "myapp", "_type": "tweet", "_id": "1", "_version": 3, "ok": true, "found": true } # 200 OK
Cheaper in bulk
Any datastore Elasticsearch Client Mirror external DB
Any datastore Elasticsearch Client Standalone
"Empty" Search
GET /_search
GET /_search { "took" : 2, }
GET /_search { "took" : "timed_out" : 2, false, }
GET /_search { 2, false, "took" : "timed_out" : "_shards" : { "total" : 10, "successful" : 10, "failed" : 0 }, }
GET /_search { 2, false, "took" : "timed_out" : "_shards" : { "total" : 10, "successful" : 10, "failed" : 0 14, : 1.0, [ { ... }] }, "hits" : { "total" : "max_score" "hits" : } }
GET /_search "de", "tweet", "4", : { ... }, 1.0, "hits" : [ { "_index" : "_type" : "_id" : "_source" "_score" : }, ... ]
Multi-index Multi-type
GET /index/_search
GET /index/_search GET /index1,index2/_search
GET /index/_search GET /index1,index2/_search GET /ind*/_search
GET /index/_search GET /index1,index2/_search GET /ind*/_search GET /index/type/_search
GET /index/_search GET /index1,index2/_search GET /ind*/_search GET /index/type/_search GET /index/type1,type2/_search
GET /index/_search GET /index1,index2/_search GET /ind*/_search GET /index/type/_search GET /index/type1,type2/_search GET /index/type*/_search
GET /index/_search GET /index1,index2/_search GET /ind*/_search GET /index/type/_search GET /index/type1,type2/_search GET /index/type*/_search GET /_all/type*/_search
Pagination
Pagination size = num of results
Pagination size = num of results from = results to skip
GET /_search?size=5&from=0 GET /_search?size=5&from=5 GET /_search?size=5&from=10
Search Lite
Search Lite GET /_search?q=name:john
+tweet:foo +name:john +date:>2013-05-01
+tweet:foo +name:john +date:>2013-05-01 → percent encoding →
+tweet:foo +name:john +date:>2013-05-01 ?q=%2Btweet%3Afoo+%2Bname%3Ajohn+ %2Bdate%3A%3E2013-05-01 → percent encoding →
GET /_search?q=mary
→ user named "Mary" → tweets by "Mary" → tweet mentioning "@mary" GET /_search?q=mary
GET /_search?q=_all:mary → user named "Mary" → tweets by "Mary" → tweet mentioning "@mary"
_all field string values from all other fields
GET /_search?q=2016
GET /_search?q=2016 GET /_search?q=2016-01-01
GET /_search?q=2016 GET /_search?q=2016-01-03 GET /_search?q=date:2016-01-03 GET /_search?q=date:2016
datatype differences?
check "mapping" (field definitions)
GET /myapp/tweet/_mapping
GET /myapp/tweet/_mapping { "tweet" : { "properties" : { "tweet" : { "type" : "string" }, "name" : { "type" : "string" }, "nick" : { "type" : "string" }, "date" : { "type" : "date" }, "rt" : { "type" : "long" }, : "double" }, : "double" } "loc" : { "type": "object", "properties" : { "lat" : { "type" "lon" : { "type" } } }}}
date = type:date _all = type:string
normalize terms
"Analysis"
"Analysis" tokenization + normalization
"Analysers" tokenizer + token filters
standard analyzer "The Quick Brown Fox jumped over the Lazy Dog!"
standard analyzer → standard tokenizer "The Quick Brown Fox jumped over the Lazy Dog!"
standard analyzer → standard tokenizer The,Quick,Brown,Fox,jumped, over,the,Lazy,Dog
standard analyzer → standard tokenizer → lowercase filter The,Quick,Brown,Fox,jumped, over,the,Lazy,Dog
standard analyzer → standard tokenizer → lowercase filter the,quick,brown,fox,jumped, over,the,lazy,dog
standard analyzer → standard tokenizer → lowercase filter → stopwords filter the,quick,brown,fox,jumped, over,the,lazy,dog
standard analyzer → standard tokenizer → lowercase filter → stopwords filter ,quick,brown,fox,jumped, over, ,lazy,dog
english analyzer → standard tokenizer → lowercase filter the,quick,brown,fox,jumped, over,the,lazy,dog
Analyzer
{ "type" : "string" "type" : "string" }, }, "string", "not_analyzed" { "type" { "type" { "type" : "date" }, : "long" }, : "geo_point" } "tweet" : { "properties" : { "tweet" : { "name" : { "nick" : { "type" : "index" : }, "date" : "rt" : "loc" : }}}
{ "tweet" : { "properties" : { "tweet" : { "type" : "string", "analyzer" : "english" }, }, "name" : "nick" : "type" "index" { "type" : "string" { : : "string", "not_analyzed" }, "date" : "rt" : { { "type" : "type" : "date" "long" }, }, }}} "loc" : { "type" : "geo_point" }
Thank You

Introducing ElasticSearch - Ashish