Great Architect & Artist

'Elasticsearch'에 해당되는 글 43건

2015/04/22 용비 18. Exploring Your Data - Executing Searches
2015/04/22 용비 17. Exploring Your Data - Introducing the Query Language
2015/04/22 용비 16. Exploring Your Data - The Search API
2015/04/22 용비 15. Exploring Your Data - Loading the Sample Dataset
2015/04/22 용비 14. Exploring Your Data - Sample Dataset

18. Exploring Your Data - Executing Searches

Elastic Search/01. Getting Started 2015/04/22 17:06 용비

[Executing Searches]

지금까지 우리는 몇 가지 기본적인 검색 파라미터들을 살펴보았다. 이제 Query DSL에 대해서 좀 더 깊이 파고 들어가 보자. 첫번째로 리턴되는 document 필드들을 살펴보자. 기본적으로 전체 JSON document는 모든 검색의 일부로 리턴된다. 이것은 source (검색 결과에서 _source 필드에 해당함)를 참조한다. 만약 전체 source document의 리턴을 원치 않으면 리턴되는 source내의 몇가지 필드만 요청할 수도 있다.

다음 예제는 _source내의 2가지 필드 - account_number, balance - 만을 검색으로부터 어떻게 리턴 받는지를 보여준다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}'

위의 예제에서 간단하게 _source 필드를 감소시켰음을 주목하라. 검색 결과는 여전히 _source 필드 하나를 리턴하겠지만, 그 내에는 account_number, balance가 포함되어 있다.

SQL 배경지식이 있다면, 위의 예제는 SQL의 SELECT [필드 리스트] FROM…. 개념과 유사하다.

이제 query 파트로 이동해서 살펴보자. 앞전에 매칭되는 모든 document를 검색하는데 사용하는 match_all query를 살펴보았다. 이제 기본적인 field 지정 검색 query인 match query라고 불리는 새로운 query에 대해서 살펴보자. (특정 필드나 여러 필드에 대해서 검색 가능)

다음 예제는 20개의 account_number를 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": { "account_number": 20 } }
}'

다음 예제는 address에 "mill"을 포함하고 있는 account들을 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": { "address": "mill" } }
}'

다음 예제는 address에 "mill"이나 "lane"이 포함되어 있는 account들을 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": { "address": "mill lane" } }
}'

다음 예제는 address에 "mill lane" 절구가 포함되어 있는 모든 account들을 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_phrase": { "address": "mill lane" } }
}'

이제 bool query에 대해서 살펴 보자. Bool query의 boolean logic을 이용하여 더 큰 query에 작은 query들을 조합할 수 있다.

다음 예제는 2개의 match query를 조합하여 address에 "mill", "lane"을 모두 포함하고 있는 account를 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
   "bool": {
   "must": [
   { "match": { "address": "mill" } },
   { "match": { "address": "lane" } }
   ]
   }
}
}'

위의 예제에서 보면, bool must절은 query 내 match 대상이 되는 document가 모두 true이어야 함을 의미한다.

반대로, 다음 예제에서는 2개의 match절 중에 하나만 true이면 된다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
   "bool": {
   "should": [
   { "match": { "address": "mill" } },
   { "match": { "address": "lane" } }
   ]
   }
}
}'

위의 예제에서 bool should는 query내 match 대상이 되는 document가 true인 결과들의 모음이다.

(둘 중에 하나만 true인 경우의 결과를 모두 취합한 것과 같다)

다음 예제는 2개의 match query가 결합하여 address에 "mill"과 "lane"이 모두 없는 account들에 대한 결과를 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
   "bool": {
   "must_not": [
   { "match": { "address": "mill" } },
   { "match": { "address": "lane" } }
   ]
   }
}
}'

위의 예제에서 bool must_not 구절은 match 대상 document에 true가 없는 리스트를 리턴한다.

Bool query 내에 must, should, must_not 구절을 동시에 조합할 수 있다.

더욱이 bool 구절 내에 multi-level boolean logic을 가지는 복잡한 bool query를 구성할 수도 있다.

다음 예제는 40세, Live하지 않은 ID를 가진 모든 account를 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
   "bool": {
   "must": [
   { "match": { "age": "40" } }
   ],
   "must_not": [
   { "match": { "state": "ID" } }
   ]
   }
}
}'

TAG Elasticsearch, 오픈소스

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/702

17. Exploring Your Data - Introducing the Query Language

Elastic Search/01. Getting Started 2015/04/22 17:05 용비

[Introducing the Query Language]

Elasticsearch는 query를 실행하는데 사용할 수 있는 JSON Style의 domain-specific language를 제공한다. 이것은 Query D니 (http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html) 을 참조하고 있다. Query language는 종합적이고, 처음에는 겁이 날 수도 있지만, 몇 가지 기본적인 예제를 통해서 실제적으로 배우기에는 가장 좋은 방법이다.

마지막 예제로 돌아가서 다음 Query를 실행해 보자.

{
"query": { "match_all": {} }
}

위의 query를 해부해 보면, query 파트는 query에 대한 정의를, match_all은 실행하고자 하는 query의 형태를 나타낸다. match_all query는 특정 index의 모든 document에 대한 검색을 나타낸다.

Query 파라미터에 더하여 검색 결과에 영향을 주는 다른 파라미터를 넘길 수도 있다. 예를 들면, 다음 예제는 match_all 결과 중에 첫번째 document만 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"size": 1
}'

size 값이 지정되지 않으면 기본적으로 10건을 리턴한다.

다음 예제는 11번째에서 20번째 document를 리턴한다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}'

from 파라미터 (0-based)는 시작지점을 의미하고, size 파라미터는 from 파라미터로부터 얼마나 많은 document를 리턴할 것인지를 의미한다. 검색 결과에 대한 paging을 구현할 때 유용하다. from이 없으면 default는 0이다.

다음 예제는 match_all을 수행하고 내림차순으로 account balance (계좌 잔액) 기준 정렬한 결과를 10개 (default) 리턴하는 예제이다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"sort": { "balance": { "order": "desc" } }
}'

TAG Elasticsearch, 오픈소스

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/701

16. Exploring Your Data - The Search API

Elastic Search/01. Getting Started 2015/04/22 17:04 용비

[The Search API]

이제 간단한 검색을 시작해 보자. 검색을 실행하는 기본적인 2가지 방법이 있다. 하나는 REST Request URI에 검색 파라미터를 보내는 것이고, 다른 하나는 REST Request Body에 검색 파라미터를 보내는 것이다. Request Body에 보내는 방법은 더 표현적이고 readable한 JSON 포맷으로 검색에 대한 정의를 할 수 있다. 우리는 Request URI에 파라미터를 보내는 방법을 예제로 해보겠지만, 이 tutorial의 나머지 부분에서는 전부 Request Body에 파라미터를 보내는 방법을 사용할 것이다.

검색에 대한 REST API는 _search endpoint를 통해 접속할 수 있다. 다음 예제는 bank index의 모든 document를 리턴한다.

curl 'localhost:9200/bank/_search?q=*&pretty'

먼저 Search Call을 해부해 보자. 우리는 _search endpoint를 사용하여 bank index를 검색 중이다. 그리고 q=* 파라미터는 index내의 모든 document를 매칭하도록 Elasticsearch에 지시한다. Pretty 파라미터는, 다시 말하지만, Elasticsearch에게 pretty-printed JSON 결과를 리턴하라고 말하는 것이다.

응답 결과의 일부분은 다음과 같다.

curl 'localhost:9200/bank/_search?q=*&pretty'
{
"took" : 63,
"timed_out" : false,
"_shards" : {
   "total" : 5,
   "successful" : 5,
   "failed" : 0
},
"hits" : {
   "total" : 1000,
   "max_score" : 1.0,
   "hits" : [ {
   "_index" : "bank",
   "_type" : "account",
   "_id" : "1",
   "_score" : 1.0, "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
   }, {
   "_index" : "bank",
   "_type" : "account",
   "_id" : "6",
   "_score" : 1.0, "_source" : {"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
   }, {
   "_index" : "bank",
   "_type" : "account",

응답 결과에서 우리는 다음 항목을 볼 수 있다.

took : Elasticsearch가 검색 실행에 걸린 시간 (ms)

timed_out : 검색 타임 아웃 발생 여부

_shards : 얼마나 많은 shard를 검색했는지. 성공한 shard 수, 실패한 shard 수

hits : 검색 결과

hits.total : 검색된 전체 document 수

hits.hits : 검색된 실제 Array (기본적으로는 처음 10개)

_score, max_score : 불필요한 필드 (무시)

Request Body를 이용하여 동일한 검색을 수행하는 경우는 다음과 같다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} }
}'

차이점은 URI에 q=*를 전달하는 대신에 POST를 사용하여 _search API의 Request Body에 JSON 스타일의 쿼리를 보낸 것이다. 다음 섹션에서 JSON query에 대해서 논의할 것이다.

응답 결과는 다음과 같다.

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} }
}'
{
"took" : 26,
"timed_out" : false,
"_shards" : {
   "total" : 5,
   "successful" : 5,
   "failed" : 0
},
"hits" : {
   "total" : 1000,
   "max_score" : 1.0,
   "hits" : [ {
   "_index" : "bank",
   "_type" : "account",
   "_id" : "1",
   "_score" : 1.0, "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
   }, {
   "_index" : "bank",
   "_type" : "account",
   "_id" : "6",
   "_score" : 1.0, "_source" : {"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
   }, {
   "_index" : "bank",
   "_type" : "account",
   "_id" : "13",

여러분이 검색 결과를 받았을 때, Elasticsearch는 request를 완벽하게 수행하고, 어떤 종류의 server-side resource 정보를 가지고 있지 않고, 결과에 대한 커서도 오픈하고 있지 않다. 이것은 SQL like한 다른 플랫폼에서 대용량의 결과를 조회할 경우, 부분적으로 data subset을 유지하거나 server-side의 stateful cursor를 open하고 있어서 서버로 다음 결과를 fetch하는 요청을 보내면 continuous하게 다음 결과를 가져오는 경우와 다른 점이다.

TAG Elasticsearch, 오픈소스

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/700

15. Exploring Your Data - Loading the Sample Dataset

Elastic Search/01. Getting Started 2015/04/22 17:03 용비

[Loading the Sample Dataset]

여러분은 샘플 dataset (accounts.json)을 https://github.com/bly2k/files/blob/master/accounts.zip?raw=true 에서 다운로드 받을 수 있다. 압축을 현재 directory에 풀고, 다음과 같이 cluster에 data를 load해 보자.

curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary @accounts.json
curl 'localhost:9200/_cat/indices?v'

응답 결과는 다음과 같다.

curl 'localhost:9200/_cat/indices?v'
health index pri rep docs.count docs.deleted store.size pri.store.size
yellow bank 5 1 1000 0 424.4kb 424.4kb

이것으로 bank index (account type 아래)에 1000개의 document가 성공적으로 bulk index되었음을 알 수 있다.

TAG Elasticsearch, 오픈소스

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/699

14. Exploring Your Data - Sample Dataset

Elastic Search/01. Getting Started 2015/04/22 17:02 용비

[Sample Dataset]

자, 이제 우리는 기본에 대해서 잠깐 들여다 보았다. 이제 좀 더 실제적인 dataset으로 작업을 수행해 보자. 여기 가상의 고객 은행 계좌 정보를 담고 있는 JSON document를 준비했다. 각 Document의 Schema는 다음과 같다.

{
   "account_number": 0,
   "balance": 16623,
   "firstname": "Bradshaw",
   "lastname": "Mckenzie",
   "age": 29,
   "gender": "F",
   "address": "244 Columbus Place",
   "employer": "Euron",
   "email": "bradshawmckenzie@euron.com",
   "city": "Hobucken",
   "state": "CO"
}

이 데이터는 www.json-generator.com 에서 생성되었다. 따라서 실제 값과 이 데이터의 semantics는 랜덤으로 생성된 것이므로 무시하기 바란다.

TAG Elasticsearch, 오픈소스

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/698

◀ 이전페이지 1 ... 2 3 4 5 6 7 8 9 다음페이지 ▶

블로거

지혜 있는 자는 궁창의 빛과 같이 빛날 것이요 많은 사람을 옳은 데로 돌아오게 한 자는 별과 같이 영원토록 빛나리라 (단 12:3) 용비

태그목록

최근에 올라온 글

Great Architect & Artist - 최근 글

Chapter 2. Organizing D....

Chapter 1. Layering.

06. Javadoc.

05. Practice.

04. Naming.

달력

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Total : 5929375
Today : 1283
Yesterday : 1717

'Elasticsearch'에 해당되는 글 43건

18. Exploring Your Data - Executing Searches

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/702

트랙백 주소 :: http://www.yongbi.net/trackback/702

트랙백 RSS :: http://www.yongbi.net/rss/trackback/702

댓글을 달아 주세요

17. Exploring Your Data - Introducing the Query Language

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/701

트랙백 주소 :: http://www.yongbi.net/trackback/701

트랙백 RSS :: http://www.yongbi.net/rss/trackback/701

댓글을 달아 주세요

16. Exploring Your Data - The Search API

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/700

트랙백 주소 :: http://www.yongbi.net/trackback/700

트랙백 RSS :: http://www.yongbi.net/rss/trackback/700

댓글을 달아 주세요

15. Exploring Your Data - Loading the Sample Dataset

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/699

트랙백 주소 :: http://www.yongbi.net/trackback/699

트랙백 RSS :: http://www.yongbi.net/rss/trackback/699

댓글을 달아 주세요

14. Exploring Your Data - Sample Dataset

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/698

트랙백 주소 :: http://www.yongbi.net/trackback/698

트랙백 RSS :: http://www.yongbi.net/rss/trackback/698

댓글을 달아 주세요

블로거

카테고리

태그목록

최근에 올라온 글

Great Architect & Artist - 최근 글

달력