Skip to content

Commit adf6e79

Browse files
Docs: Rewrote the filtered query docs to be clearer
Closes elastic#1688
1 parent 8ccfca3 commit adf6e79

File tree

1 file changed

+134
-29
lines changed

1 file changed

+134
-29
lines changed
Lines changed: 134 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,159 @@
11
[[query-dsl-filtered-query]]
22
=== Filtered Query
33

4-
A query that applies a filter to the results of another query. This
5-
query maps to Lucene `FilteredQuery`.
4+
The `filtered` query is used to combine another query with any
5+
<<query-dsl-filters,filter>>). Filters are usually faster than queries because:
6+
7+
* they don't have to calculate the relevance `_score` for each document --
8+
the answer is just a boolean ``Yes, the document matches the filter'' or
9+
``No, the document does not match the filter''.
10+
* the results from most filters can be cached in memory, making subsequent
11+
executions faster.
12+
13+
TIP: Exclude as many document as you can with a filter, then query just the
14+
documents that remain.
615

716
[source,js]
817
--------------------------------------------------
918
{
10-
"filtered" : {
11-
"query" : {
12-
"term" : { "tag" : "wow" }
13-
},
14-
"filter" : {
15-
"range" : {
16-
"age" : { "from" : 10, "to" : 20 }
17-
}
18-
}
19+
"filtered": {
20+
"query": {
21+
"match": { "tweet": "full text search" }
22+
},
23+
"filter": {
24+
"range": { "created": { "gte": "now - 1d / d" }}
25+
}
26+
}
27+
}
28+
--------------------------------------------------
29+
30+
The `filtered` query can be used wherever a `query` is expected, for instance,
31+
to use the above example in search request:
32+
33+
[source,js]
34+
--------------------------------------------------
35+
curl -XGET localhost:9200/_search -d '
36+
{
37+
"query": {
38+
"filtered": { <1>
39+
"query": {
40+
"match": { "tweet": "full text search" }
41+
},
42+
"filter": {
43+
"range": { "created": { "gte": "now - 1d / d" }}
44+
}
1945
}
46+
}
2047
}
48+
'
2149
--------------------------------------------------
50+
<1> The `filtered` query is passed as the value of the `query`
51+
parameter in the search request.
2252

23-
The filter object can hold only filter elements, not queries. Filters
24-
can be much faster compared to queries since they don't perform any
25-
scoring, especially when they are cached.
53+
==== Filtering without a query
54+
55+
If a `query` is not specified, it defaults to the
56+
<<query-dsl-match-all-query,`match_all` query>>. This means that the
57+
`filtered` query can be used to wrap just a filter, so that it can be used
58+
wherever a query is expected.
59+
60+
[source,js]
61+
--------------------------------------------------
62+
curl -XGET localhost:9200/_search -d '
63+
{
64+
"query": {
65+
"filtered": { <1>
66+
"filter": {
67+
"range": { "created": { "gte": "now - 1d / d" }}
68+
}
69+
}
70+
}
71+
}
72+
'
73+
--------------------------------------------------
74+
<1> No `query` has been specfied, so this request applies just the filter,
75+
returning all documents created since yesterday.
76+
77+
==== Multiple filters
78+
79+
Multiple filters can be applied by wrapping them in a
80+
<<query-dsl-bool-filter,`bool` filter>>, for example:
81+
82+
[source,js]
83+
--------------------------------------------------
84+
{
85+
"filtered": {
86+
"query": { "match": { "tweet": "full text search" }},
87+
"filter": {
88+
"bool": {
89+
"must": { "range": { "created": { "gte": "now - 1d / d" }}},
90+
"should": [
91+
{ "term": { "featured": true }},
92+
{ "term": { "starred": true }}
93+
],
94+
"must_not": { "term": { "deleted": false }}
95+
}
96+
}
97+
}
98+
}
99+
--------------------------------------------------
100+
101+
Similarly, multiple queries can be combined with a
102+
<<query-dsl-bool-query,`bool` query>>.
26103

27104
==== Filter strategy
28105

29-
The filtered query allows to configure how to intersect the filter with the query:
106+
You can control how the filter and query are executed with the `strategy`
107+
parameter:
30108

31109
[source,js]
32110
--------------------------------------------------
33111
{
34112
"filtered" : {
35-
"query" : {
36-
// query definition
37-
},
38-
"filter" : {
39-
// filter definition
40-
},
113+
"query" : { ... },
114+
"filter" : { ... ],
41115
"strategy": "leap_frog"
42116
}
43117
}
44118
--------------------------------------------------
45119

120+
IMPORTANT: This is an _expert-level_ setting. Most users can simply ignore it.
121+
122+
The `strategy` parameter accepts the following options:
123+
46124
[horizontal]
47-
`leap_frog_query_first`:: Look for the first document matching the query, and then alternatively advance the query and the filter to find common matches.
48-
`leap_frog_filter_first`:: Look for the first document matching the filter, and then alternatively advance the query and the filter to find common matches.
49-
`leap_frog`:: Same as `leap_frog_query_first`.
50-
`query_first`:: If the filter supports random access, then search for documents using the query, and then consult the filter to check whether there is a match. Otherwise fall back to `leap_frog_query_first`.
51-
`random_access_${threshold}`:: If the filter supports random access and if there is at least one matching document among the first `threshold` ones, then apply the filter first. Otherwise fall back to `leap_frog_query_first`. `${threshold}` must be greater than or equal to `1`.
52-
`random_access_always`:: Apply the filter first if it supports random access. Otherwise fall back to `leap_frog_query_first`.
53-
54-
The default strategy is to use `query_first` on filters that are not advanceable such as geo filters and script filters, and `random_access_100` on other filters.
125+
`leap_frog_query_first`::
126+
127+
Look for the first document matching the query, and then alternatively
128+
advance the query and the filter to find common matches.
129+
130+
`leap_frog_filter_first`::
131+
132+
Look for the first document matching the filter, and then alternatively
133+
advance the query and the filter to find common matches.
134+
135+
`leap_frog`::
136+
137+
Same as `leap_frog_query_first`.
138+
139+
`query_first`::
140+
141+
If the filter supports random access, then search for documents using the
142+
query, and then consult the filter to check whether there is a match.
143+
Otherwise fall back to `leap_frog_query_first`.
144+
145+
`random_access_${threshold}`::
146+
147+
If the filter supports random access and if there is at least one matching
148+
document among the first `threshold` ones, then apply the filter first.
149+
Otherwise fall back to `leap_frog_query_first`. `${threshold}` must be
150+
greater than or equal to `1`.
151+
152+
`random_access_always`::
153+
154+
Apply the filter first if it supports random access. Otherwise fall back
155+
to `leap_frog_query_first`.
156+
157+
The default strategy is to use `query_first` on filters that are not
158+
advanceable such as geo filters and script filters, and `random_access_100` on
159+
other filters.

0 commit comments

Comments
 (0)