Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions 3.7/aql/operations-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,12 @@ The `SEARCH` operation accepts an options object with the following attributes:

- `collections` (array, _optional_): array of strings with collection names to
restrict the search to certain source collections
- `conditionOptimization` (string, _optional_): controls how search criteria
get optimized. Possible values:
- `"auto"` (default): convert conditions to disjunctive normal form (DNF) and
apply optimizations. Removes redundant or overlapping conditions, but can
take quite some time even for a low number of nested conditions.
- `"none"`: search the index without optimizing the conditions.

**Examples**

Expand Down
39 changes: 33 additions & 6 deletions 3.7/arangosearch-views.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,9 @@ Note that the `primarySort` option is immutable: it can not be changed after
View creation. It is therefore not possible to configure it through the Web UI.
The View needs to be created via the HTTP or JavaScript API (arangosh) to set it.

The primary sort data is LZ4 compressed by default (`primarySortCompression` is
`"lz4"`). Set it to `"none"` on View creation to trade space for speed.

View Definition/Modification
----------------------------

Expand All @@ -245,7 +248,7 @@ During view modification the following directives apply:
### Link Properties

- **analyzers** (_optional_; type: `array`; subtype: `string`; default: `[
'identity' ]`)
"identity" ]`)

A list of Analyzers, by name as defined via the [Analyzers](arangosearch-analyzers.html),
that should be applied to values of processed document attributes.
Expand All @@ -271,11 +274,11 @@ During view modification the following directives apply:
- **trackListPositions** (_optional_; type: `boolean`; default: `false`)

If set to `true`, then for array values track the value position in arrays.
E.g., when querying for the input `{ attr: [ 'valueX', 'valueY', 'valueZ' ]
}`, the user must specify: `doc.attr[1] == 'valueY'`. Otherwise, all values in
E.g., when querying for the input `{ attr: [ "valueX", "valueY", "valueZ" ] }`,
the user must specify: `doc.attr[1] == "valueY"`. Otherwise, all values in
an array are treated as equal alternatives. E.g., when querying for the input
`{ attr: [ 'valueX', 'valueY', 'valueZ' ] }`, the user must specify: `doc.attr
== 'valueY'`.
`{ attr: [ "valueX", "valueY", "valueZ" ] }`, the user must specify:
`doc.attr == "valueY"`.

- **storeValues** (_optional_; type: `string`; default: `"none"`)

Expand All @@ -294,7 +297,31 @@ During view modification the following directives apply:
iterates over all documents of a View, wants to sort them by attribute values
and the (left-most) fields to sort by as well as their sorting direction match
with the *primarySort* definition, then the `SORT` operation is optimized away.
Also see [Primary Sort Order](arangosearch-views.html#primary-sort-order)
Also see [Primary Sort Order](#primary-sort-order)

- **primarySortCompression** (_optional_; type: `string`; default: `lz4`; _immutable_)

Defines how to compress the primary sort data (introduced in v3.7.0).
ArangoDB v3.5 and v3.6 always compress the index using LZ4.

- `"lz4"` (default): use LZ4 fast compression.
- `"none"`: disable compression to trade space for speed.

- **storedValues** (_optional_; type: `array`; default: `[]`; _immutable_)

An array of objects to describe which document attributes to store in the
View index. It can then cover search queries, which means the data can be
taken from the index directly and accessing the storage engine can be avoided.

Each object is expected in the form
`{ field: [ "attr1", "attr2", ... "attrN" ], compression: "none" }`,
where the required `field` attribute is an array of strings with one or more
document attribute paths. The specified attributes are placed into a single
column of the index. A column with all fields that are involved in common
search queries is ideal for performance. The column should not include too
many unneeded fields however. The optional `compression` attribute defines
the compression type used for the internal column-store, which can be `"lz4"`
(LZ4 fast compression, default) or `"none"` (no compression).

An inverted index is the heart of ArangoSearch Views.
The index consists of several independent segments and the index **segment**
Expand Down
114 changes: 114 additions & 0 deletions 3.7/release-notes-new-features37.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,77 @@ FOR doc IN viewName

See [ArangoSearch functions](aql/functions-arangosearch.html#like)

### Covering Indexes

It is possible to directly store the values of document attributes in View
indexes now via a new View property `storedValues` (not to be confused with
the existing `storeValues`).

View indexes may fully cover `SEARCH` queries for improved performance.
While late document materialization reduces the amount of fetched documents,
this new optimization can avoid to access the storage engine entirely.

```json
{
"links": {
"articles": {
"fields": {
"categories": {}
}
}
},
"primarySort": [
{ "field": "publishedAt", "direction": "desc" }
],
"storedValues": [
{ "field": [ "title", "categories" ] }
],
...
}
```

In above View definition, the document attribute *categories* is indexed for
searching, *publishedAt* is used as primary sort order and *title* as well as
*categories* are stored in the View using the new `storedValues` property.

```js
FOR doc IN articlesView
SEARCH doc.categories == "recipes"
SORT doc.publishedAt DESC
RETURN {
title: doc.title,
date: doc.publishedAt,
tags: doc.categories
}
```

The query searches for articles which contain a certain tag in the *categories*
array and returns title, date and tags. All three values are stored in the View
(`publishedAt` via `primarySort` and the two other via `storedValues`), thus
no documents need to be fetched from the storage engine to answer the query.
This is shown in the execution plan as a comment to the *EnumerateViewNode*:
`/* view query without materialization */`

```js
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 EnumerateViewNode 1 - FOR doc IN articlesView SEARCH (doc.`categories` == "recipes") SORT doc.`publishedAt` DESC LET #1 = doc.`publishedAt` LET #7 = doc.`categories` LET #5 = doc.`title` /* view query without materialization */
5 CalculationNode 1 - LET #3 = { "title" : #5, "date" : #1, "tags" : #7 } /* simple expression */
6 ReturnNode 1 - RETURN #3

Indexes used:
none

Optimization rules applied:
Id RuleName
1 move-calculations-up
2 move-calculations-up-2
3 handle-arangosearch-views
```

See [ArangoSearch Views](arangosearch-views.html#view-properties).

### Stemming support for more languages

The Snowball library was updated to the latest version 2, adding stemming
Expand Down Expand Up @@ -71,6 +142,49 @@ db._query(`RETURN TOKENS("αυτοκινητουσ πρωταγωνιστούσ

Also see [Analyzers: Supported Languages](arangosearch-analyzers.html#supported-languages)

### Condition Optimization Option

The `SEARCH` operation in AQL accepts a new option `conditionOptimization` to
give users control over the search criteria optimization:

```js
FOR doc IN myView
SEARCH doc.val > 10 AND doc.val > 5 /* more conditions */
OPTIONS { conditionOptimization: "none" }
RETURN doc
```

By default, all conditions get converted into disjunctive normal form (DNF).
Numerous optimizations can be applied, like removing redundant or overlapping
conditions (such as `doc.val > 10` which is included by `doc.val > 5`).
However, converting to DNF and optimizing the conditions can take quite some
time even for a low number of nested conditions which produce dozens of
conjunctions / disjunctions. It can be faster to just search the index without
optimizations.

See [SEARCH operation](operations-search.html#search-options).

### Primary Sort Compression Option

There is a new option `primarySortCompression` which can be set on View
creation to disable the compression of the primary sort data:

```json
{
"primarySort": [
{ "field": "date", "direction": "desc" },
{ "field": "title", "direction": "asc" }
],
"primarySortCompression": "none",
...
}
```

It defaults to LZ4 compression (`"lz4"`), which was already used in ArangoDB
v3.5 and v3.6.

See [ArangoSearch Views](arangosearch-views.html#view-properties).

SatelliteGraphs
---------------

Expand Down