@@ -145,7 +145,9 @@ The following Analyzer types are available:
145145- [ ` nearest_neighbors ` ] ( #nearest_neighbors ) : finds the nearest neighbors of the
146146 input text using a word embedding model (Enterprise Edition only)
147147- [ ` geojson ` ] ( #geojson ) : breaks up a GeoJSON object into a set of indexable tokens
148- - [ ` geopoint ` ] ( #geopoint ) : breaks up a JSON object describing a coordinate into
148+ - [ ` geo_s2 ` ] ( #geo_s2 ) : like ` geojson ` but offers more efficient formats for
149+ indexing geo-spatial data (Enterprise Edition only)
150+ - [ ` geopoint ` ] ( #geopoint ) : breaks up JSON data describing a coordinate pair into
149151 a set of indexable tokens
150152
151153The following table compares the Analyzers for ** text processing** :
@@ -878,7 +880,7 @@ attributes:
878880 ` type ` and ` properties ` attributes
879881
880882{% hint 'info' %}
881- - You cannot use Analyzers of the types ` geopoint ` and ` geojson ` in pipelines.
883+ - You cannot use Analyzers of the types ` geopoint ` , ` geojson ` , and ` geo_s2 ` in pipelines.
882884 These Analyzers require additional postprocessing and can only be applied to
883885 document fields directly.
884886- The output data type of an Analyzer needs to be compatible with the input
@@ -1231,18 +1233,26 @@ db._query(`LET str = "salt, oil"
12311233
12321234<small >Introduced in: v3.8.0</small >
12331235
1234- An Analyzer capable of breaking up a GeoJSON object into a set of
1235- indexable tokens for further usage with
1236- [ ArangoSearch Geo functions] ( aql/functions-arangosearch.html#geo-functions ) .
1236+ An Analyzer capable of breaking up a GeoJSON object or coordinate array in
1237+ ` [longitude, latitude] ` order into a set of indexable tokens for further usage
1238+ with [ ArangoSearch Geo functions] ( aql/functions-arangosearch.html#geo-functions ) .
12371239
1238- GeoJSON object example :
1240+ The Analyzer can be used for two different coordinate representations :
12391241
1240- ``` js
1241- {
1242- " type" : " Point" ,
1243- " coordinates" : [ - 73.97 , 40.78 ] // [ longitude, latitude ]
1244- }
1245- ```
1242+ - a GeoJSON feature like a Point or Polygon, using a JSON object like the following:
1243+
1244+ ``` js
1245+ {
1246+ " type" : " Point" ,
1247+ " coordinates" : [ - 73.97 , 40.78 ] // [ longitude, latitude ]
1248+ }
1249+ ```
1250+
1251+ - a coordinate array with two numbers as elements in the following format:
1252+
1253+ ``` js
1254+ [ - 73.97 , 40.78 ] // [ longitude, latitude ]
1255+ ```
12461256
12471257The * properties* allowed for this Analyzer are an object with the following
12481258attributes:
@@ -1264,7 +1274,7 @@ Create a collection with GeoJSON Points stored in an attribute `location`, a
12641274` geojson ` Analyzer with default properties, and a View using the Analyzer.
12651275Then query for locations that are within a 3 kilometer radius of a given point
12661276and return the matched documents, including the calculated distance in meters.
1267- The stored coordinates and the ` GEO_POINT() ` arguments are expected in
1277+ The stored coordinate pairs and the ` GEO_POINT() ` arguments are expected in
12681278longitude, latitude order:
12691279
12701280{% arangoshexample examplevar="examplevar" script="script" result="result" %}
@@ -1302,19 +1312,143 @@ longitude, latitude order:
13021312{% endarangoshexample %}
13031313{% include arangoshexample.html id=examplevar script=script result=result %}
13041314
1315+ ### ` geo_s2 `
1316+
1317+ <small >Introduced in: v3.10.5</small >
1318+
1319+ {% include hint-ee-arangograph.md feature="The ` geo_s2 ` Analyzer" %}
1320+
1321+ An Analyzer capable of breaking up a GeoJSON object or coordinate array in
1322+ ` [longitude, latitude] ` order into a set of indexable tokens for further usage
1323+ with [ ArangoSearch Geo functions] ( aql/functions-arangosearch.html#geo-functions ) .
1324+
1325+ The Analyzer is similar to the ` geojson ` Analyzer, but it internally uses a
1326+ format for storing the geo-spatial data that is more efficient. You can choose
1327+ between different formats to make a tradeoff between the size on disk, the
1328+ precision, and query performance.
1329+
1330+ The Analyzer can be used for two different coordinate representations:
1331+
1332+ - a GeoJSON feature like a Point or Polygon, using a JSON object like the following:
1333+
1334+ ``` js
1335+ {
1336+ " type" : " Point" ,
1337+ " coordinates" : [ - 73.97 , 40.78 ] // [ longitude, latitude ]
1338+ }
1339+ ```
1340+
1341+ - a coordinate array with two numbers as elements in the following format:
1342+
1343+ ``` js
1344+ [ - 73.97 , 40.78 ] // [ longitude, latitude ]
1345+ ```
1346+
1347+ The * properties* allowed for this Analyzer are an object with the following
1348+ attributes:
1349+
1350+ - ` format ` (string, _ optional_ ): the internal binary representation to use for
1351+ storing the geo-spatial data in an index
1352+ - ` "latLngDouble" ` (default): store each latitude and longitude value as an
1353+ 8-byte floating-point value (16 bytes per coordinate pair). This format preserves
1354+ numeric values exactly and is more compact than the VelocyPack format used
1355+ by the ` geojson ` Analyzer.
1356+ - ` "latLngInt" ` : store each latitude and longitude value as an 4-byte integer
1357+ value (8 bytes per coordinate pair). This is the most compact format but the
1358+ precision is limited to approximately 1 to 10 centimeters.
1359+ - ` "s2Point" ` : store each longitude-latitude pair in the native format of
1360+ Google S2 which is used for geo-spatial calculations (24 bytes per coordinate pair).
1361+ This is not a particular compact format but it reduces the number of
1362+ computations necessary when you execute geo-spatial queries.
1363+ This format preserves numeric values exactly.
1364+ - ` type ` (string, _ optional_ ):
1365+ - ` "shape" ` (default): index all GeoJSON geometry types (Point, Polygon etc.)
1366+ - ` "centroid" ` : compute and only index the centroid of the input geometry
1367+ - ` "point" ` : only index GeoJSON objects of type Point, ignore all other
1368+ geometry types
1369+ - ` options ` (object, _ optional_ ): options for fine-tuning geo queries.
1370+ These options should generally remain unchanged
1371+ - ` maxCells ` (number, _ optional_ ): maximum number of S2 cells (default: 20)
1372+ - ` minLevel ` (number, _ optional_ ): the least precise S2 level (default: 4)
1373+ - ` maxLevel ` (number, _ optional_ ): the most precise S2 level (default: 23)
1374+
1375+ ** Examples**
1376+
1377+ Create a collection with GeoJSON Points stored in an attribute ` location ` , a
1378+ ` geo_s2 ` Analyzer with the ` latLngInt ` format, and a View using the Analyzer.
1379+ Then query for locations that are within a 3 kilometer radius of a given point
1380+ and return the matched documents, including the calculated distance in meters.
1381+ The stored coordinate pairs and the ` GEO_POINT() ` arguments are expected in
1382+ longitude, latitude order:
1383+
1384+ {% arangoshexample examplevar="examplevar" script="script" result="result" %}
1385+ @startDocuBlockInline analyzerGeoS2
1386+ @EXAMPLE_ARANGOSH_OUTPUT{analyzerGeoS2}
1387+ var analyzers = require("@arangodb/analyzers ");
1388+ var a = analyzers.save("geo_efficient", "geo_s2", { format: "latLngInt" }, [ "frequency", "norm", "position"] );
1389+ db._ create("geo");
1390+ | db.geo.save([
1391+ | { location: { type: "Point", coordinates: [ 6.937, 50.932] } },
1392+ | { location: { type: "Point", coordinates: [ 6.956, 50.941] } },
1393+ | { location: { type: "Point", coordinates: [ 6.962, 50.932] } },
1394+ ] );
1395+ | db._ createView("geo_view", "arangosearch", {
1396+ | links: {
1397+ | geo: {
1398+ | fields: {
1399+ | location: {
1400+ | analyzers: [ "geo_efficient"]
1401+ | }
1402+ | }
1403+ | }
1404+ | }
1405+ });
1406+ ~ assert(db._ query(` FOR d IN geo_view COLLECT WITH COUNT INTO c RETURN c ` ).toArray()[ 0] === 3);
1407+ | db._ query(`LET point = GEO_POINT(6.93, 50.94)
1408+ | FOR doc IN geo_view
1409+ | SEARCH ANALYZER(GEO_DISTANCE(doc.location, point) < 2000, "geo_efficient")
1410+ RETURN MERGE(doc, { distance: GEO_DISTANCE(doc.location, point) })`).toArray();
1411+ ~ db._ dropView("geo_view");
1412+ ~ analyzers.remove("geo_efficient", true);
1413+ ~ db._ drop("geo");
1414+ @END_EXAMPLE_ARANGOSH_OUTPUT
1415+ @endDocuBlock analyzerGeoS2
1416+ {% endarangoshexample %}
1417+ {% include arangoshexample.html id=examplevar script=script result=result %}
1418+
1419+ The calculated distance between the reference point and the point stored in
1420+ the second document is ` 1825.1307… ` . If you change the search condition to
1421+ ` < 1825.1303 ` , then the document is still returned despite the distance being
1422+ higher than this value. This is due to the precision limitations of the
1423+ ` latLngInt ` format. The returned distance is unaffected because it is calculated
1424+ independent of the Analyzer. If you use either of the other two formats which
1425+ preserve the exact coordinate values, then the document is filtered out as
1426+ expected.
1427+
13051428### ` geopoint `
13061429
13071430<small >Introduced in: v3.8.0</small >
13081431
1309- An Analyzer capable of breaking up JSON object describing a coordinate into a
1310- set of indexable tokens for further usage with
1432+ An Analyzer capable of breaking up a coordinate array in ` [latitude, longitude] `
1433+ order or a JSON object describing a coordinate pair using two separate attributes
1434+ into a set of indexable tokens for further usage with
13111435[ ArangoSearch Geo functions] ( aql/functions-arangosearch.html#geo-functions ) .
13121436
13131437The Analyzer can be used for two different coordinate representations:
1314- - an array with two numbers as elements in the format
1315- ` [<latitude>, <longitude>] ` , e.g. ` [40.78, -73.97] ` .
1316- - two separate number attributes, one for latitude and one for
1317- longitude, e.g. ` { location: { lat: 40.78, lon: -73.97 } } ` .
1438+
1439+ - an array with two numbers as elements in the following format:
1440+
1441+ ``` js
1442+ [ 40.78 , - 73.97 ] // [ latitude, longitude ]
1443+ ```
1444+
1445+ - two separate numeric attributes, one for latitude and one for longitude, as
1446+ shown below:
1447+
1448+ ``` js
1449+ { " location" : { " lat" : 40.78 , " lon" : - 73.97 } }
1450+ ```
1451+
13181452 The attributes cannot be at the top level of the document, but must be nested
13191453 like in the example, so that the Analyzer can be defined for the field
13201454 ` location ` with the Analyzer properties
@@ -1337,10 +1471,10 @@ attributes:
13371471
13381472** Examples**
13391473
1340- Create a collection with coordinates pairs stored in an attribute ` location ` ,
1474+ Create a collection with coordinate pairs stored in an attribute ` location ` ,
13411475a ` geopoint ` Analyzer with default properties, and a View using the Analyzer.
13421476Then query for locations that are within a 3 kilometer radius of a given point.
1343- The stored coordinates are in latitude, longitude order, but ` GEO_POINT() ` and
1477+ The stored coordinate pairs are in latitude, longitude order, but ` GEO_POINT() ` and
13441478` GEO_DISTANCE() ` expect longitude, latitude order:
13451479
13461480{% arangoshexample examplevar="examplevar" script="script" result="result" %}
@@ -1378,7 +1512,7 @@ The stored coordinates are in latitude, longitude order, but `GEO_POINT()` and
13781512{% endarangoshexample %}
13791513{% include arangoshexample.html id=examplevar script=script result=result %}
13801514
1381- Create a collection with coordinates stored in an attribute ` location ` as
1515+ Create a collection with coordinate pairs stored in an attribute ` location ` as
13821516separate nested attributes ` lat ` and ` lng ` , a ` geopoint ` Analyzer that
13831517specifies the attribute paths to the latitude and longitude attributes
13841518(relative to ` location ` attribute), and a View using the Analyzer.
0 commit comments