Elasticsearch
GitLab Enterprise Edition has Elasticsearch integration. In this document we explain how to set this up in your development environment.
Installation
Enable Elasticsearch in the GDK
The default version of Elasticsearch is automatically downloaded into your GDK root under /elasticsearch
.
Elasticsearch is deployed with a minimum (Xms
) and maximum (Xmx
) JVM Heap Size of 2GB. The options are defined in <gdk root directory>/elasticsearch/config/jvm.options.d/custom.options
.
To enable the service and make it run as part of gdk start
:
- Run
gdk config set elasticsearch.enabled true
- Run
gdk reconfigure
.
Using other search engines
Other versions of Elasticsearch
The default Elasticsearch version is defined in lib/gdk/config.rb
.
For example, to use 7.5.2:
Add the
version
and[linux|mac]_checksum
keys to yourgdk.yml
:elasticsearch: enabled: true version: 7.5.2
Install the selected version:
make elasticsearch-setup
Opensearch
While GDK does not support installing OpenSearch, it can be easily run with Docker:
docker run --rm --name opensearch1.2.4 -p 9201:9200 -e "plugins.security.disabled=true" -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
Setup
Warning
Indexing the instance will not run if SaaS mode is enabled.
Go to Admin Area > Subscription and ensure you have a license installed as this is required for Elasticsearch.
Start Elasticsearch by either running
elasticsearch
in a new terminal, or by starting the GDK service:gdk start elasticsearch
Go to Admin Area > Settings > Search > Advanced Search.
To enable indexing, select the Turn on indexing for advanced search checkbox.
To start indexing, select the Index the instance checkbox. You can also use the
gitlab:elastic:index
Rake task:cd gitlab && bundle exec rake gitlab:elastic:index
To check the indexing progress, go to Admin Area > Settings > Search > Advanced search indexing status. You can also use the
gitlab:elastic:info
Rake task:cd gitlab && bundle exec rake gitlab:elastic:info
Go to Admin Area > Settings > Search > Advanced Search and check the Search with advanced search checkbox. You can do this before indexing is complete, or wait until indexing is complete. Search results are incomplete until initial indexing is finished.
Info
The time it takes to finish indexing the instance depends on the amount of project data and total size of the repository data. The indexing process ensures data integrity by pausing indexing before deleting and recreating all indices. Indexing is un-paused before queueing all data in the instance for indexing.
Tips and Tricks
Indexing logs
All logs related to advanced search indexing are found in log/elasticsearch.log
. You can use jq
to format the logs for readability: tail -f log/elasticsearch.log | jq
.
Query log
To enable logging for all queries against Elasticsearch you can change the slow log settings to log every query. To do this you need to send a request to Elasticsearch to change the settings for the index you want to monitor. The example turns on the slowlog for the gitlab-development
index:
curl -H 'Content-Type: application/json' -XPUT "http://localhost:9200/gitlab-development/_settings" -d '{ "index.indexing.slowlog.threshold.index.debug" : "0s", "index.search.slowlog.threshold.fetch.debug" : "0s", "index.search.slowlog.threshold.query.debug" : "0s" }'
After this you can see every query by tailing the logs from you GDK root:
tail -f elasticsearch/logs/elasticsearch_index_search_slowlog.log
To get a list of indices available in Elasticsearch:
curl "http://localhost:9200/_cat/indices"
Rate limiting
The search endpoints are rate-limited and you might receive the following message:
This endpoint has been requested too many times. Try again later.
To increase the rate limiting for search requests, modify the search rate limit in the admin settings.
- Go to Admin Area > Settings > Network.
- Expand Search Rate Limit.
- Increase the Maximum number of requests per minute for an authenticated user.
- Select Save.