Skip to content

Commit 3af5fd0

Browse files
committed
Update colab urls
1 parent 4a925ec commit 3af5fd0

File tree

7 files changed

+408
-382
lines changed

7 files changed

+408
-382
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This repository contains executable Python notebooks, sample apps, and resources
88

99
# Python notebooks 📒
1010

11-
The [`notebooks` folder](notebooks/README.md) contains a range of executable Python notebooks, so you can test these features out for yourself. Colab provides an easy-to-use Python virtual environment in the browser.
11+
The [`notebooks`](notebooks/README.md) folder contains a range of executable Python notebooks, so you can test these features out for yourself. Colab provides an easy-to-use Python virtual environment in the browser.
1212

1313
# Example apps 💻
1414

example-apps/relevance-workbench/README.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,24 @@
11
# Relevance workbench
22

3-
This application allows you to compare results ranking between the ELSER model and BM25.
3+
This application allows you to compare results ranking between the [Elastic Learned Sparse EncodeR](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html) model and BM25.
44

55
![relevance-workbench](./images/relevance-workbench.png)
66

7-
You can also try with your own data by forking this repo and plugging the application to your own Elasticsearch deployment.
7+
You can also try with your own data by forking this repo and plugging the application into your own Elasticsearch deployment.
88

99
# Deploy with default dataset
1010

1111
## Pre-requisites
1212

13-
To run this demo successfully, you will need an Elasticsearch deployment (> 8.8) with the ELSER model deployed. The easiest way for this is to use Elastic Cloud as described in the next part but you can also deploy Elasticsearch locally.
13+
To run this demo successfully, you will need an Elasticsearch deployment (> 8.8) with the ELSER model deployed. The easiest way for this is to use Elastic Cloud as described in the next part but you can also deploy Elasticsearch locally.
1414

1515
## Deploy Elasticsearch in Elastic Cloud
1616

1717
If you don't have an Elastic Cloud account, you can start by signing up for a [free Elastic Cloud trial](https://cloud.elastic.co/registration). After creating an account, you’ll have an active subscription, and you’ll be prompted to create your first deployment.
1818

1919
Follow the steps to Create a new deployment. For more details, refer to [Create a deployment](https://www.elastic.co/guide/en/cloud/current/ec-create-deployment.html) in the Elastic Cloud documentation.
2020

21-
For that demo, you will need to have an Elasticsearch deployment with Enterprise Search enabled and a ML node with at least 4Gb of memory.
21+
For that demo, you will need to have an Elasticsearch deployment with Enterprise Search enabled and a ML node with at least 4Gb of memory.
2222

2323
Once created don't forget to save the default admin password in a safe place.
2424

@@ -32,23 +32,23 @@ You can follow this [documentation](https://www.elastic.co/guide/en/machine-lear
3232

3333
The best approach is to use Enterprise Search to create a new index and configure the ingest pipeline to enrich the data.
3434

35-
From the landing page in Kibana, navigate to Enterprise Search.
35+
From the landing page in Kibana, navigate to Enterprise Search.
3636

3737
![ent-search-landing](./images/ent-search-landing.png)
3838

39-
### Create the index
39+
### Create the index
4040

41-
Here, click on Create an Elasticsearch index and choose the API method.
41+
Here, click on Create an Elasticsearch index and choose the API method.
4242

43-
Name the index `search-movies` (notice the existing prefix `search-`), and click on Create index.
43+
Name the index `search-movies` (notice the existing prefix `search-`), and click on Create index.
4444

4545
### Configure the ingest pipeline
4646

4747
On the index configuration screen, navigate to the Pipelines tab and click on Copy and customize.
4848

4949
![copy-customize](./images/copy-customize.png)
5050

51-
Now click on Add Inference pipeline to configure the inference pipeline.
51+
Now click on Add Inference pipeline to configure the inference pipeline.
5252

5353
Name the inference pipeline `ml-inference-movies` (notice the existing prefix `ml-inference-`) and select the ELSER model. If it's not available, you can deploy it on the previous screen. Click continue.
5454

@@ -58,15 +58,15 @@ On the next screen, add the fields `overview` and `title` as custom options.
5858

5959
![fields-mapping](./images/fields-mapping.png)
6060

61-
Then click on Continue.
61+
Then click on Continue.
6262

6363
![add-inference-fields](./images/add-inference-fields.png)
6464

65-
Click Continue to review the changes and then Create pipeline.
65+
Click Continue to review the changes and then Create pipeline.
6666

6767
### Run the script to ingest data
6868

69-
Go to the folder `data` and run the python script `index-data.py` to ingest the movies dataset.
69+
Go to the folder `data` and run the python script `index-data.py` to ingest the movies dataset.
7070

7171
In order to connect it to the correct Elastic Cloud instance, we need the default admin password you saved after creating the deployment and the Cloud ID for your deployment.
7272

@@ -87,33 +87,33 @@ python3 index-data.py --es_password=<ELASTICSEARCH_PASSWORD> --cloud_id=<CLOUD_I
8787
- ELASTICSEARCH_PASSWORD: Use the default admin password previously saved
8888
- CLOUD_ID: See instructions above to retrieve it
8989

90-
Note that by default, only subset of the dataset (100 movies) is indexed. If you're interested in indexing the whole data (7918 movies), you can select the `movies.json.gz` file by adding the option `--gzip_file=movies.json.gz` to the command line. Note that it might take up to 1 hour to index the full dataset.
90+
Note that by default, only subset of the dataset (100 movies) is indexed. If you're interested in indexing the whole data (7918 movies), you can select the `movies.json.gz` file by adding the option `--gzip_file=movies.json.gz` to the command line. Note that it might take up to 1 hour to index the full dataset.
9191

9292
## Run the application
9393

94-
Once the data have been succesfully indexed, you can run the application to start comparing relevance models.
94+
Once the data have been successfully indexed, you can run the application to start comparing relevance models.
9595

96-
The application is composed of a backend Python API and a React frontend. You can run the whole application locally using Docker compose.
96+
The application is composed of a backend Python API and a React frontend. You can run the whole application locally using Docker compose.
9797

98-
Edit the `docker-compose.yml` file to replace values for. Reuse the same information that you use for loading the data.
98+
Edit the `docker-compose.yml` file to replace values for. Reuse the same information that you use for loading the data.
9999
- CLOUD_ID=<CLOUD_ID>
100100
- ELASTICSEARCH_PASSWORD=<ELASTICSEARCH_PASSWORD>
101101

102-
Run `docker-compose up` to start the application.
102+
Run `docker-compose up` to start the application.
103103

104104
Open [localhost:3000](http://localhost:3000) to access the application.
105105

106106
# Use your own dataset
107107

108-
To use your own dataset, you first need to ingest it and then configure the backend API to use it.
108+
To use your own dataset, you first need to ingest it and then configure the backend API to use it.
109109

110110
## Load your own data
111111

112-
The first part described in the chapter below can be used similarly to load your own data.
112+
The first part described in the chapter below can be used similarly to load your own data.
113113

114-
Use Enterprise Search to create a new index and configure an ML Inference pipeline. In this case, you'll need to choose yourself the fields to generate text expansion, note that ELSER inference works across text fields, and best on shorter spans of text. Those are the fields that the relevance workbench will query.
114+
Use Enterprise Search to create a new index and configure an ML Inference pipeline. In this case, you'll need to choose yourself the fields to generate text expansion, note that ELSER inference works across text fields, and best on shorter spans of text. Those are the fields that the relevance workbench will query.
115115

116-
Once the index is ready, use the same Python script to ingest the data, with additional options.
116+
Once the index is ready, use the same Python script to ingest the data, with additional options.
117117

118118
```
119119
python3 index-data.py --es_password=<ELASTICSEARCH_PASSWORD> --cloud_id=<CLOUD_ID> --index_name=<INDEX_NAME> --gzip_file=<GZIP_FILE_NAME>
@@ -122,13 +122,13 @@ python3 index-data.py --es_password=<ELASTICSEARCH_PASSWORD> --cloud_id=<CLOUD_I
122122
- ELASTICSEARCH_PASSWORD: Use the default admin password previously saved
123123
- CLOUD_ID: You can find this information in your Elastic Cloud admin console
124124
- INDEX_NAME: Your own custom index
125-
- GZIP_FILE_NAME: The name of the GZIP JSON file containing your dataset. To be placed under `data` folder.
125+
- GZIP_FILE_NAME: The name of the GZIP JSON file containing your dataset. To be placed under `data` folder.
126126

127127
## Configure the backend API
128128

129-
At the beginning of the file `app-api/app.py`, you can find an object that configure the datasets to use.
129+
At the beginning of the file `app-api/app.py`, you can find an object that configure the datasets to use.
130130

131-
By default, it looks like this:
131+
By default, it looks like this:
132132

133133
```
134134
datasets = {
@@ -144,7 +144,7 @@ datasets = {
144144
}
145145
```
146146

147-
To add a new dataset, simply adds a new entry in the datasets object.
147+
To add a new dataset, simply adds a new entry in the datasets object.
148148

149149
```
150150
datasets = {
@@ -169,15 +169,15 @@ datasets = {
169169
}
170170
```
171171

172-
In the configuration of the new dataset, provides the following informations:
172+
In the configuration of the new dataset, provides the following informations:
173173
- index: Name of the index
174174
- search_fields: Fields to query for BM25
175175
- elser_search_fields: Fields to query for ELSER
176176
- result_fields: Fields to return
177177
- mapping_fields: Mapping between returned fields and fields expected by the front-end
178178

179-
Then save the file `app.py` and run the application using docker-compose.
179+
Then save the file `app.py` and run the application using docker-compose.
180180

181-
## Credits
181+
## Credits
182182

183183
<img src="./images/tmdb-logo.svg" width="40"> This product uses the TMDB API but is not endorsed or certified by TMDB.

notebooks/generative-ai/question-answering.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"source": [
1010
"# Question Answering with Langchain and OpenAI\n",
1111
"\n",
12-
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/colab-notebooks-examples/generative-ai/question-answering.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
12+
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/generative-ai/question-answering.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
1313
"\n",
1414
"This interactive notebook uses Langchain to split fictional workplace documents into passages and uses OpenAI to transform these passages into embeddings and store them into Elasticsearch.\n",
1515
"\n",

notebooks/search/00-quick-start.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"source": [
1010
"# Elasticsearch quick start: embeddings, semantic search, and hybrid search\n",
1111
"\n",
12-
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/colab-notebooks-examples/search/00-quick-start.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
12+
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/00-quick-start.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
1313
"\n",
1414
"This interactive notebook will introduce you to some basic operations with Elasticsearch, using the official [Elasticsearch Python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html).\n",
1515
"You'll perform semantic search using [Sentence Transformers](https://www.sbert.net) for text embedding. Learn how to integrate traditional text-based search with semantic search, for a hybrid search system."
@@ -239,7 +239,7 @@
239239
"source": [
240240
"### Index test data\n",
241241
"\n",
242-
"Run the following command to upload some test data, containing information about 10 popular programming books from this [dataset](https://raw.githubusercontent.com/elastic/elasticsearch-labs/notebooks-guides/colab-notebooks-examples/search/data.json).\n",
242+
"Run the following command to upload some test data, containing information about 10 popular programming books from this [dataset](https://raw.githubusercontent.com/elastic/elasticsearch-labs/blob/main/notebooks/search/data.json).\n",
243243
"`model.encode` will encode the text into a vector on the fly, using the model we initialized earlier."
244244
]
245245
},
@@ -255,7 +255,7 @@
255255
"import json\n",
256256
"from urllib.request import urlopen\n",
257257
"\n",
258-
"url = \"https://raw.githubusercontent.com/elastic/elasticsearch-labs/notebooks-guides/colab-notebooks-examples/search/data.json\"\n",
258+
"url = \"https://raw.githubusercontent.com/elastic/elasticsearch-labs/blob/main/search/data.json\"\n",
259259
"response = urlopen(url)\n",
260260
"books = json.loads(response.read())\n",
261261
"\n",

0 commit comments

Comments
 (0)