- Notifications
You must be signed in to change notification settings - Fork 1.2k
Add mappings and bulk to quickstart page #2417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| | @@ -41,6 +41,11 @@ You can generate an API key on the **Management** page under Security. | |||||
| | ||||||
| .. image:: ../guide/images/create-api-key.png | ||||||
| | ||||||
| Confirm that the connection was successful. | ||||||
| | ||||||
| .. code-block:: python | ||||||
| | ||||||
| print(client.info()) | ||||||
| | ||||||
| Using the client | ||||||
| ---------------- | ||||||
| | @@ -49,6 +54,29 @@ Time to use Elasticsearch! This section walks you through the most important | |||||
| operations of Elasticsearch. The following examples assume that the Python | ||||||
| client was instantiated as above. | ||||||
| | ||||||
| Create a mapping for your index | ||||||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||||||
| | ||||||
| Set the expected types of your features. | ||||||
| | ||||||
| .. code-block:: python | ||||||
| | ||||||
| mappings = { | ||||||
| "properties": { | ||||||
| "foo": { | ||||||
| "type" : "text" | ||||||
| }, | ||||||
| "bar" : { | ||||||
| "type" : "text", | ||||||
| "fields" : { | ||||||
| "keyword" : { | ||||||
| "type" : "keyword", | ||||||
| "ignore_above" : 256 | ||||||
| } | ||||||
| } | ||||||
| } | ||||||
| } | ||||||
| } | ||||||
| | ||||||
| Creating an index | ||||||
| ^^^^^^^^^^^^^^^^^ | ||||||
| | @@ -57,7 +85,7 @@ This is how you create the `my_index` index: | |||||
| | ||||||
| .. code-block:: python | ||||||
| | ||||||
| client.indices.create(index="my_index") | ||||||
| client.indices.create(index="my_index", mappings = mappings) | ||||||
| ||||||
| client.indices.create(index="my_index", mappings = mappings) | |
| client.indices.create(index="my_index", mappings=mappings) |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is equivalent to yield {"_index": index_name, "_id": f"{i}", "_source": document}. Is there a specific reason to prefer using dict?
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with the mapping, this code is not runnable. Maybe generate_docs should be defined like that?
def generate_docs(): for i in range(10): yield { "_index": "my_index", "foo": f"foo {i}", "bar": "bar", } helpers.bulk(client, generate_docs())The advantages of this version:
- It can be copy/pasted directly
- It reuses the index created above
- It's easy to adapt to generate much more documents
- It does not specify a doc id, which is better for performance
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike Markdown, you need two backticks in reStructuredText to format as code: single backticks only add emphasis. What do you think of this wording?
| These helpers are the recommended simple and streamlined way to abstract otherwise complicated and verbose functions such as `client.bulk`. | |
| These helpers are the recommended way to perform bulk ingestion. While it is also possible to perform bulk ingestion using ``client.bulk`` directly, the helpers handle retries, ingesting chunk by chunk and more. See the :ref:`helpers` page for more details. |
I've considered linking to the client.bulk() docs using :meth:~elasticsearch.Elasticsearch.bulk` but I did not find it clearer as it's only rendered as "bulk()" with a link to the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please format the code using https://github.com/psf/black to fix the indentation? You may need a comma after 256 to get the results you want.