- Notifications
You must be signed in to change notification settings - Fork 114
[Transform] Specify use_point_in_time in settings #5322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Following you can find the validation changes against the target branch for the APIs. No changes detected. You can validate these APIs yourself by using the |
max_page_search_size?: integer | ||
/** | ||
* Specifies whether the transform checkpoint will use the Point In Time API while searching over the source index. | ||
* @ext_doc_id point-in-time-api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elastic/developer-docs Is this a correct usage of ext_doc_id
?
This Transform API can be configured to use the PIT API, and here point-in-time-api
links to the point-in-time API docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pquentin, the @ext_doc_id
looks fine here. I see it’s already in the table.csv
file as well, so it should be good. What we usually do to make sure the links appear correctly is test them locally by running these (you should have the bump CLI installed):
make generate make transform-to-openapi bump preview output/openapi/elasticsearch-openapi.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to have worked as intended:
And it correctly links to https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-open-point-in-time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI we could add a description field to the link definition in
point-in-time-api,https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-open-point-in-time,https://www.elastic.co/guide/en/elasticsearch/reference/8.18/point-in-time-api.html, |
and this would be render nicer link text than just External documentation
💅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be as simple as Point in Time API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it would be nice to name this "Open a point in time API" instead of "External documentation."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*/ | ||
max_page_search_size?: integer | ||
/** | ||
* Specifies whether the transform checkpoint will use the Point In Time API while searching over the source index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you set it to false? It uses the scroll API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah it just uses a search request without PIT: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/transform/src/main/java/org/elasticsearch/xpack/transform/transforms/TransformIndexer.java#L1154
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we'd explain the pros/cons differences of each approach, or at least outline why you'd want to use the PIT, rather than just the how :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should that go on the PIT page? in either case, we'd need someone familiar with PIT to go into that detail, I'm just trying to show what users can configure rather than keeping the setting hidden
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I missed is that transforms use composite aggregations, which offer pagination without using PIT. However, transforms use PIT by default, as it returns more correct results. That said, as it can be expensive if many PITs are open at any given time, this can be disabled. Is my understanding correct?
This is not about explaining what PIT does (especially given that we have a link), but making it clear what the setting does. For example, max_page_search_size
is clear here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not about explaining what PIT does (especially given that we have a link), but making it clear what the setting does.
right Quentin said it much clearer than me 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM! Sorry for the delay.
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub git fetch # Create a new working tree git worktree add .worktrees/backport-8.19 8.19 # Navigate to the new working tree cd .worktrees/backport-8.19 # Create a new branch git switch --create backport-5322-to-8.19 # Cherry-pick the merged commit of this pull request and resolve the conflicts git cherry-pick -x --mainline 1 9fe24d15ae7841b87f6e2d2c745374b08d2e55c4 # Push it to GitHub git push --set-upstream origin backport-5322-to-8.19 # Go back to the original working tree cd ../.. # Delete the working tree git worktree remove .worktrees/backport-8.19 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub git fetch # Create a new working tree git worktree add .worktrees/backport-8.18 8.18 # Navigate to the new working tree cd .worktrees/backport-8.18 # Create a new branch git switch --create backport-5322-to-8.18 # Cherry-pick the merged commit of this pull request and resolve the conflicts git cherry-pick -x --mainline 1 9fe24d15ae7841b87f6e2d2c745374b08d2e55c4 # Push it to GitHub git push --set-upstream origin backport-5322-to-8.18 # Go back to the original working tree cd ../.. # Delete the working tree git worktree remove .worktrees/backport-8.18 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub git fetch # Create a new working tree git worktree add .worktrees/backport-9.0 9.0 # Navigate to the new working tree cd .worktrees/backport-9.0 # Create a new branch git switch --create backport-5322-to-9.0 # Cherry-pick the merged commit of this pull request and resolve the conflicts git cherry-pick -x --mainline 1 9fe24d15ae7841b87f6e2d2c745374b08d2e55c4 # Push it to GitHub git push --set-upstream origin backport-5322-to-9.0 # Go back to the original working tree cd ../.. # Delete the working tree git worktree remove .worktrees/backport-9.0 Then, create a pull request where the |
use_point_in_time has been in the docs since before 8.18 Resolve elastic/elasticsearch#134841 (cherry picked from commit 9fe24d1)
use_point_in_time has been in the docs since before 8.18 Resolve elastic/elasticsearch#134841 (cherry picked from commit 9fe24d1)
use_point_in_time has been in the docs since before 8.18 Resolve elastic/elasticsearch#134841 (cherry picked from commit 9fe24d1) # Conflicts: # specification/_doc_ids/table.csv
use_point_in_time has been in the docs since before 8.18 Resolve elastic/elasticsearch#134841 (cherry picked from commit 9fe24d1) # Conflicts: # specification/_doc_ids/table.csv
use_point_in_time
has been in the docs since before 8.18Resolve elastic/elasticsearch#134841