- Notifications
You must be signed in to change notification settings - Fork 25.7k
Closed
Labels
:Core/Infra/CoreCore issues without another labelCore issues without another labelbug"" muted="" aria-describedby="MDU6TGFiZWwyMzE3Mw==-tooltip :R1aqdb:">>bugSupportabilityImprove our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better.Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better.Team:Core/InfraMeta label for core/infra teamMeta label for core/infra team
Description
Elasticsearch Version
8.3.3
Installed Plugins
No response
Java Version
bundled
OS Version
Deployment in ESS
Problem Description
.fleet-actions-results data stream cannot be restored via the fleet feature state.
Consider the following scenario (observed in the field in ESS):
- Due to unforeseen situation, cluster becomes red with the following red indices:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size sth green open .ds-.fleet-actions-results-2022.05.04-000002 eZO3mXu3RYOZpygHvC2dgQ 1 1 0 0 450b 225b false red open .ds-.fleet-actions-results-2022.06.03-000003 iBbSWmHaQbqJFn_aBVqaYg 1 1 false red open .ds-.fleet-actions-results-2022.07.03-000004 sF3-S4uoQkybpm7ujaZBVg 1 1 false red open .ds-.fleet-actions-results-2022.08.02-000006 t-U-Wrd_RpqZUqSS2a3TqA 1 1 false red open .fleet-actions-7 8zgOKVzdQIeS_YGq_JX--w 1 1 false red open .fleet-agents-7 p7sWhvhPRaWQ_unOHIJQTQ 1 1 false red open .fleet-artifacts-7 iingfeghRJ2bfqLAGFt0Aw 1 1 false red open .fleet-enrollment-api-keys-7 8J1tyEuJSfyhMxf5HsfU2A 1 1 false red open .fleet-policies-7 HufDBhgBQraUYlNosY1ysg 1 1 false red open .fleet-policies-leader-7 jpqhCaF9SL-S0AjlWqa6xg 1 1 false red open .fleet-servers-7 5xdgNy-kSXSdsWZbM8mRHw 1 1 false - User attempts to restore the
fleetfeature state using the following restore snapshot API:
POST _snapshot/found-snapshots/cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/_restore?wait_for_completion=false { "indices": "-*", "ignore_unavailable": "true", "include_global_state": "false", "include_aliases": "false", "feature_states": [ "fleet" ] } - Above API fails with the following error:
{ "error": { "root_cause": [ { "type": "snapshot_restore_exception", "reason": "[found-snapshots:cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/H3i28HlrSiKyrLaiDCE6uA] cannot restore index [.ds-.fleet-actions-results-2022.06.03-000003] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name" } ], "type": "snapshot_restore_exception", "reason": "[found-snapshots:cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/H3i28HlrSiKyrLaiDCE6uA] cannot restore index [.ds-.fleet-actions-results-2022.06.03-000003] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name" }, "status": 500 } - Checking the
fleetfeature state, it seems that theSystemIndexDescriptor(c.f code) does contain the.fleet-actions-results-*pattern. A couple of guesses about the reported problem:
- The implementation only considers regular indices and not data streams?
- The implementation considers the data stream but fails to close the backing indices before restoring them?
Steps to Reproduce
- Create a cluster version 8.3.3 and deploy an Elastic Agent with the Osquery Manager integration.
- Run a new live Osquery.
- Observe that the
.fleet-actions-resultsdata stream is created with the respective backing indices. - Restore the
fleetfeature state using the restore snapshot API and observe the same error as above.
Workaround
- Create
fleet_superuserrole
POST _security/role/fleet_superuser { "indices": [ { "names": [ ".fleet*" ], "privileges": [ "all" ], "allow_restricted_indices": true } ] } - Create
temp_useruser withsuperuser,fleet_superuserroles:
POST _security/user/temp_user { "password": "temp_password", "roles": [ "superuser", "fleet_superuser" ] } - Close
.fleet-actions-resultsbacking indices using the below cURL command:
curl -k -XPOST --user temp_user:temp_password -H 'x-elastic-product-origin:fleet' https://$CLUSTER_ADDRESS/.ds-.fleet-actions-results-2022.05.04-000002,.ds-.fleet-actions-results-2022.06.03-000003,.ds-.fleet-actions-results-2022.07.03-000004,.ds-.fleet-actions-results-2022.08.02-000006/_close Note: for users running the cURL command on Windows, make sure to use double quotes instead for the header: "x-elastic-product-origin:fleet"
- Restore fleet feature state:
POST _snapshot/found-snapshots/cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/_restore?wait_for_completion=false { "indices": "-*", "ignore_unavailable": "true", "include_global_state": "false", "include_aliases": "false", "feature_states": [ "fleet" ] } - Delete
temp_useruser
DELETE _security/user/temp_user - Delete
fleet_superuserrole
DELETE _security/role/fleet_superuser Logs (if relevant)
No response
renshuki, kunisen, shoulian-zhao, stefnestor, louisong and 2 more
Metadata
Metadata
Assignees
Labels
:Core/Infra/CoreCore issues without another labelCore issues without another labelbug"" muted="" aria-describedby="MDU6TGFiZWwyMzE3Mw==-tooltip :R2hehb:">>bugSupportabilityImprove our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better.Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better.Team:Core/InfraMeta label for core/infra teamMeta label for core/infra team