Skip to content

ES CLI elasticsearch-shard throws exception on startup  #97350

@pgomulka

Description

@pgomulka

Elasticsearch Version

8.7, 8.8, 8.9

Installed Plugins

No response

Java Version

bundled

OS Version

n/a

Problem Description

when using a CLI bin/elasticsearch-shard remove-corrupted-data --index .ds-my-data-stream-2023.07.04-000001 --shard-id 0 an exception is thrown on startup

----------------------------------------------------------------------- Exception in thread "main" java.lang.ExceptionInInitializerError	at org.elasticsearch.cluster.metadata.DataStreamMetadata.lambda$static$2(DataStreamMetadata.java:71)	at org.elasticsearch.xcontent.ObjectParser.lambda$declareField$10(ObjectParser.java:431)	at org.elasticsearch.xcontent.ObjectParser.parseValue(ObjectParser.java:609)	at org.elasticsearch.xcontent.ObjectParser.parseSub(ObjectParser.java:629)	at org.elasticsearch.xcontent.ObjectParser.parse(ObjectParser.java:315)	at org.elasticsearch.xcontent.ConstructingObjectParser.parse(ConstructingObjectParser.java:166)	at org.elasticsearch.cluster.metadata.DataStreamMetadata.fromXContent(DataStreamMetadata.java:229)	at org.elasticsearch.xcontent.NamedXContentRegistry$Entry.lambda$new$0(NamedXContentRegistry.java:54)	at org.elasticsearch.xcontent.NamedXContentRegistry.parseNamedObject(NamedXContentRegistry.java:147)	at org.elasticsearch.cluster.coordination.ElasticsearchNodeCommand$1.parseNamedObject(ElasticsearchNodeCommand.java:77)	at org.elasticsearch.xcontent.support.AbstractXContentParser.namedObject(AbstractXContentParser.java:414)	at org.elasticsearch.cluster.metadata.Metadata$Builder.fromXContent(Metadata.java:2663)	at org.elasticsearch.gateway.PersistedClusterStateService.readXContent(PersistedClusterStateService.java:671)	at org.elasticsearch.gateway.PersistedClusterStateService.lambda$loadOnDiskState$6(PersistedClusterStateService.java:571)	at org.elasticsearch.gateway.PersistedClusterStateService.consumeFromType(PersistedClusterStateService.java:715)	at org.elasticsearch.gateway.PersistedClusterStateService.loadOnDiskState(PersistedClusterStateService.java:570)	at org.elasticsearch.gateway.PersistedClusterStateService.loadBestOnDiskState(PersistedClusterStateService.java:494)	at org.elasticsearch.gateway.PersistedClusterStateService.loadBestOnDiskState(PersistedClusterStateService.java:409)	at org.elasticsearch.cluster.coordination.ElasticsearchNodeCommand.loadTermAndClusterState(ElasticsearchNodeCommand.java:131)	at org.elasticsearch.index.shard.RemoveCorruptedShardDataCommand.processDataPaths(RemoveCorruptedShardDataCommand.java:240)	at org.elasticsearch.cluster.coordination.ElasticsearchNodeCommand.processDataPaths(ElasticsearchNodeCommand.java:145)	at org.elasticsearch.cluster.coordination.ElasticsearchNodeCommand.execute(ElasticsearchNodeCommand.java:163)	at org.elasticsearch.common.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:54)	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:85)	at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:94)	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:85)	at org.elasticsearch.cli.Command.main(Command.java:50)	at org.elasticsearch.launcher.CliToolLauncher.main(CliToolLauncher.java:64) Caused by: java.lang.NullPointerException: Cannot invoke "org.elasticsearch.logging.internal.spi.LoggerFactory.getLogger(java.lang.Class)" because the return value of "org.elasticsearch.logging.internal.spi.LoggerFactory.provider()" is null	at org.elasticsearch.logging.LogManager.getLogger(LogManager.java:35)	at org.elasticsearch.cluster.metadata.DataStreamAlias.<clinit>(DataStreamAlias.java:50)	... 28 more 

This is because we do not initialise ES logging on CLI tool startup.

This affects ES 8.7 and 8.8 and 8.9 (to be released)
The DataStreamAlias class https://github.com/elastic/elasticsearch/pull/92692/files#diff-63c928f0bd8f043eb462351edf8e20cfedb81bed39aa0d153057697985ab4e33R19 is using the es logging. That class is used when elasticsearch-shard CLI is run to fix datastream with an alias

Steps to Reproduce

create a datastream
index documents
create alias
index some docs
break the index (with CorruptionUtil)
run the tool

curl --request PUT \ --url https://localhost:9200/_ilm/policy/my-lifecycle-policy \ --header 'Authorization: Basic ZWxhc3RpYzpYdVp0aWUzZzctOXY3cG1tbEtiRg==' \ --header 'Content-Type: application/json' \ --data '{ "policy": { "phases": { "hot": { "actions": { "rollover": { "max_primary_shard_size": "50gb" } } }, "warm": { "min_age": "30d", "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } }, "delete": { "min_age": "735d", "actions": { "delete": {} } } } } }'

curl --request PUT \ --url https://localhost:9200/_component_template/my-mappings \ --header 'Authorization: Basic ZWxhc3RpYzpYdVp0aWUzZzctOXY3cG1tbEtiRg==' \ --header 'Content-Type: application/json' \ --data ' { "template": { "mappings": { "properties": { "@timestamp": { "type": "date", "format": "date_optional_time||epoch_millis" }, "message": { "type": "wildcard" } } } }, "_meta": { "description": "Mappings for @timestamp and message fields", "my-custom-meta-field": "More arbitrary metadata" } } '

curl --request PUT \ --url https://localhost:9200/_component_template/my-settings \ --header 'Authorization: Basic ZWxhc3RpYzpYdVp0aWUzZzctOXY3cG1tbEtiRg==' \ --header 'Content-Type: application/json' \ --data '{ "template": { "settings": { "index.lifecycle.name": "my-lifecycle-policy" } }, "_meta": { "description": "Settings for ILM", "my-custom-meta-field": "More arbitrary metadata" } }'

curl --request PUT \ --url https://localhost:9200/_index_template/my-index-template \ --header 'Authorization: Basic ZWxhc3RpYzpYdVp0aWUzZzctOXY3cG1tbEtiRg==' \ --header 'Content-Type: application/json' \ --data '{ "index_patterns": ["my-data-stream*"], "data_stream": { }, "composed_of": [ "my-mappings", "my-settings" ], "priority": 500, "_meta": { "description": "Template for my time series data", "my-custom-meta-field": "More arbitrary metadata" } }'

curl --request PUT \ --url https://localhost:9200/my-data-stream/_bulk \ --header 'Authorization: Basic ZWxhc3RpYzpYdVp0aWUzZzctOXY3cG1tbEtiRg==' \ --header 'Content-Type: application/json' \ --data '{ "create":{ } } { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" } { "create":{ } } { "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" } '

curl --request POST \ --url https://localhost:9200/_aliases \ --header 'Authorization: Basic ZWxhc3RpYzpYdVp0aWUzZzctOXY3cG1tbEtiRg==' \ --header 'Content-Type: application/json' \ --data '{ "actions": [ { "add": { "index": "my-data-stream", "alias": "aliasxx" } } ] }'

Run corruption util tool (in order to find out what the path to corrupt is, run the elasticsearch-shard tool on the healthy index first)

bin/elasticsearch-shard remove-corrupted-data --index .ds-my-data-stream-2023.07.04-000001 --shard-id 0

It will give you an output with a message

Opening Lucene index at /Users/yourname/scratch/elasticsearch-8.8.2/data/indices/ZYxSAeoxQg-cSPKnwUJ1kA/0/index 

Use that path to corrupt that index

 CorruptionUtils.corruptIndex(new Random(),Path.of("/Users/yourname/scratch/elasticsearch-8.8.2/data/indices/ZYxSAeoxQg-cSPKnwUJ1kA/0/index"),false); 

run the tool again
bin/elasticsearch-shard remove-corrupted-data --index .ds-my-data-stream-2023.07.04-000001 --shard-id 0
an exception is thrown

Logs (if relevant)

n/a

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions