Skip to content

Conversation

prwhelan
Copy link
Member

Unified Chat Completion error responses now forward code, type, and param to in the response payload. reason has been renamed to message.

Notes:

  • XContentFormattedException is a ChunkedToXContent so that the REST listener can call toXContentChunked to format the output structure. By default, the structure forwards to our existing ES exception structure.
  • UnifiedChatCompletionException will override the structure to match the new unified format.
  • The Rest, Transport, and Stream handlers all check the exception to verify it is a UnifiedChatCompletionException.
  • OpenAI response handler now reads all the fields in the error message and forwards them to the user.
  • In the event that a Throwable is a Error, we rethrow it on another thread so the JVM can catch and handle it. We also stop surfacing the JVM details to the user in the error message (but it's still logged for debugging purposes).
Unified Chat Completion error responses now forward code, type, and param to in the response payload. `reason` has been renamed to `message`.
@prwhelan prwhelan added >bug :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v9.0.0 v8.18.0 labels Jan 31, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @prwhelan, I've created a changelog YAML for you.

@prwhelan prwhelan marked this pull request as ready for review January 31, 2025 16:49
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes Pat!

I think we'll also want to change this for EIS since it leverages the unified format as well:

https://github.com/elastic/elasticsearch/blob/18345c41ab707f2cdfcfe2fd3d942ae811f14803/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/http/sender/ElasticInferenceServiceUnifiedCompletionRequestManager.java

Let me know if we already have coverage for this but could we add a new integration test that spins up a mock web server to mock an openai error response? That way we can make a request to a live ES node and ensure that the response all the way back to the rest client is what we're expecting.

I'm thinking something like this: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/MockElasticInferenceServiceAuthorizationServer.java#L21

Which is used here: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/BaseMockEISAuthServerTest.java#L33


public UnifiedChatCompletionException(RestStatus status, String message, String type, @Nullable String code, @Nullable String param) {
super(message, status);
this.message = message;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some Objects.requireNonNull for the values that are required?

public static UnifiedChatCompletionException fromThrowable(Throwable t) {
if (t instanceof UnifiedChatCompletionException e) {
return e;
} else if (unwrapCause(t) instanceof UnifiedChatCompletionException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below we implement the unwrapCause(). It doesn't look like UnifiedChatCompletionException implements ElasticsearchWrapperException in the inheritance chain. Could we use ExceptionHelper.unwrapCause() and kind of like this:

public static UnifiedChatCompletionException fromThrowable2(Throwable t) { var unwrappedCause = ExceptionsHelper.unwrapCause(t); if (unwrappedCause instanceof UnifiedChatCompletionException e) { return e; } else { return maybeError(t).map(error -> { ... }); } } 
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think this method was a holdover from when UnifiedChatCompletionException was a ElasticsearchWrapperException. We can remove it

@jonathan-buttner jonathan-buttner changed the title [ML] Change format for Unified Chat [ML] Change format for Unified Chat error responses Jan 31, 2025
Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@prwhelan prwhelan enabled auto-merge (squash) February 5, 2025 13:53
@prwhelan prwhelan merged commit ad00113 into elastic:main Feb 5, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.0 Commit could not be cherrypicked due to conflicts
8.18 Commit could not be cherrypicked due to conflicts
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 121396

prwhelan added a commit to prwhelan/elasticsearch that referenced this pull request Feb 5, 2025
Unified Chat Completion error responses now forward code, type, and param to in the response payload. `reason` has been renamed to `message`. Notes: - `XContentFormattedException` is a `ChunkedToXContent` so that the REST listener can call `toXContentChunked` to format the output structure. By default, the structure forwards to our existing ES exception structure. - `UnifiedChatCompletionException` will override the structure to match the new unified format. - The Rest, Transport, and Stream handlers all check the exception to verify it is a UnifiedChatCompletionException. - OpenAI response handler now reads all the fields in the error message and forwards them to the user. - In the event that a `Throwable` is a `Error`, we rethrow it on another thread so the JVM can catch and handle it. We also stop surfacing the JVM details to the user in the error message (but it's still logged for debugging purposes).
prwhelan added a commit to prwhelan/elasticsearch that referenced this pull request Feb 5, 2025
Unified Chat Completion error responses now forward code, type, and param to in the response payload. `reason` has been renamed to `message`. Notes: - `XContentFormattedException` is a `ChunkedToXContent` so that the REST listener can call `toXContentChunked` to format the output structure. By default, the structure forwards to our existing ES exception structure. - `UnifiedChatCompletionException` will override the structure to match the new unified format. - The Rest, Transport, and Stream handlers all check the exception to verify it is a UnifiedChatCompletionException. - OpenAI response handler now reads all the fields in the error message and forwards them to the user. - In the event that a `Throwable` is a `Error`, we rethrow it on another thread so the JVM can catch and handle it. We also stop surfacing the JVM details to the user in the error message (but it's still logged for debugging purposes).
elasticsearchmachine pushed a commit that referenced this pull request Feb 5, 2025
Unified Chat Completion error responses now forward code, type, and param to in the response payload. `reason` has been renamed to `message`. Notes: - `XContentFormattedException` is a `ChunkedToXContent` so that the REST listener can call `toXContentChunked` to format the output structure. By default, the structure forwards to our existing ES exception structure. - `UnifiedChatCompletionException` will override the structure to match the new unified format. - The Rest, Transport, and Stream handlers all check the exception to verify it is a UnifiedChatCompletionException. - OpenAI response handler now reads all the fields in the error message and forwards them to the user. - In the event that a `Throwable` is a `Error`, we rethrow it on another thread so the JVM can catch and handle it. We also stop surfacing the JVM details to the user in the error message (but it's still logged for debugging purposes).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >bug :ml Machine learning Team:ML Meta label for the ML team v8.18.0 v8.19.0 v9.0.0 v9.1.0

3 participants