Skip to content

Conversation

Jan-Kazlouski-elastic
Copy link
Contributor

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic commented Sep 3, 2025

Update of the existing Google Vertex AI inference provider integration allowing performing completion (both streaming and non-streaming) and chat_completion (only streaming) of Anthropic provider models withing Google Model Garden.

Changes were tested locally against next anthropic models:

  • claude-3-5-haiku
Create Completion Endpoint

Success:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS { "inference_id": "google-model-garden-anthropic-completion", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

With max_tokens in task settings:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 10 } } RS { "inference_id": "google-model-garden-anthropic-completion", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

Unknown Provider:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "unknown", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];" } ], "type": "validation_exception", "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];" }, "status": 400 } 

No Provider + No Google Vertex AI parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=null, url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict.;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with url and/or streaming_url. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=null, url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict.;" }, "status": 400 } 

No URL + No Streaming URL + No Google Vertex AI parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}} } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;" }, "status": 400 } 

URL + No Streaming URL (URL is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict" } } RS { "inference_id": "google-model-garden-anthropic-completion", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

No URL + Streaming URL (Streaming URL is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS { "inference_id": "google-model-garden-anthropic-completion", "task_type": "completion", "service": "googlevertexai", "service_settings": { "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

Not Found:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2" } } RS { "error": { "root_cause": [ { "type": "status_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion] status [404]" } ], "type": "status_exception", "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.", "caused_by": { "type": "status_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion] status [404]" } }, "status": 400 } 
Perform Completion

Success Non Streaming:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "input": "The sky above the port was the color of television tuned to a dead channel." } RS { "completion": [ { "result": "That's the opening line from William Gibson's groun" } ] } 

Success Streaming:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion/_stream RQ { "input": "The sky above the port was the color of television tuned to a dead channel." } RS event: message data: {"completion":[{"delta":"This"},{"delta":" is"},{"delta":" the"},{"delta":" opening"}]} event: message data: {"completion":[{"delta":" line of William"},{"delta":" Gibson's sem"}]} event: message data: [DONE] 

Success Non Streaming with task_settings max_tokens:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS { "completion": [ { "result": "That's the opening line from William Gibson's groun" } ] } 

Success Streaming with task_settings max_tokens:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion/_stream RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS event: message data: {"completion":[{"delta":"This"},{"delta":" is"},{"delta":" the"},{"delta":" opening"}]} event: message data: {"completion":[{"delta":" line of William"},{"delta":" Gibson's sem"}]} event: message data: [DONE] 
Create Chat Completion Endpoint

Success:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS: { "inference_id": "google-model-garden-anthropic-chat-completion", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

Success with task_settings max_tokens:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 5 } } RS: { "inference_id": "google-model-garden-anthropic-chat-completion", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

Unknown Provider:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "unknown", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS: { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];" } ], "type": "validation_exception", "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];" }, "status": 400 } 

No url/streaming_url:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}} } } RS: { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;" }, "status": 400 } 

Not found:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2" } } RS: { "error": { "root_cause": [ { "type": "unified_chat_completion_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n <title>Error 404 (Not Found)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>404.</b> <ins>That’s an error.</ins>\n <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server. <ins>That’s all we know.</ins>\n]" } ], "type": "status_exception", "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.", "caused_by": { "type": "unified_chat_completion_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n <title>Error 404 (Not Found)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>404.</b> <ins>That’s an error.</ins>\n <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server. <ins>That’s all we know.</ins>\n]" } }, "status": 400 } 

No streaming_url (url is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict" } } RS: { "inference_id": "google-model-garden-anthropic-chat-completion", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

No url (steraming_url is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ: { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS: { "inference_id": "google-model-garden-anthropic-chat-completion", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 
Perform Chat Completion

Basic:

POST {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion/_stream RQ { "messages": [ { "role": "user", "content": "What is deep learning?" } ], "max_completion_tokens": 100 } RS event: message data: {"id":"msg_vrtx_01X7rphiRiVbTsKBiEJ2ejjR","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of machine learning an"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"d artificial intelligence that uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" multiple layers ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"hence \""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" higher-level features from"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" raw input"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Here"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" are key characteristics"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n\n1"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Core Components"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Neural networks"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" with multiple hidden"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Ability"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" to learn complex patterns from"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" large datasets"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Inspired by the"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" structure of"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" human brain"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neurons\n\n2. Key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Techniques:\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Convolutional Neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Networks (CNNs"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":")\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}} event: message data: [DONE] 

Complex

POST {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion/_stream RQ { "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What is the price of scarf?" } ] } ], "max_completion_tokens": 100, "temperature": 0.2, "top_p": 0.2, "tools": [ { "type": "auto", "function": { "name": "get_current_price", "description": "Get the current price of a item", "parameters": { "type": "object", "properties": { "item": { "id": "123" } } } } } ], "tool_choice": { "type": "auto", "function": { "name": "get_current_price" } } } RS event: message data: {"id":"msg_vrtx_01C6WmQ5QCsxv9WZvSwniHiw","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":2,"prompt_tokens":330,"total_tokens":332}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"I'll"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" help"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" you find the current price"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of a scarf."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" I"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"'ll"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" use the get_current"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"_price function"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" to"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" retrieve"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" this"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" information."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"id":"toolu_vrtx_01Dug9z7HJSRPfAExfssoQ1z","function":{"arguments":"{}","name":"get_current_price"},"type":null}]},"index":1}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":""},"type":null}]},"index":1}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\""},"type":null}]},"index":1}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"item\": \""},"type":null}]},"index":1}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"scarf\"}"},"type":null}]},"index":1}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"tool_use","index":0}],"model":null,"object":null,"usage":{"completion_tokens":85,"prompt_tokens":0,"total_tokens":85}} event: message data: [DONE] 
@elasticsearchmachine elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 3, 2025
…ntegration # Conflicts: #	server/src/main/java/org/elasticsearch/TransportVersions.java #	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googlevertexai/action/GoogleVertexAiActionCreator.java #	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googlevertexai/request/completion/GoogleVertexAiUnifiedChatCompletionRequest.java #	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/action/GoogleVertexAiUnifiedChatCompletionActionTests.java #	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionModelTests.java #	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/request/completion/GoogleVertexAiUnifiedChatCompletionRequestTests.java
…ntegration # Conflicts: #	server/src/main/java/org/elasticsearch/TransportVersions.java
…ntegration # Conflicts: #	server/src/main/java/org/elasticsearch/TransportVersions.java
… to support new content block types and improve parsing logic
… parser and add unit tests for response validation
…ity to validate serialization of user fields
@Jan-Kazlouski-elastic Jan-Kazlouski-elastic marked this pull request as ready for review September 11, 2025 14:00
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Sep 11, 2025
@Jan-Kazlouski-elastic
Copy link
Contributor Author

Hello @jonathan-buttner @dan-rubinstein
Could you please take a look at this PR? It is out of the draft state.

public static final TransportVersion ESQL_DOCUMENTS_FOUND_AND_VALUES_LOADED_8_19 = def(8_841_0_61);
public static final TransportVersion ESQL_PROFILE_INCLUDE_PLAN_8_19 = def(8_841_0_62);
public static final TransportVersion INITIAL_ELASTICSEARCH_8_19_4 = def(8_841_0_68);
public static final TransportVersion ML_INFERENCE_GOOGLE_MODEL_GARDEN_ADDED_8_19 = def(8_841_0_69);
Copy link
Contributor Author

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this needs to be removed. I haven't seen backports in a while. But Google Vertex AI is there for quite some time, so probably we'd require one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this, we won't be backporting the changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

…ntegration # Conflicts: #	server/src/main/java/org/elasticsearch/TransportVersions.java
…ntegration # Conflicts: #	server/src/main/java/org/elasticsearch/TransportVersions.java #	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionServiceSettings.java
@Jan-Kazlouski-elastic
Copy link
Contributor Author

@jonathan-buttner your comments are addressed. Could you please take a look at the PR once more?

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, couple more changes

}

public GoogleVertexAiChatCompletionTaskSettings(StreamInput in) throws IOException {
thinkingConfig = new ThinkingConfig(in);
TransportVersion version = in.getTransportVersion();
if (GoogleVertexAiUtils.supportsModelGarden(version)) {
maxTokens = Objects.requireNonNullElse(in.readOptionalInt(), DEFAULT_MAX_TOKENS);
maxTokens = in.readOptionalInt();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use readOptionalVInt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking. Done.

@@ -124,7 +124,9 @@ public TransportVersion getMinimalSupportedVersion() {
@Override
public void writeTo(StreamOutput out) throws IOException {
thinkingConfig.writeTo(out);
out.writeOptionalInt(maxTokens);
if (GoogleVertexAiUtils.supportsModelGarden(out.getTransportVersion())) {
out.writeOptionalInt(maxTokens);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use writeOptionalVInt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 189 to 204
delta = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta(
null,
null,
null,
List.of(
new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall(
0,
id,
new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall.Function(
input != null ? input.toString() : null,
name
),
null
)
)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, this might be better as:

var function = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall.Function( input != null ? input.toString() : null, name ); var toolCall = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall(0, id, function, null); delta = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta(null, null, null, List.of(toolCall)); 

Similar changes can be made in the parseContentBlockDelta() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done!

…ntegration # Conflicts: #	server/src/main/resources/transport/upper_bounds/9.2.csv
…ntegration # Conflicts: #	server/src/main/resources/transport/upper_bounds/9.2.csv
@Jan-Kazlouski-elastic
Copy link
Contributor Author

@jonathan-buttner @DonalEvans

Your comments are addressed. Could you please review the fixes?

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@Jan-Kazlouski-elastic
Copy link
Contributor Author

Create Completion Endpoint

No Provider No URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 10 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" }, "status": 400 } 

Google Provider With URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "google", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 10 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" }, "status": 400 } 

Google Provider No URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "google", "service_account_json": {{service_account_config}} }, "task_settings": { "max_tokens": 10 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;" }, "status": 400 } 

No URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}} }, "task_settings": { "max_tokens": 10 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;" }, "status": 400 } 

Both URLs:

 PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-1 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 10 } } RS { "inference_id": "google-model-garden-anthropic-completion-1", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } }, "task_settings": { "max_tokens": 10 } } 

Only Non-Streaming URL:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-2 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict" }, "task_settings": { "max_tokens": 10 } } RS { "inference_id": "google-model-garden-anthropic-completion-2", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } }, "task_settings": { "max_tokens": 10 } } 

Only Streaming URL:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 10 } } RS { "inference_id": "google-model-garden-anthropic-completion-3", "task_type": "completion", "service": "googlevertexai", "service_settings": { "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } }, "task_settings": { "max_tokens": 10 } } 

No Task Parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS { "inference_id": "google-model-garden-anthropic-completion-4", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

Not Found:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-5 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2" }, "task_settings": { "max_tokens": 10 } } RS { "error": { "root_cause": [ { "type": "status_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion-5] status [404]" } ], "type": "status_exception", "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.", "caused_by": { "type": "status_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion-5] status [404]" } }, "status": 400 } 
Perform Non-Streaming Completion

Non-Streaming Both URLs

 POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-1 RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS { "completion": [ { "result": "This is the famous opening line from William Gibson's groundbreaking cyberpunk novel \"Neu" } ] } 

Non-Streaming Only Non-Streaming URL

 POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-2 RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS { "completion": [ { "result": "This is the famous opening line from William Gibson's seminal cyberpunk novel \"Neurom" } ] } 

Non-Streaming Only Streaming URL

 POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3 RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS { "completion": [ { "result": "This is the famous opening line from William Gibson's groundbreaking cyberpunk novel \"Neu" } ] } 

Non-Streaming Without Task Settings

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4 RQ { "input": "The sky above the port was the color of television tuned to a dead channel." } RS { "completion": [ { "result": "This is the famous opening line from William Gibson's groundbreaking cyberpunk novel \"Neuromancer,\" published in 1984. The line is notable for its poetic description of the sky using a technological metaphor, which was characteristic of Gibson's innovative writing style.\n\nIn the era when the book was written, a dead television channel would typically display static - a gray, fuzzy, slightly shifting monochromatic screen. So the line suggests a sky that is gray, bleak, and somewhat undefined, creating an immediate atmosphere of technological melancholy.\n\nThis sentence has become one of the most famous opening lines in science fiction literature, often cited as an example of the cyberpunk genre's aesthetic: a gritty, technological world where technology and human experience are deeply intertwined.\n\nWould you like me to discuss more about the novel, its context, or cyberpunk as a literary genre?" } ] } 
Perform Streaming Completion

Streaming Both URLs

 POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-1/_stream RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS event: message data: {"completion":[{"delta":"This"}]} event: message data: {"completion":[{"delta":" is the"}]} event: message data: {"completion":[{"delta":" opening"}]} event: message data: {"completion":[{"delta":" line of William"},{"delta":" Gibson"}]} event: message data: {"completion":[{"delta":"'s groun"}]} event: message data: {"completion":[{"delta":"dbreaking cyberpunk"}]} event: message data: {"completion":[{"delta":" novel \""}]} event: message data: {"completion":[{"delta":"Neurom"}]} event: message data: [DONE] 

Streaming Only Non-Streaming URL

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-2/_stream RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS event: message data: {"completion":[{"delta":"This is the"}]} event: message data: {"completion":[{"delta":" famous"},{"delta":" opening"},{"delta":" line from William"},{"delta":" Gibson"},{"delta":"'s sem"},{"delta":"inal cyb"},{"delta":"erpunk novel \""},{"delta":"Neurom"}]} event: message data: [DONE] 

Streaming Only Streaming URL

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3/_stream RQ { "input": "The sky above the port was the color of television tuned to a dead channel.", "task_settings": { "max_tokens": 20 } } RS event: message data: {"completion":[{"delta":"This is the"}]} event: message data: {"completion":[{"delta":" famous"},{"delta":" opening"},{"delta":" line from William"},{"delta":" Gibson"},{"delta":"'s sem"},{"delta":"inal cyb"},{"delta":"erpunk novel \""},{"delta":"Neurom"}]} event: message data: [DONE] 

Streaming Without Task Settings

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4/_stream RQ { "input": "The sky above the port was the color of television tuned to a dead channel." } RS event: message data: {"completion":[{"delta":"This"},{"delta":" is"},{"delta":" the"}]} event: message data: {"completion":[{"delta":" opening"}]} event: message data: {"completion":[{"delta":" line of William"}]} event: message data: {"completion":[{"delta":" Gibson"},{"delta":"'s sem"}]} event: message data: {"completion":[{"delta":"inal cyb"},{"delta":"erpunk novel \""}]} event: message data: {"completion":[{"delta":"Neuromancer,\""}]} event: message data: {"completion":[{"delta":" published in 1984"},{"delta":". It's"}]} event: message data: {"completion":[{"delta":" a"},{"delta":" famous"}]} event: message data: {"completion":[{"delta":" an"}]} event: message data: {"completion":[{"delta":"d much"},{"delta":"-discusse"}]} event: message data: {"completion":[{"delta":"d first"}]} event: message data: {"completion":[{"delta":" sentence that creates"}]} event: message data: {"completion":[{"delta":" an"}]} event: message data: {"completion":[{"delta":" ev"},{"delta":"ocative, mo"}]} event: message data: {"completion":[{"delta":"ody image"},{"delta":"."}]} event: message data: {"completion":[{"delta":" \n\nIn"}]} event: message data: {"completion":[{"delta":" the"}]} event: message data: {"completion":[{"delta":" early"}]} event: message data: {"completion":[{"delta":" 1980s,"}]} event: message data: {"completion":[{"delta":" when"},{"delta":" the book was written,"},{"delta":" a"},{"delta":" television"},{"delta":" tu"}]} event: message data: {"completion":[{"delta":"ned to a dead channel"}]} event: message data: {"completion":[{"delta":" woul"}]} event: message data: {"completion":[{"delta":"d display"}]} event: message data: {"completion":[{"delta":" static"}]} event: message data: {"completion":[{"delta":" -"}]} event: message data: {"completion":[{"delta":" a gray"}]} event: message data: {"completion":[{"delta":", fuz"}]} event: message data: {"completion":[{"delta":"zy,"}]} event: message data: {"completion":[{"delta":" slightly"}]} event: message data: {"completion":[{"delta":" blu"}]} event: message data: {"completion":[{"delta":"ish-"},{"delta":"white noise"}]} event: message data: {"completion":[{"delta":". So"},{"delta":" the line"}]} event: message data: {"completion":[{"delta":" suggests"}]} event: message data: {"completion":[{"delta":" a"},{"delta":" sky"}]} event: message data: {"completion":[{"delta":" that"}]} event: message data: {"completion":[{"delta":" is"}]} event: message data: {"completion":[{"delta":" blank"}]} event: message data: {"completion":[{"delta":", neutral"}]} event: message data: {"completion":[{"delta":", somewhat"}]} event: message data: {"completion":[{"delta":" bl"},{"delta":"eak and technological"}]} event: message data: {"completion":[{"delta":" -"}]} event: message data: {"completion":[{"delta":" which"}]} event: message data: {"completion":[{"delta":" perfectly"},{"delta":" sets"}]} event: message data: {"completion":[{"delta":" the tone"}]} event: message data: {"completion":[{"delta":" for the cyb"},{"delta":"erpunk genre"}]} event: message data: {"completion":[{"delta":" Gibson"}]} event: message data: {"completion":[{"delta":" helpe"},{"delta":"d create"}]} event: message data: {"completion":[{"delta":"."}]} event: message data: {"completion":[{"delta":"\n\nThe"}]} event: message data: {"completion":[{"delta":" line"}]} event: message data: {"completion":[{"delta":" is"},{"delta":" considere"}]} event: message data: {"completion":[{"delta":"d a masterful"}]} event: message data: {"completion":[{"delta":" piece"},{"delta":" of descript"},{"delta":"ive writing"}]} event: message data: {"completion":[{"delta":","}]} event: message data: {"completion":[{"delta":" using"},{"delta":" a"}]} event: message data: {"completion":[{"delta":" then"}]} event: message data: {"completion":[{"delta":"-contemporary"},{"delta":" technological"}]} event: message data: {"completion":[{"delta":" reference"}]} event: message data: {"completion":[{"delta":" to create a po"}]} event: message data: {"completion":[{"delta":"etic an"}]} event: message data: {"completion":[{"delta":"d atmospheric"}]} event: message data: {"completion":[{"delta":" description"}]} event: message data: {"completion":[{"delta":" of the"}]} event: message data: {"completion":[{"delta":" sky"}]} event: message data: {"completion":[{"delta":". It immediately"}]} event: message data: {"completion":[{"delta":" establishes the novel"}]} event: message data: {"completion":[{"delta":"'s blen"},{"delta":"d of high"}]} event: message data: {"completion":[{"delta":"-tech imagery"}]} event: message data: {"completion":[{"delta":" and g"}]} event: message data: {"completion":[{"delta":"ritty, dystop"}]} event: message data: {"completion":[{"delta":"ian moo"}]} event: message data: {"completion":[{"delta":"d.\n\nWoul"}]} event: message data: {"completion":[{"delta":"d you like me to discuss"}]} event: message data: {"completion":[{"delta":" the"}]} event: message data: {"completion":[{"delta":" novel"},{"delta":","}]} event: message data: {"completion":[{"delta":" the"}]} event: message data: {"completion":[{"delta":" line"}]} event: message data: {"completion":[{"delta":", or cyb"}]} event: message data: {"completion":[{"delta":"erpunk literature"},{"delta":" further"}]} event: message data: {"completion":[{"delta":"?"}]} event: message data: [DONE] 
@Jan-Kazlouski-elastic
Copy link
Contributor Author

Jan-Kazlouski-elastic commented Sep 29, 2025

Create Chat Completion Endpoint

No Provider No URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ { "service": "googlevertexai", "service_settings": { "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 5 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" }, "status": 400 } 

Google Provider With URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "google", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 5 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;" }, "status": 400 } 

Google Provider No URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "google", "service_account_json": {{service_account_config}} }, "task_settings": { "max_tokens": 5 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;" }, "status": 400 } 

No URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}} }, "task_settings": { "max_tokens": 5 } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;" }, "status": 400 } 

Both URLs:

 PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion-1 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 5 } } RS { "inference_id": "google-model-garden-anthropic-chat-completion-1", "task_type": "completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } }, "task_settings": { "max_tokens": 10 } } 

Only Non-Streaming URL:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion-2 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict" }, "task_settings": { "max_tokens": 5 } } RS { "inference_id": "google-model-garden-anthropic-chat-completion-2", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } }, "task_settings": { "max_tokens": 5 } } 

Only Streaming URL:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" }, "task_settings": { "max_tokens": 5 } } RS { "inference_id": "google-model-garden-anthropic-chat-completion-3", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } }, "task_settings": { "max_tokens": 5 } } 

No Task Parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict" } } RS { "inference_id": "google-model-garden-anthropic-chat-completion-4", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict", "provider": "ANTHROPIC", "rate_limit": { "requests_per_minute": 1000 } } } 

Not Found:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion-5 RQ { "service": "googlevertexai", "service_settings": { "provider": "anthropic", "service_account_json": {{service_account_config}}, "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2", "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2" }, "task_settings": { "max_tokens": 5 } } RS { "error": { "root_cause": [ { "type": "unified_chat_completion_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n <title>Error 404 (Not Found)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>404.</b> <ins>That’s an error.</ins>\n <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server. <ins>That’s all we know.</ins>\n]" } ], "type": "status_exception", "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.", "caused_by": { "type": "unified_chat_completion_exception", "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n <title>Error 404 (Not Found)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>404.</b> <ins>That’s an error.</ins>\n <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server. <ins>That’s all we know.</ins>\n]" } }, "status": 400 } 

Testing of Performing Streaming Chat Completion is done and it is confirmed to be successful.

@Jan-Kazlouski-elastic
Copy link
Contributor Author

Perform Chat Completion

Both URLs

event: message data: {"id":"msg_vrtx_01DtbJbxQxXDC2NM98ex3eUY","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":5,"prompt_tokens":0,"total_tokens":5}} event: message data: [DONE] 

Both URLs With Max Tokens in RQ

event: message data: {"id":"msg_vrtx_01H9UMg6c8ey3rhtAsN6uomT","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural networks with multiple layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" higher"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"-level features"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" from"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" points about deep learning:"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Characteristics"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural networks with many"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" automatically"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" learn representations"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" from data\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Mim"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ics the way"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" human brain"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" processes information"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Capable"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of learning complex patterns an"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"d features\n\n2. Key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Components"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n- Neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" networks"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" with multiple"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" hidden"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}} event: message data: [DONE] 

Only Non-Streaming URL

event: message data: {"id":"msg_vrtx_018Ci31jn9VXVh3F8SGuePA5","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":5,"prompt_tokens":0,"total_tokens":5}} event: message data: [DONE] 

Only Non-Streaming URL With Max Tokens in RQ

event: message data: {"id":"msg_vrtx_01KiN7pMLqYfXWGvsR7wfkzg","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" multiple layers ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ively extract higher"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"-level features from raw"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" points about deep learning:"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Characteristics"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural networks with many"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" hidden"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" automatically"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" learn features"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" from data\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Capable of handling"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" complex"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":", non-linear relationships"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Requires"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" large amounts of data"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" for"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" training\n\n2. Neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Network"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Structure"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n- Input"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layer"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}} event: message data: [DONE] 

Only Streaming URL

event: message data: {"id":"msg_vrtx_01AwvZxPifsMPLhWyDCy92Zo","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":5,"prompt_tokens":0,"total_tokens":5}} event: message data: [DONE] 

Only Streaming URL With Max Tokens in RQ

event: message data: {"id":"msg_vrtx_01MGYZfdQf2LpreTA6xgJexk","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with multiple"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"hence \"deep\") to"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" progress"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" higher-level features from"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" input."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" characteristics"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Components"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural networks"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Multiple hidden"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Ability"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" to learn from large"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" amounts"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of data"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Mim"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ics human"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" brain"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"'s"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" processing"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n2. Key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Techniques"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n- Con"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"volutional Neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Networks (CNNs"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":")"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Rec"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}} event: message data: [DONE] 

Both URLs No task settings on creation

event: message data: {"id":"msg_vrtx_014AvDDsgg3BhcwxoQxRwuDR","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" multiple layers ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" higher-level features from"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" characteristics"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" include"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n\n1"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Network"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Architecture"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Consists"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of interconn"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ected layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neurons"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Multiple"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" hidden layers between"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" input and output layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Capable"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of learning complex"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":","},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" non-linear relationships"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n2. Key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Features"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Automatic"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" feature extraction"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" handle"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" un"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"structured data (images"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":", text, audio)"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Self"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"-learning and adaptive"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" capabilities"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n3. Common"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Applications\n- Image"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" recognition"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Speech"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" recognition\n- Natural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" language processing\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Autonomous"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" vehicles\n- Medical"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" diagnosis"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Predict"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ive analytics\n\n4"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Learning"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Techniques"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Supervised learning"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Unsuperv"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ised learning\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Reinforcement learning\n\n5"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Popular"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Architectures\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Convolutional Neural"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Networks (CNNs"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":")\n- Rec"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"urrent Neural Networks (R"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"NNs)"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Generative"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Adversarial Networks ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"GANs)"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Transform"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ers\n\n6"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Advantages\n- High"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" accuracy"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" handle complex, large"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"-scale datasets\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Reduces"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" nee"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"d for manual feature engineering"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n7"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Challenges\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Requires large"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" amounts"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of training"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" data\n- Comput"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ationally intensive\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Can"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" be difficult"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" to interpret"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\nDeep learning has"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" revolutionized artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" intelligence by"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" enabling"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" more"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" sophisticate"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"d an"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"d nu"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"anced machine"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" learning approaches"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"end_turn","index":0}],"model":null,"object":null,"usage":{"completion_tokens":300,"prompt_tokens":0,"total_tokens":300}} event: message data: [DONE] 

Both URLs No task settings on creation With Max Tokens in RQ

event: message data: {"id":"msg_vrtx_018RE7gV36k9m8KGnKTeMnGa","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}} event: message data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural networks with multiple layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" higher"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"-level features from"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" characteristics"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Components"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Artificial"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" neural networks"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Multiple hidden layers"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Complex"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" pattern"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" recognition"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Ability"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" to learn from large"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" amounts of data\n\n2"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":". Key"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Techniques"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":":\n- Con"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"volutional Neural Networks ("},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"CNNs)"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":"\n- Recurrent"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{"content":" Neural Networks (R"},"index":0}],"model":null,"object":null} event: message data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}} event: message data: [DONE] 
@Jan-Kazlouski-elastic
Copy link
Contributor Author

Regression Tests for Google Vertex AI.

Create Completion endpoint

Success

 RQ { "service": "googlevertexai", "service_settings": { "service_account_json": {{service_account_config}}, "model_id": "gemini-2.5-pro", "location": "us-central1", "project_id": "project_id" } } RS { "inference_id": "google-vertex-ai-completion", "task_type": "completion", "service": "googlevertexai", "service_settings": { "project_id": "project_id", "location": "us-central1", "model_id": "gemini-2.5-pro", "provider": "GOOGLE", "rate_limit": { "requests_per_minute": 1000 } } } 

No model_id

 RQ { "service": "googlevertexai", "service_settings": { "service_account_json": {{service_account_config}}, "location": "us-central1", "project_id": "project_id" } } RS { "error": { "root_cause": [ { "type": "validation_exception", "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=us-central1, project_id=1014491842772, model_id=null;" } ], "type": "validation_exception", "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=us-central1, project_id=1014491842772, model_id=null;" }, "status": 400 } 
Perform Non-Streaming Completion
 RQ { "input": "The sky above the port was the color of television tuned to a dead channel." } RS { "completion": [ { "result": "That is the iconic opening line of William Gibson's 1984 debut novel, **Neuromancer**.\n\nIt is widely regarded as one of the most brilliant and effective opening sentences in modern literature, especially within science fiction. Here’s a breakdown of why it's so powerful:\n\n### 1. Instant World-Building\nIn a single sentence, Gibson establishes the entire mood and setting of his novel. This isn't a world of blue skies and natural beauty. It's a world where the man-made has superseded the natural to such an extent that the sky itself is described in terms of a technological failure. It immediately signals a gritty, polluted, and dystopian future.\n\n### 2. Subverting Poetic Language\nThe sentence structure is traditionally poetic (\"The sky above the port...\"), but the simile at the end is jarringly modern, ugly, and technical. This clash between a classic literary form and a piece of low-tech jargon perfectly encapsulates the \"high tech, low life\" ethos of the cyberpunk genre that Gibson was pioneering.\n\n### 3. Establishing Tone\nThe image is bleak and unsettling. A \"dead channel\" implies a lack of signal, an absence of information, a void. It's not a peaceful, uniform gray, but a staticky, lifeless, and oppressive color. This sets a noir-ish, melancholic, and anxious tone that persists throughout the novel.\n\n### 4. A Generational Image\nThe meaning of the line has evolved with technology, which is a fascinating aspect of its legacy:\n* **In 1984:** A \"dead channel\" on an analog CRT television was a screen of flickering, staticky, light-gray noise. This is the image Gibson intended—a dynamic, unpleasant, and textured sky.\n* **Today:** For younger readers, a \"dead channel\" might be a solid blue or black screen from a modern digital TV or monitor. While different from the original intent, this new interpretation—a flat, empty, digital void—still powerfully conveys a sense of technological alienation.\n\nWilliam Gibson himself has commented on this generational shift, acknowledging that he was thinking of the specific static of an old black-and-white TV, but finds the modern \"blue screen of death\" interpretation equally valid and interesting.\n\nIn short, that one sentence is a masterclass in economical writing. It establishes the setting, tone, and central themes of *Neuromancer* before the story has even begun." } ] } 
Perform Streaming Completion
 RQ { "input": "The sky above the port was the color of television tuned to a dead channel." } RS event: message data: {"completion":[{"delta":"That's the legendary opening line of William Gibson's 1984 novel, ***Neuromancer***.\n\nIt is widely considered one of the most effective and iconic opening sentences in modern literature, especially within science fiction. It does a phenomenal amount of work in just 14 words.\n\nHere's"}]} event: message data: {"completion":[{"delta":" a breakdown of why it's so brilliant:\n\n### 1. It Establishes the Genre (Cyberpunk)\nIn a single stroke, the line marries the natural world with defunct technology.\n* **The Natural:** \"The sky above the port\" is a classic, almost poetic, scenic description"}]} event: message data: {"completion":[{"delta":".\n* **The Artificial & Decayed:** \"...the color of television tuned to a dead channel\" is jarring, modern, and specific. It's not the grey of a storm cloud; it's the specific, staticky, lifeless grey of technological failure.\n\nThis fusion of the natural world with technology ("}]} event: message data: {"completion":[{"delta":"often in a state of decay) is the absolute heart of the cyberpunk genre. Gibson didn't just describe a scene; he announced a new literary sensibility.\n\n### 2. It Sets the Mood (Tone)\nThe image is incredibly bleak. A \"dead channel\" evokes feelings of:\n* **Em"}]} event: message data: {"completion":[{"delta":"ptiness:** A lack of signal, no information, a void.\n* **Alienation:** It's an unnatural, disorienting sight.\n* **Dystopia:** This is not a beautiful, clear blue sky. It's polluted, oppressive, and grim. The world the characters inhabit is broken"}]} event: message data: {"completion":[{"delta":", much like the television.\n\nThis is the \"low life\" part of cyberpunk's \"high tech, low life\" motto.\n\n### 3. It's a Masterclass in Imagery and Simile\nThe simile is potent and multi-sensory. When reading it, especially for the first time in"}]} event: message data: {"completion":[{"delta":" the 1980s, you don't just *see* the fuzzy, colorless static; you can almost *hear* the white-noise hiss that accompanies it. It's an image that's both visual and auditory, creating a powerful and immersive experience for the reader from the very first sentence"}]} event: message data: {"completion":[{"delta":".\n\n### The Evolution of its Meaning\nOne of the most fascinating aspects of this line is how its meaning has changed with technology.\n\n* **In 1984:** An analog television tuned to a dead channel displayed a screen of flickering black-and-white static. This is the image Gibson intended"}]} event: message data: {"completion":[{"delta":".\n* **Today:** A modern television or monitor \"tuned to a dead channel\" often displays a solid blue or black screen, perhaps with a \"No Signal\" message.\n\nWilliam Gibson himself has commented on this. He's joked that if he were writing it today, he might have to say the"}]} event: message data: {"completion":[{"delta":" sky was \"the color of a blue screen of death.\"\n\nThis technological shift doesn't diminish the line's power; it anchors it in a specific historical and technological moment. It makes the novel a kind of artifact of the very era it was critiquing, turning the sentence itself into a piece of retro"}]} event: message data: {"completion":[{"delta":"-tech. Even if a younger reader has to look up what \"television static\" looks like, the core idea—describing the natural world in terms of technological failure—remains as powerful as ever."}]} event: message data: [DONE] 
Create Chat Completion endpoint
RQ { "service": "googlevertexai", "service_settings": { "service_account_json": {{service_account_config}}, "model_id": "gemini-2.5-pro", "location": "us-central1", "project_id": "project_id" } } RS { "inference_id": "google-vertex-ai-chat-completion", "task_type": "chat_completion", "service": "googlevertexai", "service_settings": { "project_id": "project_id", "location": "us-central1", "model_id": "gemini-2.5-pro", "provider": "GOOGLE", "rate_limit": { "requests_per_minute": 1000 } } } 
Perform Chat Completion
 RQ { "messages": [ { "role": "user", "content": "What is deep learning?" } ], "max_completion_tokens": 100 } RS event: message data: {"id":"54zaaLahA-OcmecPmbjUCA","choices":[{"delta":{"content":"Of course! Here","role":"model"},"finish_reason":"MAX_TOKENS","index":0}],"model":"gemini-2.5-pro","object":"chat.completion.chunk","usage":{"completion_tokens":4,"prompt_tokens":5,"total_tokens":103}} event: message data: [DONE] 
@Jan-Kazlouski-elastic
Copy link
Contributor Author

@jonathan-buttner
Testing is finished. All good. We're ready to merge.

@jonathan-buttner jonathan-buttner enabled auto-merge (squash) September 29, 2025 14:06
@jonathan-buttner jonathan-buttner merged commit d18da3c into elastic:main Sep 29, 2025
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :ml Machine learning Team:ML Meta label for the ML team v9.2.0
5 participants