Request Syntax Request Parameters Response Syntax Response Elements Errors See Also

ListInferenceRecommendationsJobSteps

Returns a list of the subtasks for an Inference Recommender job.

The supported subtasks are benchmarks, which evaluate the performance of your model on different instance types.

Request Syntax


{ "JobName": "string", "MaxResults": number, "NextToken": "string", "Status": "string", "StepType": "string" }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

JobName

The name for the Inference Recommender job.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 64.

Pattern: [a-zA-Z0-9](-*[a-zA-Z0-9]){0,63}

Required: Yes

MaxResults

The maximum number of results to return.

Type: Integer

Valid Range: Minimum value of 1. Maximum value of 100.

Required: No

NextToken

A token that you can specify to return more results from the list. Specify this field if you have a token that was returned from a previous request.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 8192.

Pattern: .*

Required: No

Status

A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.

Type: String

Required: No

StepType

A filter to return details about the specified type of subtask.

BENCHMARK: Evaluate the performance of your model on different instance types.

Type: String

Valid Values: BENCHMARK

Required: No

Response Syntax


{ "NextToken": "string", "Steps": [ { "InferenceBenchmark": { "EndpointConfiguration": { "EndpointName": "string", "InitialInstanceCount": number, "InstanceType": "string", "ServerlessConfig": { "MaxConcurrency": number, "MemorySizeInMB": number, "ProvisionedConcurrency": number }, "VariantName": "string" }, "EndpointMetrics": { "MaxInvocations": number, "ModelLatency": number }, "FailureReason": "string", "InvocationEndTime": number, "InvocationStartTime": number, "Metrics": { "CostPerHour": number, "CostPerInference": number, "CpuUtilization": number, "MaxInvocations": number, "MemoryUtilization": number, "ModelLatency": number, "ModelSetupTime": number }, "ModelConfiguration": { "CompilationJobName": "string", "EnvironmentParameters": [ { "Key": "string", "Value": "string", "ValueType": "string" } ], "InferenceSpecificationName": "string" } }, "JobName": "string", "Status": "string", "StepType": "string" } ] }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

NextToken

A token that you can specify in your next request to return more results from the list.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 8192.

Pattern: .*

Steps

A list of all subtask details in Inference Recommender.

Type: Array of InferenceRecommendationsJobStep objects

Errors

For information about the errors that are common to all actions, see Common Errors.

ResourceNotFound

Resource being access is not found.

HTTP Status Code: 400