ListInferenceRecommendationsJobSteps
Returns a list of the subtasks for an Inference Recommender job.
The supported subtasks are benchmarks, which evaluate the performance of your model on different instance types.
Request Syntax
{ "JobName": "string
", "MaxResults": number
, "NextToken": "string
", "Status": "string
", "StepType": "string
" }
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- JobName
-
The name for the Inference Recommender job.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
[a-zA-Z0-9](-*[a-zA-Z0-9]){0,63}
Required: Yes
- MaxResults
-
The maximum number of results to return.
Type: Integer
Valid Range: Minimum value of 1. Maximum value of 100.
Required: No
- NextToken
-
A token that you can specify to return more results from the list. Specify this field if you have a token that was returned from a previous request.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 8192.
Pattern:
.*
Required: No
- Status
-
A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.
Type: String
Valid Values:
PENDING | IN_PROGRESS | COMPLETED | FAILED | STOPPING | STOPPED | DELETING | DELETED
Required: No
- StepType
-
A filter to return details about the specified type of subtask.
BENCHMARK
: Evaluate the performance of your model on different instance types.Type: String
Valid Values:
BENCHMARK
Required: No
Response Syntax
{ "NextToken": "string", "Steps": [ { "InferenceBenchmark": { "EndpointConfiguration": { "EndpointName": "string", "InitialInstanceCount": number, "InstanceType": "string", "ServerlessConfig": { "MaxConcurrency": number, "MemorySizeInMB": number, "ProvisionedConcurrency": number }, "VariantName": "string" }, "EndpointMetrics": { "MaxInvocations": number, "ModelLatency": number }, "FailureReason": "string", "InvocationEndTime": number, "InvocationStartTime": number, "Metrics": { "CostPerHour": number, "CostPerInference": number, "CpuUtilization": number, "MaxInvocations": number, "MemoryUtilization": number, "ModelLatency": number, "ModelSetupTime": number }, "ModelConfiguration": { "CompilationJobName": "string", "EnvironmentParameters": [ { "Key": "string", "Value": "string", "ValueType": "string" } ], "InferenceSpecificationName": "string" } }, "JobName": "string", "Status": "string", "StepType": "string" } ] }
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- NextToken
-
A token that you can specify in your next request to return more results from the list.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 8192.
Pattern:
.*
- Steps
-
A list of all subtask details in Inference Recommender.
Type: Array of InferenceRecommendationsJobStep objects
Errors
For information about the errors that are common to all actions, see Common Errors.
- ResourceNotFound
-
Resource being access is not found.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: