Skip to content

"Failed to convert to numpy array for model '_cortex_default'" when making a prediction on an ONNX model #1186

@RobertLucian

Description

@RobertLucian

Version

Version 0.18.0

Description

When the input shape of an ONNX model has been set to a string (thus indicating that the axes are dynamic), then making a prediction will give an error of this kind:

cortex.lib.exceptions.UserException: error: key 'input_ids' for model '_cortex_default': failed to convert to NumPy array for model '_cortex_default': cannot reshape array of size 6 into shape (1,1) 

Here's an example of a model's input shapes:

model input type shape attention_mask int64 (batch, sequence) input_ids int64 (batch, sequence) 

Steps to reproduce

Within a given directory, run all the following steps.

Creating environment/model

Create a virtual environment for Python 3.6.9 and install the following pip dependencies:

onnxruntime==1.3.0 torch==1.5.0 transformers==3.0.0 scipy==1.4.1 

Within that environment, run the following instructions to export the XLM-Roberta model in ONNX format:

from transformers.convert_graph_to_onnx import convert convert(framework="pt", model="xlm-roberta-base", output="./output/xlm-roberta-base.onnx", opset=11)

Now, let's run the following:

python -m onnxruntime_tools.optimizer_cli --input ./output/xlm-roberta-base.onnx --output ./output/xlm-roberta-base.onnx --model_type bert --float16
Creating the Cortex deployment

Create a cortex.yaml config file with the following content:

# cortex.yaml - name: api predictor: type: onnx model_path: ./output/xlm-roberta-base.onnx path: predictor.py image: cortexlabs/onnx-predictor-cpu:0.18.0

Create a predictor.py script with the following content:

# predictor.py from transformers import XLMRobertaTokenizer from scipy.special import softmax import time class ONNXPredictor: def __init__(self, onnx_client, config): self.client = onnx_client self.tokenizer = XLMRobertaTokenizer.from_pretrained("xlm-roberta-base") def predict(self, payload): start = time.time() model_inputs = self.tokenizer.encode_plus(payload["text"], max_length=512, return_tensors="pt", truncation=True) inputs_onnx = {k: v.cpu().detach().numpy() for k, v in model_inputs.items()} print(self.client._signatures) output = self.client.predict(inputs_onnx) output = softmax(output[0], axis=1)[0].tolist() end = time.time() return {"output": output, "time": end - start}

Copy-paste the pip dependencies as mentioned above into a requirements.txt file and within the same directory as that of the cortex.yaml config file, run cortex deploy -e local. Wait for the API to be live and then run:

curl http://localhost:8888 -X POST -H "Content-Type: application/json" -d '{"text": "That is a nice"}'
Error

The above command will return a non-200 response code. Inspect the logs with cortex get api. The expected error is:

cortex.lib.exceptions.UserException: error: key 'input_ids' for model '_cortex_default': failed to convert to numpy array for model '_cortex_default': cannot reshape array of size 6 into shape (1,1) 

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions