Cannot Import ONNX Model

peteop1 · April 14, 2024, 6:30pm

Question/Issue:
Cannot import an ONNX model exported from Pytorch. I tried both the edgeimpulse SDK and the studio to import the model. The import process is ok, but when I click the “Save Model” in the studio or upload using the SDK, an error that says “socket hang up” pops up.
I think there’s something wrong with my network but I don’t know what is causing it. I have tried the following different combinations:

No reshape and transpose, just the resnet. Then EI Studio is able to save the model.
Just reshape and transpose, no resent (remove the layer1, layer2…, just keep the first conv). Then EI Studio is also able to save the model.

It seems that the combination of the reshape operations with the reset shortcuts somehow prevents EI from saving the model. I have also tried various ONNX opset versions (default, 12, 11).

Project ID: 379038

Context/Use case:
Here’s the code to produce the onnx file:

import torch
import torch.nn as nn
import onnx
import onnxruntime

class SimplifiedResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, dropout_rate=0.0):
        super(SimplifiedResidualBlock, self).__init__()
        self.conv1 = nn.Conv1d(in_channels, out_channels, kernel_size, stride=stride, padding=kernel_size//2, bias=False)
        self.bn1 = nn.BatchNorm1d(out_channels)
        self.relu = nn.ReLU(inplace=True)
        self.dropout = nn.Dropout(dropout_rate)
        self.conv2 = nn.Conv1d(out_channels, out_channels, kernel_size, padding=kernel_size//2, bias=False)
        self.bn2 = nn.BatchNorm1d(out_channels)
        
        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels:
            self.shortcut = nn.Sequential(
                nn.Conv1d(in_channels, out_channels, 1, stride=stride, bias=False),
                nn.BatchNorm1d(out_channels)
            )

    def forward(self, x):
        shortcut = self.shortcut(x)
        
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.dropout(x)
        
        x = self.conv2(x)
        x = self.bn2(x)
        x += shortcut
        x = self.relu(x)
        
        return x
    
class SmallResNet1DWithReshape(nn.Module):
    def __init__(self, num_classes=10, dropout_rate=0.0):
        super(SmallResNet1DWithReshape, self).__init__()
        self.conv1 = nn.Conv1d(6, 32, 7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm1d(32)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool1d(3, stride=2, padding=1)
        
        self.layer1 = SimplifiedResidualBlock(32, 32, dropout_rate=dropout_rate)
        self.layer2 = SimplifiedResidualBlock(32, 64, stride=2, dropout_rate=dropout_rate)
        self.layer3 = SimplifiedResidualBlock(64, 128, stride=2, dropout_rate=dropout_rate)
        self.layer4 = SimplifiedResidualBlock(128, 256, stride=2, dropout_rate=dropout_rate)
        
        self.avgpool = nn.AdaptiveAvgPool1d(1)
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        # B = x.shape[0]
        x = x.reshape(-1, 300, 6)
        x = x.transpose(1, 2)
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        
        return x
    
model = SmallResNet1DWithReshape(num_classes=10, dropout_rate=0.0)
model.eval()

# Export the model to ONNX, fix the batch size to 1
dummy_input = torch.ones(1, 1800)
print(f'model inference result: {model(dummy_input)}')

torch.onnx.export(model, dummy_input,
                  'model.onnx',
                  opset_version=11,
                  input_names=['input'],
                  output_names=['output'])

# load the model and run inference
onnx_model = onnx.load('model.onnx')
onnx.checker.check_model(onnx_model)
ort_session = onnxruntime.InferenceSession('model.onnx')
ort_inputs = {ort_session.get_inputs()[0].name: dummy_input.numpy()}
ort_outs = ort_session.run(None, ort_inputs)

print(f'onnx inference result: {ort_outs[0]}')

Eoin · April 15, 2024, 11:27am

Hi @peteop1

Welcome to the forum!

Your model seems to upload ok for me, via the edgeimpulse python sdk, did you go and configure your model settings?

peteop1 · April 15, 2024, 3:03pm

Hi Eoin,

Thank you for your reply. I tried again using the python SDK but it is still not working. I am getting the following errors:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 resp_deploy = ei.model.deploy(model='model.onnx',
      2                               model_input_type=ei.model.input_type.OtherInput(),
      3                               model_output_type=ei.model.output_type.Classification(),
      4                               representative_data_for_quantization='validation_data.npy',
      5                               output_directory='.')

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse/model/_functions/deploy.py:250, in deploy(model, model_output_type, model_input_type, representative_data_for_quantization, deploy_model_type, engine, deploy_target, output_directory, api_key, timeout_sec)
    248 except Exception as e:
    249     logging.debug(f"Exception calling save_pretrained_model_parameters [{str(e)}]")
--> 250     raise e
    252 target_names = get_project_deploy_targets(client, project_id=project_id)
    253 if deploy_target not in target_names:

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse/model/_functions/deploy.py:244, in deploy(model, model_output_type, model_input_type, representative_data_for_quantization, deploy_model_type, engine, deploy_target, output_directory, api_key, timeout_sec)
    240 try:
    241     r = SavePretrainedModelRequest.from_dict(
    242         {"input": model_input_type, "model": model_output_type}
    243     )
--> 244     response = learn.save_pretrained_model_parameters(
    245         project_id=project_id, save_pretrained_model_request=r
    246     )
    247     check_response_errors(response)
    248 except Exception as e:

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/pydantic/decorator.py:40, in pydantic.decorator.validate_arguments.validate.wrapper_function()

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/pydantic/decorator.py:134, in pydantic.decorator.ValidatedFunction.call()

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/pydantic/decorator.py:206, in pydantic.decorator.ValidatedFunction.execute()

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse_api/api/learn_api.py:2487, in LearnApi.save_pretrained_model_parameters(self, project_id, save_pretrained_model_request, **kwargs)
   2463 """Save parameters for pretrained model
   2464 
   2465 Save input / model configuration for a pretrained model. This overrides the current impulse. If you want to deploy a pretrained model from the API, see `startDeployPretrainedModelJob`.
   (...)
   2484 :rtype: GenericApiResponse
   2485 """
   2486 kwargs['_return_http_data_only'] = True
-> 2487 return self._save_pretrained_model_parameters_with_http_info(project_id, save_pretrained_model_request, **kwargs)

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/pydantic/decorator.py:40, in pydantic.decorator.validate_arguments.validate.wrapper_function()

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/pydantic/decorator.py:134, in pydantic.decorator.ValidatedFunction.call()

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/pydantic/decorator.py:206, in pydantic.decorator.ValidatedFunction.execute()

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse_api/api/learn_api.py:2591, in LearnApi._save_pretrained_model_parameters_with_http_info(self, project_id, save_pretrained_model_request, **kwargs)
   2585 _auth_settings = ['ApiKeyAuthentication', 'JWTAuthentication', 'JWTHttpHeaderAuthentication']  # noqa: E501
   2587 _response_types_map = {
   2588     '200': "GenericApiResponse",
   2589 }
-> 2591 return self.api_client.call_api(
   2592     '/api/{projectId}/pretrained-model/save', 'POST',
   2593     _path_params,
   2594     _query_params,
   2595     _header_params,
   2596     body=_body_params,
   2597     post_params=_form_params,
   2598     files=_files,
   2599     response_types_map=_response_types_map,
   2600     auth_settings=_auth_settings,
   2601     async_req=_params.get('async_req'),
   2602     _return_http_data_only=_params.get('_return_http_data_only'),  # noqa: E501
   2603     _preload_content=_params.get('_preload_content', True),
   2604     _request_timeout=_params.get('_request_timeout'),
   2605     collection_formats=_collection_formats,
   2606     _request_auth=_params.get('_request_auth'))

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse_api/api_client.py:400, in ApiClient.call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_types_map, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _request_auth)
    359 """Makes the HTTP request (synchronous) and returns deserialized data.
    360 
    361 To make an async_req request, set the async_req parameter.
   (...)
    397     then the method will return the response directly.
    398 """
    399 if not async_req:
--> 400     return self.__call_api(resource_path, method,
    401                            path_params, query_params, header_params,
    402                            body, post_params, files,
    403                            response_types_map, auth_settings,
    404                            _return_http_data_only, collection_formats,
    405                            _preload_content, _request_timeout, _host,
    406                            _request_auth)
    408 return self.pool.apply_async(self.__call_api, (resource_path,
    409                                                method, path_params,
    410                                                query_params,
   (...)
    418                                                _request_timeout,
    419                                                _host, _request_auth))

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse_api/api_client.py:237, in ApiClient.__call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_types_map, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _request_auth)
    234 # deserialize response data
    236 if response_type:
--> 237     return_data = self.deserialize(response_data, response_type)
    238 else:
    239     return_data = None

File ~/miniconda3/envs/torch_env/lib/python3.10/site-packages/edgeimpulse_api/api_client.py:306, in ApiClient.deserialize(self, response, response_type)
    303     data = json.loads(response.data)
    305     if not data["success"]:
--> 306         raise Exception(data["error"])
    308 except ValueError:
    309     data = response.data

Exception: socket hang up

The command I am using is:

resp_deploy = ei.model.deploy(model='model.onnx',
                              model_input_type=ei.model.input_type.OtherInput(),
                              model_output_type=ei.model.output_type.Classification(),
                              representative_data_for_quantization='validation_data.npy',
                              output_directory='.')

I am also seeing this error in the EI Studio:

Failed to calculate performance

Creating job… OK (ID: 18372417) OK (ID: 18372418) Scheduling job in cluster… Scheduling job in cluster… Container image pulled! Container image pulled! Job started Job started Scheduling job in cluster… Scheduling job in cluster… Container image pulled! Job started Container image pulled! Job started Profiling model_quantized_int8_io.tflite… Profiling model_quantized_int8_io.tflite… Error converting model for EON RAM optimized mode: Calculating performance metrics… Calculating inferencing time… Aborted (core dumped) Error while calculating inferencing time: substring not found Traceback (most recent call last): File “/app/./resources/libraries/ei_tensorflow/profiling.py”, line 1159, in profile_tflite_file metadata[‘performance’] = json.loads(a[a.index(‘{’):a.index(‘}’)+1]) ValueError: substring not found Determining whether this model runs on MCU… Determining whether this model runs on MCU OK Profiling model_quantized_int8_io.tflite OK Profiling model.tflite… Error converting model for EON RAM optimized mode: Calculating performance metrics… Calculating inferencing time… Aborted (core dumped) Error while calculating inferencing time: substring not found Traceback (most recent call last): File “/app/./resources/libraries/ei_tensorflow/profiling.py”, line 1159, in profile_tflite_file metadata[‘performance’] = json.loads(a[a.index(‘{’):a.index(‘}’)+1]) ValueError: substring not found Determining whether this model runs on MCU… Determining whether this model runs on MCU OK Profiling model_quantized_int8_io.tflite OK Profiling model.tflite… Error converting model for EON RAM optimized mode: Calculating performance metrics… Calculating inferencing time… INFO: Created TensorFlow Lite XNNPACK delegate for CPU. Error converting model for EON RAM optimized mode: Calculating performance metrics… Calculating inferencing time… INFO: Created TensorFlow Lite XNNPACK delegate for CPU. Calculating inferencing time OK Determining whether this model runs on MCU… Determining whether this model runs on MCU OK Profiling float32 model (TensorFlow Lite Micro)… Calculating inferencing time OK Determining whether this model runs on MCU… Determining whether this model runs on MCU OK Profiling float32 model (TensorFlow Lite Micro)… Profiling float32 model (TensorFlow Lite Micro, HW optimizations disabled)… Profiling float32 model (TensorFlow Lite Micro, HW optimizations disabled)… Profiling float32 model (EON)… Profiling float32 model (EON)… Profiling float32 model (EON, HW optimizations disabled)… Profiling float32 model (EON, HW optimizations disabled)… Profiling model.tflite OK Profiling model.tflite OK socket hang up Job failed (see above)

Can you verify that we are uploading the same model? This is generated by netron:

I also tried a simple conv model without the skip connections, and that uploads fine. However, there are some issues with automatic quantization. The quantized model has extremely low performance (accuracy dropped from 78% to 10%). My model takes input of size (1800), so I created the npy as a 2D array or shape (B, 1800), where B is the validation dataset size. Is this the correct way of providing the representative data?