Hello,
It seems that Tensorflow Lite for MCU does not support the LOG_SOFTMAX opecode (LogSoftmax from Pytorch).
Is there a way to define a custom function to replace it ?
Thanks
Hello,
It seems that Tensorflow Lite for MCU does not support the LOG_SOFTMAX opecode (LogSoftmax from Pytorch).
Is there a way to define a custom function to replace it ?
Thanks
Hello @adamsantamaria,
I can see it here: tflite-micro/log_softmax.h at main · tensorflow/tflite-micro · GitHub
Could you provide some more info on how you’re converting your PyTorch to TFlite?
In parallel, I’m checking with our ML team to see if we need to enable something custom on our end.
Best,
Louis
Hello @louis,
Thanks for your answer.
Glad to hear the function exists.
I am exporting my Pytorch model named model
to ONNX format this way:
# exporting the model to ONNX format
print("Exporting the model to ONNX format...")
model.eval()
dummy_input = torch.randn(1, 3, 15000)
input_names = ["actual_input"]
output_names = ["output"]
torch.onnx.export(
model,
dummy_input,
f"{MODELS_DIR}/activity.onnx",
verbose=False,
input_names=input_names,
output_names=output_names,
export_params=True,
)
Then, I use the .onnx generated file as a parameter for the model.profile
and model.deploy
functions:
try:
profile = ei.model.profile(
model=f"{MODELS_DIR}/activity.onnx", device='cortex-m4f-80mhz'
)
print(profile.summary())
except Exception as e:
print(f"Could not profile: {e}")
try:
deploy_bytes = ei.model.deploy(
model=f"{MODELS_DIR}/activity.onnx",
model_output_type=model_output_type,
deploy_target="zip",
engine="tflite"
)
except Exception as e:
print(f"Could not deploy: {e}")
pip freeze outputs:
tensorflow==2.12.0
edgeimpulse==1.0.4
edgeimpulse-api==1.23.6
torch==1.8.1+cpu
torch-summary==1.4.5
onnx==1.14.0
onnxruntime==1.15.0
Thanks
Hello @adamsantamaria,
Just had a deeper look this morning, could you try to set the opset_version=9,
(or higher, I think latest is 18) in your torch.onnx.export
function?
It seems that it is support starting from that version: pytorch/symbolic_opset9.py at main · pytorch/pytorch · GitHub
This GH issue gave me the indication: RuntimeError: ONNX export failed: Couldn't export operator aten::softmax · Issue #20643 · pytorch/pytorch · GitHub.
I have not had time to reproduce myself though.
Best,
Louis
Hi Louis,
Changing the opset_version
did not solve my problem (default value is 14).
When I setup the verbose mode of onnx.export
,
torch.onnx.export(
model,
dummy_input,
f"{MODELS_DIR}/activity.onnx",
verbose=True,
input_names=input_names,
output_names=output_names,
export_params=True,
opset_version=12,
)
I don’t get any message about not finding the opecode log_softmax:
%output : Float(1, 6, strides=[6, 1], requires_grad=1, device=cpu) = onnx::LogSoftmax[axis=1](%19) # /home/asantamaria/.pyenv/versions/activity-3.8.16/lib/python3.8/site-packages/torch/nn/functional.py:1672:0
The problem does not seem to come from onnx.
Hi Adam,
Could you share your model and/or your source code with some data samples so I can try to replicate your issue please.
Also, I’ve seen that workaround:
Could you replace nn.LogSoftmax(dim=1)
with the following code,and retraining the network.
import torch.nn.functional as F
F.log_softmax(input,1)
In parallel, I’ll check if the issue comes from how we convert the onnx to tflite when you import your onnx model.
Best,
Louis
Thanks Louis for your help,
I can’t share my full model/code with you for now, but I have setup a mock code to replicate the issue:
import edgeimpulse as ei
import torch
import torch.nn as nn
# define a model featuring only one logsoftmax layer (input_size=2*3)
input = torch.randn(2, 3)
m = nn.LogSoftmax(dim=1)
# check input/output relation
print(input)
print(m(input))
# exporting the model to ONNX format
print("Exporting the model to ONNX format...")
m.eval()
dummy_input = torch.randn(1, 2, 3)
input_names = ["actual_input"]
output_names = ["output"]
torch.onnx.export(
m,
dummy_input,
"activity.onnx",
verbose=True,
input_names=input_names,
output_names=output_names,
export_params=True,
opset_version=11,
)
# edgeimpulse API key
ei.API_KEY = "XXX"
# estimate the RAM, ROM, and inference time for our model on the target hardware family
try:
profile = ei.model.profile(
model="activity.onnx", device='cortex-m4f-80mhz'
)
print(profile.summary())
except Exception as e:
print(f"Could not profile: {e}")
You can try it by using your API key.
From my side I still get the message Unsupported ops: LOG_SOFTMAX
in the model.profile
output:
Target results for float32:
===========================
{
"device": "cortex-m4f-80mhz",
"tfliteFileSizeBytes": 1112,
"isSupportedOnMcu": false,
"timePerInferenceMs": 1,
"mcuSupportError": "Unsupported ops: LOG_SOFTMAX."
}
Also, I can’t use F.log_softmax(input,1)
as a model.
Thx !
I observe the same thing with a Keras model.
Thus the problem should not come from ONNX.
import edgeimpulse as ei
import torch
import torch.nn as nn
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# define a model featuring only one logsoftmax layer (input_size=2*3)
pt_tensor = torch.randn(2, 3)
np_tensor = pt_tensor.numpy()
tf_tensor = tf.convert_to_tensor(np_tensor)
pt_model = nn.LogSoftmax(dim=1)
tf_model = keras.Sequential([layers.Activation('log_softmax')])
# check input/output relation
print(pt_tensor)
print(pt_model(pt_tensor))
print(tf_tensor)
print(tf_model(tf_tensor))
# exporting the model to ONNX format
print("Exporting the model to ONNX format...")
pt_model.eval()
dummy_input = torch.randn(1, 2, 3)
input_names = ["actual_input"]
output_names = ["output"]
torch.onnx.export(
pt_model,
dummy_input,
"activity_pt.onnx",
verbose=True,
input_names=input_names,
output_names=output_names,
export_params=True,
opset_version=11,
)
# edgeimpulse API key
ei.API_KEY = "XXX"
# estimate the RAM, ROM, and inference time for our model on the target hardware family
try:
profile = ei.model.profile(
model="activity_pt.onnx", device='cortex-m4f-80mhz'
)
print(profile.summary())
except Exception as e:
print(f"Could not profile: {e}")
try:
profile = ei.model.profile(
model=tf_model, device='cortex-m4f-80mhz'
)
print(profile.summary())
except Exception as e:
print(f"Could not profile: {e}")
Hi @adamsantamaria,
Indeed the op has not been ported to our internal codebase yet.
It will be added by end of this week, we’ll keep you posted!
Aurelien
Hello @adamsantamaria,
@janjongboom added the log softmax op it passed all the tests, it should be merged today.
I’ll test with your code sample when it will be ready.
Best,
Louis
Hello @adamsantamaria,
The Log Softmax op is now available.
I just managed to profile your “activity.onnx” test model:
Target results for float32:
===========================
{
"device": "cortex-m4f-80mhz",
"tfliteFileSizeBytes": 1112,
"isSupportedOnMcu": true,
"memory": {
"tflite": {
"ram": 2132,
"rom": 24360,
"arenaSize": 1964
},
"eon": {
"ram": 760,
"rom": 11824
}
},
"timePerInferenceMs": 1
}
Performance on device types:
============================
{
"variant": "float32",
"lowEndMcu": {
"description": "Estimate for a Cortex-M0+ or similar, running at 40MHz",
"timePerInferenceMs": 24,
"memory": {
"tflite": {
"ram": 2132,
"rom": 24360
},
"eon": {
"ram": 760,
"rom": 11824
}
},
"supported": true
},
"highEndMcu": {
"description": "Estimate for a Cortex-M7 or other high-end MCU/DSP, running at 240MHz",
"timePerInferenceMs": 2,
"memory": {
"tflite": {
"ram": 2132,
"rom": 24360
},
"eon": {
"ram": 760,
"rom": 11824
}
},
"supported": true
},
"highEndMcuPlusAccelerator": {
"description": "Most accelerators only accelerate quantized models.",
"timePerInferenceMs": 2,
"memory": {
"tflite": {
"ram": 2132,
"rom": 24360
},
"eon": {
"ram": 760,
"rom": 11824
}
},
"supported": true
},
"mpu": {
"description": "Estimate for a Cortex-A72, x86 or other mid-range microprocessor running at 1.5GHz",
"timePerInferenceMs": 1,
"rom": 1112.0,
"supported": true
},
"gpuOrMpuAccelerator": {
"description": "Estimate for a GPU or high-end neural network accelerator",
"timePerInferenceMs": 1,
"rom": 1112.0,
"supported": true
}
}
None
Best,
Louis