Arduino Portenta MBED Core version 2.5.2 released

Rocksetta · September 23, 2021, 10:54pm

Arduino released today MBED core version 2.5.2 https://github.com/arduino/ArduinoCore-mbed/releases/tag/2.5.2

This is very relevant to Edge Impulse as the camera now has 320x320 resolution and the IDE can split the memory in 3 ways between the cores 50:50, 75:25, 100:0 (With zero memory for the M4 core it needs to load from the SD Card on the vision shields or the breakout board.)

Does anyone know the code to show the Edge Impulse relevant memory information. I am trying to load a model that will not run on the regular M7 core 50:50 memory, but will run for the M7 if we switch to 75:25 memory.

Was thinking some statement like this, would at least say how big my vision frame buffer is.

    ei_printf("sizeof(frame_buffer): %d\n", sizeof(frame_buffer));

Wondering how to show how big the Edge Impulse model is, anyone know any tricks to show in a readable format any memory information?

Rocksetta · September 24, 2021, 4:31am

@janjongboom @dansitu I am getting tFlite memory Arena issues when using the expanded Portenta memory from ~1Mb to 2Mb. Is it possible for Arduino the TFLITE Arena is set in EdgeImpulse code and not flexible. Can someone show me where to see the code that sets the model size. I know where it is done using TensorflowLite, just not using Edge Impulse.

…

3 minutes later!

… Not sure why I have to post a question on the Forum before I think of a way to solve the issue

In the file trained_model_compiled.cpp

There is this Variable, Where does that max file size come from? I will probably play around with changing it, but is this something Edge Impulse can do based on if the Arduino has ore memory? Or is this not the issue?

constexpr int kTensorArenaSize = 286432;

… few minutes later

So this is what I am seeing, using a model that is slightly bigger than what will fit on the regular M7 core.

Trying to figure out if this is an Arduino issue or an edge impulse issue. Looks like the software is not allocating the Arena, which might be an Arduino issue. Anyone got any suggestions?

.

Actually the above issue might be more to do with me trying the camera at 320x320, so I am testing again but using the regular 320x240 camera setting.

janjongboom · September 24, 2021, 8:40am

@Rocksetta this number is calculated by EON Compiler and has all allocations that need to be made by the TFLite kernels. I’m not sure how the memory is managed on the Portenta but:

Default allocation is on the heap, maybe the heap is not expanded to full memory or there’s so much already allocated? There’s https://os.mbed.com/blog/entry/Tracking-memory-usage-with-Mbed-OS/ but I’m not sure if this is enabled for the Portenta.
Not sure if everything is in one bank, or how the memory is split up. You can set EI_CLASSIFIER_ALLOCATION_STATIC to allocate statically (not on the heap) and then you have freedom to put this into separate section of RAM.

@rjames might know more about the memory layout?

Rocksetta · September 24, 2021, 1:35pm

Thanks @janjongboom trying that now. I know in the vision code we set the
uint8_t frame_buffer[320*240] __attribute__((aligned(32))); not sure if that is at all relevant.

Here is the trained_model_compiled.cpp file relevant parts, what are the other options


#if defined(EI_CLASSIFIER_ALLOCATION_STATIC)
uint8_t tensor_arena[kTensorArenaSize] ALIGN(16);
#elif defined(EI_CLASSIFIER_ALLOCATION_STATIC_HIMAX)
#pragma Bss(".tensor_arena")
uint8_t tensor_arena[kTensorArenaSize] ALIGN(16);
#pragma Bss()
#elif defined(EI_CLASSIFIER_ALLOCATION_STATIC_HIMAX_GNU)
uint8_t tensor_arena[kTensorArenaSize] ALIGN(16) __attribute__((section(".tensor_arena")));
#else
#define EI_CLASSIFIER_ALLOCATION_HEAP 1
uint8_t* tensor_arena = NULL;
#endif

.

few minutes later

So I added

#define EI_CLASSIFIER_ALLOCATION_STATIC

to my sketch but it does not seem to have any positive effect. Should I try the other options? Probably will if no reply soon .

rjames · September 24, 2021, 2:02pm

Hi @Rocksetta,

You can use the mbed url that @janjongboom linked to get memory information just before classification. The core has it enabled so following the steps in the URL and adding the print_memory_info() could prove useful debugging this.

I’ve been dabbing around with core 2.5.2 as well and ran into similar issue with not enough RAM. I’ve managed using MobileNetV1 0.25 which produces model of 108.5 kB (in my case) which allowed me to allocate enough to run an impulse. In my case this was my output and rationale for selecting a model size:

Thread: 0x24059408, Stack size: 2232 / 8192

Thread: 0x2404A848, Stack size: 328 / 896

Thread: 0x2404A804, Stack size: 104 / 768

Thread: 0x240590C8, Stack size: 128 / 256

Thread: 0x24052F9C, Stack size: 1792 / 32768

Heap size: 18113 / 166128 bytes (max: 40385)

Predictions (DSP: 0 ms., Classification: 54 ms., Anomaly: 0 ms.):
    lamp:       0.457031

    plant:      0.019531

    unknown:    0.523438

Starting inferencing in 2 seconds..

As for printing the frame buffer size. We always capture at full frame resolution so it won’t be as useful to print static information.

janjongboom · September 24, 2021, 2:18pm

This heap size looks waaaaay too small though, there’s 2MB of RAM right? Do we have the .map file ?

Rocksetta · September 24, 2021, 9:55pm

Thanks @janjongboom and @rjames for the example print_memory_info(); Not sure about the map file.

If anyone can word the issue intelligently, can you put it on the Arduino coreMbed github https://github.com/arduino/ArduinoCore-mbed/issues
make sure to add me @hpssjellis and @facchinm as he seems to be the lead with these issues. If you want to describe it here I will put in the issue.

I think I will work on something else but this is what I got today:

M7 core working with 50:50 memory Split small vision model

M7 core working with 100:0 memory split small vision model

M7 core crashed with 100:0 memory split slightly larger vision model

Another M7 core crashed with 100:0 memory split slightly different vision model

If you can word what you think the problem is I will put it on the github issues.

rjames · September 25, 2021, 3:30pm

Hi,

No there’s in total 1MB of RAM which is split in segments. See image below.

The heap is placed in the 512 KB (RAM) segment. It is important to note that
the flash split option does not affect the heap size. The heap size is thus
determined by the amount of (static) RAM used.

With regards to map file you can find it (on Linux) at /tmp/arduino-sketch-<some-numer>/*.map.
See below the map file, linker script and the output of print_memory_info of (wip) firmware application with 100_0 flash split.

Inferencing settings:
	Image resolution: 96x96
	Frame size: 9216
	No. of classes: 3
Thread: 0x24045408, Stack size: 2096 / 8192
Thread: 0x24036848, Stack size: 328 / 896
Thread: 0x24036804, Stack size: 104 / 768
Thread: 0x240450C8, Stack size: 128 / 256
Thread: 0x2403EF9C, Stack size: 1792 / 32768
Heap size: 17849 / 248048 bytes (max: 17949)
Starting inferencing in 2 seconds...
Taking photo...
Predictions (DSP: 1 ms., Classification: 54 ms., Anomaly: 0 ms.):
    lamp: 	0.312500
    plant: 	0.027344
    unknown: 	0.660156

Since my last print_memory_info I’ve done some memory optimizations. You can
see now tthat heap is 0x3c8f0 (248048; ~242 KB). In our case the heap is
smaller than in your case since we also have audio support which adds to the
.pdm_section which takes from the RAM segment. We also allocate the
out_buf on the heap which holds the out 92x92 image.

There’s also RAM_D2 and RAM_D3 sections which looks dedicated to lwip and
openamp, respectively. RAM_D2 being 288 KB and RAM_D3 64 KB.

I’ll currently investigating whether these sections can be made free if not used by the application.

P.S.: At the bottom of your images in the Arduino IDE, after a build you see how much RAM is available to your application, in your case this was 355936 bytes. I didn’t know this either until I spotted it

// Raul

Rocksetta · September 28, 2021, 8:09pm

This conversation continues on the Arduino MBED core issues at

Not sure how I missed knowing about the 8 Mb extra memory.

Quote from Martino Facchin

> …the Portenta has an external 8MB RAM module (albeit slower), you could have a huge heap by just using Portenta_SDRAM library and replacing the calls to malloc and free with ea_malloc and ea_free