Processed features not matching

paulphilip · September 4, 2020, 9:31am

Hi,

I am trying to run the inference on my embedded target.

Following is the Raw features that I used to feed to the system

7, 989, 147, 10, 1004, 86, -28, 999, 71, -39, 985, 122, -7, 986, 151, 18, 1003, 91, -13, 1002, 61, -42, 990, 98, -20, 984, 148, 16, 1000, 109, 2, 1003, 69, -38, 992, 94, -33, 981, 151, 8, 995, 133, 13, 1004, 81, -31, 993, 82, -42, 985, 130, -6, 991, 141, 20, 1004, 84, -13, 997, 68, -43, 989, 107, -21, 988, 142, 17, 1002, 99, 2, 999, 74, -39, 989, 104, -33, 984, 149, 7, 997, 120, 14, 1001, 81, -30, 992, 94, -42, 984, 133, -4, 994, 127, 20, 1003, 77, -16, 995, 78, -37, 989, 115, -20, 990, 135, 14, 1002, 92, -1, 998, 79, -34, 988, 113, -34, 984, 148, 3, 1000, 109, 11, 1001, 82, -25, 990, 104, -38, 984, 135, -11, 995, 114, 17, 1004, 74, -9, 995, 87, -35, 985, 123, -25, 991, 126, 13, 1005, 81, 2, 998, 87, -28, 987, 122, -38, 983, 144, -2, 999, 99, 13, 1002, 82, -16, 991, 113, -43, 981, 137, -15, 997, 103, 16, 1006, 71, -2, 994, 98, -37, 982, 130, -30, 993, 116, 12, 1008, 75, 11, 996, 96, -26, 984, 134, -44, 985, 137, -2, 1005, 87, 17, 1001, 86, -14, 986, 125, -45, 984, 137, -18, 1001, 89, 20, 1005, 72, 3, 990, 108, -41, 982, 132, -33, 997, 104, 12, 1008, 71, 12, 992, 108, -28, 982, 142, -43, 991, 127, -1, 1006, 76, 18, 997, 94, -14, 984, 136, -48, 986, 131, -16, 1004, 75, 19, 1003, 72, 3, 988, 121, -41, 985, 133, -32, 999, 95, 8, 1004, 67, 14, 990, 117, -28, 984, 147, -41, 994, 117, -2, 1006, 72, 15, 993, 103, -12, 984, 141, -44, 990, 123, -17, 1006, 67, 15, 999, 78, 5, 987, 130, -41, 986, 132, -31, 1001, 83, 7, 1003, 70, 13, 988, 126, -27, 983, 148, -42, 995, 110, -6, 1006, 70, 17, 992, 110, -9, 983, 144, -46, 991, 116, -18, 1007, 61, 14, 998, 87, 6, 985, 136, -40, 987, 127, -34, 1004, 74, 6, 1003, 73, 21, 987, 137, -31, 981, 149, -46, 996, 97, -4, 1008, 65, 24, 991, 120, -10, 981, 149, -50, 993, 109, -21, 1010, 53, 23, 995, 93, 8, 983, 144, -46, 988, 122, -37, 1006, 65, 11, 1003, 79, 23, 985, 148, -32, 982, 147, -50, 1000, 85, -3, 1008, 64, 30, 988, 132, -16, 979, 153, -54, 995, 95, -18, 1012, 48, 24, 993, 102, 5, 979, 152, -47, 990, 115, -34, 1009, 57, 15, 999, 87, 22, 982, 157, -37, 984, 141, -47, 1002, 76, -3, 1006, 68, 28, 985, 142, -18, 980, 153, -50, 999, 84, -18, 1010, 46, 24, 990, 114, 2, 980, 157, -46, 994, 106, -34, 1009, 47, 12, 996, 96, 19, 982, 166, -30, 990, 135, -48, 1003, 66, -7, 1001, 71, 27, 985, 151, -14, 984, 149, -50, 1000, 74, -22, 1008, 48, 22, 989, 125, 6, 983, 156, -44, 996, 96, -37, 1008, 46, 12, 994, 105, 20, 982, 171, -30, 991, 130, -47, 1004, 59, -7, 999, 78, 25, 984, 157, -12, 985, 144, -50, 1000, 68, -24, 1006, 49, 25, 989, 133, 8, 984, 155, -44, 996, 89, -39, 1008, 44, 12, 992, 117, 20, 980, 172, -30, 991, 123, -48, 1007, 53, -6, 999, 83, 26, 982, 164, -15, 986, 141, -49, 1004, 56, -22, 1006, 51, 27, 985, 141, 5, 983, 156, -45, 1000, 82, -37, 1007, 44, 15, 989, 124, 19, 978, 173, -34, 993, 114, -46, 1008, 50, -5, 997, 93, 25, 978, 170, -15, 989, 132, -47, 1008, 47, -21, 1005, 59, 23, 980, 151, 1, 984, 152, -41, 1004, 71, -33, 1008, 45, 12, 984, 136, 14, 979, 175, -29, 996, 103, -44, 1007, 47, -5, 994, 103, 22, 978, 171, -14, 991, 123, -43, 1008, 43, -22, 1001, 65, 18, 979, 159, 3, 988, 144, -37, 1006, 62, -38, 1005, 51, 12, 985, 147, 16, 982, 174

and following are the Processed features generated in edgeImpulse Model Testing block

24.1301, 25.5512, 28.6131, 27.2835, 3.6525, 21.6535, 1.0428, 0.0007, 0.0040, 0.0092, 0.0050, 8.9582, 25.9843, 9.6373, 27.7165, 1.6693, 22.5197, 0.5812, 0.0000, 0.0008, 0.0009, 0.0010, 34.1839, 25.9843, 37.5552, 27.7165, 4.8158, 8.6614, 4.6296, 0.0040, 0.0121, 0.0111, 0.0103

and following is the Processed features generated by embedded target

24.130144 18.471909 0.671408 15.894431 0.537333 9.450743 0.470300 0.000677 0.003979 0.009198 0.005002 8.958190 8.591585 0.560658 15.894431 0.263774 7.302847 0.254865 0.000016 0.000823 0.000923 0.000965 34.183895 25.774754 37.844513 8.591585 4.665249 21.908543 1.929370 0.004009 0.012078 0.011126 0.010309

Because of this the predicted outcome is getting wrong compared to what is seen in Model Testing.

Please advise

/cc @Hardik

janjongboom · September 4, 2020, 1:08pm

Hi @paulphilip, thanks a lot for the report, here’s a patch:

diff --git a/edge-impulse-sdk/dsp/spectral/feature.hpp b/edge-impulse-sdk/dsp/spectral/feature.hpp
index 03c1b9f4..7916245e 100644
--- a/edge-impulse-sdk/dsp/spectral/feature.hpp
+++ b/edge-impulse-sdk/dsp/spectral/feature.hpp
@@ -134,11 +134,12 @@ public:
             }
 
             // multiply by 2/N
-            numpy::scale(&fft_matrix, (2.0f / static_cast<float>(fft_matrix.cols)));
+            numpy::scale(&fft_matrix, (2.0f / static_cast<float>(fft_length)));
 
             // we're now using the FFT matrix to calculate peaks etc.
             EI_DSP_MATRIX(peaks_matrix, fft_peaks, 2);
-            ret = spectral::processing::find_fft_peaks(&fft_matrix, &peaks_matrix, sampling_freq, fft_peaks_threshold);
+            ret = spectral::processing::find_fft_peaks(&fft_matrix, &peaks_matrix,
+                sampling_freq, fft_peaks_threshold, fft_length);
             if (ret != EIDSP_OK) {
                 EIDSP_ERR(EIDSP_MATRIX_SIZE_MISMATCH);
             }
diff --git a/edge-impulse-sdk/dsp/spectral/processing.hpp b/edge-impulse-sdk/dsp/spectral/processing.hpp
index 67fe43ad..7d03255e 100644
--- a/edge-impulse-sdk/dsp/spectral/processing.hpp
+++ b/edge-impulse-sdk/dsp/spectral/processing.hpp
@@ -209,12 +209,6 @@ namespace processing {
             prev = in[ix];
         }
 
-        // printf("find_peak_indexes returned: ");
-        // for (size_t ix = 0; ix < out_ix; ix++) {
-        //     printf("%d ", out[ix]);
-        // }
-        // printf("\n");
-
         *peaks_found = out_ix;
 
         return EIDSP_OK;
@@ -232,7 +226,8 @@ namespace processing {
         matrix_t *fft_matrix,
         matrix_t *output_matrix,
         float sampling_freq,
-        float threshold)
+        float threshold,
+        uint16_t fft_length)
     {
         if (fft_matrix->rows != 1) {
             EIDSP_ERR(EIDSP_MATRIX_SIZE_MISMATCH);
@@ -244,16 +239,16 @@ namespace processing {
 
         int ret;
 
-        int N = static_cast<int>(fft_matrix->cols);
+        int N = static_cast<int>(fft_length);
         float T = 1.0f / sampling_freq;
 
-        EI_DSP_MATRIX(freq_space, 1, N);
+        EI_DSP_MATRIX(freq_space, 1, fft_matrix->cols);
         ret = numpy::linspace(0.0f, 1.0f / (2.0f * T), floor(N / 2), freq_space.buffer);
         if (ret != EIDSP_OK) {
             EIDSP_ERR(ret);
         }
 
-        EI_DSP_MATRIX(peaks_matrix, output_matrix->rows * 4, 1);
+        EI_DSP_MATRIX(peaks_matrix, output_matrix->rows * 10, 1);
 
         uint16_t peak_count;
         ret = find_peak_indexes(fft_matrix, &peaks_matrix, 0.0f, &peak_count);
@@ -265,10 +260,9 @@ namespace processing {
         std::vector<freq_peak_t> peaks;
         for (uint8_t ix = 0; ix < peak_count; ix++) {
             freq_peak_t d;
-            // @todo: something somewhere does not go OK... and these numbers are dependent on
-            // the FFT length I think... But they are an OK approximation for now.
-            d.freq = freq_space.buffer[static_cast<uint32_t>(peaks_matrix.buffer[ix])] / 2.032258f;
-            d.amplitude = fft_matrix->buffer[static_cast<uint32_t>(peaks_matrix.buffer[ix])] / 1.969326f;
+
+            d.freq = freq_space.buffer[static_cast<uint32_t>(peaks_matrix.buffer[ix])];
+            d.amplitude = fft_matrix->buffer[static_cast<uint32_t>(peaks_matrix.buffer[ix])];
             if (d.amplitude < threshold) {
                 d.freq = 0.0f;
                 d.amplitude = 0.0f;

Two issues here that this uncovered: 1) we were calculating 2/N wrong, which is used to calculate a number of matrices to find peaks in the signal after FFT. I know I’d seen this bug before, but we corrected a bit for it (see the removed comment). 2) the scratch buffer we use to find peaks in the FFT is normally enough, but with your signal the buffer is not large enough. This made the DSP code drop peaks later in your signal. We’ll put a proper patch out for the SDK early next week but you can apply the patch above if you need it earlier.

We’ll also release a fix on Monday for the issue you had yesterday with the mismatching number of samples. Some unfortunate luck with hitting bugs, my apologies!

paulphilip · September 4, 2020, 3:13pm

Hi,

Thanks for the fix patch. I am now getting the Processed features accurately.

24.130144 25.551182 28.613092 27.283464 3.652510 21.653543 1.042838 0.000677 0.003979 0.009198 0.005002 8.958190 25.984253 9.637259 27.716536 1.669298 22.519686 0.581195 0.000016 0.000823 0.000923 0.000965 34.183895 25.984253 37.555218 27.716536 4.815798 8.661417 4.629587 0.004009 0.012078 0.011126 0.010309

Please notify when this patch goes online
Thanks
/cc @Hardik

paulphilip · September 8, 2020, 11:16am

Hi @janjongboom,

Thanks for notifying. When I compare the size of the sliding window size in the model testing window, I find it to be 63 whereas as per the project the size should be EI_CLASSIFIER_FREQUENCY x EI_CLASSIFIER_RAW_SAMPLES_PER_FRAME x (200ms) = 66. Please share on how to configure the sliding window size from parameters in model_metadata.h.

/cc @Hardik

janjongboom · September 8, 2020, 12:21pm

@paulphilip The SDK does not know about the sliding window size, it’s used during training only. In your firmware just construct one window (of 660 features (110Hz x 2 seconds x 3 axes)) and call run_classifier.

On the live classification: if the sliding window increase is not perfectly divisable by the frequency we round the increase down. Because there are some rounding weirdness with the way we process the frequency here (in this flow 1000 / 110 is stored as an intermediate result, but rounded to 9.090909, then multiplied by the expected window count (22) => 199.99998 which is below 200 and then rounded down), we set this increase to 190.9 ms. instead. I think a workaround would be to set the increase to 199 ms instead.

edit: Actually the issue is between using JavaScript on server side, and C++ on the embedded side:

C++

floor(200.0f / (1000.0f / 110.0f)) * 3.0f; // => 66

JS

Math.floor(200.0 / (1000.0 / 110.0)) * 3.0; // => 63

paulphilip · September 8, 2020, 1:09pm

Hi @janjongboom,

How can we represent it as a #define so that I do not be bothered of the rounding error?
Currently I am using

#define SENSOR_OVERLAP_WINDOW (((int32_t)(EI_CLASSIFIER_FREQUENCY*0.199))*EI_CLASSIFIER_RAW_SAMPLES_PER_FRAME)

/cc @Hardik

janjongboom · September 8, 2020, 1:26pm

@paulphilip Try this:

floor(200 / (1000 / (EI_CLASSIFIER_FREQUENCY - 0.01f))) * EI_CLASSIFIER_RAW_SAMPLES_PER_FRAME

This gives me 63, but it’s dependent on the frequency. For e.g. 100Hz you don’t want the rounding.

In all fairness I’d not try to round this down, but rather just do the sliding window calculation the correct way on device.

paulphilip · September 8, 2020, 1:36pm

Hi @janjongboom,

I see that the code patch you have mentioned above has not gone online. Please confirm

/cc @Hardik

janjongboom · September 8, 2020, 1:36pm

@paulphilip Correct, this will be released with the SDK update later this week.

janjongboom · September 9, 2020, 1:53pm

@paulphilip @Hardik This is now online!

paulphilip · September 11, 2020, 4:20am

Hi @janjongboom,

I am facing segmentation fault in the new release.

++ MbedOS Fault Handler ++

FaultType: HardFault

Context:
R0   : 00000021
R1   : 00000000
R2   : 00000000
R3   : 00000000
R4   : 200057C0
R5   : 200057C4
R6   : 00000014
R7   : 00000000
R8   : 20005850
R9   : 08042492
R10  : 08043012
R11  : 20005850
R12  : 00000000
SP   : 20005498
LR   : 00000000
PC   : 080059E0
xPSR : 210D0000
PSP  : 20005430
MSP  : 2002FFC0
CPUID: 410FC241
HFSR : 40000000
MMFSR: 00000000
BFSR : 00000000
UFSR : 00000100
DFSR : 00000008
AFSR : 00000000
Mode : Thread
Priv : Privileged
Stack: PSP

-- MbedOS Fault Handler --

When I compare the previous sdk release and the current sdk release, along with the patch given by you above, I can see change in edge-impulse-sdk\classifier\ei_run_dsp.h.

72,74c72,89
<     // the spectral edges that we want to calculate (@todo, take this from the config)
<     float edges[] = { 0.1, 0.5, 1.0, 2.0, 5.0 };
<     matrix_t edges_matrix_in(sizeof(edges) / sizeof(edges[0]), 1, edges);
---
>     // the spectral edges that we want to calculate
>     matrix_t edges_matrix_in(64, 1);
>     size_t edge_matrix_ix = 0;
> 
>     char spectral_str[128] = { 0 };
>     if (strlen(config.spectral_power_edges) > sizeof(spectral_str) - 1) {
>         EIDSP_ERR(EIDSP_PARAMETER_INVALID);
>     }
>     memcpy(spectral_str, config.spectral_power_edges, strlen(config.spectral_power_edges));
> 
> 	char spectral_delim[] = ",";
> 	char *spectral_ptr = strtok(spectral_str, spectral_delim);
> 	while (spectral_ptr != NULL) {
>         edges_matrix_in.buffer[edge_matrix_ix] = atof(spectral_ptr);
>         edge_matrix_ix++;
> 		spectral_ptr = strtok(NULL, spectral_delim);
> 	}
>     edges_matrix_in.rows = edge_matrix_ix;

If I retain the old version of edge-impulse-sdk\classifier\ei_run_dsp.h, my code is running properly.
Kindly advise.
/cc @Hardik

janjongboom · September 11, 2020, 8:06am

@paulphilip very interesting crash, it’s actually an unaligned access in TensorFlow Lite which throws the hardfault (see below):

Crash Info:
	Crash location = tflite::ops::micro::fully_connected::Eval(TfLiteContext*, TfLiteNode*) [0x080038F0] (based on PC value)
	Caller location = __FRAME_END__ [0x10001680] (based on LR value)
	Stack Pointer at the time of crash = [20003430]
	Target and Fault Info:
		Processor Arch: ARM-V7M or above
		Processor Variant: C24
		Forced exception, a fault with configurable priority has been escalated to HardFault
		Unaligned access error has occurred

What’s weird though is that our own examples (which use the same power edges as your application) run without an issue both on the ST IoT Discovery Kit (int8 and float32 models) as well as under Valgrind on MacOS. The applications all pass CI too (where they run on additional targets as well).

Will look at it more in depth with a debugger after the weekend.

janjongboom · September 16, 2020, 1:35pm

@paulphilip I’ve been debugging on our end - but no luck. Even when skipping the DSP code altogether this still fails during initialization. Would be it OK to share your finalized tflite model with the TensorFlow team?

janjongboom · September 25, 2020, 3:04pm

@paulphilip your model runs fine under EON by the way!