iPython notebook - can't unpack .npy data

Hi everyone,

I’m watching a great Hackaday Remoticon TinyML talk and of course trying to replicate it/experiment myself. I noticed that there is an “edit as an iPython notebook” option, so I downloaded the notebook, and here’s the problem: when running this cell

with open('x_train.npy', 'wb') as file:
with open('y_train.npy', 'wb') as file:
X = np.load('x_train.npy')
Y = np.load('y_train.npy')[:,0]

I’m getting a Value Error "Cannot load file containing pickled data when allow_pickle=False".
I thought it was something that had changed in the NumPy .load() function, so I tried forcing it to allow_pickle = True, but that doesn’t work, either:
OSError: Failed to interpret file 'x_train.npy' as a pickle

The problem seems to be with the X values: y_train.npy gets processed without any problem. The y_train.npy file is almost 12 megabytes, meanwhile the x_train.npy only 158 kilobytes, which makes me wonder if it gets created/downloaded properly.

Any idea what it could be?

Hi @nebelgrau77, I’m the person that gave that workshop on Hackaday (I’m glad you’re enjoying it!). Can you link to where you got that Notebook file? I don’t recognize that code as coming from one of the workshop’s scripts.

The GitHub repo that we used in the workshop can be found here: https://github.com/ShawnHymel/ei-keyword-spotting. In it, there should be two options: one for running a Jupyter Notebook on Colab (remotely) or running the Python curation script (locally).

Hi Shawn,

It’s not from your scripts (which work great BTW!), it’s an EdgeImpulse dashboard feature. In Impulse design/NN Classifier there are three little dots at the top, with the two options: Keras (expert) mode, where you can fine tune the model’s code in Python, and edit as iPython notebook. The notebook gets exported without any problem, but there must be some problem with the .npy files generation.

1 Like

@Nebelgrau, which project is this for? I see you have multiple.

Hello Jan, it’s the speech_recog project, it’s the only one where I tried this option.

I’ll look into it in detail tomorrow, but for now you can grab the labels NPY file from Dashboard.

I think there is just some problem with the first file: I can open three of them with a simple np.load(), but the first one keeps giving me problems, as if it was somehow malformed (I just downloaded them all from the dashboard).

Edit: Yep, confirmed: a malformed file, the X training data. But just in the speech_recog_2 project, the speech_recog and speech_recog_2_v2 are OK, the downloaded notebook unpacks the .npy files OK!

@nebelgrau77 The plot thickens… I’ve just downloaded the X & Y.npy file from both Dashboard and from the iPython notebook export (shapes: X (4799, 637) Y (4799,4)) and this imports fine for me in Python 3.9.0. This was for project ID 11649 ( speech_recog_2).

Did you retrain this model since your last message by any chance?

Nope, but it’s working fine for me, too, just testing it as I’m typing. Mine is Python 3.6.10, miniconda installation. Must’ve been some momentary glitch, thanks for checking!

And the feature itself is great, makes it easy to add some graphs to see how the loss/val loss are
behaving in time, all the model history stuff and such.

Hmm… If you encounter this again would you mind versioning your project (Versioning tab in the Studio), that caches all intermediate state.

And yeah, we want to add those things to the Studio at some point, but great to use it like that in the meantime!

Sure, will do! I’ll keep an eye on it :slight_smile: