Bug with data ingestion API

I am using the data ingestion API to upload samples. Per the docs, I have added the x-label header but the data ingested into the platform has no label. Equally, the platform does not infer the label from the file name (file names include cats and dogs).

{'x-label': 'cats', 'x-api-key': 'KEY'}
{'x-label': 'dogs', 'x-api-key': 'KEY'}

Hello @AdamMiltonBarker,

I just tested with a new project and it seems to work as expected on my side.

I used the python snippet from Ingestion API - Edge Impulse API


Can you share your code so I can have a deeper look?

Best,

Louis

Hi thanks for the reply. This is the code, the print outs above are the headers being printed before sending the request:

def data_prepare(self):
        if (
            self.helper.confs["data"]["type"] == "local"
            and self.helper.confs["data"]["train_data"] == False
            and self.helper.confs["data"]["test_data"] == False
        ):
            print("** Make sure your data is in the data/train and data/test folders.")
            print("** Directory names in these folder will be used as the labels.")
            train_data = self.data.process_data(
                self.helper.confs["data"]["train_dir"],
                self.helper.confs["data"]["file_type"],
            )
            if not len(train_data):
                self.data_prepare()
            test_data = self.data.process_data(
                self.helper.confs["data"]["test_dir"],
                self.helper.confs["data"]["file_type"],
            )
            if not len(test_data):
                self.data_prepare()
                
            data_types = [
                "train",
                "test"
            ]
            
            for data_type in data_types:
                if data_type == "train":
                    current = train_data.items()
                else:
                    current = test_data.items()
                for label, data in current:
                    headers  = {"x-label": label, "x-api-key": self.project.get_project_key()}
                    print(headers)
                    resp_code, resp = self.helper.http_request(
                        "POST",
                        self.helper.confs["urls"][data_type+"_data_ingestion"],
                        headers,
                        False,
                        (
                            ("data", (os.path.basename(i), open(i, "rb"), "image/jpeg"))
                            for i in data[0]
                        ),
                    )
                    if resp_code == 200 and resp["success"] is not False:
                        if data_type == "train":
                            self.helper.set_train_data_complete()
                        else:
                            self.helper.set_test_data_complete()
                    else:
                        print("** There was an error with your request, please try again!")
                        self.data_prepare()
            return

If the header wasn’t making it to the api it would be rejected as the api key is in the header.

The http_request code:



    def http_request(self, method, address, headers, jsond = False, files = False):
        
        if method == "POST":
            if jsond:
                response = requests.post(
                    address, json=jsond, headers=headers)
            if files:
                response = requests.post(
                    address, files=files, headers=headers)
        
        if method == "GET":
            response = requests.get(
                address, headers=headers)
            
        response_code = response.status_code
        response_json = json.loads(response.text)

        return response_code, response_json

Just in case you would like to see the data processing code:


        
    def process_data(self, dir, file_type):

        data_list = {}
        for path in pathlib.Path(dir).iterdir():
            if path.is_dir():
                data_list[os.path.basename(path)]=[]
                data_list[os.path.basename(path)].append(list(
                    path.glob('*' + file_type)))
        return data_list

I have tested this a number of times now, running the code above with 2 training dirs (cats, dogs) and 2 testing dirs (cats, dogs) this is the output of headers:

TRAINING
{'x-label': 'cats', 'x-api-key': 'KEY'}
{'x-label': 'dogs', 'x-api-key': 'KEY'}

TESTING
{'x-label': 'cats', 'x-api-key': 'KEY'}
{'x-label': 'dogs', 'x-api-key': 'KEY'}

Data is imported but labels are still missing:

As an additional updated, I left out the x-label header completely and the label is not determined from the file name.

Hello @AdamMiltonBarker,

Can you share which endpoint you are using in your url? (or make sure to use the /files instead of the /data which is legacy and might not support the x-label header).
I don’t think I can see it in the code you shared.

Best,

Louis

Hi the URL is determined here:

self.helper.confs["urls"][data_type+"_data_ingestion"],

self.helper.confs:

    "urls": {
        "device_connect_doc": "https://docs.edgeimpulse.com/docs/development-platforms/officially-supported-cpu-gpu-targets/nvidia-jetson-nano",
        "get_device": "https://studio.edgeimpulse.com/v1/api/{projectId}/device/{deviceId}",
        "get_devices": "https://studio.edgeimpulse.com/v1/api/{projectId}/devices",
        "login": "https://studio.edgeimpulse.com/v1/api-login",
        "project_create": "https://studio.edgeimpulse.com/v1/api/projects/create",
        "project_key_create": "https://studio.edgeimpulse.com/v1/api/{projectId}/apikeys",
        "test_data_ingestion": "https://ingestion.edgeimpulse.com/api/testing/files",
        "train_data_ingestion": "https://ingestion.edgeimpulse.com/api/training/files"
    }

in the case of data_type == “train”:
https://ingestion.edgeimpulse.com/api/training/files

in the case of data_type == “test”:
https://ingestion.edgeimpulse.com/api/testing/files

Responses are coming back fine also, this is for the testing - dogs request:

Header:

{'x-label': 'dog', 'x-api-key': 'KEY'}

Response:

{'success': True, 'files': [{'success': True, 'projectId': 224721, 'sampleId': 271909476, 'fileName': 'dog_155.jpg.404kvpmc.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909477, 'fileName': 'dog_75.jpg.404kvpql.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909478, 'fileName': 'dog_415.jpg.404kvpva.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909479, 'fileName': 'dog_89.jpg.404kvq2c.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909480, 'fileName': 'dog_534.jpg.404kvq6p.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909481, 'fileName': 'dog_59.jpg.404kvq9b.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909482, 'fileName': 'dog_563.jpg.404kvqdu.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909483, 'fileName': 'dog_227.jpg.404kvqk7.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909485, 'fileName': 'dog_518.jpg.404kvqnl.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909486, 'fileName': 'dog_68.jpg.404kvqri.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909487, 'fileName': 'dog_173.jpg.404kvqvg.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909488, 'fileName': 'dog_43.jpg.404kvr37.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909489, 'fileName': 'dog_168.jpg.404kvr9h.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909490, 'fileName': 'dog_258.jpg.404kvrdj.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909491, 'fileName': 'dog_521.jpg.404kvrgm.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909492, 'fileName': 'dog_377.jpg.404kvrk7.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909493, 'fileName': 'dog_124.jpg.404kvrpq.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909494, 'fileName': 'dog_236.jpg.404kvrsr.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909495, 'fileName': 'dog_461.jpg.404kvrvp.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909496, 'fileName': 'dog_130.jpg.404kvs4k.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909497, 'fileName': 'dog_302.jpg.404kvs7g.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909498, 'fileName': 'dog_421.jpg.404kvscd.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909499, 'fileName': 'dog_159.jpg.404kvsj2.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909500, 'fileName': 'dog_442.jpg.404kvsm1.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909501, 'fileName': 'dog_150.jpg.404kvspb.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909502, 'fileName': 'dog_476.jpg.404kvst5.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909503, 'fileName': 'dog_219.jpg.404kvt0b.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909504, 'fileName': 'dog_181.jpg.404kvt3s.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909505, 'fileName': 'dog_360.jpg.404kvt8p.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909506, 'fileName': 'dog_364.jpg.404kvtce.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909507, 'fileName': 'dog_191.jpg.404kvtha.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909508, 'fileName': 'dog_244.jpg.404kvtpj.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909509, 'fileName': 'dog_344.jpg.404kvttv.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909510, 'fileName': 'dog_313.jpg.404kvu1d.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909511, 'fileName': 'dog_194.jpg.404kvu4t.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909512, 'fileName': 'dog_237.jpg.404kvu7t.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909513, 'fileName': 'dog_380.jpg.404kvud0.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909514, 'fileName': 'dog_142.jpg.404kvuh1.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909515, 'fileName': 'dog_44.jpg.404kvuki.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909516, 'fileName': 'dog_196.jpg.404kvuq9.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909517, 'fileName': 'dog_141.jpg.404kvv03.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909518, 'fileName': 'dog_327.jpg.404kvvbn.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909519, 'fileName': 'dog_551.jpg.404kvvfj.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909520, 'fileName': 'dog_369.jpg.404kvvke.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909521, 'fileName': 'dog_462.jpg.404kvvoe.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909522, 'fileName': 'dog_354.jpg.404kvvss.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909523, 'fileName': 'dog_464.jpg.404kvvvu.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909524, 'fileName': 'dog_472.jpg.404l003t.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909525, 'fileName': 'dog_517.jpg.404l0080.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909526, 'fileName': 'dog_536.jpg.404l00cq.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909527, 'fileName': 'dog_197.jpg.404l00hs.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909528, 'fileName': 'dog_213.jpg.404l00lc.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909529, 'fileName': 'dog_355.jpg.404l00pg.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909530, 'fileName': 'dog_177.jpg.404l00so.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909531, 'fileName': 'dog_28.jpg.404l011c.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909533, 'fileName': 'dog_229.jpg.404l0169.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909534, 'fileName': 'dog_528.jpg.404l01bb.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909535, 'fileName': 'dog_283.jpg.404l01gu.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909536, 'fileName': 'dog_482.jpg.404l01k0.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909537, 'fileName': 'dog_522.jpg.404l01nh.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909538, 'fileName': 'dog_443.jpg.404l01s4.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909539, 'fileName': 'dog_240.jpg.404l01v3.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909540, 'fileName': 'dog_519.jpg.404l024i.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909541, 'fileName': 'dog_114.jpg.404l027r.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909542, 'fileName': 'dog_303.jpg.404l02ed.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909543, 'fileName': 'dog_398.jpg.404l02hl.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909544, 'fileName': 'dog_520.jpg.404l02n9.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909545, 'fileName': 'dog_211.jpg.404l02rf.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909546, 'fileName': 'dog_147.jpg.404l030o.json'}, {'success': True, 'projectId': 224721, 'sampleId': 271909547, 'fileName': 'dog_123.jpg.404l0352.json'}]}

Oh I see where the issue is:

You project is set with the labeling method Bounding Boxes. (See in Dashboard)

You can change this by setting this parameter to one label per data item.

Best,

Louis

1 Like

UGH I did not notice that thanks, I did not set that, this project is created via the API and this was not passed to the api.

Thanks for that, I look how to set labelling method via the API.

@louis thanks buddy, was driving me nuts :smiley:

@louis Please could you tell me another thing. For create impulse, where can we find the documentation for the details we need to create other types of impulses, for instance image classifier.

Edit: I found first you have to call get impulse blocks:

This still does not provide the expected parameters you have to send to the api to create an impulse.

Hi @AdamMiltonBarker,

You actually have very good questions here.
All this will be integrated in a future version of the Python SDK (we’re working on it but it probably won’t happen before Q3).

Would you like to schedule a call sometimes in the following weeks, we’d love to hear your feedbacks.
If so, can you please send me an email to louis@edgeimpulse.com

And to answer your question, it is indeed not super clear in our documentation.
What I suggest is that you have a look at the EON Search Spaces.

You can have a look at the different template that we provide (we also generate the templates based on your impulses).

I hope that helps, I’ll create an internal ticket to improve the sections on our Create Impulse and Get Impulse blocks.

Best,

Louis

And I forgot to mention but we also have an Python API bindings package, not sure if you’re already using the one: edgeimpulse_api - Edge Impulse API

Best,

Louis

Hi thanks for the reply.

This a project I am doing for you guys through my role as Edge Impulse Expert, I am not using the SDK but the API directly per my approved project specs. We can schedule a call definitely but first I need to complete this project and move onto the next :smiley:

Thanks for the links, I think I found my answer there, I will continue with the project and let you know either way :slight_smile:

Thank you for the help today.

1 Like

Sadly no,

{'inputBlocks': [{'id': 0, 'type': 'image', 'name': 'Image', 'title': 'Images', 'imageWidth': 96, 'imageHeight': 96, 'resizeMode': 'fit-short', 'resizeMethod': 'fit-short'}], 'dspBlocks': [{'id': 0, 'input': 1, 'type': 'image', 'name': 'Image', 'title': 'Image'}], 'learnBlocks': [{'id': 0, 'type': 'keras-transfer-image', 'dsp': [1], 'model': 'transfer_mobilenetv2_a35'}]}

Response

{'success': False, 'error': 'invalid input syntax for integer: "NaN"'}

Honestly, I have no idea what values are used to be sent to the API :smiley: Please could someone let me know ASAP as this puts the project in deadlock. The following is what I worked out from what is available but the error response tells me nothing:

{
            "inputBlocks": [
                {
                    "id": 0,
                    "type": "image",
                    "name": "Image",
                    "title": "Images",
                    "imageWidth": 96,
                    "imageHeight": 96,
                    "resizeMode": "fit-short",
                    "resizeMethod": "fit-short"
                }
            ],
            "dspBlocks": [
                {
                    "id": 0,
                    "input": 1,
                    "type": "image",
                    "name": "Image",
                    "title": "Image"
                }
            ],
            "learnBlocks": [
                {
                    "id": 0,
                    "type": "keras-transfer-image",
                    "dsp": [1],
                    "model":"transfer_mobilenetv2_a35"
                }
            ]
        }
    }

The goal here is to create an Image impulse with an image processing block and a transfer learning learn block.

Done :slight_smile:

If anyone needs the info here is what I used to create an image classifier:

{
            "inputBlocks": [
                {
                    "id": 1,
                    "type": "image",
                    "name": "Image",
                    "title": "Images",
                    "imageWidth": 96,
                    "imageHeight": 96,
                    "resizeMode": "fit-short",
                    "resizeMethod": "fit-short"
                }
            ],
            "dspBlocks": [
                {
                    "id": 2,
                    "type": "image",
                    "name": "Image",
                    "title": "Image",
                    "axes": ["image"],
                    "input": 0
                }
            ],
            "learnBlocks": [
                {
                    "id": 3,
                    "type": "keras-transfer-image",
                    "name": "Transfer learning",
                    "title": "Transfer Learning (Images)",
                    "dsp": [2],
                    "primaryVersion": true
                }
            ]
        }

Hope it helps

I found another issue, or more a restriction that maybe should be added to notes. I am uploading a dataset of 4000 images per class and the server resonds with “Too many files” after about 5 minutes processing.

I will keep track of all issues I find in this post if it helps?

I found another issue, or more a restriction that maybe should be added to notes. I am uploading a dataset of 4000 images per class and the server resonds with “Too many files” after about 5 minutes processing.

I changed it to 1000 per class and 500 per testing class and this time it timed out.

<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
</body>
</html>

I will keep track of all issues I find in this post if it helps?

Hello @AdamMiltonBarker,

Yes please, if you can keep track of your issues in this post that would be awesome, I’ll share them with our Core Engineering team and keep a close look at it.

Best,

Louis

Happy to report aside from the above and a temporary issue I had with the socket server, all went well. I do have some feedback about the API and documentation, but project is now complete :slight_smile:

Regarding the error with uploads, I worked out it is safe to upload around 500 images per class via the API, large datasets could be chunked into smaller groups, this would be good to add to the documentation for the API.

I found the response from the socket messages a bit unintuitive but nothing that caused an issue.

I definitely think the documentation needs improving, and the UI for the documentation appears to be very buggy, jumping around a lot, especially the search part of the documentation.

Aside from that it has been a great experience learning about the API, the project is working very well and produces a good model.

Thanks for your help.