Thanks for the feedback!
We don’t have any post processing filtering of the type you describe and ideally we’d like to fix the training for the single cell example. If there’s more detail you could provide on your specific case we might be able to just directly fix it with some tuning and/or data prep.
Re: 7 vs 8; we don’t explicitly make the output 8 pixels; it’s a side effect of the network doing 1/8th reduction…
e.g. a (96, 96)
input is compressed down to (12, 12)
so each of those output cells represents 8x8 pixels.
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_7 │ (None, 96, 96, 3) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ Conv1 (Conv2D) │ (None, 48, 48, 8) │ 216 │ input_layer_7[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
....
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ block_6_expand_relu │ (None, 12, 12, │ 0 │ block_6_expand_B… │
│ (ReLU) │ 48) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ head (Conv2D) │ (None, 12, 12, │ 1,568 │ block_6_expand_r… │
│ │ 32) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ logits (Conv2D) │ (None, 12, 12, 5) │ 165 │ head[0][0] │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘
and this is always 1/8 when the input is a multiple of 8
but for (150, 150)
we end up with (19, 19)
( due to the padding config in the MobileNet head ). so each of these output cells is actually representing 150/19=7.8 pixels.
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_8 │ (None, 150, 150, │ 0 │ - │
│ (InputLayer) │ 3) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ Conv1 (Conv2D) │ (None, 75, 75, 8) │ 216 │ input_layer_8[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
...
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ block_6_expand_relu │ (None, 19, 19, │ 0 │ block_6_expand_B… │
│ (ReLU) │ 48) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ head (Conv2D) │ (None, 19, 19, │ 1,568 │ block_6_expand_r… │
│ │ 32) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ logits (Conv2D) │ (None, 19, 19, 5) │ 165 │ head[0][0] │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘
( Note: you can see all this info by switching to “Expert mode” and dropping a print(model.summary())
in the code after the call to build_model()
)
If you switched to (152, 152)
these would line up ( 152/19=8 ) and that’s your quickest “fix”.
But if you’re seeing 7x7 in the SDK, but 8x8 in Studio, that’s a bug on our side that I can log.
Cheers,
Mat