With the Object Detection feature, you can identify objects of interest in an image or each frame of live video. Each prediction returns a set of objects, each with a label, bounding box, and confidence score.
If you just need to know the contents of an image – not the location of the objects – consider using Image Labeling instead.
Models Compatible with the API
- Your model must be a single-shot multibox detector with boxes matching the default configuration found here.
- Your model must be in the TensorFlow Lite (.tflite) or Core ML (.mlmodel) formats.
- iOS Only The name of the input layer must be named
Preprocessor/sub:0and the 2 outputs
- Android Only The 1 input layer (Preprocessor/sub) and 4 output layers (‘outputLocations’, ‘outputClasses’, ‘outputScores’, ‘numDetections’) should be defined in the TensorFlow Lite conversion tool.
- The input should have the following dimensions:
1x300x300x3 (batch_size x height x width * num_channels). Height and width are configurable.
- iOS Only The output should have the following dimensions:
4 (box points) x num_anchor_boxes x 1for boxPredictions and
num_classes x 1for classPredictions.
- Android Only The output should have the following dimensions:
1 x num_anchor_boxes x 4 (box points)for outputLocations,
num_classes x 1for outputClasses & outputScores, and
Architecture Format(s) Model Size Input Output Benchmarks SSDLite + MobileNet V2 variant Core ML (iOS), TensorFlow Lite (Android) ~17 MB 300x300-pixel image Offsets for >2,000 candidate bounding boxes, Class labels for each box, Confidence scores for each box 18 FPS on iPhone X, 8 FPS on Pixel 2
- The object detection model supports 90 labels from the COCO dataset.
Customizing Models for Object Detection
If you have your own dataset and would like to train a custom model that is compatible with the Object Detection API, sign up for the Standard Plan on Fritz to access training notebooks.