Image Labeling

Image Labeling In Action

With the Image Labeling feature, you can identify the contents of an image or each frame of live video. Each prediction returns a set of labels as well as a confidence score for each label. Image Labeling can recognize people, places, and things. The underlying ML model was trained on millions of images and hundreds of labels.

If you need to know what objects are in an image, and where they are, consider using Object Detection instead.

Get Started

Models Compatible with the API

Compatibility Checklist

  1. Your model must be in the TensorFlow Lite (.tflite) or Core ML (.mlmodel) formats.
  2. iOS Only The name of the input layer must be named image and the output confidence.
  3. Android Only The input (image) and output layer (confidence) should be defined in the TensorFlow Lite conversion tool.
  4. The input should have the following dimensions: 1 (batch_size) x 224 (height) x 224 (width) x 3 (num_channels). Height and width are configurable.
  5. The output should have the following dimensions: 1 x number_of_labels.

Technical Specifications

Architecture Format(s) Model Size Input Output Benchmarks
MobileNet V2 variant Core ML (iOS), TensorFlow Lite (Android) ~13MB 224x224-pixel image Label + confidence score (0-100%) 38 FPS on iPhone X, 10 FPS on Pixel 2

Pre-trained Model

Customizing Models for Image Labeling

If you have your own dataset and would like to train a custom model that is compatible with the Image Labeling API, sign up for the Standard Plan to access training notebooks.

If you’ve already created a model with tools like TuriCreate, Create ML, or TensorFlow, contact us to see if the model is compatible with the API.