Image Labeling

note

If you haven't set up the SDK yet, make sure to go through those directions first. You'll need to add the Core library to the app before using the specific feature API or custom model. Follow iOS setup or Android setup directions.

Image Labeling In Action

With an Image Labeling model, you can identify the contents of an image or each frame of live video. Each prediction returns a set of labels as well as a confidence score for each label. Image Labeling can recognize people, places, and things.

If you need to know what objects are in an image, and where they are, consider using Object Detection instead.

Custom Training Models for Image Labeling

You can train a custom model that is compatible with the Image Labeling API by using studio.

If you have a custom model that was trained outside of Fritz AI, follow this checklist to make sure it will be compatible with the Image Labeling API.

  1. Your model must be in the TensorFlow Lite (.tflite) or Core ML (.mlmodel) formats.
  2. iOS Only The name of the input layer must be named image and the output confidence.
  3. Android Only The input (image) and output layer (confidence) should be defined in the TensorFlow Lite conversion tool.
  4. The input should have the following dimensions: 1 (batch_size) x 224 (height) x 224 (width) x 3 (num_channels). Height and width are configurable.
  5. The output should have the following dimensions: 1 x number_of_labels.

Pre-trained Image Labeling

The pre-trained Image Labeling model supports 681 labels. View the full label list.

Technical Specifications

ArchitectureFormat(s)Model SizeInputOutputBenchmarks
MobileNet V2 variantCore ML (iOS), TensorFlow Lite (Android)~13MB224x224-pixel imageLabel + confidence score (0-100%)38 FPS on iPhone X, 10 FPS on Pixel 2"

iOS

You can use the FritzVisionLabelModel to label the contents of images. Fritz provides a variety of options to configure predictions.

1. Build the FritzVisionLabelModel

To create the label model, you can either include the model in your bundle or download it over the air once the user installs your app.

Include the model in your application bundle

Add the model to your Podfile

Include Fritz/VisionLabelModel in your Podfile. This will include the model file in your app bundle.

pod 'Fritz/VisionLabelModel/Fast'

Make sure to install the recent addition.

pod install
note

If you've built the app with just the core Fritz pod and add a new submodule for the model, you may encounter an error "Cannot invoke initializer for type". To fix this, run a pod update and clean your XCode build to resolve the issue.

Define FritzVisionLabelModelFast

Define the instance of the FritzVisionLabelModelFast in your code. There should only be one instance that is reused for each prediction.

import Fritz
let labelModel = FritzVisionLabelModelFast()
Model initialization

It's important to intialize one instance of the model so you are not loading the entire model into memory on each model execution. Usually this is a property on a ViewController. When loading the model in a ViewController, the following ways are recommended:

Lazy-load the model

By lazy-loading model, you won't load the model until the first prediction. This has the benefit of not prematurely loading the model, but it may make the first prediction take slghtly longer.

class MyViewController: UIViewController {
lazy var model = FritzVisionHumanPoseModelFast()
}

Load model in viewDidLoad

By loading the model in viewDidLoad, you'll ensure that you're not loading the model before the view controller is loaded. The model will be ready to go for the first prediction.

class MyViewController: UIViewController {
let model: FritzVisionHumanPoseModelFast!
override func viewDidAppear(_ animated: Bool) {
super.viewDidAppear(animated)
model = FritzVisionHumanPoseModelFast()
}
}

Alternatively, you can initialize the model property directly. However, if the ViewController is instantiated by a Storyboard and is the Initial View Controller, the properties will be initialized before the appDelegate function is called. This can cause the app to crash if the model is loaded before FritzCore.configure() is called.

Download the model over the air

Only available on Growth plans

For more information on plans and pricing, visit our website.

Add FritzVision to your Podfile

Include Fritz/Vision in your Podfile.

pod 'Fritz/Vision'

Make sure to run a pod install with the latest changes.

pod install

Download Model

import Fritz
var labelModel: FritzVisionLabelModelFast?
FritzVisionLabelModelFast.fetchModel { model, error in
guard let downloadedModel = model, error == nil else { return }
labelModel = downloadedModel
}

2. Create FritzVisionImage

FritzVisionImage supports different image formats.

Using a CMSampleBuffer

If you are using a CMSampleBuffer from the built-in camera, first create the FritzVisionImage instance:

let image = FritzVisionImage(buffer: sampleBuffer)
FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithBuffer: sampleBuffer];
// or
FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithImage: uiImage];

The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image. By default, if you specify FritzVisionImageMetadata the orientation will be .right:

image.metadata = FritzVisionImageMetadata()
image.metadata?.orientation = .left
// Add metdata
visionImage.metadata = [FritzVisionImageMetadata new];
visionImage.metadata.orientation = FritzImageOrientationLeft;
Setting the Orientation from the Camera

Data passed in from the camera will generally need the orientation set. When using a CMSampleBuffer to create a FritzVisionImage the orientation will change depending on which camera and device orientation you are using.

When using the back camera in the portrait Device Orientation, the orientation should be .right (the default if you specify FritzVisionImageMetadata on the image). When using the front facing camera in portrait Device Orientation, the orientation should be .left.

You can initialize the FritzImageOrientation with the AVCaptureConnection to infer orientation (if the Device Orientation is portrait):

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
let image = FritzVisionImage(sampleBuffer: sampleBuffer, connection: connection)
...
}

Using an UIImage

If you are using an UIImage, create the FritzVisionImage instance:

let image = FritzVisionImage(image: uiImage)

The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image:

image.metadata = FritzVisionImageMetadata()
image.metadata?.orientation = .right
Set the image orientation

UIImage can have associated UIImageOrientation data (for example when capturing a photo from the camera). To make sure the model is correctly handling the orientation data, initialize the FritzImageOrientation with the image's image orientation:

image.metadata?.orientation = FritzImageOrientation(image.imageOrientation)

3. Run image labeling

Use the labelModel instance you created earlier to run predictions:

guard let results = try? labelModel.predict(image) else { return }

Configure Label Prediction

Before running image labeling, you can configure the prediction with a FritzVisionLabelModelOptions object.

SettingsDescription
imageCropAndScaleOption.scaleFit (default)
Crop and Scale option for how to resize and crop the image for the model
threshold0.6 (default)
Confidence threshold for prediction results in the range of [0, 1].
numResults15 (default)
Maxiumum number of results to return from prediction.

For example, to build a more lenient FritzVisionLabelModelOptions object:

let options = FritzVisionLabelModelOptions()
options.threshold = 0.3
options.numResults = 2
guard let results = try? labelModel.predict(image, options: options) else { return }

4. Get labels in image

Once you have an array of FritzVisionLabel you can use them to access the image classifications.

// Created from model prediction.
let labels: [FritzVisionLabel]
// Print highest confidence result
print(labels[0].label)
print(labels[0].confidence)

5. Use the record method on the predictor to collect data

The FritzVisionLabelPredictor used to make predictions has a record method allowing you to send an image, a model-predicted annotation, and a user-generated annotation back to your Fritz AI account.

guard let results = try? labelModel.predict(image, options: options),
// Implement your own custom UX for users to label an image and store
// that as a list of [FritzVisionLabel].
labelModel.record(image, predicted: results, modified: modifiedLabels)

Android

1. Add the dependencies via Gradle

Add our repository in order to download the Vision API:

repositories {
maven { url "https://fritz.mycloudrepo.io/public/repositories/android" }
}

Add renderscript support and include the vision dependency in app/build.gradle. Renderscript is used in order to improve image processing performance. You'll also need to specify aaptOptions in order to prevent compressing TensorFlow Lite models.

android {
defaultConfig {
renderscriptTargetApi 21
renderscriptSupportModeEnabled true
}
// Don't compress included TensorFlow Lite models on build.
aaptOptions {
noCompress "tflite"
}
}
dependencies {
implementation 'ai.fritz:vision:+'
}

(Optional include model in your app) To include |FeatureName| model with your build, then you'll need to add the dependency as shown below. Note: This includes the model with your app when you publish it to the play store and will increase your app size.

dependencies {
implementation 'ai.fritz:vision-labeling-model-fast:{ANDROID_MODEL_VERSION}'
}

Now you're ready to classify images with the |FeatureName| API.

2. Get a FritzVisionLabelPredictor

In order to use the predictor, the on-device model must first be loaded. If you followed the Optional step above and included the ai.fritz:vision-image-label-model dependency, you can get a predictor to use immediately:

LabelingOnDeviceModel imageLabelOnDeviceModel = FritzVisionModels.getImageLabelingOnDeviceModel();
FritzVisionLabelPredictor predictor = FritzVision.ImageLabeling.getPredictor(imageLabelOnDeviceModel);

If you did not include the on-device model, you'll have to load the model before you can get a predictor. To do that, you'll use ImageLabelManagedModel object and call FritzVision.ImageLabeling.loadPredictor to start the model download.

FritzVisionLabelPredictor predictor;
LabelingManagedModel managedModel = FritzVisionModels.getImageLabelingManagedModel();
FritzVision.ImageLabeling.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionLabelPredictor>() {
@Override
public void onPredictorReady(FritzVisionLabelPredictor imageLabelingPredictor) {
Log.d(TAG, "Image Labeling predictor is ready");
predictor = imageLabelingPredictor;
}
});

3. Create a FritzVisionImage from an image or a video stream

To create a FritzVisionImage from a Bitmap:

FritzVisionImage visionImage = FritzVisionImage.fromBitmap(bitmap);

To create a FritzVisionImage from a media.Image object when capturing the result from a camera, first determine the orientation of the image. This will rotate the image to account for device rotation and the orientation of the camera sensor.

// Get the system service for the camera manager
final CameraManager manager = (CameraManager) getSystemService(Context.CAMERA_SERVICE);
// Gets the first camera id
String cameraId = manager.getCameraIdList().get(0);
// Determine the rotation on the FritzVisionImage from the camera orientaion and the device rotation.
// "this" refers to the calling Context (Application, Activity, etc)
ImageRotation imageRotationFromCamera = FritzVisionOrientation.getImageRotationFromCamera(this, cameraId);

Finally, create the FritzVisionImage object with the rotation

FritzVisionImage visionImage = FritzVisionImage.fromMediaImage(image, imageRotationFromCamera);

4. Run prediction - Label your image

Next, pass the FritzVisionImage into the predictor in order to evaluate the image labels:

FritzVisionLabelResult labelResult = visionPredictor.predict(visionImage);

visionPredictor.predict() returns a FritzVisionLabelResult which contains several methods to access the labels.

FritzVisionLabelResult methods

MethodDescription
List<FritzVisionLabel> getVisionLabels()Gets a list of FritzVisionLabel objects. Each label object has getText() and getConfidence() methods.
String getResultString()Creates a string with the labels and the confidence scores. (e.g cat: .35)
void logResult()Uses Log.d to print out the label predictions.

5. Use the record method on the predictor to collect data

The FritzVisionLabelPredictor used to make predictions has a record method allowing you to send an image, a model-predicted annotation, and a user-generated annotation back to your Fritz AI account.

FritzVisionLabelResult predictedResults = visionPredictor.predict(visionImage);
// Implement your own custom UX for users to label an image and store
// that as a FritzVisionLabelResult.
visionPredictor.record(visionImage, predictedResults.toAnnotations(), modifiedResults.toAnnotations())

Advanced Options

Configuring the Predictor

You can configure the predictor with FritzVisionLabelPredictorOptions to return specific results that match the options given:

FritzVisionLabelPredictorOptions methods

OptionDefaultDescription
confidenceThreshold.3Return labels above the confidence threshold
labelsMobileNet labelsThe set of labels that the model will use.

In order to change model performance for different devices, you may also expose the underlying TensorFlow Lite Interpreter options.

FritzVisionPredictorOptions methods

OptionDefaultDescription
useGPUfalseReturn labels above the confidence threshold. Please note, this is an experimental option and should not be used in production apps.
useNNAPIfalseUses the NNAPI for running model inference. Please note, this is an experimental option and should not be used in production apps.
numThreads2For CPU Only, run model inference using the specified number of threads

For more details, please visit the Official TensorFlow Lite documentation.

Example:

// Create predictor options
FritzVisionLabelPredictorOptions options = new FritzVisionLabelPredictorOptions()
options.confidenceThreshold = 0.7f;
options.numThreads = 2;
// Pass in the options when initializing the predictor
FritzVisionLabelPredictor predictor = FritzVision.ImageLabeling.getPredictor(imageLabelOnDeviceModel, options);