iOS

You can use the FritzVisionLabelModel to label the contents of images. Fritz provides a variety of options to configure predictions.

Note

If you haven’t set up the SDK yet, make sure to go through those directions first. You’ll need to add the Core library to the app before using the specific feature or custom model libraries.

1. Add the model to your project

Include the FritzVisionLabelModel in your Podfile. This will include the on-device model in your Fritz bundle.

pod 'Fritz/VisionLabelModel'

Make sure to install the recent addition.

pod install

2. Define FritzVisionLabelModel

Define the instance of the FritzVisionLabelModel in the struct/class. There should only be one instance that is reused for each prediction.

import Fritz
lazy var visionModel = FritzVisionLabelModel()
@import Fritz;
FritzVisionLabelModel *visionModel = [FritzVisionLabelModel new];

3. Create FritzImage

FritzImage supports different image formats.

  • Using a CMSampleBuffer

    If you are using a CMSampleBuffer from the built-in camera, first create the FritzImage instance:

    let image = FritzVisionImage(buffer: sampleBuffer)
    
    FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithBuffer: sampleBuffer];
    // or
    FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithImage: uiImage];
    

    The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image. By default, if you specify FritzVisionImageMetadata the orientation will be .right:

    image.metadata = FritzVisionImageMetadata()
    image.metadata?.orientation = .left
    
    // Add metdata
    visionImage.metadata = [FritzVisionImageMetadata new];
    visionImage.metadata.orientation = FritzImageOrientationLeft;
    

    Note

    Data passed in from the camera will generally need the orientation set. When using a CMSampleBuffer to create a FritzImage the orientation will change depending on which camera and device orientation you are using.

    When using the back camera in the portrait Device Orientation, the orientation should be .right (the default if you specify FritzVisionImageMetadata on the image). When using the front facing camera in portrait Device Orientation, the orientation should be .left.

    You can initialize the FritzImageOrientation with the AVCaptureConnection to infer orientation (if the Device Orientation is portrait):

    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        ...
        image.metadata = FritzVisionImageMetadata()
        image.metadata?.orientation = FritzImageOrientation(connection)
        ...
    }
    
  • Using a UIImage

    If you are using a UIImage, create the FritzVision instance:

    let image = FritzVisionImage(image: uiImage)
    

    The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image:

    image.metadata = FritzVisionImageMetadata()
    image.metadata?.orientation = .right
    

    Note

    UIImage can have associated UIImageOrientation data (for example when capturing a photo from the camera). To make sure the model is correctly handling the orientation data, initialize the FritzImageOrientation with the image’s image orientation:

    image.metadata?.orientation = FritzImageOrientation(image.imageOrientation)
    

4. Run predictions

Now, use the visionModel instance you created earlier to run predictions:

visionModel.predict(image) { labels, error in
  guard error == nil, let labels = labels else { return }

  // Code to work with labels here!
}
[visionModel predict:visionImage options:nil completion:^(NSArray *objects, NSError *error) {
  // Use results
}];

By default, visionModel only returns a max of 15 labels with a confidence score of 0.6 or greater. To override the default parameters, you can define FritzVisionLabelModelOptions and pass it into the predict method:

let options = FritzVisionLabelModelOptions(
    threshold: 0.2,
    numResults: 20
)

visionModel.predict(image, options: options) { labels, error in
  // ...
}