iOS

You can use a Fritz Image Segmentation Model to partition an image into multiple segments that recognize everyday objects. These instructions will help you get Image Segmentation running in your app in no time.

Note

If you haven’t set up the SDK yet, make sure to go through those directions first. You’ll need to add the Core library to the app before using the specific feature or custom model libraries.

1. Add the model to your project

Include the segmentation model in your Podfile. You can choose between three different models:

To idenfity people (segments that represent people are marked in black)

pod 'Fritz/VisionSegmentationModel/People'

Make sure to install the added pod:

pod install

To idenfity the following object in your living room (with the colors that represent each segment in the final result):

  • Chair (Sandy Brown)
  • Wall (White)
  • Coffee Table (Brown)
  • Ceiling (Light Gray)
  • Floor (Dark Gray)
  • Bed (Light Blue)
  • Lamp (Yellow)
  • Sofa (Red)
  • Window (Cyan)
  • Pillow (Beige)
pod 'Fritz/VisionSegmentationModel/Indoor'

Make sure to install the added pod:

pod install

2. Define the segmentation model

Choose which segmentation model you want to use. There should only be one instance of the model that is shared across all predictions. Here is an example using the FritzVisionPeopleSegmentationModel:

import Fritz
lazy var peopleModel = FritzVisionPeopleSegmentationModel()

3. Create FritzImage

FritzImage supports different image formats.

  • Using a CMSampleBuffer

    If you are using a CMSampleBuffer from the built-in camera, first create the FritzImage instance:

    let image = FritzVisionImage(buffer: sampleBuffer)
    
    FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithBuffer: sampleBuffer];
    // or
    FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithImage: uiImage];
    

    The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image. By default, if you specify FritzVisionImageMetadata the orientation will be .right:

    image.metadata = FritzVisionImageMetadata()
    image.metadata?.orientation = .left
    
    // Add metdata
    visionImage.metadata = [FritzVisionImageMetadata new];
    visionImage.metadata.orientation = FritzImageOrientationLeft;
    

    Note

    Data passed in from the camera will generally need the orientation set. When using a CMSampleBuffer to create a FritzImage the orientation will change depending on which camera and device orientation you are using.

    When using the back camera in the portrait Device Orientation, the orientation should be .right (the default if you specify FritzVisionImageMetadata on the image). When using the front facing camera in portrait Device Orientation, the orientation should be .left.

    You can initialize the FritzImageOrientation with the AVCaptureConnection to infer orientation (if the Device Orientation is portrait):

    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        ...
        image.metadata = FritzVisionImageMetadata()
        image.metadata?.orientation = FritzImageOrientation(connection)
        ...
    }
    
  • Using a UIImage

    If you are using a UIImage, create the FritzVision instance:

    let image = FritzVisionImage(image: uiImage)
    

    The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image:

    image.metadata = FritzVisionImageMetadata()
    image.metadata?.orientation = .right
    

    Note

    UIImage can have associated UIImageOrientation data (for example when capturing a photo from the camera). To make sure the model is correctly handling the orientation data, initialize the FritzImageOrientation with the image’s image orientation:

    image.metadata?.orientation = FritzImageOrientation(image.imageOrientation)
    

4. Segment your image

Now, use the peopleModel instance you created earlier to segment your image. The style model will return a FritzVisionSegmentationResult object:

You can use this object to create a mask for a specific class, optionally specifying the threshold:

peopleModel.predict(image) { result, error in
  guard error == nil, let result = result else { return }

  let peopleMask = result.toImageMask(classIndex: FritzVisionPeopleClass.person.rawValue, threshold: 0.3)
  // If you're not sure how to use the output result, check out the public heartbeat
  // project for reference.
}

Or, calling toImageMask() with no arguments will choose the most probable class at each pixel:

peopleModel.predict(image) { result, error in
  guard error == nil, let result = result else { return }

  let peopleMask = result.toImageMask()
}

Note

By setting the imageCropAndScaleOption on FritzVisionSegmentationModelOptions, you can control preprocessing on the image.

There are 2 crop and scale options available:

  • centerCrop - This is the default option. Crops the image to the aspect ratio of the model by fitting the short side and cropping the centered image on the long side.
  • scaleFit - Fit the image to the model’s aspect ratio by resizing the image to match the given aspect ratio.

When using scaleFit, The image produced by result.toImageMask will conform to the image size of the input image. When using centerCrop, the image mask produced will be a square. You will need to align the image appropriately.

Example usage:

let options = FritzVisionImageSegmentationModelOptions(cropAndScaleOption: .scaleFit)
peopleModel.predict(image, options: options) { result, error in
  guard error == nil, let result = result else { return }

  let peopleMask = result.toImageMask()
 }