Object Detection on iOS

Note

If you haven’t set up the SDK yet, make sure to go through those directions first. You’ll need to add the Core library to the app before using the specific feature API or custom model. Follow iOS setup or Android setup directions.

You can use the FritzVisionObjectModel to detect the objects inside of images. Fritz provides a variety of options to configure predictions.

1. Build the FritzVisionObjectModel

To create the object model, you can either include the model in your bundle or download it over the air once the user installs your app.

Include the model in your application bundle

Add the model to your Podfile

Include Fritz/VisionObjectModel in your Podfile. This will include the model file in your app bundle.

pod 'Fritz/VisionObjectModel'

Make sure to install the recent addition.

pod install

Define FritzVisionObjectModel

Define the instance of the FritzVisionObjectModel in your code. There should only be one instance that is reused for each prediction.

import Fritz

let objectModel = FritzVisionObjectModel()
@import Fritz;

FritzVisionObjectModel *objectModel = [[FritzVisionObjectModel alloc] initWithOptionalModel:nil];

Note

Model initialization

It’s important to intialize one instance of the model so you are not loading the entire model into memory on each model execution. Usually this is a property on a ViewController. When loading the model in a ViewController, the following ways are recommended:

Lazy-load the model

By lazy-loading model, you won’t load the model until the first prediction. This has the benefit of not prematurely loading the model, but it may make the first prediction take slghtly longer.

class MyViewController: UIViewController {
  lazy var model = FritzVisionPoseModel()
}

Load model in viewDidLoad

By loading the model in viewDidLoad, you’ll ensure that you’re not loading the model before the view controller is loaded. The model will be ready to go for the first prediction.

class MyViewController: UIViewController {
  let model: FritzVisionPoseModel!

  override func viewDidAppear(_ animated: Bool) {
    super.viewDidAppear(animated)
    model = FritzVisionPoseModel()
  }
}

Alternatively, you can initialize the model property directly. However, if the ViewController is instantiated by a Storyboard and is the Initial View Controller, the properties will be initialized before the appDelegate function is called. This can cause the app to crash if the model is loaded before FritzCore.configure() is called.

Download the model over the air

Add FritzVision to your Podfile

Include Fritz/Vision in your Podfile.

pod 'Fritz/Vision'

Make sure to run a pod install with the latest changes.

pod install

Download Model

import Fritz

var objectModel: FritzVisionObjectModel?

FritzVisionObjectModel.fetchModel { model, error in
   guard let downloadedModel = model, error == nil else { return }

   objectModel = downloadedModel
}
 @import Fritz;

 [FritzVisionObjectModel fetchModelWithCompletionHandler:^(FritzVisionObjectModel * _Nullable model, NSError * _Nullable error) {
     // Use downloaded object model
}];

2. Create FritzVisionImage

FritzImage supports different image formats.

  • Using a CMSampleBuffer

    If you are using a CMSampleBuffer from the built-in camera, first create the FritzImage instance:

    let image = FritzVisionImage(buffer: sampleBuffer)
    
    FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithBuffer: sampleBuffer];
    // or
    FritzVisionImage *visionImage = [[FritzVisionImage alloc] initWithImage: uiImage];
    

    The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image. By default, if you specify FritzVisionImageMetadata the orientation will be .right:

    image.metadata = FritzVisionImageMetadata()
    image.metadata?.orientation = .left
    
    // Add metdata
    visionImage.metadata = [FritzVisionImageMetadata new];
    visionImage.metadata.orientation = FritzImageOrientationLeft;
    

    Note

    Data passed in from the camera will generally need the orientation set. When using a CMSampleBuffer to create a FritzImage the orientation will change depending on which camera and device orientation you are using.

    When using the back camera in the portrait Device Orientation, the orientation should be .right (the default if you specify FritzVisionImageMetadata on the image). When using the front facing camera in portrait Device Orientation, the orientation should be .left.

    You can initialize the FritzImageOrientation with the AVCaptureConnection to infer orientation (if the Device Orientation is portrait):

    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        ...
        image.metadata = FritzVisionImageMetadata()
        image.metadata?.orientation = FritzImageOrientation(connection)
        ...
    }
    
  • Using a UIImage

    If you are using a UIImage, create the FritzVision instance:

    let image = FritzVisionImage(image: uiImage)
    

    The image orientation data needs to be properly set for predictions to work. Use FritzImageMetadata to customize orientation for an image:

    image.metadata = FritzVisionImageMetadata()
    image.metadata?.orientation = .right
    

    Note

    UIImage can have associated UIImageOrientation data (for example when capturing a photo from the camera). To make sure the model is correctly handling the orientation data, initialize the FritzImageOrientation with the image’s image orientation:

    image.metadata?.orientation = FritzImageOrientation(image.imageOrientation)
    

3. Run object detection

Run Object Detection Model

Use the objectModel instance you created earlier to run predictions:

guard let results = try? objectModel.predict(image) else { return }
FritzVisionObjectModelOptions* options = [FritzVisionObjectModelOptions new];
[objectModel predictWithImage:image options:options completion:^(NSArray<FritzVisionObject* > * _Nullable result, NSError *error) {
  // Code to work with objects here!
}];

Configure Object Prediction

Before running object detection, you can configure the prediction with a FritzVisionObjectModelOptions object.

Settings
imageCropAndScaleOption

.scaleFit (default)

Crop and Scale option for how to resize and crop the image for the model

threshold

0.6 (default)

Confidence threshold for prediction results in the range of [0, 1].

numResults

15 (default)

Maxiumum number of results to return from prediction.

iouThreshold

0.25 (default)

Threshold for overlap of items within a single class in range [0, 1].

Lower values are more strict.

For example, to build a more lenient FritzVisionObjectModelOptions object:

let options = FritzVisionObjectModelOptions()
options.threshold = 0.3
options.numResults = 2

guard let results = try? objectModel.predict(image, options: options) else { return }
FritzVisionObjectModelOptions* options = [FritzVisionObjectModelOptions new];
options.threshold = 0.3;
options.numResults = 2;

[objectModel predictWithImage:image options:options completion:^(NSArray<FritzVisionObject* > * _Nullable result, NSError *error) {
  // Code to work with objects here!
}];

4. Get detected objects in the image

List Detected Objects

The [FritzVisionObject] array has a list of detected objects from the prediction result. Each object has a label, confidence, and bounding box describing where the object is located.

 // Created from model prediction.
let objects: [FritzVisionObject]

for object in objects {
    object.label
    object.confidence
    object.boundingBox
}
NSArray<FritzVisionObject*> * objects;

for (FritzVisionObject* object in objects) {
    object.label;
    object.confidence;
    object.boundingBox;
}

Draw Bounding Boxes

Use FritzVisionObject to draw bounding boxes around detected objects. The BoundingBox object on ever FritzVisionObject has logic to convert the detected bounding box into a CGRect for your given coordinate space.

// Image used in model.
let image = FritzVisionImage(...)

 // Object created from model prediction.
let boundingBox = object.boundingBox

guard let size = image.size else { return }
let boundingBoxCGRect = boundingBox.toCGRect(imgHeight: size.height, imgWidth: size.width)

Custom Object Detection Model

You can use a model that has been trained with the TensorFlow Object Detection API. The model must have take an image input of size 300x300. If you are looking to train a custom object detection model, contact us.

If you have trained your own object detection model, you can use it with FritzVisionObjectModel.

1. Create a custom model for your trained model in the webapp and add to your Xcode project.

For instructions on how to do this, see Integrating a Custom Core ML Model.

2. Conform your model

Conform the model

Create a new file called CustomObjectDetectionModel+Fritz.swift and conform your class like so:

import Fritz

extension CustomObjectDetectionModel: SwiftIdentifiedModel {

    static let modelIdentifier = "model-id-abcde"

    static let packagedModelVersion = 1
}

3. Define the Custom Object Detection Model

let objectDetectionModel = FritzVisionObjectModel(
  model: CustomObjectDetectionModel(),
  classNames: [1: "Custom object 1"]
)

Other Code Examples

Not sure how to get started? Check out these resources for more examples using Object Detection: