Android

Note

If you haven’t set up the SDK yet, make sure to go through those directions first. You’ll need to add the Core library to the app before using the specific feature API or custom model. Follow iOS setup or Android setup directions.

Fritz provides an Android API that you can use to partition an image into multiple segments that recognize everyday objects. Follow these simple instructions in order to bring image segmentation to your app in no time.

1. Add the dependencies via Gradle

Add our repository in order to download the Vision API:

repositories {
    maven { url "https://raw.github.com/fritzlabs/fritz-repository/master" }
}

Include the dependencies in app/build.gradle:

dependencies {
    implementation 'ai.fritz:core:3+'
    implementation 'ai.fritz:vision:3+'
}

(Optional include model in your app) To include Image Segmentation model with your build, then you’ll need to add the dependency as shown below. Note: This includes the model with your app when you publish it to the play store and will increase your app size.

To identify people (segments that represent people are marked in Cyan).

Note

Behind the scenes, People Segmentation uses a TensorFlow Lite model. In order to include this with your app, you’ll need to make sure that the model is not compressed in the APK by setting aaptOptions.

Fast Inference Time / Lower Resolution - Processes images of size 384x384.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-people-segmentation-model:3+'
}

Slower Inference Time / Higher Resolution - Processes images of size 768x768.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-people-segmentation-medium-model:3+'
}

Identify Pets.

Note

Behind the scenes, Pet Segmentation uses a TensorFlow Lite model. In order to include this with your app, you’ll need to make sure that the model is not compressed in the APK by setting aaptOptions.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-pet-segmentation-model:3+'
}

Identify the sky (red).

Note

Behind the scenes, Sky Segmentation uses a TensorFlow Lite model. In order to include this with your app, you’ll need to make sure that the model is not compressed in the APK by setting aaptOptions.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-sky-segmentation-model:3+'
}

To identify the following object in your living room (with the colors that represent each segment in the final result):

  • Chair (Sandy Brown)
  • Wall (White)
  • Coffee Table (Brown)
  • Ceiling (Light Gray)
  • Floor (Dark Gray)
  • Bed (Light Blue)
  • Lamp (Yellow)
  • Sofa (Red)
  • Window (Cyan)
  • Pillow (Beige)

Note

Behind the scenes, Living Room Segmentation uses a TensorFlow Lite model. In order to include this with your app, you’ll need to make sure that the model is not compressed in the APK by setting aaptOptions.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-living-room-segmentation-model:3+'
}

To identify the following objects outside (with the colors that represent each segment in the final result):

  • Building / Edifice (Gray)
  • Sky (Very Light Blue)
  • Tree (Green)
  • Sidewalk / Pavement (Dark Gray)
  • Earth / Ground (Dark Green)
  • Car (Light Orange)
  • Water (Blue)
  • House (Purple)
  • Fence, Fencing (White)
  • Signboard, Sign (Pink)
  • Skyscraper (Light Gray)
  • Bridge, Span (Orange)
  • River (Light Blue)
  • Bus (Dark Orange)
  • Truck / Motortruck (dark brown)
  • Van (Light Orange)
  • Minibike / Motorbike (Black)
  • Bicycle (Dark Blue)
  • Traffic Light (Yellow)
  • Person (Cyan)

Note

Behind the scenes, Outdoor Segmentation uses a TensorFlow Lite model. In order to include this with your app, you’ll need to make sure that the model is not compressed in the APK by setting aaptOptions.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-outdoor-segmentation-model:3+'
}

To identify hair in order to apply effects (coloring):

  • None
  • Hair (Red)

Note

Behind the scenes, Hair Segmentation uses a TensorFlow Lite model. In order to include this with your app, you’ll need to make sure that the model is not compressed in the APK by setting aaptOptions.

android {
  aaptOptions {
    noCompress "tflite"
  }
}

dependencies {
    implementation 'ai.fritz:vision-hair-segmentation-model:3+'
}

Now you’re ready to segment images with the Image Segmentation API.

2. Get a Segmentation Predictor

In order to use the predictor, the on-device model must first be loaded.

If you followed the Optional step above and included the model, you can get a predictor to use immediately:

Fast Inference Time / Lower Resolution - Processes images of size 384x384.

SegmentOnDeviceModel peopleSegmentOnDeviceModel = new PeopleSegmentOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(peopleSegmentOnDeviceModel);
var peopleSegmentOnDeviceModel:SegmentOnDeviceModel = PeopleSegmentOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(peopleSegmentOnDeviceModel)

Slower Inference Time / Higher Resolution - Processes images of size 768x768.

SegmentOnDeviceModel peopleSegmentOnDeviceModel = new PeopleSegmentMediumOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(peopleSegmentOnDeviceModel);
var peopleSegmentOnDeviceModel:SegmentOnDeviceModel = PeopleSegmentMediumOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(peopleSegmentOnDeviceModel)
SegmentOnDeviceModel petSegmentOnDeviceModel = new PetSegmentationOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(petSegmentOnDeviceModel);
var petSegmentOnDeviceModel:SegmentOnDeviceModel = PetSegmentationOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(petSegmentOnDeviceModel)
SegmentOnDeviceModel skySegmentOnDeviceModel = new SkySegmentationOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(skySegmentOnDeviceModel);
var skySegmentOnDeviceModel:SegmentOnDeviceModel = SkySegmentationOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(skySegmentOnDeviceModel)
SegmentOnDeviceModel livingRoomSegmentOnDeviceModel = new LivingRoomSegmentOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(livingRoomSegmentOnDeviceModel);
var livingRoomSegmentOnDeviceModel:SegmentOnDeviceModel = LivingRoomSegmentOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(livingRoomSegmentOnDeviceModel)
SegmentOnDeviceModel outdoorSegmentOnDeviceModel = new OutdoorSegmentOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(outdoorSegmentOnDeviceModel);
var outdoorSegmentOnDeviceModel:SegmentOnDeviceModel = OutdoorSegmentOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(outdoorSegmentOnDeviceModel)
SegmentOnDeviceModel hairOnDeviceModel = new HairSegmentationOnDeviceModel();
FritzVisionSegmentPredictor predictor = FritzVision.ImageSegmentation.getPredictor(hairOnDeviceModel);
var hairOnDeviceModel:SegmentOnDeviceModel = HairSegmentationOnDeviceModel()
var predictor = FritzVision.ImageSegmentation.getPredictor(hairOnDeviceModel)

If you did not include the on-device model, you’ll have to load the model before you can get a predictor. To do that, you’ll call FritzVision.ImageSegmentation.loadPredictor to start the model download.

Fast Inference Time / Lower Resolution - Processes images of size 384x384.

FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new PeopleSegmentManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = PeopleSegmentManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})

Slower Inference Time / Higher Resolution - Processes images of size 768x768.

FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new PeopleSegmentMediumManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = PeopleSegmentMediumManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})
FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new PetSegmentManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = PetSegmentManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})
FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new SkySegmentManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = SkySegmentManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})
FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new LivingRoomSegmentManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = LivingRoomSegmentManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})
FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new OutdoorSegmentManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = OutdoorSegmentManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})
FritzVisionSegmentPredictor predictor;

SegmentManagedModel managedModel = new HairSegmentManagedModel();
FritzVision.ImageSegmentation.loadPredictor(managedModel, new PredictorStatusListener<FritzVisionSegmentPredictor>() {
    @Override
    public void onPredictorReady(FritzVisionSegmentPredictor segmentPredictor) {
        Log.d(TAG, "Segment predictor is ready");
        predictor = segmentPredictor;
    }
});
val predictor:FritzVisionSegmentPredictor

val managedModel = HairSegmentManagedModel()
FritzVision.ImageSegmentation.loadPredictor(managedModel, PredictorStatusListener<FritzVisionSegmentPredictor>() {
  fun onPredictorReady(segmentPredictor:FritzVisionSegmentPredictor) {
    Log.d(TAG, "Segment predictor is ready")
    predictor = segmentPredictor
  }
})

3. Create a FritzVisionImage from an image or a video stream

To create a FritzVisionImage from a Bitmap:

FritzVisionImage visionImage = FritzVisionImage.fromBitmap(bitmap);
var visionImage = FritzVisionImage.fromBitmap(bitmap)

To create a FritzVisionImage from a media.Image object when capturing the result from a camera, first determine the orientation of the image. This will rotate the image to account for device rotation and the orientation of the camera sensor.

// Get the system service for the camera manager
final CameraManager manager = (CameraManager) getSystemService(Context.CAMERA_SERVICE);

// Gets the first camera id
String cameraId = manager.getCameraIdList().get(0);

// Determine the rotation on the FritzVisionImage from the camera orientaion and the device rotation.
// "this" refers to the calling Context (Application, Activity, etc)
int imageRotationFromCamera = FritzVisionOrientation.getImageRotationFromCamera(this, cameraId);
// Get the system service for the camera manager
val manager = getSystemService(Context.CAMERA_SERVICE) as CameraManager

// Gets the first camera id
var cameraId = manager.getCameraIdList().get(0)

// Determine the rotation on the FritzVisionImage from the camera orientaion and the device rotation.
// "this" refers to the calling Context (Application, Activity, etc)
var imageRotationFromCamera = FritzVisionOrientation.getImageRotationFromCamera(this, cameraId)

Finally, create the FritzVisionImage object with the rotation

FritzVisionImage visionImage = FritzVisionImage.fromMediaImage(image, imageRotationFromCamera);
val visionImage = FritzVisionImage.fromMediaImage(image, imageRotationFromCamera);

4. Run prediction on FritzVisionImage

Pass your FritzVisionImage into the predictor to create masks on the original image.

FritzVisionSegmentResult segmentResult = predictor.predict(visionImage);
val segmentResult = predictor.predict(visionImage);

Running predict on the image returns a FritzVisionSegmentResult object with the following methods.

FritzVisionSegmentResult methods
Method Description
float[][] getConfidenceScores() Get the raw confidence scores of the output. This matrix will be the same size as model output.
MaskType[][] getMaskClassifications() Gets a grid of MaskType objects that represent the output classification of the model (e.g MaskType.PERSON, MaskType.BUS, MaskType.None). The matrix is the same size as the model output.
Bitmap buildMultiClassMask() Create a mask of the classes detected. The resulting bitmap size will be the same size as the model output.
Bitmap buildMultiClassMask(int maxAlpha, float clippingScoresAbove, float zeroingScoresBelow) Create a mask of the classes detected using the given options.
Bitmap buildSingleClassMask(MaskType maskType) Create a mask for a given MaskType (e.g MaskType.PERSON)
Bitmap buildSingleClassMask(MaskType maskType, int maxAlpha, float clippingScoresAbove, float zeroingScoresBelow) Create a mask for a given MaskType (e.g MaskType.PERSON) using the given options.

Calling buildSingleClassMask with clippingScoresAbove and zeroingScoresBelow arguments helps for dealing with the uncertainty of the model.

  • When clippingScoresAbove is set, any pixels with a confidence score above that threshold will have an alpha value of 255 (completely opaque).
  • When zeroingScoresBelow is set, any confidence scores below will not appear in the mask.
  • Any scores between clippingScoresAbove and zeroingScoresBelow will have a value of classProbability * 255. It’s useful to create a blur around predictions that may still contain the desired class.

5. Displaying the result

Overlay the mask on top of the original image

To view mask result, use the overlay method on the visionImage passed into the predict method.

Overlay the mask

Left: Original Image | Middle: People Segmentation mask | Right: Mask overlay

// Create a mask
Bitmap personMask = segmentResult.buildSingleClassMask(MaskType.PERSON);
Bitmap imageWithMask = visionImage.overlay(personMask);
val personMask = segmentResult.buildSingleClassMask(MaskType.PERSON);
val imageWithMask = visionImage.overlay(personMask);

Cut out the mask from the original image

To create a cut out from the original image using the mask, use the mask method on the visionImage passed into the predict method.

Create a cut out mask

Left: Original Image | Middle: Pet Segmentation mask | Right: Mask cut out

// Create a mask (max alpha value is 255, clippingScoresAbove .5f, zeroingScoresBelow .5f)
Bitmap personMask = segmentResult.buildSingleClassMask(MaskType.PERSON, 255, .5f, .5f);

// This image will have the same dimensions as visionImage
Bitmap imageWithMask = visionImage.mask(personMask);

// To trim the transparent pixels, set the optional trim parameter to true
Bitmap imageWithMask = visionImage.mask(personMask, true);
// Create a mask (max alpha value is 255, clippingScoresAbove .5f, zeroingScoresBelow .5f)
val personMask = segmentResult.buildSingleClassMask(MaskType.PERSON, 255, .5f, .5f);

// This image will have the same dimensions as visionImage
val imageWithMask = visionImage.mask(personMask);

// To trim the transparent pixels, set the optional trim parameter to true
val imageWithMask = visionImage.mask(personMask, true);

Blend the mask color with the original image :

For uses cases such as hair color changing with Hair Segmentation, you can blending the mask color with the original image. Choose from one of the following BlendModes. You may specify an alpha to apply to the mask before blending 0-255 (default).

Blend the mask colors

Left: Original Image | Middle: Hair Segmentation mask | Right: Blended bitmap with the mask

// Hue Blend
BlendMode hueBlend = BlendModeType.HUE.create();
BlendMode hueBlend = BlendModeType.HUE.createWithAlpha(50);

// Color Blend
BlendMode colorBlend = BlendModeType.COLOR.create();
BlendMode colorBlend = BlendModeType.COLOR.createWithAlpha(50);

// Soft Light Blend
BlendMode softLightBlend = BlendModeType.SOFT_LIGHT.create();
BlendMode softLightBlend = BlendModeType.SOFT_LIGHT.createWithAlpha(50);
// Hue Blend
val hueBlend = BlendModeType.HUE.create()
val hueBlend = BlendModeType.HUE.createWithAlpha(50)

// Color Blend
val colorBlend = BlendModeType.COLOR.create()
val colorBlend = BlendModeType.COLOR.createWithAlpha(50)

// Soft Light Blend
val softLightBlend = BlendModeType.SOFT_LIGHT.create()
val softLightBlend = BlendModeType.SOFT_LIGHT.createWithAlpha(50)

Set the color for the mask and then create a blended bitmap from the result.

MaskType.HAIR.color = Color.BLUE;
FritzVisionSegmentResult hairResult = hairPredictor.predict(fritzVisionImage);

FritzVisionImage originalImage = hairResult.getOriginalImage();
Bitmap maskBitmap = hairResult.buildMultiClassMask(blendMode.getAlpha());
Bitmap blendedBitmap = originalImage.blend(maskBitmap, blendMode);
MaskType.HAIR.color = Color.BLUE
val hairResult = hairPredictor.predict(fritzVisionImage)

val originalImage = hairResult.getOriginalImage()
val maskBitmap = hairResult.buildMultiClassMask(blendMode.getAlpha())
val blendedBitmap = originalImage.blend(maskBitmap, blendMode)

The blendedBitmap object will have the same size as the original image passed into the predictor.


Advanced Options

For targeting specific classes:

To target only specific classes (e.g Window / Walls), create a FritzVisionSegmentPredictorOptions object to pass into the predictor.

To initialize the options when getting the style predictor.

// List the segments to target
List<MaskType> targetMasks = new ArrayList<>();
targetMasks.add(MaskType.WALL);
targetMasks.add(MaskType.WINDOW);

// Create predictor options with a confidence threshold.
// If it's below the confidence threshold, the segment will be marked
// as SegmentClass.NONE.
FritzVisionSegmentPredictorOptions options = new FritzVisionSegmentPredictorOptions.Builder()
        .targetSegmentClasses(targetMasks)
        .targetConfidenceThreshold(.3f)
        .build();

predictor = FritzVision.ImageSegmentation.getPredictor(onDeviceModel, options);
// List the segments to target
val targetMasks = ArrayList()
targetMasks.add(MaskType.WALL)
targetMasks.add(MaskType.WINDOW)

// Create predictor options with a confidence threshold.
// If it's below the confidence threshold, the segment will be marked
// as SegmentClass.NONE.
val options = FritzVisionSegmentPredictorOptions.Builder()
  .targetSegmentClasses(targetMasks)
  .targetConfidenceThreshold(.3f)
  .build()

predictor = FritzVision.ImageSegmentation.getPredictor(onDeviceModel, options)

The resulting list of segments will contain 3 classes: Wall, Window, and None.

Controlling Image Preprocessing

By setting FritzVisionCropAndScale on FritzVisionSegmentPredictorOptions, you can control preprocessing on the image.

There are 2 crop and scale options available:

  • FritzVisionCropAndScale.SCALE_TO_FIT (default) - Resizes the original image to fit the size of the model input while maintaining the aspect ratio. After inference, the output is scaled back to the original aspect ratio. (e.g for a model input size of 384x384, scale a 600x400 image down to fit the model size. After the model runs inference, scale the output results to match the original 600x400 aspect ratio).
  • FritzVisionCropAndScale.CENTER_CROP - Run prediction on a center cropped section of the original image passed into the predict method. (e.g if the original image is 600x400, the predictor will run on a 400x400 image)

Example usage:

FritzVisionSegmentPredictorOptions options = new FritzVisionSegmentPredictorOptions.Builder()
  .cropAndScaleOption(FritzVisionCropAndScale.SCALE_TO_FIT)
  .build()
predictor = FritzVision.ImageSegmentation.getPredictor(onDeviceModel, options);
val options = FritzVisionSegmentPredictorOptions.Builder()
  .cropAndScaleOption(FritzVisionCropAndScale.SCALE_TO_FIT)
  .build()
predictor = FritzVision.ImageSegmentation.getPredictor(onDeviceModel, options)