Our end-to-end machine learning platform empowers your dev team to tackle each challenge in the mobile ML lifecycle: generate and collect labeled datasets, train optimized models without code, deploy and manage on any mobile platforms, and improve models and app UX based on real-world data.
In this quickstart guide, you’ll work through the entire mobile ML lifecycle: building an intial dataset, training and deploying your first model, then collecting more data and retraining to improve it over time.
1. Defining the problem
Before you get started, you'll need to clearly define the problem you're trying to solve. Fritz AI Studio currently supports four types of computer vision problems:
- Image Labeling: predict a single label per image. (e.g. beach or forest)
- Object Detection: identify objects in images along with location of a bounding box that encloses them. E.g. draw a box around every cat
- Image Segmentation: predict which pixels in an image belong to a type of object. E.g. separate people from the background
- Pose Estimation: predict the location of individual keypoints on an object in an image. E.g. the (x, y) coordinates a person's eyes and nose
2. Building an initial dataset
Once you understand the problem you’re trying to solve, you’ll need some data. When starting a project from scratch, getting an initial dataset can be a daunting task. But, fear not. Rather than collecting and labeling thousands of images by hand, you can use our Dataset Generator to create high-quality annotated datasets from just a handful of seed images.
To generate a dataset, navigate to the datasets tab on the left hand menu and click the
ADD IMAGE COLLECTION button.
Image collections contain images and corresponding annotations that will eventually be used to train a model. The Dataset Generator requires a special type of collection called a “Seed” image collection as input. Select this option on the image collection creation form.
After creating your image collection, it’s time to upload some seed images. Seed images are transparent PNGs that will be used like stickers and pasted onto random backgrounds to generate a large, diverse set of images for model training. It is recommended you start with 10-20 seed images for each type of object you want to predict. For example, if you wanted to train a pose estimation model to find the location of finger tips, you would need to upload 10-20 images of hands with the background removed. The greater the variety in seed images, the more accurate your model will be in the real world.
Tools like remove.bg, Photoshop, and even Preview on MacOS can be used to create transparent PNGs. You may also find high quality transparent images from tools like Google Images, but be sure any photos have appropriate licensing.
Once your seed images are uploaded, it’s time to start annotating. The first step to annotating a collection is creating an annotation configuration. This configuration defines the types of objects your model will detect. Again, using the hand pose estimation model as an example, your annotation configuration would contain a single object called “hand” and five keypoints, one for each fingertip. If you want to train an object detection model, make sure the “include boudning box” option is enabled. You can edit your annotation configuration at any time to add or remove an object.
With your annotation configuration set, you can begin annotating your seed images. For each image, you’ll need to select the object you want to annotate, then select the individual annotation to place (e.g. a thumb keypoint or a bounding box). When you’re happy with your annotation, save it and move on to the next image. To speed up your annotation workflow, you can make use of the keyboard shortcuts.
After all seed images have been annotated, a dataset can be generated. Click the
CREATE SNAPSHOT button, provide a name, and choose the number of images you want to generate. The recommended number is 100 images per seed images. It will take a few minutes to generate your dataset. The dataset details page will show you the status and progress of your job. When completed, you will recieve an email and be able to view preview images.
While image collections can be changed over time (e.g. adding new images or modifying annotations), snapshots are fixed. Once you've created a snapshot, it will never change. If you make a change to an image collection, you will need to create a new snapshot in order for it to reflect those changes. This way, snapshots provide an audit trail, making sure the exact data used to train a model is always available.
3. Training a model
Now that you have an initial dataset, you can train your first model. From either the snapshot details page or the training tab, click the
TRAIN NEW MODEL button.
First, you’ll be asked to name your training job and select a model type. The model type you select should match the task you defined in step one and the annotations you provided in step two. If you labeled keypoints on your seed images, you’ll want to train a pose estimation model; bounding boxes are used to train object detection models.
Next, select the snapshot you generated in the previous step to train your model. In the future, you can select multiple snapshots to train models, as long as their annotation configurations match. This allows you to iteratively collect and generate new data as you identify scenarios where you want to improve model performance.
When training a model, you must choose one of three variants. Each variant optimizes training for a specific characteristic: model size, speed, and accuracy. Small models keep your application bundle size low, fast models are meant for use with live video, and accurate models produce the highest quality predictions.
Provide a model output filename and a training time budget and you’re ready to to train. In general, the longer you train your model, the more accurate it will be. If your model has stopped improving during training, we will stop it automatically and you wont be charged for any remaining budget.
There are two optional configuration settings. The first allows you to start training from an existing model checkpoint. This is useful if you want to retraining a model for longer or with new data.
The second allows you to specificy the location of output checkpoints. By default, trained model checkpoints are uploaded to Fritz as brand new models. However, you are training a new version of an existing model, you can specify that here so that all of your model checkpoints stay organized in one place.
You will receive an email when your model has finished training. You can check on the status of a training on the training job details page. Here, you can also cancel a training job at any time.
4. Testing a model in an app
While your model is training, it’s a good time to set up a demo application for testing. You can find example applications for every model type in our fritz-examples GitHub repository.
For example, if you wanted to test a hand pose estimation model on iOS, clone the repository above, navigate to the iOS/FritzPoseEstimationDemo project, and run pod install to install the Fritz SDK via cocoapods. Open the .xcworkspace and verify the project builds. If you are building for Android, you will follow a similar process, but using Android build tools.
By default, most example projects use a pre-trained model included in the Fritz SDK. You’ll need to replace this model with the custom model you’re training. Navigate back to the webapp. At the bottom of the page, you will find a list of output model checkpoints that have been pre-populated at the start of your training job. These checkpoints represent un-trained models that will not produce accurate predictions, but will allow you to get started with app integration.
Select the model checkpoint format for your desired platform (Core ML for iOS or TensorFow Lite for Android). You will be taken to a model details page. From this page, download the saved model file and drag it into your mobile app IDE.
Next, you’ll need to register your model with the Fritz SDK. Back on the model details page, select
SDK INSTRUCTIONS to get a code snippet for integrating your specific model. The first block of code contains a unique model identifier and a version number that tells the Fritz SDK which version of your model will be bundled with the app and which identifier to use when checking for new versions.
The second block of code demonstrates how to use your custom model to create a predictor that can be used to make predictions in your application. Copy this code and replace the pre-trained predictor in the demo application.
Build the demo project and verify that the correct model version is loaded. Note that you will likely not see accurate predictions as the model is not trained.
When your training job is finished, a new version will automatically be placed in Fritz AI. You can navigate to the model details page to see all available model versions. To test a new version, download the model file, replace the existing bundled version, and rebuild your app. You can also implement over-the-air downloads in your demo application, making it possible to push new active versions of your model directly to your app without needing to recompile.
With a trained model in your app, you can now begin to assess model perfomance in real-world scenarios.
5. Collecting more data
Now that you have the first version of your model working inside a demo application, it's time to do some more testing to collect additional data to retrain and improve performance. One option is to indetify scenarios where the model does not perform well, create additional seed images to cover those scenarions, and generate another snapshot.
Another option is to collect data directly from your application using the Dataset Collection System. To set up the data collection system, navigate to the Datasets tab in the webapp and click
ADD IMAGE COLLECTION. This time, create a Model-based image collection. Model-based image collections are tied directly to a model stored in Fritz AI so that model predictions and can be compared to manual annotations. Once you’ve create the collection, you’ll need to update your mobile app to make use of the predictor.record method, which sends an image, a model prediction, and an optional user-generated annotation back to the collection you just created.
Again using the hand pose estimation model as an example, a model-based image collection could be used to gather a few hundred images of hands in real-world use. These images can then be annotated within the Fritz AI webapp and used to retrain a model.
6. Improving the model
Once you have created a new dataset or two, it’s time to retrain your model. From the training screen, select
TRAIN NEW MODEL. Provide a name, choose the same model type as before, and this time, select both your original and new snapshots for training. Select the same variant as last time, and provide a new training time budget.
This time, set the optional configuration to start training from an existing model checkpoint. Select the most recent version of the model you trained last time.
Also, set the optional form items to save retraining checkpoints as new versions of the model trained earlier. This will keep all versions of your model neatly organized in one place.
Start retraining and wait to receive an email when the job has finished.
7. Deploying new versions
When your retrained models are ready, download the new files or, if you’re using over-the-air updates, make the most recent checkpoints the active version, and test them out in your demo application.
Machine learning requires constant testing and iteration. As you discover new edge cases or areas of improvement, continue to add more data and retrain models.
Experiment with different combinations of datasets and different model variants until you find the best experience for your users.