Uploading Annotations¶
In this tutorial, we explore different options to upload annotations in Remo from code. In particular, we can:
- add annotations from a file in a format supported by remo
- add annotations from code, which enables uploading annotations or model predictions from any input format
We start off by creating a dataset and populating it with some images
%load_ext autoreload
%autoreload 2
import remo
import os
import pandas as pd
((\ (>':') Remo server is running: v0.3.14
urls = ['https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip']
my_dataset = remo.create_dataset(name = 'D1', urls = urls)
my_dataset.view()
Open http://localhost:8123/datasets/16
Add annotations from file supported by remo¶
To add annotations from a supported file format, we can pass the file via dataset.add_data
Remo automatically parses annotation files in a variety of formats (such as Pascal XML, CoCo JSON, Open Images CSV, etc). You can read more about file formats supported by remo in our documentation.
As an example, let's add some annotations for an Object Detection task from a CSV file with encoded classes
In this case, annotations are stored in a CSV file in a format already supported by Remo. Class labels were encoded using GoogleKnowledgeGraph. Remo automatically detects the class encoding and translates it into the corresponding labels
annotation_files=[os.getcwd() + '/assets/open_sample.csv']
df = pd.read_csv(annotation_files[0])
df.columns
Index(['ImageID', 'Source', 'LabelName', 'Confidence', 'XMin', 'XMax', 'YMin', 'YMax', 'IsOccluded', 'IsTruncated', 'IsGroupOf', 'IsDepiction', 'IsInside'], dtype='object')
my_dataset.add_data(local_files=annotation_files, annotation_task = 'Object detection')
{'files_link_result': {'files uploaded': 0, 'annotations': 9, 'errors': []}}
We can now see annotation statistics, explore the dataset and further leverage Remo
my_dataset.get_annotation_statistics()
[{'AnnotationSet ID': 12, 'AnnotationSet name': 'Object detection', 'n_images': 9, 'n_classes': 15, 'n_objects': 84, 'top_3_classes': [{'name': 'Fruit', 'count': 27}, {'name': 'Sports equipment', 'count': 12}, {'name': 'Human arm', 'count': 7}], 'creation_date': None, 'last_modified_date': '2020-03-25T17:50:36.055324Z'}]
remo.set_viewer('jupyter')
my_dataset.view()
Add annotations from code¶
We can also easily to add annotations from code via the Annotation
object
This can be useful to: - visualize model predictions - upload annotations from any custom file format - create an active learning workflow
As an example, let's see how we can add annotations to a specific image using add_annotations()
method of the dataset class
image_name = '000a1249af2bc5f0.jpg'
annotations = []
annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes='Human hand'
annotation.bbox=[227, 284, 678, 674]
annotations.append(annotation)
annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes='Fashion accessory'
annotation.bbox=[496, 322, 544,370]
annotations.append(annotation)
my_dataset.add_annotations(annotations)
Progress 100% - 1/1 - elapsed 0:00:02.001000 - speed: 0.50 img / s, ETA: 0:00:00
We can now retrieve the picture and visualise it:
my_image = my_dataset.image(image_name)
my_image.view()
Annotation sets¶
Behind the scenes, Remo organises annotations in Annotation sets. An annotation set is simply a collection of all the annotations of Dataset.
The advantage of grouping annotations in an Annotation Set is that it allows for high-level group operations on all the annotations, such as: - grouping classes together - deleting objects of specific classes - comparing of different annotations (such as ground truth vs prediction, or annotations coming from different annotators)
In the examples we have seen before, Remo automatically creates an annotation set and sets it as default. For more control, it's however possible to explicit manipulate Annotation sets objects.
To read more about annotation sets, you can check remo documentation or the annotation set tutorial.