1 Resources

Awesome data labeling
GitHub Topics
- annotation-tool
- video-annotation

2 Ideal Labeler Qualities

The labeler needs to have the following attributes:

Ability to handle both images and video
Import/export annotations
Some sort of layered annotations for comparisons

The labeler will preferably have the following:

Import/export multiple types of annotations (e.g., .csv,.json)
Add meta-data
Toggle number of videos present

3 Potential Labelers

3.1 Microsoft Visual Object Tagging Tool

Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.

Can run locally or in a web app
- Web app cannot access local file system
- webapp Docker image
No way to import external data

3.2 Labelbox

Labelbox is the fastest way to annotate data to build and ship computer vision applications.

Written in javascript
Easily create custom interfaces
Video analysis seems to be proprietary (https://www.labelbox.com/pricing)

3.3 UltimateLabeling

A multi-purpose Video Labeling GUI in Python with integrated SOTA detector and tracker.

Ability to import pre-existing files:

Import labels: To import existing .CSV labels, hit Cmd+I (or Ctrl+I). UltimateLabeling expects to read one .CSV file per frame, in the format: “class_id”, “xc”, “yc”, “w”, “h”.

csv file import/export
Can connect to remote server via ssh
YOLO integration
Seems to be under active development (i.e., missing some features)

3.4 cvat

Powerful and efficient Computer Vision Annotation Tool (CVAT)

XML file output
Can only be run in Google Chrome
Run via docker compose (i.e., multiple containers), which is incompatible with singularity (as far as I know)

3.5 ViPER

Written in Java (cross platform)
Loads annotations via XML
Relatively old (2015)

3.7 sloth

Sloth is a tool for labeling image and video data for computer vision research.

Installation guide

3.8 labelme

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Written in python
Pre-made Docker image

3.9 SimpleVideoAnnotation

A simple video annotation made with python + OpenCV for detection in YoloV2 format

3.10 video_annotation

Annotation tool for videos

Written in python
Run as a server

3.11 label-v

semi-automatic video annotation tool

Written in python
Runs in browser
Outputs json

3.12 scaleabel

Quantify computer vision performance in human terms

Run in browser
Can upload image lists and annotations as json
- Example file

3.13 vactic.js

vatic.js A pure Javascript video annotation tool

Run in browser
Outputs XML

4 Best leads

Based on the initial skimming of annotation software, the following appear to be the most viable for the current project. None are perfectly suitable for the project purpose, but may be useful nonetheless.

UltimateLabeling

Runs in a dedicated window (as opposed to through a web browser); not ideal for running on ACI or in a container. Doable, but can’t get it to work on my local machine yet
A little buggy, but in active development
Can’t get labels other than bounding boxes to work

~~scalable~~

Packaged Dockerfile runs on Ubuntu18.04 (not compatible with ACI)
Complicated label file structure

labelme

Doesn’t run in browser
Uncomplicated JSON label file
Support for multiple label types

~~label-v~~

Bounding box labels only

cvat
- Stretch goal. Very pretty and feature-rich, but I don’t think I can get it working on ACI since it relies on docker-compose
- Complicated project setup

5 Conclusions and Thoughts

There are a lot of labeling tools, but all seem to have been created either as a project starting point, or to address a specific issue that the creator was facing (which probably explains why there are so many!).

None of the labelers reviewed here are perfect, or fit all of the ideal label qualities. The best lead is labelme. Not only does it support multiple label types, but also can also import modified JSON files. It doesn’t run in a browser, but can be containerized and run on any system with X11 support.

5.1 Labeler Considerations

No labeler that I have come across does online video transformation; all expect images (frames) to be pre-processed
Because frames are pre-processed, we need to consider the framerate of the original video, the sample rate of the classification model, and the frame extraction rate for the labelers (a few seem to default to extracting 5 fps)

BehAV: Labeling Tools

Daniel N. Albohn

2019-07-30 21:54:43