Step 2: Data Pre-processing with CVAT¶
Overview¶
In order to automatically label the animal videos with behavior, we must first create mini-scenes of each individual animal captured in the frame, as illustrated below.
Figure: A mini-scene is a sub-image cropped from the drone video footage centered on and surrounding a single animal. Mini-scenes simulate the camera as well-aligned with each animal in the frame, compensating for the drone's movement by focusing on just the animal and its immediate surroundings. The KABR dataset consists of mini-scenes and their frame-by-frame behavior annotation.
Resources¶
- See the CVAT User Guide and Data Management Tips for detailed instructions and recommendations.
- View example mini-scenes at data/mini_scenes on Hugging Face.
Step 2A: Perform Detections to Create Tracks¶
To create mini-scenes, we first must perform the detection step by drawing bounding boxes around each animal in frame.
Option 1: Manual Detections in CVAT¶
Figure: Simplified CVAT annotation tool interface
Upload your raw videos to CVAT and perform the detections by drawing bounding boxes manually. This can be quite time consuming, but has the advantage of generating highly accurate tracks.
Video Size Considerations
Depending on the resolution of your raw video, you may encounter out of space issues with CVAT. You can use helper_scripts/downgrade.sh to reduce the size of your videos.
Option 2: Automatic Detections with YOLO¶
You may use YOLO to automatically perform detection on your videos. Use the script below to convert YOLO detections to CVAT format.
detector2cvat: Detect objects with Ultralytics YOLO detections, apply SORT tracking and convert tracks to CVAT format.
Step 2B: Create Mini-scenes from Tracks¶
Once you have your tracks generated, use them to create mini-scenes from your raw footage.
tracks_extractor: Extract mini-scenes from CVAT tracks.
Tool Reference¶
detector2cvat¶
Source: src/kabr_tools/detector2cvat.py
Detect objects with Ultralytics YOLO detections, apply SORT tracking and convert tracks to CVAT format.
Usage:
tracks_extractor¶
Source: src/kabr_tools/tracks_extractor.py
Extract mini-scenes from CVAT tracks.
Usage:
Next Steps¶
Once you have created your mini-scenes, proceed to Step 3: Behavior Labeling to classify behaviors using machine learning models.