Example photo from the MMLA dataset and labels generated from model. The image shows a group of zebras and giraffes at the Mpala Research Centre in Kenya.
The MMLA (Multi-Environment, Multi-Species, Low-Altitude Aerial Footage) dataset is a collection of 155,074 frames of low-altitude aerial footage of various species in different environments. The dataset is designed to help researchers and practitioners develop and evaluate object detection models for wildlife monitoring and conservation.
This project provides a fine-tuned YOLO11m model trained on the MMLA dataset, enabling more accurate detection of wildlife in aerial imagery across diverse environments.
MMLA includes footage from three distinct locations: Ol Pejeta Conservancy (Kenya), Mpala Research Center (Kenya), and The Wilds Conservation Center (USA).
The dataset contains annotations for multiple species including Plains zebra, Grevy's zebra, Reticulated giraffe, Masai giraffe, Persian onager, and African wild dog.
All footage is captured from low-altitude aerial platforms, providing a realistic dataset for practical wildlife monitoring applications.
Our fine-tuned YOLO11m model achieves the following performance on the MMLA dataset:
Class | Images | Instances | Box(P) | R | mAP50 | mAP50-95 |
---|---|---|---|---|---|---|
all | 7,658 | 44,619 | 0.867 | 0.764 | 0.801 | 0.488 |
Zebra | 4,430 | 28,219 | 0.768 | 0.647 | 0.675 | 0.273 |
Giraffe | 868 | 1,357 | 0.788 | 0.634 | 0.678 | 0.314 |
Onager | 172 | 1,584 | 0.939 | 0.776 | 0.857 | 0.505 |
Dog | 3,022 | 13,459 | 0.973 | 0.998 | 0.995 | 0.860 |
Representative images from the MMLA dataset showcasing different environments and species:
Figure: Representative images from the MMLA dataset showing various environments and species from different collection sessions.
The MMLA dataset facilitates research in several key areas:
Enables more effective wildlife monitoring in conservation areas through automated detection systems that can process aerial footage.
Provides a challenging benchmark for object detection algorithms in real-world settings with varied lighting, backgrounds, and animal movements.
Facilitates research on wildlife populations through non-invasive monitoring methods that can count and track animal groups.
Supports studies of animal behavior by providing aerial perspectives that capture group dynamics and movement patterns.
We provide a fine-tuned YOLO11m model that has been trained on the MMLA dataset. The model achieves state-of-the-art performance for wildlife detection in aerial imagery.
Full model weights and details are available on HuggingFace.
The MMLA dataset comprises 155,074 frames from 11 sessions across 3 locations, featuring multiple species of wildlife captured from low-altitude aerial footage.
Location | Session | Date | Frames | Species |
---|---|---|---|---|
Ol Pejeta Conservancy, Kenya | 1 | 1/31/25 | 16,726 | Plains zebra |
2 | 2/1/25 | 12,542 | Plains zebra | |
Subtotal: | 29,268 | |||
Mpala Research Center, Kenya | 1 | 1/12/23 | 16,891 | Reticulated giraffe |
2 | 1/17/23 | 11,165 | Plains zebra | |
3 | 1/18/23 | 17,940 | Grevy's zebra | |
4 | 1/20/23 | 33,960 | Grevy's zebra | |
5 | 1/21/23 | 24,106 | Giraffe, Plains and Grevy's zebras | |
Subtotal: | 104,062 | |||
The Wilds Conservation Center, USA | 1 | 6/14/24 | 13,749 | African wild dog |
2 | 7/31/24 | 3,436 | Reticulated and Masai giraffe | |
3 | 4/18/24 | 4,053 | Persian onager | |
4 | 7/31/24 | 506 | Grevy's zebra | |
Subtotal: | 21,744 | |||
Total: | 155,074 |
The dataset is split into three main collections that are available on HuggingFace:
If you use this dataset or model in your research, please cite our paper:
We want to thank Samuel Mutisya and William Njoroge at Ol Pejeta Conservancy for their support in facilitating the WildDrone 2025 Hackathon. We would like to thank Dan Beetem and The Wilds for their support in collecting the data. This work was supported by WildDroneEU (EU Horizon Europe MSCA grant 101071224), the Imageomics Institute (NSF HDR Award 2118240), and ICICLE (NSF grant OAC-2112606).
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.