|
|
|:–|
|Figure 1. Representative image from the dataset showing [describe what’s shown].|
Add 2-3 paragraphs describing:
This dataset supports the following computer vision and ecological analysis tasks:
List those that apply, non-exhaustive suggestions by topic below: 🤖 Computer Vision Tasks:
🌿 Ecological Applications:
🤖 Robotics Applications:
Benchmark Results:
[If available, provide baseline performance metrics]
| Method | Detection mAP@50 | Tracking MOTA | Behavior Acc | Reference |
|---|---|---|---|---|
| YOLOv8 | 0.XX | - | - | [link] |
| … | … | … | … | … |
The suggested dataset structure for full drone data is provided below; modify as needed based on the tasks or applications supported by your data.
dataset/
├── images/
│ ├── train/
│ │ ├── rgb/
│ │ │ └── {mission_id}_{frame_id}.jpg
│ │ └── thermal/ # if applicable
│ ├── val/
│ └── test/
├── annotations/
│ ├── train/
│ │ ├── coco_format.json
│ │ ├── yolo_format/
│ │ └── tracking/ # if applicable
│ ├── val/
│ └── test/
├── telemetry/ # if available
│ └── {mission_id}_telemetry.csv
├── metadata/
│ ├── darwin_core_events.csv # 🌿 Darwin Core Event records
│ ├── darwin_core_occurrences.csv # 🌿 Darwin Core Occurrence records
│ ├── missions.csv # Mission-level metadata
│ ├── sensors.json # Sensor specifications
│ └── species_info.json # Taxonomic information
└── README.md
Images:
Naming Convention:
{mission_id}_{frame_number}[_{sensor_type}].{extension}
Example: AWS2024-045_003821.jpg
└─mission─┘ └frame─┘
Mission ID: Unique identifier for each flight (format: [prefix]-[number])
Frame Number: Sequential frame number within mission (zero-padded to 6 digits)
Sensor Type: Optional suffix for multi-sensor data (_rgb, _thermal, etc.)
Temporal Information:
Provide more details about the included metadata.
metadata/darwin_core_events.csv)⚠️ Required fields for minimal Darwin Core compliance are provided below, add any other available information you can:
| Field | Type | Description | Example |
|---|---|---|---|
eventID |
string | Unique identifier for sampling event | “AWS2024-045” |
eventDate |
date | Date of survey (ISO 8601) | “2024-03-15” |
eventTime |
time | Time of survey start | “06:30:00+03:00” |
decimalLatitude |
float | Latitude in decimal degrees (WGS84) | -2.3456 |
decimalLongitude |
float | Longitude in decimal degrees (WGS84) | 34.8123 |
coordinateUncertaintyInMeters |
integer | GPS precision | 5 |
geodeticDatum |
string | Coordinate system | “WGS84” |
locality |
string | Named location | “Serengeti National Park, Sector A3” |
habitat |
string | Habitat type | “Open savanna with scattered Acacia” |
samplingProtocol |
string | Survey method | “UAV transect at 60m AGL, 5m/s” |
sampleSizeValue |
float | Area/duration surveyed | 250 |
sampleSizeUnit |
string | Unit for sample size | “hectares” |
samplingEffort |
string | Effort description | “45 min flight, 70% overlap” |
Optional but recommended:
minimumElevationInMeters: Ground elevation at siteweather: Weather conditions during surveyfieldNotes: Additional observationsmetadata/darwin_core_occurrences.csv)⚠️ Required fields for minimal Darwin Core compliance, add any other available information you can:
| Field | Type | Description | Example |
|---|---|---|---|
occurrenceID |
string | Unique observation identifier | “AWS2024-045_003821_001” |
eventID |
string | Links to Event record | “AWS2024-045” |
scientificName |
string | Full scientific name | “Loxodonta africana (Blumenbach, 1797)” |
kingdom |
string | Taxonomic kingdom | “Animalia” |
phylum |
string | Taxonomic phylum | “Chordata” |
class |
string | Taxonomic class | “Mammalia” |
order |
string | Taxonomic order | “Proboscidea” |
family |
string | Taxonomic family | “Elephantidae” |
genus |
string | Taxonomic genus | “Loxodonta” |
species |
string | Specific epithet | “africana” |
taxonRank |
string | Rank of identification | “species” |
individualCount |
integer | Number of individuals | 12 |
Optional but recommended:
lifeStage: “adult”, “juvenile”, “calf”sex: “male”, “female”, “undetermined”behavior: Behavioral state during observationoccurrenceRemarks: Additional notesModify the example CV annotation format provided below to match the details of your data.
COCO Format (annotations/train/coco_format.json):
{
"info": {
"description": "[Dataset name and description]",
"version": "1.0",
"year": 2024,
"date_created": "2024-01-15"
},
"licenses": [...],
"images": [
{
"id": 1,
"file_name": "AWS2024-045_003821.jpg",
"width": 5280,
"height": 2970,
"date_captured": "2024-03-15T06:35:42+03:00",
"mission_id": "AWS2024-045",
"altitude_m": 60,
"gsd_cm_per_px": 1.5
}
],
"annotations": [
{
"id": 1,
"image_id": 1,
"category_id": 1,
"bbox": [x, y, width, height],
"area": 12543.5,
"iscrowd": 0,
"attributes": {
"occlusion": "none",
"truncation": false,
"life_stage": "adult",
"behavior": "foraging",
"group_size": 12,
"confidence": 0.95
},
"occurrence_id": "AWS2024-045_003821_001" # Links to Darwin Core
}
],
"categories": [
{
"id": 1,
"name": "african_elephant",
"supercategory": "mammal",
"scientific_name": "Loxodonta africana"
}
]
}
Tracking Format (if applicable):
MOT Challenge format with species labels:
{frame_number},{object_id},{bbox_left},{bbox_top},{bbox_width},{bbox_height},{confidence},{species_id},{life_stage},{behavior}
| Split | Images | Annotations | Species Coverage | Temporal Coverage |
|---|---|---|---|---|
| Train | X,XXX | XX,XXX | All species | All seasons |
| Validation | X,XXX | XX,XXX | All species | Stratified |
| Test | X,XXX | XX,XXX | All species | Held-out missions |
Split Methodology:
Describe how splits were created, e.g.:
Type: [UAV / AUV / ROV / USV / UGV]
Hardware:
Autonomy:
Payload:
Primary Sensor: [Name]
Spectral Bands (if applicable): | Band | Wavelength (nm) | Purpose | |——|—————–|———| | Red | 590-700 | Vegetation health | | … | … | … |
Calibration:
Synchronization (for multi-sensor):
Flight Specifications:
Telemetry Data:
Environmental Conditions:
⚠️ Full sampling protocol description:
[Fill in this detailed description following Barnas et al. (2020) reporting standards:]
Permits Obtained:
Regulations Followed:
Ethics Approval:
Animal Welfare Protocol:
Describe the motivation for creating this dataset:
Field Collection (describe what YOU did):
Software and Tools Used:
Field Team:
Local Collaboration:
🤖 Annotation Method:
Tools Used:
Annotation Guidelines:
Quality Control:
Annotation Coverage:
Annotator Team:
Subject Matter Experts:
⚠️ Privacy and Security Considerations:
Human Subjects:
Endangered Species:
Cultural Sensitivity:
Security:
Species Distribution:
| Species (Scientific Name) | Common Name | Train | Val | Test | Total |
|---|---|---|---|---|---|
| Species name | Common name | XXX | XX | XX | XXX |
| … | … | … | … | … | … |
Class Balance:
Image Characteristics:
Detection Difficulty Metadata:
| Difficulty Factor | Easy (%) | Medium (%) | Hard (%) |
|---|---|---|---|
| Occlusion | XX | XX | XX |
| Crowd density | XX | XX | XX |
| Scale (small objects) | XX | XX | XX |
| Weather/lighting | XX | XX | XX |
⚠️ Known Biases:
Technical Limitations:
Ethical Limitations:
Best Practices for Using This Dataset:
What This Dataset Should NOT Be Used For:
Associated Datasets (if applicable):
Synchronization:
Dataset License: Full license name
Citation Requirement: If you use this dataset, you MUST cite both the dataset and associated paper (see Citation section).
Image Licensing: [If different from compilation]
metadata/licenses.csv for per-image license informationCode License: [If releasing code alongside data]
If you use this dataset, please cite:
Dataset:
@misc{yourdataset2024,
author = {Last, First and Last, First},
title = {Dataset Title},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/your-org/your-dataset},
doi = {10.XXXX/XXXXX}
}
Paper:
@article{yourpaper2024,
title = {Paper Title},
author = {Last, First and Last, First},
journal = {Journal Name},
year = {2024},
volume = {X},
pages = {XX-XX},
doi = {10.XXXX/XXXXX}
}
FAIR² Drones Drone Data Standard:
@article{kline2025wildfair,
title = {FAIR² Drones: An AI-Ready Standard for Cross-Domain Wildlife Drone Datasets},
author = {Kline, Jenna and others},
year = {2025},
doi = {10.XXXX/XXXXX}
}
This work was supported by [funding source].
We thank:
This work was supported by the Imageomics Institute, which is funded by the US National Science Foundation’s Harnessing the Data Revolution (HDR) program under Award #2118240 (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Conservation Partners:
Data Collection Permits: [List major permits with gratitude to issuing authorities]
🤖 AI-Readiness Validation:
🌿 Darwin Core Validation:
⚠️ FAIR² Compliance Checklist:
Data Loading:
# Example: Load dataset using Hugging Face datasets library
from datasets import load_dataset
dataset = load_dataset("your-org/your-dataset")
# Access images and annotations
train_data = dataset['train']
for sample in train_data:
image = sample['image']
annotations = sample['annotations']
# Your code here
Visualization Tools:
Evaluation Scripts:
AGL: Above Ground Level - altitude measured from terrain surface below Darwin Core: Biodiversity data standard maintained by TDWG FAIR²: Extension of FAIR principles for AI-ready datasets GSD: Ground Sampling Distance - real-world size represented by one pixel UAV: Unmanned Aerial Vehicle (drone) TDWG: Biodiversity Information Standards organization
[List names of card authors]
For questions about this dataset:
Version History:
This dataset card follows the FAIR² Drone Data Standard (Kline et al., 2025) and extends the Imageomics dataset card template.