BioCLIP Data | TreeOfLife Datasets and Benchmarks

Which TreeOfLife dataset or benchmark do you need?

Largest Available

I need the largest available TreeOfLife dataset with images paired to their taxonomic labels.

Go to TreeOfLife-200M →

Image Captions

I want to train a model with morphological image captions for each image.

Go to TreeOfLife-10M Captions →

Original

I need the original TreeOfLife-10M dataset described in the 2024 CVPR paper.

Go to TreeOfLife-10M →

Rare Species Benchmark

I want to evaluate my model's performance on the Rare Species Benchmark presented in the 2024 CVPR paper.

Go to Rare Species Benchmark→

Camera Trap Benchmark

I want to evaluate my model's performance on the IDLE-OO Camera Trap Benchmark presented in the 2025 NeurIPS paper.

Go to IDLE-OO Camera Trap Benchmark →

All Data

I want to explore or download datasets or benchmarks used to train and evaluate the BioCLIP models.

Go to HF Collection →

TreeOfLife Datasets

The TreeOfLife datasets are curated collections of images representing a wide range of biological taxa, paired with their corresponding taxonomic labels. These datasets are designed to facilitate the training and evaluation of vision-based knowledge-guided biological foundation models.

Latest Release

TreeOfLife-200M

With nearly 214-million images representing more than 950-thousand taxa across the tree of life, TreeOfLife-200M is the largest and most diverse public ML-ready dataset for computer vision models in biology at release. This dataset combines images and metadata from four core biodiversity data providers: Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EOL), BIOSCAN-5M, and FathomNet to more than double the number of unique taxa covered by TreeOfLife-10M, adding 50 million more images than BioTrove (and nearly triple the unique taxa).

TreeOfLife-200M also increases image context diversity with museum specimen, camera trap, and citizen science images well-represented. Our rigorous curation process ensures each image has the most specific taxonomic label possible and that the overall dataset provides a well-rounded foundation for training BioCLIP 2 and future biology foundation models.

Image Count: 213.9 million images
Unique Taxa: 952,257 unique 7-rank taxa strings
Image Types: Museum specimen, camera traps, citizen science, drawings (not labeled)
Sources: Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EOL), BIOSCAN-5M, FathomNet
Best for: Large foundation model training.

Visit BioCLIP 2 Site Dataset Card

BioCLIP 2 model visualization showing the model architecture, a clustered embedding plot with organism thumbnails, showing the separation by age and sex orthogonal to the species axis

Captioned Dataset

TreeOfLife-10M Captions

A dataset of 10 million generated captions, Wikipedia-derived descriptions and format examples for the TreeOfLife-10M. These captions were generated using InternVL3-38B based on biological contexts that help the model generate more accurate captions. It was used to train BioCAP, a CLIP-based model. This dataset is designed to enhance the training of vision models by providing rich, contextual information about each image included in the original TreeOfLife-10M dataset.

10 Million Captions: A diverse set of captions generated for a wide range of biological images.
Contextual Information: Captions provide rich context to enhance model understanding.
Training Resource: Specifically designed to improve training for vision models like BioCAP.

Visit BioCAP Site Dataset Card

BioCAP data generation pipeline visualization showing the caption generation process with a format example and image provided to an MLLM along with a morphological species-level description from Wikipedia; inaccurate output has an X and correct has a checkmark

Original Dataset

TreeOfLife-10M (Original)

The original TreeOfLife dataset presented in "BioCLIP: A Vision Foundation Model for the Tree of Life". With over 10-million images covering 454-thousand taxa in the tree of life, TreeOfLife-10M was the largest-to-date ML-ready dataset of images of biological organisms paired with their associated taxonomic labels.

It expanded on the foundation established by existing high-quality datasets, such as iNat21 and BIOSCAN-1M, by further incorporating newly curated images from the Encyclopedia of Life (eol.org), which supplies most of TreeOfLife-10M’s data diversity. Every image in TreeOfLife-10M is labeled to the most specific taxonomic level possible, as well as higher taxonomic ranks in the tree of life. TreeOfLife-10M was generated for the purpose of training BioCLIP and future biology foundation models.

Image Count: 10,065,576 images
Unique Taxa: 454,103 unique 7-rank taxa strings
Text Types: Common names (black-billed magpie), scientific names (Pica hudsonia), and taxonomic names (Animalia Chordata Aves Passeriformes Corvidae Pica hudsonia)
Image Types (not labeled): Museum specimen, citizen science, drawings
Sources: EOL, BIOSCAN-1M, iNat21
Best for: Smaller foundation model or distilled model training.

Visit BioCLIP Site Dataset Card

treemap from phyla down to family of TreeOfLife-10M dataset. The largest phyla are Arthropoda, Tracheophyta, and Chordata, and their most represented orders are Insecta, Magnoliopsida, and Aves, respectively

TreeOfLife Benchmark Datasets

The TreeOfLife benchmark datasets are curated collections of images representing a wide range of biological taxa, paired with their corresponding taxonomic labels. These benchmarks are designed to provide biologically and ecologically relevant evaluation tasks.

Camera Trap Benchmark

IDLE-OO Camera Traps

IDLE-OO Camera Traps is a 5-dataset benchmark of camera trap images from the Labeled Information Library of Alexandria: Biology and Conservation (LILA BC) with a total of 2,586 images for species classification. Each of the 5 benchmarks is balanced to have the same number of images for each species within it (between 310 and 1120 images), representing between 16 and 39 species.

Image Count: 2,586 images
Unique Species: 96 species across 5 datasets
Image Types: Camera trap images
Sources: LILA BC: Desert Lion Conservation Camera Traps, ENA24-detection, Island Conservation Camera Traps, Ohio Small Animals, Orinoquia Camera Traps
Best for: Evaluating model performance on real-world camera trap data.

Visit BioCLIP 2 Site Dataset Card

Rare Species Benchmark

The Rare Species Benchmark was generated alongside TreeOfLife-10M as a benchmark for BioCLIP; data (images and text) were pulled from Encyclopedia of Life (EOL) to generate a dataset consisting of rare species for zero-shot-classification and more refined image classification tasks. Here, we use "rare species" to mean species listed on The International Union for Conservation of Nature (IUCN) Red List as Near Threatened, Vulnerable, Endangered, Critically Endangered, and Extinct in the Wild.

Image Count: 11,983 images
Unique Species: 400 species
Sources: Encyclopedia of Life (EOL) and International Union for Conservation of Nature (IUCN) Red List
Best for: Zero-shot classification

Visit BioCLIP Site Dataset Card

BioCLIP model visualization showing the model architecture