biobench.inat21

Trains a simple ridge regression classifier on visual representations for the iNat21 challenge. In the challenge, there are 10K different species (classes). We use the mini training set with 50 images per species, and test on the validation set, which has 10 images per species.

This task is a benchmark: it should help you understand how general a vision backbone's representations are. This is not a true, real-world task.

If you use this task, be sure to cite the original iNat21 dataset paper:

@misc{inat2021,
  author={Van Horn, Grant and Mac Aodha, Oisin},
  title={iNat Challenge 2021 - FGVC8},
  publisher={Kaggle},
  year={2021},
  url={https://kaggle.com/competitions/inaturalist-2021}
}

`Dataset` ¶

Bases: ImageFolder

Subclasses ImageFolder so that __getitem__ includes the path, which we use as the ID.

`getitem(index)` ¶

Parameters:

Name	Type	Description	Default
`index`	`int`	Index	required

Returns:

Name	Type	Description
`tuple`	`tuple[str, object, object]`	(path, sample, target) where target is class_index of the target class.

Source code in src/biobench/inat21/__init__.py

def __getitem__(self, index: int) -> tuple[str, object, object]:
    """
    Args:
        index (int): Index

    Returns:
        tuple: (path, sample, target) where target is class_index of the target class.
    """
    path, target = self.samples[index]
    sample = self.loader(path)
    if self.transform is not None:
        sample = self.transform(sample)
    if self.target_transform is not None:
        target = self.target_transform(target)

    return path, sample, target

`benchmark(cfg)` ¶

Steps: 1. Get features for all images. 2. Select lambda using validation data. 3. Report score on test data.

Source code in src/biobench/inat21/__init__.py

@beartype.beartype
def benchmark(cfg: config.Experiment) -> reporting.Report:
    """
    Steps:
    1. Get features for all images.
    2. Select lambda using validation data.
    3. Report score on test data.
    """
    backbone = registry.load_vision_backbone(cfg.model)

    # 1. Get features
    val_features = get_features(cfg, backbone, is_train=False)
    train_features = get_features(cfg, backbone, is_train=True)

    # 2. Fit model.
    clf = init_clf(cfg)
    clf.fit(train_features.x, train_features.y)

    true_labels = val_features.y
    pred_labels = clf.predict(val_features.x)

    preds = [
        reporting.Prediction(
            str(image_id),
            float(pred == true),
            {"y_pred": pred.item(), "y_true": true.item()},
        )
        for image_id, pred, true in zip(val_features.ids, pred_labels, true_labels)
    ]

    return reporting.Report("inat21", preds, cfg)

biobench.inat21

Dataset ¶

__getitem__(index) ¶

benchmark(cfg) ¶

`Dataset` ¶

`getitem(index)` ¶

`benchmark(cfg)` ¶