Python API
bioclip.TreeOfLifeClassifier(**kwargs)
Bases: BaseClassifier
A classifier for predicting taxonomic ranks for images.
See BaseClassifier
for details on **kwargs
.
Source code in src/bioclip/predict.py
413 414 415 416 417 418 419 420 421 |
|
predict(images, rank, min_prob=1e-09, k=5)
Predicts probabilities for supplied taxa rank for given images using the Tree of Life embeddings.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
images |
List[str] | str | List[Image]
|
A list of image file paths, a single image file path, or a list of PIL Image objects. |
required |
rank |
Rank
|
The rank at which to make predictions (e.g., species, genus). |
required |
min_prob |
float
|
The minimum probability threshold for predictions. |
1e-09
|
k |
int
|
The number of top predictions to return. |
5
|
Returns:
Type | Description |
---|---|
dict[str, dict[str, float]]
|
List[dict]: A list of dicts with keys "file_name", taxon ranks, "common_name", and "score". |
Source code in src/bioclip/predict.py
557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 |
|
get_label_data()
Retrieves label data for the tree of life embeddings as a pandas DataFrame.
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame: A DataFrame containing label data for TOL embeddings. |
Source code in src/bioclip/predict.py
437 438 439 440 441 442 443 444 445 446 447 448 |
|
create_taxa_filter(rank, user_values)
Creates a filter for taxa based on the specified rank and user-provided values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rank |
Rank
|
The taxonomic rank to filter by. |
required |
user_values |
List[str]
|
A list of user-provided values to filter the taxa. |
required |
Returns:
Type | Description |
---|---|
List[bool]
|
List[bool]: A list of boolean values indicating whether each entry in the label data matches any of the user-provided values. |
Raises:
Type | Description |
---|---|
ValueError
|
If any of the user-provided values are not found in the label data for the specified taxonomic rank. |
Source code in src/bioclip/predict.py
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 |
|
apply_filter(keep_labels_ary)
Filters the TOL embeddings based on the provided boolean array. See create_taxa_filter()
for an easy way to create this parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keep_labels_ary |
List[bool]
|
A list of boolean values indicating which TOL embeddings to keep. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If the length of keep_labels_ary does not match the expected length. |
Source code in src/bioclip/predict.py
503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 |
|
bioclip.Rank
Rank for the Tree of Life classification.
KINGDOM
PHYLUM
CLASS
ORDER
FAMILY
GENUS
SPECIES
bioclip.CustomLabelsClassifier(cls_ary, **kwargs)
Bases: BaseClassifier
A classifier that predicts from a list of custom labels for images.
Initializes the classifier with the given class array and additional keyword arguments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls_ary |
List[str]
|
A list of class names as strings. |
required |
Source code in src/bioclip/predict.py
274 275 276 277 278 279 280 281 282 283 284 |
|
predict(images, k=None)
Predicts the probabilities for the given images.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
images |
List[str] | str | List[Image]
|
A list of image file paths, a single image file path, or a list of PIL Image objects. |
required |
k |
int
|
The number of top probabilities to return. If not specified or if greater than the number of classes, all probabilities are returned. |
None
|
Returns:
Type | Description |
---|---|
dict[str, float]
|
List[dict]: A list of dicts with keys "file_name" and the custom class labels. |
Source code in src/bioclip/predict.py
299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 |
|
bioclip.CustomLabelsBinningClassifier(cls_to_bin, **kwargs)
Bases: CustomLabelsClassifier
A classifier that creates predictions for images based on custom labels, groups the labels, and calculates probabilities for each group.
Initializes the class with a dictionary mapping class labels to binary values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls_to_bin |
dict
|
A dictionary where keys are class labels and values are binary values. |
required |
**kwargs |
Additional keyword arguments passed to the superclass initializer. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If any value in |
Source code in src/bioclip/predict.py
340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 |
|
bioclip.predict.BaseClassifier(model_str=BIOCLIP_MODEL_STR, pretrained_str=None, device='cpu')
Bases: Module
Initializes the prediction model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_str |
str
|
The string identifier for the model to be used. |
BIOCLIP_MODEL_STR
|
pretrained_str |
str
|
The string identifier for the pretrained model to be loaded. |
None
|
device |
Union[str, device]
|
The device on which the model will be run. |
'cpu'
|
Source code in src/bioclip/predict.py
179 180 181 182 183 184 185 186 187 188 189 190 |
|
forward(x)
Given an input tensor representing multiple images, return probabilities for each class for each image. Args: x (torch.Tensor): Input tensor representing the multiple images. Returns: torch.Tensor: Softmax probabilities of the logits for each class for each image.
Source code in src/bioclip/predict.py
256 257 258 259 260 261 262 263 264 265 266 |
|