Python API
bioclip.TreeOfLifeClassifier(**kwargs)
Bases: BaseClassifier
A classifier for predicting taxonomic ranks for images.
See BaseClassifier
for details on **kwargs
.
Source code in src/bioclip/predict.py
404 405 406 407 408 409 410 411 412 |
|
predict(images, rank, min_prob=1e-09, k=5)
Predicts probabilities for supplied taxa rank for given images using the Tree of Life embeddings.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
images |
List[str] | str | List[Image]
|
A list of image file paths, a single image file path, or a list of PIL Image objects. |
required |
rank |
Rank
|
The rank at which to make predictions (e.g., species, genus). |
required |
min_prob |
float
|
The minimum probability threshold for predictions. |
1e-09
|
k |
int
|
The number of top predictions to return. |
5
|
Returns:
Type | Description |
---|---|
dict[str, dict[str, float]]
|
List[dict]: A list of dicts with keys "file_name", taxon ranks, "common_name", and "score". |
Source code in src/bioclip/predict.py
524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 |
|
get_label_data()
Retrieves label data for the tree of life embeddings as a pandas DataFrame.
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame: A DataFrame containing label data for TOL embeddings. |
Source code in src/bioclip/predict.py
428 429 430 431 432 433 434 435 436 437 438 439 |
|
create_taxa_filter(rank, user_values)
Creates a filter for taxa based on the specified rank and user-provided values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rank |
Rank
|
The taxonomic rank to filter by. |
required |
user_values |
List[str]
|
A list of user-provided values to filter the taxa. |
required |
Returns:
Type | Description |
---|---|
List[bool]
|
List[bool]: A list of boolean values indicating whether each entry in the label data matches any of the user-provided values. |
Raises:
Type | Description |
---|---|
ValueError
|
If any of the user-provided values are not found in the label data for the specified taxonomic rank. |
Source code in src/bioclip/predict.py
441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 |
|
apply_filter(keep_labels_ary)
Filters the TOL embeddings based on the provided boolean array. See create_taxa_filter()
for an easy way to create this parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keep_labels_ary |
List[bool]
|
A list of boolean values indicating which TOL embeddings to keep. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If the length of keep_labels_ary does not match the expected length. |
Source code in src/bioclip/predict.py
470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 |
|
bioclip.Rank
Rank for the Tree of Life classification.
KINGDOM
PHYLUM
CLASS
ORDER
FAMILY
GENUS
SPECIES
bioclip.CustomLabelsClassifier(cls_ary, **kwargs)
Bases: BaseClassifier
A classifier that predicts from a list of custom labels for images.
Initializes the classifier with the given class array and additional keyword arguments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls_ary |
List[str]
|
A list of class names as strings. |
required |
Source code in src/bioclip/predict.py
265 266 267 268 269 270 271 272 273 274 275 |
|
predict(images, k=None)
Predicts the probabilities for the given images.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
images |
List[str] | str | List[Image]
|
A list of image file paths, a single image file path, or a list of PIL Image objects. |
required |
k |
int
|
The number of top probabilities to return. If not specified or if greater than the number of classes, all probabilities are returned. |
None
|
Returns:
Type | Description |
---|---|
dict[str, float]
|
List[dict]: A list of dicts with keys "file_name" and the custom class labels. |
Source code in src/bioclip/predict.py
290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 |
|
bioclip.CustomLabelsBinningClassifier(cls_to_bin, **kwargs)
Bases: CustomLabelsClassifier
A classifier that creates predictions for images based on custom labels, groups the labels, and calculates probabilities for each group.
Initializes the class with a dictionary mapping class labels to binary values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls_to_bin |
dict
|
A dictionary where keys are class labels and values are binary values. |
required |
**kwargs |
Additional keyword arguments passed to the superclass initializer. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If any value in |
Source code in src/bioclip/predict.py
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 |
|
bioclip.predict.BaseClassifier(model_str=BIOCLIP_MODEL_STR, pretrained_str=None, device='cpu')
Bases: Module
Initializes the prediction model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_str |
str
|
The string identifier for the model to be used. |
BIOCLIP_MODEL_STR
|
pretrained_str |
str
|
The string identifier for the pretrained model to be loaded. |
None
|
device |
Union[str, device]
|
The device on which the model will be run. |
'cpu'
|
Source code in src/bioclip/predict.py
170 171 172 173 174 175 176 177 178 179 180 181 |
|
forward(x)
Given an input tensor representing multiple images, return probabilities for each class for each image. Args: x (torch.Tensor): Input tensor representing the multiple images. Returns: torch.Tensor: Softmax probabilities of the logits for each class for each image.
Source code in src/bioclip/predict.py
247 248 249 250 251 252 253 254 255 256 257 |
|