More Research

Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution

ECCV 2024

1Mridul Khurana, 2Arka Daw, 1M. Maruf, 1Josef Uyeda, 3Wasila Dahdul, 1Caleb Charpentier, 4Yasin Bakış, 4Henry L. Bart Jr., 5Paula Mabee, 6Hilmar Lapp, 7James Balhoff, 8Wei-Lun (Harry) Chao, 9Charles Stewart, 8Tanya Berger-Wolf, 1Anuj Karpatne

1Virginia Tech, 2Oak Ridge National Lab, 3University of California, Irvine, 4Tulane University, 5Battelle, 6Duke University, 7University of North Carolina at Chapel Hill, 8The Ohio State University, 9Rensselaer Polytechnic Institute

mridul@vt.edu, karpatne@vt.edu

Paper Demo Code

Figure 1: Overview of Phylo-Diffusion framework. Every species in the tree of life (phylogenetic tree) is encoded to a HIERarchical Embedding (HIER-Embed) comprising of four vectors (one for each phylogenetic level), which is used to condition a latent diffusion model to generate synthetic images of the species. By structuring the embedding space with phylogenetic knowledge, Phylo-Diffusion enables visualization of changes in the evolutionary traits of a species (circled pink) upon perturbing its embedding.

Abstract

A central problem in biology is to understand how organisms evolve and adapt to their environment by acquiring variations in the observable characteristics or traits of species across the tree of life. With the growing availability of large-scale image repositories in biology and recent advances in generative modeling, there is an opportunity to accelerate the discovery of evolutionary traits automatically from images. Toward this goal, we introduce Phylo-Diffusion, a novel framework for conditioning diffusion models with phylogenetic knowledge represented in the form of HIERarchical Embeddings (HIER-Embeds). We also propose two new experiments for perturbing the embedding space of Phylo-Diffusion: trait masking and trait swapping, inspired by counterpart experiments of gene knockout and gene editing/swapping. Our work represents a novel methodological advance in generative modeling to structure the embedding space of diffusion models using tree-based knowledge. Our work also opens a new chapter of research in evolutionary biology by using generative models to visualize evolutionary changes directly from images. We empirically demonstrate the usefulness of Phylo-Diffusion in capturing meaningful trait variations for fishes and birds, revealing novel insights about the biological mechanisms of their evolution.

Phylo-Diffusion

We introduce Graph as a modality to conditional Diffusion Models, Phylo-Diffusion, for understanding in evolutionary traits in biological specimens. We leverage this new modality and introduce two experiments for understanding evolutionary traits inspired by gene-knockout experiments:

  1. Trait Masking: where we mask out one or more level embedding by substituting it with gaussian noise to verify that the embeddings learned capture hierarchical information.
  2. Trait Swapping: which involves the substitution of level-l embedding of a source species with the level-l embedding of a sibling subtree at an equivalent level to visualize the trait differences in the generated images before and after trait swapping that can help us understand the evolutionary traits that branched at a certain level.

(a) Trait Masking
(b) Trait Swapping
Figure 2: Schematic diagrams of the two proposed experiments for discovering evolutionary traits using Phylo-Diffusion.

Demo

Please note that the demo is currently running on free CPU resources provided by Hugging Face, so it may take up to 10 minutes to generate an image. We're working on securing additional resources to speed up the process. Thank you for your patience!

Experiments

We evaluate Phylo-Diffusion and show that it achieves high-fidelity at par with the text-to-image and class-conditional diffusion models.

Phylo-Diffusion performs at par with state-of-the-art generative models

Scroll to see all results.

Model Type Method FID ↓ IS ↑ Prec. ↑ Recall ↑
GAN Phylo-NN 28.08 2.35 0.625 0.084
Diffusion Class Conditional 11.46 2.47 0.679 0.359
Diffusion Scientific Name 11.76 2.43 0.683 0.332
Diffusion Phylo-Diffusion (ours) 11.38 2.53 0.654 0.367

Trait Masking

Figure 3: Probability activations for Lepomis species at level 3 for Trait Masking. The species highlighted in green represent the species belonging to that subtree at the given level

Trait Swapping

Figure 4 illustrates trait swapping for the source species Noturus exilis (left), where the information at Level-2 is swapped with that of a sibling subtree at Node B (right). The image in the center is generated using the trait swapped embedding. This visualization of the perturbed species helps us study the trait changes that would have branched out at level-2 between Node A and Node B.In the generated image (center), we observe the absence of barbels(whiskers), and the caudal fin (tail) is getting forked (or split) highlighted in pink, which are traits adopted from species in the subtree at B (Notropis). Whereas other fins like the dorsal, pelvic, and anal fin still resemble the source species Noturus exilis highlighted in green. The same is also reflected in the change of probability distribution after perturbations; the probability distribution of source species Noturus exilis decreases and the probability of it being a Notropis increases slightly.
Figure 4: Visualization of changes in traits after swapping information at Level 2 (Node A) for Noturus exilis (left) with its sibling subtree at Node B(right) to generate perturbed species (center). Traits shared with the source species are outlined in green, whereas those shared with the sibling subtree at Node B are outlined in pink.

Comparision to PhyloNN

Figure 5: Comparing Phylo-NN with Phylo-Diffusion for trait swapping

References

Please cite our paper if you use our code, data, model or results.

      @article{khurana2024hierarchical,
        title={Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution},
        author={Khurana, Mridul and Daw, Arka and Maruf, M and Uyeda, Josef C and Dahdul, Wasila and Charpentier, Caleb and Bak{\i}{\c{s}}, Yasin and Bart Jr, Henry L and Mabee, Paula M and Lapp, Hilmar and others},
        journal={arXiv preprint arXiv:2408.00160},
        year={2024}
      }
    

Also consider citing LDM:

@inproceedings{rombach2022high,
      title={High-resolution image synthesis with latent diffusion models},
      author={Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj{\"o}rn},
      booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
      pages={10684--10695},
      year={2022}
    }

Acknowledgements

This work was supported by the Imageomics Institute, which is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under Award #2118240 (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Thanks to the BioCLIP Team for the website template.