Large Scale Analysis of Microbial Genomes
Written on January 10th, 2023 by Leonardo De Oliveira Martins
What
This workshop will discuss state-of-the-art methods in the large-scale analysis of microbial genomes, with a focus on protein clustering, orthology inference and structural phylogenetics. We will discuss the latest advances in the field, including the use of artificial intelligence to improve the accuracy and scalability of these methods.
Who
The two international groups invited for this workshop are Christophe Dessimoz & Natasha Glover from the University of Lausanne and Swiss Institute of Bioinformatics (SIB), and Martin Steinegger from Seoul National University. Both teams have collaborative work in the application of artificial intelligence to advance structural phylogenetics.
Martin was involved in the development of popular microbial genomics tools like mmseqs2 and PLASS. He furthermore has been working with AlphaFold for clustering and classification of all known protein sequences based on their structure. The focus of Christophe’s and Natasha’s group is on orthology across the tree of life, with an increased focus on microbes. They are responsible for the comprehensive OMA database together with its ecosystem of tools, and for the Quest for Orthologs benchmark.
The SIB members visiting us are Natasha Glover (Associate Director, Comparative Genomics Group, Swiss Institute Bioinformatics & University of Lausanne), Stefano Pascarelli (Postdoctoral researcher), Irene Julca (Senior postdoctoral researcher), David Moi (Senior postdoctoral researcher), Alex Warwick Vesztrocy (Postdoctoral researcher), and Mauricio Langleib (PhD student).
When
Both groups will be in the Institute between the 20th to the 24th of May, and there will be a workshop on the morning of the 21st (Tuesday) and on the afternoon of the 22nd (Wednesday). Tuesday’s section theme will be “Comparative Genomics of the Tree of Life”, and on Wednesday we’ll have a “Petascale Structure Prediction and Phylogenetics” session, with a panel discussion led by our own Dipali Singh. On Tuesday afternoon we will have the guests in our Data Science at QIB group meeting, where we will have an open discussion on topics brought up by the members of the group.
The list of NBI Calendar Events (no registration needed, but you can import the invites to Outlook) can be seen at:
Event | When | Where |
---|---|---|
Workshop: Comparative Genomics of the Tree of Life | 21st May, 9:30-12:00 | UG55 ABC |
QIB Data Science group meeting | 21st May, 14:00-16:30 | UG55 AB |
Workshop: Petascale Structure Prediction and Phylogenetics | 22nd May, 14:00-17:00 | UG55 ABC |
Please contact Leo if you would like to talk to the guests during their visit. They will be available for meetings on Monday (20th) and Thursday (23rd).
References
A small list of recent selected publications from the groups involved in the workshop (with guests highlighted in bold):
- Majidian, S., Nevers, Y., Kharrazi, A. Y., Vesztrocy, A. W., Pascarelli, S., Moi, D., Glover, N., Altenhoff, A. M., & Dessimoz, C. (2024). Orthology inference at scale with FastOMA. bioRxiv (p. 2024.01.29.577392).
- Nevers, Y., Warwick Vesztrocy, A., Rossier, V., Train, C.-M., Altenhoff, A., Dessimoz, C., & Glover, N. M. (2024). Quality assessment of gene repertoire annotations with OMArk. Nature Biotechnology.
- Moi, D., Dessimoz, C., Nevers, Y., Steinegger, M., Bernard, C., & Langleib, M. (2023). Structural phylogenetics unravels the evolutionary diversification of communication systems in gram-positive bacteria and their viruses. bioRxiv (p. 2023.09.19.558401).
- Nevers, Y., Jones, T. E. M., Jyothi, D., Yates, B., Ferret, M., Portell-Silva, L., Codo, L., Cosentino, S., Marcet-Houben, M., Vlasova, A., Poidevin, L., Kress, A., Hickman, M., Persson, E., Piližota, I., Guijarro-Clarke, C., OpenEBench team the Quest for Orthologs Consortium, Iwasaki, W., Lecompte, O., … Altenhoff, A. (2022). The Quest for Orthologs orthology benchmark service in 2022. Nucleic Acids Research, 50(W1), W623–W632.
- Hodcroft, E. B., De Maio, N., Lanfear, R., MacCannell, D. R., Minh, B. Q., Schmidt, H. A., Stamatakis, A., Goldman, N., & Dessimoz, C. (2021). Want to track pandemic variants faster? Fix the bioinformatics bottleneck. Nature, 591(7848), 30–33.
- Lee, S., Kim, G., Karin, E. L., Mirdita, M., Park, S., Chikhi, R., Babaian, A., Kryshtafovych, A., & Steinegger, M. (2023). Petascale Homology Search for Structure Prediction. bioRxiv (p. 2023.07.10.548308).
- Barrio-Hernandez, I., Yeo, J., Jänes, J., Mirdita, M., Gilchrist, C. L. M., Wein, T., Varadi, M., Velankar, S., Beltrao, P., & Steinegger, M. (2023). Clustering-predicted structures at the scale of the known protein universe. Nature, 622, 637–645
- Kim, J., & Steinegger, M. (2023). Metabuli: sensitive and specific metagenomic classification via joint analysis of amino-acid and DNA. bioRxiv (p. 2023.05.31.543018).
- Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold: making protein folding accessible to all. Nature Methods, 1–4.
- Steinegger, M., Mirdita, M., & Söding, J. (2019). Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nature Methods, 16(7), 603–606.