Home Page

Project Overview

This Shiny app is made to communicate the results of our research project published here: https://doi.org/10.1111/tpj.70291

Credits

Armin Dadras created this website based on the experiment design conducted by the de Vries group. The design and maintenance of this website was made possible by the support offered by GWDG. The project was funded by ERC grant 852725 (ERC-StG TerreStriAL Terrestrialization: Stress Signalling Dynamics in the Algal Progenitors of Land Plants).

Species networks

Co-expression network of species

To construct co-expression networks for each species, we downloaded raw reads from SRA NCBI (see Data section), and quantified the gene expressions across samples. Then, we constructed the networks for each condition using Rapid Conditional Fused Graphical Lasso (RCFGL) method (paper and GitHub). Precision matrices were calculated and then based on the covariance the correlation matrix were calculated. The resulting correlation matrix contains the Pearson correlation coefficients, which range from -1 to 1 and indicate the strength and direction of the linear relationship between pairs of variables. Then, we used a truncation cut-off (0.05) and made the adjacency matrix. These adjacency matrices were used as inputs of igraph to perform network analysis and visualization. For enrichment analysis, we used clusterProfiler package with all genes that were used for the network construction as background gene and the following inputs: pvaluecutoff = 0.05, qvaluecutoff = 0.05. The networks are visualized using visNetwork.

Union, intersection, difference operations for graphs

To compare networks with each other, we can use some functions such as union (aggregation of all edges in input networks), intersection (conserved edges among all networks), and difference (A-B; The set of all edges that are present in A and are not present in B). Of course, for the difference operation the order of inputs are important. If you use more than one graph, we would calculate the union of them to further calculate the difference between A and B. Putting the results into biological context could be difficult in some comparisons.

Update the input and click the button

If you want to change the input or comparison, please do not forget to click on the button at the end of input list. There will be a notification on the bottom right of your screen mention when the update started and when it is ready.

General stats

Hubs

Table of GO enrichment analysis

Plot of GO enrichment analysis

Neighbourhood of specified genes

Table of genes that are in neighbourhood of input genes

Comparative networks

Comparative co-expression network

To co-expression networks for each species, we downloaded raw reads from SRA NCBI (see Data section), and quantified the gene expressions across samples. Then, we constructed the networks for each condition using Rapid Conditional Fused Graphical Lasso (RCFGL) method (paper and GitHub). Precision matrices were calculated and then based on the covariance the correlation matrix were calculated. The resulting correlation matrix contains the Pearson correlation coefficients, which range from -1 to 1 and indicate the strength and direction of the linear relationship between pairs of variables. Then, we used a truncation cut-off (0.05) and made the adjacency matrix. These adjacency matrices were used as inputs of igraph to perform network analysis and visualization. For enrichment analysis, we used clusterProfiler package with all genes that were used for the network construction as background gene and the following inputs: pvaluecutoff = 0.05, qvaluecutoff = 0.05. The networks are visualized using visNetwork.

To compare co-expression networks across species, we used orthogroups that were inferred using Orthofinder. For each network, we replace the gene ID with the Phylogenetic Hierarchical Orthogroups (HOGs) IDs. In other words, we replace the gene with the orthogroup that it belongs to. Then, we remove any duplicated edge from the graph.

Union, intersection, difference operations for graphs

To compare networks with each other, we can use some functions such as union (aggregation of all edges in input networks), intersection (conserved edges among all networks), and difference (A-B; The set of all edges that are present in A and are not present in B). Of course, for the difference operation the order of inputs are important. If you use more than one graph, we would calculate the union of them to further calculate the difference between A and B. Putting the results into biological context could be difficult in some comparisons.

Update the input and click the button

If you want to change the input or comparison, please do not forget to click on the button at the end of input list. There will be a notification on the bottom right of your screen mention when the update started and when it is ready.

General stats

Hubs

Table of GO enrichment analysis

Plot of GO enrichment analysis

Neighbourhood of specified orthogroups

Table of orthogroups that are in neighbourhood of input orthogroups

Community detection via Louvian method

Table of GO enrichment analysis of community detection

Plot of GO enrichment analysis of community detection

Search for Orthogroups

Comparative Genomics via orthogroups

OrthoFinder is a software for identifying and categorizing orthologous genes across species or genomes paper. It employs an algorithm that considers both sequence similarity and gene evolutionary relationships. By creating a similarity graph based on pairwise comparisons of protein sequences, OrthoFinder clusters genes into orthogroups, which represent genes descended from a common ancestral gene. This approach ensures accurate grouping, even for genes with complex evolutionary histories.To obtain the best results from Orthofinder, we must include a phylogenetically diverse sample of species for orthogroup inference. As a result, this section contains more species.

Run details

We used Orthofinder with two different settings, and used the results of the second run since we believe it is more accurate. The species tree that we used for the second run is shown below.

1st run

orthofinder.py -S diamond -M msa -A mafft -T fasttree -t 50 -a 6 -y -n run_1

2nd run

orthofinder.py -t 50 -a 6 -y -n run_2 -ft Results_run_1 -s SpeciesTree_input.txt

Species tree

BLAST Search

Here, you can use BLASTp to perform sequence similarity search against the protein database that we created. This database includes representative protein sequences of the following species from the mentioned source in our manuscript and the Data section of this website: A. thaliana, C. reinhardtii, M. endlicherianum, M. polymorpha, O. sativa, P. patens, S. lycopersicum, Z. circumcarinatum, Z. mays. We use the following arguments for the BLAST: -evalue 1e-8 -max_target_seqs 100

Data and metadata

Analyses results files

To adhere to our commitment to reproducible and open science practices, we have made all the codes, scripts, and results utilized in this project available on GitLab. Access to these resources can be obtained through here.

Annotation files that were used in this study

In the course of this study, we employed genome annotation and protein sequences of various species. Below is the comprehensive list of these resources and their respective locations for reference:

Species Paper Downloaded from
C. reinhardtii Craig, Rory J., et al. “The Chlamydomonas Genome Project, version 6: Reference assemblies for mating-type plus and minus strains reveal extensive structural mutation in the laboratory.” The Plant Cell 35.2 (2023): 644-672. https://data.jgi.doe.gov/refine-download/phytozome?organism=CreinhardtiiCC-4532&expanded=707
O. lucimarinus Palenik, Brian, et al. “The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation.” Proceedings of the National Academy of Sciences 104.18 (2007): 7705-7710. https://data.jgi.doe.gov/refine-download/phytozome?q=Ostreococcus+lucimarinus&expanded=Phytozome-231
M. viride Liang, Zhe, et al. “Mesostigma viride genome and transcriptome provide insights into the origin and evolution of Streptophyta.” Advanced Science 7.1 (2020): 1901850. https://genome.jgi.doe.gov/portal/pages/dynamicOrganismDownload.jsf?organism=Mesvir1
C. melkonianii Wang, Sibo, et al. “Genomes of early-diverging streptophyte algae shed light on plant terrestrialization.” Nature Plants 6.2 (2020): 95-106. https://ftp.cngb.org/pub/CNSA/data1/CNP0000228/CNS0021447/CNA0002353/
K. nitens Hori, Koichi, et al. “Klebsormidium flaccidum genome reveals primary factors for plant terrestrial adaptation.” Nature communications 5.1 (2014): 3978. https://genome.jgi.doe.gov/portal/pages/dynamicOrganismDownload.jsf?organism=Klenit1
C. braunii Nishiyama, Tomoaki, et al. “The Chara genome: secondary complexity and implications for plant terrestrialization.” Cell 174.2 (2018): 448-464. https://bioinformatics.psb.ugent.be/gdb/Chara_braunii/
A. agrestis oxford Li, Fay-Wei, et al. “Anthoceros genomes illuminate the origin of land plants and the unique biology of hornworts.” Nature plants 6.3 (2020): 259-272. https://www.hornworts.uzh.ch/en/download.html
M. polymorpha https://marchantia.info/data/README.html https://marchantia.info/download/MpTak_v7.1/
P. patens Bi, Guiqi, et al. “Near telomere-to-telomere genome of the model plant Physcomitrium patens.” Nature Plants 10.2 (2024): 327-343. https://phytozome-next.jgi.doe.gov/info/Ppatens_v6_1
S. moellendorffii Banks, Jo Ann, et al. “The Selaginella genome identifies genetic changes associated with the evolution of vascular plants.” science 332.6032 (2011): 960-963. https://data.jgi.doe.gov/refine-download/phytozome?q=Selaginella+moellendorffii&expanded=Phytozome-91
A. filiculoides Li, Fay-Wei, et al. “Fern genomes elucidate land plant evolution and cyanobacterial symbioses.” Nature plants 4.7 (2018): 460-472. https://fernbase.org/ftp/Azolla_filiculoides/Azolla_asm_v1.1/
A. thaliana Cheng, Chia‐Yi, et al. “Araport11: a complete reannotation of the Arabidopsis thaliana reference genome.” The Plant Journal 89.4 (2017): 789-804. https://data.jgi.doe.gov/refine-download/phytozome?q=Arabidopsis+thaliana&expanded=Phytozome-447
S. lycopersicum Hosmani, Prashant S., et al. “An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps.” biorxiv (2019): 767764. https://data.jgi.doe.gov/refine-download/phytozome?q=Solanum+lycopersicum&expanded=Phytozome-691
Z. mays Jiao, Yinping, et al. “Improved maize reference genome with single-molecule technologies.” Nature 546.7659 (2017): 524-527. https://data.jgi.doe.gov/refine-download/phytozome?q=zea+mays&expanded=Phytozome-493
B. distachyon DNA sequencing and assembly Barry Kerrie 5 Lucas Susan 5 Harmon-Smith Miranda 5 Lail Kathleen 5 Tice Hope 5 Schmutz (Leader) Jeremy 4 Grimwood Jane 4 McKenzie Neil 7 Bevan Michael W. michael. bevan@ bbsrc. ac. uk 7 k, Gene analysis and annotation Haberer Georg 16 Spannagl Manuel 16 Mayer (Leader) Klaus 16 Rattei Thomas 17 Mitros Therese 6 Rokhsar Dan 6 Lee Sang-Jik 18 Rose Jocelyn KC 18 Mueller Lukas A. 19 York Thomas L. 19, and Comparative genomics Salse (Leader) Jerome 27 Murat Florent 27 Abrouk Michael 27 Haberer Georg 16 Spannagl Manuel 16 Mayer Klaus 16 Bruggmann Remy 13 Messing Joachim 13 You Frank M. 8 Luo Ming-Cheng 8 Dvorak Jan 8. “Genome sequencing and analysis of the model grass Brachypodium distachyon.” Nature 463.7282 (2010): 763-768. https://data.jgi.doe.gov/refine-download/phytozome?q=Brachypodium+distachyon&expanded=Phytozome-556
O. sativa Ouyang, S. et al. The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res. 35, D883–D887 (2007). https://data.jgi.doe.gov/refine-download/phytozome?q=Oryza+sativa&expanded=Phytozome-323
Closterium sp. NIES68 Sekimoto, Hiroyuki, et al. “A divergent RWP‐RK transcription factor determines mating type in heterothallic Closterium.” New Phytologist 237.5 (2023): 1636-1651. https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_027943385.1/
Closterium sp. NIES67 Sekimoto, Hiroyuki, et al. “A divergent RWP‐RK transcription factor determines mating type in heterothallic Closterium.” New Phytologist 237.5 (2023): 1636-1651. https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_027943415.1/
Z. circumcarinatum SAG698-1b Feng, Xuehuan, et al. “Genomes of multicellular algal sisters to land plants illuminate signaling network evolution.” Nature Genetics (2024): 1-14. https://phycocosm.jgi.doe.gov/Zygcir6981b_2/Zygcir6981b_2.home.html
Z. circumcarinatum SAG698-1a Feng, Xuehuan, et al. “Genomes of multicellular algal sisters to land plants illuminate signaling network evolution.” Nature Genetics (2024): 1-14. https://phycocosm.jgi.doe.gov/Zygcyl6981a_1/Zygcyl6981a_1.home.html
Z. circumcarinatum UTEX1560 Feng, Xuehuan, et al. “Genomes of multicellular algal sisters to land plants illuminate signaling network evolution.” Nature Genetics (2024): 1-14. https://phycocosm.jgi.doe.gov/Zygcir1560_1/Zygcir1560_1.home.html
Z. circumcarinatum UTEX1559 Feng, Xuehuan, et al. “Genomes of multicellular algal sisters to land plants illuminate signaling network evolution.” Nature Genetics (2024): 1-14. https://phycocosm.jgi.doe.gov/Zygcir1559_1/Zygcir1559_1.home.html
M. endlicherianum Dadras, Armin, et al. “Environmental gradients reveal stress hubs pre-dating plant terrestrialization.” Nature Plants (2023): 1-20. https://mesotaenium.uni-goettingen.de/download.html
S. muscicola Cheng, Shifeng, et al. “Genomes of subaerial Zygnematophyceae provide insights into land plant evolution.” Cell 179.5 (2019): 1057-1067. https://figshare.com/articles/dataset/
P. coloniale Li, Linzhou, et al. “The genome of Prasinoderma coloniale unveils the existence of a third phylum within green plants.” Nature ecology & evolution 4.9 (2020): 1220-1231. https://phycocosm.jgi.doe.gov/Praco1/Praco1.home.html

Publicly available RNA-Seq reads that were investigated in this study

Our work is built on top of collection, cleaning, filtering, and modeling of many datasets that we gathered from public databases such as SRA NCBI. Below, we provide a comprehensive list of accession IDs as well as whether each sample has passed our quality control or not.