This is the website for our W e I ghted C o-expression C alculation tool for pl A smodium genes, or WICCA in short. It uses datasets from different microarray platforms across four different Plasmodium species (P. berghei , P. chabaudi , P. falciparum and P. vivax ) to calculate the co-expression between genes. Combined with e.g. homology prediction based on Hidden-Markov models this can be used to discover new genes (or new orthologs of existing genes) for known biological pathways. It is inspired by the WeGET tool by Szklarczyk et. al as decribed in the paper WeGET: predicting new genes for molecular systems .
Since co-expression cannot be calculated for single genes, we instead offer a way to see how a certain gene ranks across all GO terms to give clues about which biological system(s) a certain gene is likely involved in. First select a species, then a gene ID and hit the Submit button. A table will be displayed with the ranking of that gene among all Biological Process GO terms as well as the overall co-expression value (range of -1 to 1) within each GO term. The In GO group column shows whether or not that particular gene is actually a member of that GO term group.
Precalculated GO term results are shown here as reference for this tool's performance. Using the genes that are members of the given GO terms as query genes, the co-expression was calculated. AUC values given represent Areas Under the Curve for Receiver Operating Curves (0 = worst possible, 0.5 = random, 1 = perfect).
Exported result tables can be uploaded here for performance estimation, by creating a Receiver Operating Curve (ROC) and calculating the Area Under the Curve (AUC).
GPL4220: UTTH_Mouse/P.bergheiANKA_44K_v1.0
GPL8604: Winzeler Lab Plasmodium 17k array (PlasmoFBa520596)
GPL18005: Agilent-032976 Plasmodium berghei_Mar2010_transcriptome
GPL19614: Agilent-024169 P.berghei_ordered 024168
GPL21702: ZB/SBS-NTU Plasmodium Rodent YBC 11.6K v1
GPL15077: Agilent-031546 PBANKA1_X8
GPL15092: Agilent-031544 Custom Plasmodium berghei str. ANKA array [PBANKA1]
GSE5672: Simultaneous Host and Parasite Expression Profiling Identifies Transcriptional Programs Associated with Cerebral Malaria.
GSE9497: Acute Lung Injury In Experimental Malaria.
GSE16259: Plasmodium berghei PbeIK2KO sporozoites.
GSE34806: Comparison of gene expression between wild type and AP2-G KO parasites.
GSE34877: Comparison of gene expression between wild type and AP2-L KO parasites.
GSE52859: Differential gene expression in ap2-g and ap2-g2 mutants of Plasmodium berghei.
GSE53246: Plasmodium berghei Gametocytogenesis.
GSE58580: Identification of AP2-O targets [microarray].
GSE64887: Expression data from Plasmodium falciparum schizonts.
GSE65032: Global expression profiling reveals shared and distinct transcript signatures in arrested act2(-) and CDPK4(-) Plasmodium berghei gametocytes.
GSE80015: Global gene expression of the rodent malaria parasites Plasmodium yoelii, Plasmodium berghei and Plasmodium chabaudi blood-stage parasites.
GSE83667: Parasite in vivo blood transcriptomes from Malawian children with cerebral malaria.
GPL21702: ZB/SBS-NTU Plasmodium Rodent YBC 11.6K v1
GPL14814: ZB/SBS-NTU Plasmodium chabaudi chabaudi 4K v1.0
GSE33333: Characterization and gene expression analysis of the cir multi-gene family of Plasmodium chabaudi chabaudi (AS).
GSE80015: Global gene expression of the rodent malaria parasites Plasmodium yoelii, Plasmodium berghei and Plasmodium chabaudi blood-stage parasites.
GPL1321: [Plasmodium_Anopheles] Affymetrix Plasmodium/Anopheles Genome Array.
GPL1858: NIAID Pfab.
GPL1892: Scripps Malaria GeneChip.
GPL3575: SrcMalaria (Scripps/GNF 2002 affymetrix chip).
GPL3983: Malaria Oligo p19.
GPL5928: Affymetrix Plasmodium falciparum 5K (SrcMalaria ).
GPL6187: Plasmodium falciparum Rathod Lab 8096.
GPL6269: Plasmodium falciparum Rathod Lab 8832.
GPL7628: Malaria Print 20060818ML_PMC.
GPL8088: Princeton Lewis-Sigler Institute Plasmodium 8.5K array version 1.
GPL8604: Winzeler Lab Plasmodium 17k array (PlasmoFBa520596).
GPL9109: Operon_malaria_8K.
GPL10991: ZB/SBS-NTU Plasmodium falciparum 3D7 11.0K v1.1
GPL11248: ZB/SBS-NTU Plasmodium falciparum 3D7 11.4K v1.1.
GPL11250: ZB/SBS-NTU Plasmodium falciparum 3D7 IGR+ORF 16.8K v1.0.
GPL13818: BIOTEC P. falciparum 8448 features.
GPL14830: ZB/SBS-NTU Plasmodium falciparum 11K v1.1 (DD2 strain).
GPL14831: ZB/SBS-NTU Plasmodium falciparum 10.4K v1.0.
GPL15130: Agilent-037237 P.falciparum_Pf3D7v7.1_Transcriptome_apimito.
GPL15221: Plasmodium falciparum Rathod09A_10816 (DD2 strain).
GPL16175: [scrMalariaa] Winzeler Lab Plasmodium falciparum expression array 5312 v2.
GPL17233: Agilent-021506 Plasmodium falciparum UP1_01_Of_1.
GPL17880: Agilent-032491 3D7v7.1_transcriptome.
GPL18267: Agilent-032491 3D7v7.1_transcriptome [ORF version].
GPL18893: ZB/SBS-NTU Plasmodium falciparum 3D7 11.4K v2.
GSE2878: Molecular mechanism for switching of P.falciparum invasion pathways into human erythrocytes.
GSE4582: T4_bisthiazolium effect on the malaria parasite transcriptome.
GSE8099: Whole genome analysis of mRNA decay in P. falciparum reveals a lengthening of mRNA half-life during the IDC.
GSE9152: Distinct physiological states of Plasmodium falciparum in malaria infected patients.
GSE9724: Hard-wiring in the Plasmodium falciparum transcriptome I.
GSE9853: Hard-wiring in the Plasmodium falciparum transcriptome 2.
GSE9868: Hard-wiring in the Plasmodium falciparum transcriptome II.
GSE10022: Expression and genomic changes after exposing drug-selected mutants to short term CQ treatment in Plasmodium falciparum.
GSE11763: The HAT Inhibitor Anacardic Acid Leads to Changes in Global Gene Expression During in vitro P.falciparum Development.
GSE13578: Functional genomics investigation of PfAdoMetDC/ODC co-inhibition of Plasmodium falciparum.
GSE14524: Transcriptomic and metabolomic profiles of the Plasmodium falciparum developmental cycle.
GSE16259: Plasmodium berghei PbeIK2KO sporozoites.
GSE18075: Plasmodium falciparum treated with cyclohexylamine.
GSE19010: Gene expression profiling of Plasmodium falciparum after co-culture with NK cells.
GSE24416: Quantitative time-course profiling of Plasmodium falciparum transcripts and proteins throughout the 48-hour intraerythrocytic developmental cycle.
GSE25642: Comparative gene expression profiling of P. falciparum malaria parasites exposed to three different histone deacetylase inhibitors.
GSE25878: Artemisinin resistance in Plasmodium falciparum is associated with an altered temporal pattern of transcription (expression).
GSE25879: Artemisinin resistance in Plasmodium falciparum is associated with an altered temporal pattern of transcription (CGH).
GSE28625: Transcript level responses of Plasmodium falciparum to antimycin A.
GSE28701: Transcript-level responses of Plasmodium falciparum to thiostrepton.
GSE28990: Whole transcriptome analysis of rosetting Plasmodium falciparum parasites.
GSE29874: In vitro antiplasmodial activity of Dicoma anomala subsp. gerrardii (Asteraceae): identification of its main active constituent and gene expression profiling.
GSE30869: Comparison of developmental stage transcripts in the K1 strain.
GSE31829: Mutually Exclusive Transcription of Subtelomeric Gene Families in Plasmodium falciparum is Restricted to var Genes.
GSE32211: Whole transcriptome analysis identifies a subset of Group A var genes that encode the malaria parasite ligands for binding to human brain endothelial cells.
GSE33605: P. falciparum (lab strain DD2) treated with ionomycin vs untreated [CGH] (DD2 strain).
GSE33764: P. falciparum (lab strain DD2) treated with A23187 vs untreated [CGH] (DD2 strain).
GSE33795: P. falciparum (lab strain 3d7) schizonts untreated control vs P. falciparum (lab strain 3d7) reference RNA pool.
GSE33796: P. falciparum (lab strain 3d7) schizonts treated with ionomycin vs P. falciparum (lab strain 3d7) reference RNA pool.
GSE33797: P. falciparum (lab strain 3d7) schizonts treated with A23187 vs P. falciparum (lab strain 3d7) reference RNA pool.
GSE33826: Gene expression in Plasmodium falciparum NF54 and P. falciparum HOX.
GSE33834: Extracellular calcium chelation experiment_P. falciparum (lab strain 3d7) schizonts untreated vs P. falciparum (lab strain 3d7) reference RNA pool.
GSE33835: Extracellular calcium chelation experiment_P. falciparum (lab strain 3d7) schizonts treated with ionomycin vs P. falciparum (lab strain 3d7) reference RNA pool.
GSE33836: Extracellular calcium chelation experiment_P. falciparum (lab strain 3d7) schizonts treated with EGTA and ionomycin vs P. falciparum (lab strain 3d7) reference RNA pool.
GSE35732: Early Genomic Amplifications in Blood-Stages of ARMD Plasmodium falciparum Acquiring De Novo Drug Resistance [Genome variation] (DD2 strain).
GSE35949: Early Genomic Amplifications in Blood-Stages of ARMD Plasmodium falciparum Acquiring De Novo Drug Resistance [Expression] (DD2 strain).
GSE39238: Dynamics of epigenetic regulation of gene expression during Plasmodium falciparum life cycle.
GSE39485: Identification of a New Chemical Class of Antimalarials, ACT-213615.
GSE41567: Expression data from TSA-treated Plasmodium.
GSE44127: Structural polymorphism in the promoter of pfmrp2 confers tolerance to mefloquine and chloroquine in Plasmodium falciparum [expression].
GSE47349: Expression data of gene knockout Plasmodium falciparum clones.
GSE47579: Epigenetic marks reduce erythrocyte uptake of antimalarials.
GSE47611: 4PEHz-treated P.falciparum parasites vs. untreated extracted at timepoints t1 (12 Hours post invasion or HPI), t2 (24 HPI) and t3 (36 HPI).
GSE52030: Comparison of Gametocyte producer 3D7-A subclone E5 to the gametotcyte nonprodcer strain F12 and pfap2-g deletion mutant.
GSE53176: Heterochromatin protein 1 secures survival and transmission of malaria parasites.
GSE54806: 3D7 PfHda2 Knockdown vs. Wildtype.
GSE56329: Molecular characterization of Plasmodium falciparum Bruno/CELF RNA binding proteins.
GSE57748: Genome-wide RNA polymerase II recruitment study during the intraerythrocytic life cycle of Plasmodium falciparum.
GSE59015: Gene expression analysis of a double knockout of TCA cycle enzymes IDH and KDH.
GSE59097: Whole genome expression profiling of artemsinin-resistant Plasmodium falciparum field isolates [in vivo].
GSE59098: Whole genome expression profiling of artemsinin-resistant Plasmodium falciparum field isolates [ex vivo].
GSE61536: Plasmodium parasites mount an arrest response against dihydroartemisinin [Microarray].
GSE62364: Identification of a subtelomeric gene family expressed during the asexual-sexual stage transition in Plasmodium falciparum.
GSE64887: Expression data from Plasmodium falciparum schizonts.
GSE64688: Whole genome expression profiling across the asexual life cycle in PfBDP1 knockdown malaria parasites.
GSE64690: Whole genome expression profiling across the life cycle of PfBDP1 overexpressing P. falciparum malaria parasites.
GSE66669: Plasmodium falciparum whole-genome real-time transcription and decay.
GSE72578: DNA damage regulation and its role in drug-related phenotypes in the malaria parasites. [RNA].
GSE72579: DNA damage regulation and its role in drug-related phenotypes in the malaria parasites. [genomic DNA].
GSE72695: Capturing the dynamic RNA landscape of the malaria parasite P. falciparum.
GSE75295: A crucial role of the zinc finger protein PfZnFP-G in sexual differentiation of the malaria parasites, Plasmodium falciparum.
GSE83667: Parasite in vivo blood transcriptomes from Malawian children with cerebral malaria.
GPL6667: P. vivax exon array v1.0 using oligos designed by Z. Bozdech and G. Hu
GPL18382: Agilent Custom P. vivax 8x15k Array designed by Ashis Das, Raja CM & Genotypic Technology Pvt. Ltd. (AMADID:022126)
GSE11075: P. vivax: proof-of-principle global transcription analysis from two wild isolates.
GSE12174: Identification of Plasmodium vivax genes whose expression is spleen-dependent.
GSE55644: A custom genome-wide Plasmodium vivax 8x15K microarray for expression profiling of Indian clinical isolates showing complicated malaria.
Using the Gene Expression Omnibus the raw data from 83 microarray experiments across 36 different platforms were downloaded from four different Plasmodium species (see tab 'Datasets'). These species were berghei (strain ANKA, 12 datasets), chabaudi (strain chabaudi chabaudi, 2 datasets), falciparum (strain 3D7: 62 datasets, strain DD2: 4 datasets) and finally vivax (strain Sal-1, 3 datasets). Background correction and quantile normalization was performed using the R package limma .
For the co-expression calculation the Pearson's correlation coefficient will be calculated between the query genes and the rest of the genes. These query genes can be selected from a full list of genes, by selecting a (Biological Process) GO term, by uploading a file with gene identifiers or by manually entering some gene identifiers.
The pair-wise correlation between all the query genes will determine the importance of a given dataset (it is expected the user enters query genes that have properties in common and thus should correlate well together) and will influence the final co-expression score of all the genes in that dataset. For each dataset the fraction of the sum of the pair-wise correlations between all query genes compared to the maximum possible sum of pair-wise correlations determines the final weight. By assigning a higher weight (importance) to datasets with a higher pair-wise correlation between the query genes, much of the noise from less important datasets is automatically filtered out.
Background correction is applied by subtracting the correlation of each gene against every other gene on the array. This is used to compensate for the fact that genes that correlate high against every other gene (because of either technical artifacts or biological reasons e.g. transcription factors) will be less interesting, so in this way they receive a lower score and will end up with a lower final rank.
PlasmoDB (version 29) was used for retrieving the full gene lists, gene identifiers and Gene Ontology term annotation. Because the amount of datasets of P. berghei , P. chabaudi and P. vivax are very small compared to P. falciparum , ortholog groups were used to combine them. These ortholog group definitions were also obtained from PlasmoDB. This allows us to extend the collection for the species we have little amounts of data for and hopefully still obtain a co-expression signal. Of course the performance of P. falciparum far exceeds that of the other three species simply because more pure ( non-ortholog) data is available. Using ortholog groups gives a much more coarse grained signal than using the actual genes, but it's still better than no signal at all.
This website and the underlying calculations were made by Ronald Duim at the Centre for Molecular and Biomolecular Informatics (CMBI) at Nijmegen under the supervision of professor Martijn Huijnen . The co-expression calculation tool was written in R, version 3.3.1. This website was created using Shiny, version 0.14.2.