"Guilt by Association" — Finding Genes That Travel With Yours
Science loves a good collaborator. In the molecular world, genes that are co-expressed — that go up and down together across many samples — are often part of the same biological programme. If you understand your gene's "company," you understand something deeper about what it actually does.
R2's correlating genes module is built on this principle, and it's one of the most unexpectedly delightful tools in the platform.
Here's the scenario: you've been working on a transcription factor for two years. You know it matters. You have good evidence it regulates a handful of known targets. But you have a nagging suspicion there's more to the story — that it's coordinating a much broader programme that you haven't fully mapped. Your collaborator in computational biology just went on sabbatical. What now?
You go to R2, select your dataset, and run Find Correlated Genes. R2 calculates the Pearson correlation between your gene of interest and every other gene in the dataset. What comes back is two lists: genes that go up with yours (positive correlation) and genes that go down (negative correlation). Hundreds of them, ranked by the strength of the relationship.
You skim the top positively correlated genes. Several are known targets — which is reassuring, like seeing familiar faces at a party. But then there are unfamiliar ones. Genes you haven't thought about. R2 lets you click any of them to immediately visualise the scatter plot — your gene on one axis, the candidate on the other, with each dot a sample. A tight diagonal line of dots tells you this relationship is real.
Then you do something clever. You take your full list of correlated genes and ask R2 to place them in pathway context — are they enriched in any known biological processes? The answer comes back: your transcription factor's correlates are heavily overrepresented in a metabolic pathway you hadn't previously connected to your work. A new hypothesis is born.
You can also flip the analysis around: rather than "what correlates with my gene," ask "what genes correlate with a clinical outcome?" The logic is the same, but the entry point is a survival track or a treatment response annotation rather than another gene.
The result, either way, is a richer picture of your gene's biology — built not from a single experiment, but from the accumulated signal of hundreds of patient samples.
This is Part of an ongoing series on the R2 Genomics Analysis and Visualization Platform, developed at Amsterdam UMC. All analyses can be freely performed at r2.amc.nl. Full tutorials at r2-tutorials.readthedocs.io.
Comments
Post a Comment