Transcriptomics
In the cells of multicellular organisms, not all genes are transcriptionally active. The transcriptional activity of cells is dependent on the cell type, the development stage, and physiological cues. The term transcriptomics defines the study of the complete set of ribonucleic acid (RNA) transcripts, both coding and non-coding, expressed in a given entity, such as a cell, tissue, or organism. Studying transcriptomes aids the understanding of how different patterns of gene expression affect development and disease.
Prior to transcriptomics, libraries of mRNA transcripts were collected and converted to complementary DNA (cDNA) using reverse transcriptase in the late 1970s. Downstream methods at the time involved the construction of a cDNA library, followed by the characterization of cDNA clones with restriction enzymes, Southern blotting and hybridization analysis1. Later, in 1977, RNA transcripts could be analyzed using Northern blots2. In the 1980s, Sanger-based methods were used to sequence random transcripts, producing expressed sequence tags (EST tags). Based on the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication, lower throughput Sanger methods were popular until they were superseded by higher throughput methods that will be mentioned in later sections of this blog. The late 1980s saw the incorporation of reverse transcription prior to PCR (RT-PCR or RT-qPCR), which was used to quantify RNA3. Transcriptomics data has been used together with molecular cytogenetic techniques, such as in situ hybridization (ISH), which have also been instrumental in examining the spatial and temporal distribution of RNA in tissues and whole organisms. The drawbacks of these techniques were their time-consuming nature and the fact they were limited to analyzing subsets of the transcriptome. The earliest transcriptomic method was described in 1995 – serial analysis of gene expression (SAGE). Employing Sanger-based sequencing of concatenated random transcript fragments, SAGE enabled the quantitative and simultaneous analysis of a large number of transcripts at that time4.
Current transcriptomic technologies
Microarrays and RNA sequencing are the two important contemporary methods in the field of transcriptomics. Microarrays were first described in 1995 and measure the abundance of a defined set of transcripts calculated by their hybridization to an array containing complementary probes. The probes are short nucleotide oligomers attached to a solid substrate, such as glass. Once mRNA is extracted from an organism, reverse transcriptase produces stable double-stranded cDNA (ds-cDNA), which are fragmented and fluorescently labeled with either Cy3 (green) or Cy5 (red) dyes to allow the differentiation of two samples. Probe generation requires prior knowledge of sequence information obtained from either genome annotation or an EST library. For data acquisition, microarray scanners measure the fluorescence produced from two dye samples bound to the microarrays, simultaneously obtaining the relative mRNA expression levels of thousands of genes from two types of samples. Microarrays are still used today in transcriptomics as they have a low cost per sample for larger studies, and data can be easily analyzed and stored in a standard laboratory setting (Table 1). Despite this, microarray studies have seen a decline in popularity, with the number of studies published citing its use decreasing since its peak in 20145. This is mainly caused by the employment of the second contemporary transcriptomic method, RNA-Sequencing (RNA-Seq), which uses high-throughput sequencing methods, also known as next-generation sequencing (NGS), to capture all sequences.
Higher throughput RNA-Seq technologies
RNA-Seq was initially described in 2006 by Bainbridge et al. The study described the transcriptome of a prostate cancer cell line using a sequencing-by-synthesis method6. A typical workflow involves extraction and purification of RNA, followed by library construction.. RNA-Seq uses different deep-sequencing NGS technology platforms5. Preparation steps using Illumina NGS technology involve reverse transcribing RNA to cDNA, adapter ligation and amplification. Sequencing by synthesis employs either single-end or paired-end modes. Lastly, the sequences are aligned with the reference genome or transcriptome, and data is analyzed to identify nucleotides along with accuracy measurements.
RNA-Seq allows unbiased gene expression profiling of RNA sequences within cells, which, unlike microarrays, eliminates the need for prior knowledge of the genome sequence. The technique can also detect single-nucleotide polymorphisms (SNPs), splice variants, and other forms of RNA7 (Table 1). RNA-Seq is proving to be very popular in scientific studies, with the number of publications citing RNA-Seq increasing since its development5. The cost of NGS has, in the past, been a sticking point for adopting the method. However, the cost per gigabase (Gb) for NGS has decreased by over 90% from 2014 to 2020, making the technology more accessible to researchers7.
Microarrays vs RNA-Seq
Characteristics | Microarray | RNA-Seq |
---|---|---|
RNA sample input | ~ µg quantities | ~ ng quantities |
Prior knowledge needed | Yes,reference transcriptsneeded for probe generation | Not required, though genome sequence is useful |
Sensitivity | 10−3 (limited by fluorescence detection) | 10−6 (limited by sequence coverage) |
Sequence resolution | Dedicated arrays needed to detect splice variants | Can detect SNPs and splice variants |
Dynamic range | 103– 104 (limited by fluorescence saturation) | >105 (limited by sequence coverage |
Specificity | Signal noise generated from cross-hybridization of probes with mRNA transcripts due to sequence similarity can affect downstream analysis of gene expression profiles | Free from technical issues caused by probe redundancy and annotation |
Data analysis and storage | Performed in a standard laboratory setting; negligible storage capacity | Specialist bioinformatics software and training may be needed; however, intuitive bioinformatics apps enable sequence alignment, variant calling, and data visualization or interpretation of NGS data without specialist training or additional staff; higher requirements for data storage capacity needed |
Cost per sample | Lower cost per sample for larger studies | Cost per Gb is decreasing |
Why is spatial transcriptomics so important?
RNA-Seq is commonly applied to pooled populations of cells, tissue sections, or biopsies. Although the analysis can decipher important information from average gene expression and differences between treatment conditions, plus provide information on highly regulated pathways, the contribution of individual cells or heterogeneity of tissue cannot be considered. For this reason, single-cell RNA sequencing (scRNA-seq) has been pivotal in identifying rare cell populations masked with bulk profiling.
The limitation of scRNA-seq technology is the lack of consideration for the spatial environment, with an increasing number of studies highlighting the importance of the microenvironment in regulating the biological functions of cells. The microenvironment is essential on a cellular level. For example, cancer stem cells (CSCs) have niches in specific anatomical and physiological locations where cells reside, which maintain and preserve the properties of CSCs8. At a mRNA level, an increasing body of evidence shows that the subcellular context of mRNA impacts gene function, regulating where a protein is produced and trafficked in cells9. The lack of spatial information is where scRNA-seq falls short and why spatial transcriptomics (ST) is so powerful. ST analyzes the number of transcripts of a gene at a particular location within cells or tissue. Nature recognized the field of spatial transcriptomics as Method of the Year in 2020 because of the significant contributions of the technique in characterizing cell types and elucidating the importance of spatial location.
Spatial transcriptomics technologies
ST preserves the in situ spatial location of transcripts expressed within a tissue of interest10. To achieve this, ST workflows utilize elements of different transcriptomics techniques (either image-based or sequencing) combined with molecular cytogenetic techniques. Choosing a suitable ST technology for the applications depends on several factors. Each method has its advantages and drawbacks, and considerations must be made on the desired gene throughput, sequence information, sensitivity, resolution and area size (often a trade-off between tissue size and imaging time)11. Another consideration is feasibility, such as accessibility to the required instrumentation. In general, In situ hybridization (ISH) and in situ sequencing (ISS) methods require an imaging instrument, whereas in situ capture and microdissection methods require an NGS sequencing platform12.
ST technologies can be broadly subdivided into four categories based on how spatial information is acquired:
1. In situ hybridization (ISH)
ISH visualizes RNA molecules in their biological context rather than extracting them from tissue sections. The ISH strategy is based on hybridizing single-stranded mRNAs to fluorescently labeled gene-specific, single-stranded probes with a complementary sequence. Hybridization is used to propagate signals from probes targeting short sequences of transcripts. One current ISH technique is sequential FISH (seqFISH), which, like other techniques (merFISH and seqFISH+), is based on the accurate detection of RNA molecules in single cells. During each round of hybridization of seqFISH, transcripts are targeted with a set of FISH probes. Following imaging of the sample, the FISH probes are removed using DNase I. Subsequent rounds of hybridization employ the same FISH probes labeled with different fluorescent dyes. The same barcode will be present in all transcripts of the same gene so the abundance of the transcript can be quantified following imaging. As the cells in the tissue are fixed, this generates a unique in situ mRNA barcode. The seqFISH technique was initially described to quantify transcripts in single cells but has now been used for spatial transcriptomics of the complex tissues – mouse hippocampus13. The benefits of sequencing barcoding are that it is possible to scale up quickly and is robust, enabling full Z-stack on native samples.
2. In situ sequencing (ISS) methods
ISS methods allow the targeted RNA analysis in preserved cells and tissues. ISS uses a non-targeted approach and detects more genes than ISH. The first ISS methods described employed padlock probes to identify mRNA transcripts in tissues in a multiplex fashion. Padlock probes are linear single-stranded oligonucleotides that hybridize to a specific target sequence and form a circular structure, hence the name. A typical ISS workflow employing padlock probes first involves fixation and permeabilization of the tissue specimen. The mRNA is then reverse-transcribed to cDNA in situ, followed by RNA degradation. There are two types of approaches using padlock probes: a no-gap padlock and a gap-filling approach. In both, ligation steps and rolling circle amplification (RCA) occur to generate RCA products (RCP) used to synthesize multiple copies of RNA, which are sequenced by ligation. Upon sample imaging, each RCP exhibits the colors corresponding to the matched base. The no-gap padlock approach has the advantage of greater sensitivity, whilst the gap-filling padlock approach can capture the actual sequence of targeted RNA. The ISS approaches have been used to detect point mutations and perform multiplexed gene expression profiling using either the gap-filling or the no-gap approaches, respectively14. Further developments in ISS technology have included direct circularization of cDNA using single-stranded DNA ligase (FISSEQ) and automation using microfluidic platforms. Alternative sequencing methods have also been described, such as Barista-seq, which uses Illumina sequencing by synthesis and has the advantage of a higher signal-to-noise ratio than sequencing by ligation15.
3. In situ capture methods
A third way to perform spatial transcriptomics is to extract mRNAs from the tissue while preserving spatial localization. The mRNA species is then profiled ex situ using next-generation sequencing (NGS). The first technology to be proposed using spatial barcode involved overlaying tissue sections on glass slides immobilized with reverse transcription primers that capture polyadenylated mRNA from the tissue sections using poly-T. The primers also contained spatial barcodes and unique molecular identifiers (UMIs) to distinguish the coordinates of each array. The mRNA transcripts diffuse into microwells on the slides and hybridize with primers during the permeabilization step. RT reagents are then used to synthesize cDNA using fluorescently labeled nucleotides. Only the mRNA remains hybridized with nucleotides on the glass slides after enzymatic removal of the tissue. NGS sequencing is then performed, and mRNA transcript data is mapped back to the point of origin in the tissue sample. The initial proof of principle showed that the strategy was able to undertake RNA-seq whilst maintaining the 2D positional information in the mouse brain and human breast cancer samples 16. Advances in ISC methods offer quantification of RNA from different tissue preparations – frozen and formalin-fixed paraffin-embedded (FFPE) using UV-photocleavable barcodes incorporated into the in situ hybridization probes17,18. Upon exposure to UV light on the region of interest, the barcodes are cleaved, generating digital quantification of RNA expression with spatial context18.
4. Microdissection
Laser capture microdissection (LCM) employs a laser beam with microscopic visualization to isolate cells from surrounding tissues with high spatial resolution. This technique is particularly important in clinical contexts as patient materials can be reused for multiple studies. In 2017, a study detailing the Geo-seq method was described, which profiled the transcriptomes of tissue regions dissected using LCM, followed by RNA extraction, cDNA library preparation and sequencing of transcripts. The technique profiled the transcriptome of cells whilst retaining their native spatial information19.
In the next issue of the technical blog, we will discuss further examples of spatial transcriptomics workflows, the challenges faced in the field, and how researchers are combating these challenges. LubioSciences are a trusted life sciences product provider with antibodies compatible with spatial transcriptomics.
Article references
1. Sim, G. K. et al. Use of a cDNA library for studies on evolution and developmental expression of the chorion multigene families. Cell18, (1979).
2. Alwine, J. C., Kemp, D. J. & Stark, G. R. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proc Natl Acad Sci U S A74, (1977).
3. Becker-André, M. & Hahlbrock, K. Absolute mRNA quantification using the polymerase chain reaction (PCR). A novel approach by a PCR aided transcipt titration assay (PATTY). Nucleic Acids Res17, (1989).
4. Velculescu, V. E., Zhang, L., Vogelstein, B. & Kinzler, K. W. Serial analysis of gene expression. Science (1979)270, (1995).
5. Lowe, R., Shirley, N., Bleackley, M., Dolan, S. & Shafee, T. Transcriptomics technologies. PLoS Comput Biol13, (2017).
6. Bainbridge, M. N. et al. Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-by-synthesis approach. BMC Genomics7, (2006).
7. Illumina. High-impact discovery through gene expression and regulation research. emea.illumina.com/science/technology/next-generation-sequencing/plan-experiments/paired-end-vs-single-read.html. (2023).
8. Noll, J. E. , Vandyke, K. & Zannettino, A. C. W. The Role of the “Cancer Stem Cell Niche” in Cancer Initiation and Progression. in Adult Stem Cell Niches (2014).
9. Holt, C. E. & Bullock, S. L. Subcellular mRNA localization in animal cells and why it matters. Science vol. 326 Preprint at doi.org/10.1126/science.1176488 (2009).
10. Piñeiro, A. J., Houser, A. E. & Ji, A. L. Research Techniques Made Simple: Spatial Transcriptomics. Journal of Investigative Dermatology vol. 142 Preprint at doi.org/10.1016/j.jid.2021.12.014 (2022).
11. Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature vol. 596 Preprint at doi.org/10.1038/s41586-021-03634-9 (2021).
12. Williams, C. G., Lee, H. J., Asatsuma, T., Vento-Tormo, R. & Haque, A. An introduction to spatial transcriptomics for biomedical research. Genome Medicine vol. 14 Preprint at doi.org/10.1186/s13073-022-01075-1 (2022).
13. Shah, S., Lubeck, E., Zhou, W. & Cai, L. In Situ Transcription Profiling of Single Cells Reveals Spatial Organization of Cells in the Mouse Hippocampus. Neuron92, (2016).
14. Ke, R. et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods10, (2013).
15. Chen, X., Sun, Y. C., Church, G. M., Lee, J. H. & Zador, A. M. Efficient in situ barcode sequencing using padlock probe-based BaristaSeq. Nucleic Acids Res46, (2018).
16. Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science vol. 353 Preprint at doi.org/10.1126/science.aaf2403 (2016).
17. Asp, M., Bergenstråhle, J. & Lundeberg, J. Spatially Resolved Transcriptomes—Next Generation Tools for Tissue Exploration. BioEssays vol. 42 Preprint at doi.org/10.1002/bies.201900221 (2020).
18. Zollinger, D. R., Lingle, S. E., Sorg, K., Beechem, J. M. & Merritt, C. R. GeoMxTM RNA Assay: High Multiplex, Digital, Spatial Analysis of RNA in FFPE Tissue. in Methods in Molecular Biology vol. 2148 (2020).
19. Chen, J. et al. Spatial transcriptomic analysis of cryosectioned tissue samples with Geo-seq. Nat Protoc12, (2017).
Supplier
IDT - Integrated DNA Technologies
With over 30 years experience as a manufacturer, IDT offers innovative tools for NGS, CRISPR, qPCR and PCR. IDT offers superior quality DNA and RNA oligos, genes, gene fragments, Cas nucleases and more, with fast turnaround times!
About IDT Shop for IDT products
Norgen Biotek
In addition to the topselling Total RNA Purification Kit line, Norgen also offers collection devices for whole food (cf-RNA/DNA) urine, saliva and more.
About Norgen Biotek Shop for Norgen Biotek products
LGC Biosearch Technologies
Biosearch Technologies™ provides products and services for genomic analysis that support mission critical applications for global customers in agrigenomics and human healthcare. In addition, they provide the well-known Stellaris® RNA FISH probes.
About LGC Biosearch Technologies Shop for LGC Biosearch Technologies