There are different omics disciplines, each describing its own biological field like transcriptomics describes transcripts or epigenomics describes epigenetics in a tissue or a cell or other biological compartment. But is there a way to assemble omics results in a graphical way? In this topic, we will talk about genome browsers — tools that enable you to visualize and search through all the genome along with annotation on transcriptomics, epigenomics, and other omics results.
Genome browser components
Basically, every genome browser shows genome sequence in a horizontal axis, enabling the addition of some annotation on other omics results (it's called "tracks"). For instance, you can add track of gene prediction, so that you can clearly observe intron/exon structure. On top of it, you can add gene expression data, like transcription levels measured in cell lines or expression levels in different tissue types for this specific gene. Also, you can add tracks based on genomics and epigenomics results like single nucleotide polymorphisms (SNPs) and epigenetics marks on histones (H3K27Ac, H3K27me3, and so on) respectively.
There is always a clear indication of the genome position where you are located. You can move along the genome entering gene name (TP53) or direct genome position (chr17:7,668,421-7,687,490) or accession number of gene/transcript (like this ensemble transcript id ENST00000269305.9). Another feature of all genome browsers is the possibility of adding custom tracks.
Let's see what features UCSC Genome Browser, Ensembl genome browser, and Genome Data Viewer from NCBI and IGV have. For all the following examples, we will analyze GRCh38 Human genome assembly and the TP53 gene.
UCSC genome browser
UCSC Genome Browser (UCSC stands for "University of California, Santa Cruz"; current website) is one of the most popular and comprehensive online tools for analyzing/visualizing more than 100 fully-assembled genomes of different species. For instance, you can visualize genomes of humans, mice, chicken, zebrafish, fruit fly (Drosophila melanogaster), worms (Caenorhabditis elegans), yeast (Saccharomyces cerevisiae), Monkeypox virus, and so on. So, first of all, you need to select the reference genome and its assembly version.
After you pick the reference genome you will see a header indicating your exact location (red vertical line in the chromosomal band). Also, there are some tweaks for visualization and windows for genome navigation.
After that, there is a main body with a lot of annotation tracks. Let's explore some. At the top, you seeyour position in the genome. Right after there are GENCODE-annotated transcripts of the TP53 gene with introns/exons depicted as lines with arrows and blue rectangles respectively. You can also depict some statistics on expression using RNA sequencing results in different tissues like here using bar plots. If you click on the track, you will see detailed information on expression values in tissues. Then you can depict transcription levels that support intron/exon structure, i.e. transcription levels are high in exons, but not in intron regions.
Also, epigenetics and genomics results can be used and plotted in the genome browser. For example, we can plot the H3K27Ac histone mark that can unveil some regulation relationships ("Layered H3K27Ac" track) and prediction of SNPs ("Common dbSNP (155)" track).
Right after that module, there is a block of tracks menu, where you can select tracks that you want to show/hide. The part of this module is depicted below.
Ensembl genome browser
Next, we will review the Ensembl genome browser. The functionality of the Ensembl browser is comparable to UCSC as for Ensembl 108 release. For example, you can load tracks for gene expressions, SNPs, and gene regulation. On the genome browser page there is a text description of a loaded region containing the gene name if any exists, genome location, number of transcripts, and Ensembl IDs.
Let's look how the TP53 gene in the Ensembl browser looks like. In the following figure, you can see how common are pathogenic frameshift variants in the TP53 gene ("ClinVar" track). You can also see that somewhere in the middle of the gene body there is a transcription factor binding site ("Regulatory Build" track). Other tracks based on genomics, transcriptomic, and epigenomics results can be loaded.
Genome data viewer
Another useful genome browser comes from NCBI, it's called Genome Data Viewer (current website). The core functionality of this browser is quite the same as in the two previous ones. You can view the results of gene annotation, prediction of SNPs, RNA sequencing exon coverage, and epigenetics marks in different tissues. The nice feature of NCBI's tool is built-in compatibility with other NCBI tools like BLAST.
IGV
IGV (Interactive Genomics Viewer; current website) is probably the most minimalistic genome browser from the above listed. It has such features as adding prediction of SNPs, GENCODE/Ensembl gene annotation, and CpG islands (not depicted below). Because it lacks a lot of extra features like NCBI's tools compatibility or many UCSC built-in tracks, IGV is lightweight and can be used not only online as a web tool but as a stand-alone offline program in a computer.
How to choose a genome browser?
Choosing of genome browser heavily depends on your needs. If you want to comprehensively analyze some gene or set of genes you should pick UCSC, Ensembl, or Genome Data Viewer browser, because they have the biggest variation of tracks. The final choice should be done based on a specific set of tracks that you need. For example, if you want to visualize the single-cell expression of a certain gene across different human tissues, then you should choose UCSC. If you don't need extra features and just want to visualize gene prediction and some SNPs, then IGV is the most suitable tool.
Conclusion
A genome browser is a useful tool to quickly analyze and visualize gene annotation, gene expression, SNPs, and epigenetics marks. There are a few widely used tools that have different levels of complexity. UCSC Genome Browser, Ensembl browser, and Genomic Data Viewer offer the possibility of selection from a broad range of tracks, so they are frequently used in comprehensive analysis. Contrary, IGV is used when you want to visualize gene and SNPs prediction, and load custom tracks.