Friday, August 12, 2016

Annotating sets of genomic intervals with genomic annotations such as chromHMM

Genomation is an R package to summarize, annotate and visualize genomic intervals. It contains a collection of tools for visualizing and analyzing genome-wide data sets, i.e. RNA-seq, bisulfite sequencing or chromatin-immunoprecipitation followed by sequencing (ChIP-seq) data.

Recently we added new features to genomation The new functionalities are available in the latest version of genomation that can be found on its github website.

This demo shows the new annotation functions in genomation. The functions can be used to annotate target regions or a list of target regions with a given set of genomic features. The genomic features to be used should be in named GRangesList format.

Get data to R

We will get p300, SP1 and Nanog peaks,and chromHMM annotations from ENCODE. We wil use the H1-Esc cells. The aim is to annotate the peaks with chromHMM annotations.

The new functions

here are the new functions annotateWithFeatures and heatTargetAnnotation. These functions are available as of version 1.5.6. There were similar functions within the package before but these are more generalized versions of the old ones.

annotateWithFeatures() will calculate the percentage of overlaps in a given GRanges object with a GRangesList object where each element correspods to a different category of genomic features such as promoters, exons and introns.

heatTargetAnnotation() will plot a heatmap of annotations returned from annotateWithFeatures() or can return the matrix that is used the create the heatmap.

Annotate peaks with chromHMM segments

We can annote a given peak set like this:

## summary of target set annotation with feature annotation:
## Rows in target set: 8934
## ----------------------------
## percentage of target elements overlapping with features:
## 1_Active_Promoter 10_Txn_Elongation       11_Weak_Txn      12_Repressed 
##             28.16              0.07              2.10              1.33 
## 13_Heterochrom/lo 14_Repetitive/CNV 15_Repetitive/CNV   2_Weak_Promoter 
##              2.74              0.08              0.18             23.86 
## 3_Poised_Promoter 4_Strong_Enhancer 5_Strong_Enhancer   6_Weak_Enhancer 
##              8.69             11.91             16.33             32.65 
##   7_Weak_Enhancer       8_Insulator  9_Txn_Transition 
##              8.03              5.72              1.00
## 
## percentage of feature elements overlapping with target:
## 1_Active_Promoter 10_Txn_Elongation       11_Weak_Txn      12_Repressed 
##             19.81              0.04              0.16              0.68 
## 13_Heterochrom/lo 14_Repetitive/CNV 15_Repetitive/CNV   2_Weak_Promoter 
##              0.27              0.19              0.57              6.76 
## 3_Poised_Promoter 4_Strong_Enhancer 5_Strong_Enhancer   6_Weak_Enhancer 
##              6.45             20.36             11.49              3.46 
##   7_Weak_Enhancer       8_Insulator  9_Txn_Transition 
##              0.53              0.84              0.71
## 

We can annotate the GRangesList of different peak sets like this:

## Working on: p300
## Working on: SP1
## Working on: NANOG

Make a heatmap of percentage of peaks overlapping with different chromHMM annotations

We can make a heatmap using heatTargetAnnotation() function. The function also returns a ggplot2 object, so it can be further manipulated via ggplot2.

We can also get the matrix that is used to make the heatmap and use it in other heatmap functions.

sessionInfo()

sessionInfo()
## R version 3.3.0 (2016-05-03)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.11.5 (El Capitan)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] devtools_1.11.1  gplots_3.0.1     genomation_1.5.6
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.6                formatR_1.4               
##  [3] GenomeInfoDb_1.8.3         plyr_1.8.4                
##  [5] XVector_0.12.1             bitops_1.0-6              
##  [7] tools_3.3.0                zlibbioc_1.18.0           
##  [9] digest_0.6.10              memoise_1.0.0             
## [11] tibble_1.1                 evaluate_0.9              
## [13] gtable_0.2.0               BSgenome_1.40.1           
## [15] gridBase_0.4-7             yaml_2.1.13               
## [17] parallel_3.3.0             withr_1.0.1               
## [19] rtracklayer_1.32.2         stringr_1.0.0             
## [21] knitr_1.13.1               caTools_1.17.1            
## [23] gtools_3.5.0               Biostrings_2.40.2         
## [25] S4Vectors_0.10.2           IRanges_2.6.1             
## [27] stats4_3.3.0               Biobase_2.32.0            
## [29] data.table_1.9.6           impute_1.46.0             
## [31] plotrix_3.6-3              XML_3.98-1.4              
## [33] BiocParallel_1.6.5         seqPattern_1.4.0          
## [35] rmarkdown_0.9.6            gdata_2.17.0              
## [37] reshape2_1.4.1             readr_1.0.0               
## [39] ggplot2_2.1.0              magrittr_1.5              
## [41] codetools_0.2-14           matrixStats_0.50.2        
## [43] Rsamtools_1.24.0           scales_0.4.0              
## [45] htmltools_0.3.5            BiocGenerics_0.18.0       
## [47] GenomicRanges_1.24.2       GenomicAlignments_1.8.4   
## [49] assertthat_0.1             SummarizedExperiment_1.2.3
## [51] colorspace_1.2-6           labeling_0.3              
## [53] KernSmooth_2.23-15         stringi_1.1.1             
## [55] RCurl_1.95-4.8             munsell_0.4.3             
## [57] chron_2.3-47

5 comments:

  1. Hello! Fix all errors related to the use of passive and active voices with our passive voice detector. We have developed this assistant specifically so that every student can check their text and convert a passive voice into an active one. Our tool has many additional useful features! https://www.passivetoactive.com/

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. read this article if you are still spending a lot of time looking for and identifying various grammatical errors. There you will learn a lot of useful and interesting information, and you will also be able to start using a specialized tool that will help you identify errors and fix them! https://www.tensechecker.com/

    ReplyDelete
  4. Creating high-quality content requires a lot of time and effort, especially when it comes to editing and correcting text. However, this process can be both tedious and time-consuming, often detracting from the creative process of writing . To help make this process more efficient and productive, online software solutions are available that offer a range of useful features. These services allow for quick and accurate text editing, as well as insightful suggestions to improve the quality of your writing. Source: https://www.verbchecker.com/

    ReplyDelete
  5. Always lacking time for these endless checks and edits of your texts? Entrust the execution of these tasks to a real professional who, in a matter of seconds, will check and correct even the most inconspicuous mistake that could call into question all your work. Try it out today, it's absolutely free and the result can please even the pickiest copywriter! Source: https://www.runonchecker.com/

    ReplyDelete