Recipes, scripts and genomics: RNASeq

Showing posts with label RNASeq. Show all posts

Wednesday, February 10, 2021

New Book: "Computational Genomics with R"

It costs now 299$ now to sequence your genome [https://nebula.org/whole-genome-sequencing-dna-test/]. It is getting cheaper to generate genomic data and more of it is coming through. We can't see the same trend for the people who can analyze the data. There is always a shortage of people who can do decent analysis of genomic data. We need to train these people. However, the skillset required ,apart from the domain specific knowledge of genomics, is very similar to data science jobs. Some qualified people will go on those route and some people who are trained in genomic data analysis will end up as data scientists in unrelated fields. This pushes the demand for genomic data analysis people higher and higher. Not only there is not enough of them, the ones we have might do something else using the same skillset.

Because of this scarcity but also to decrease our workload in the lab we chose to train people in computational genomics. We have done this for many years. We organized public hands-on courses and provide consultation and mentoring opportunities for people who want to learn how to do genomic data analysis. We see that there is a large demand for genomic data analysis and we can teach those skills.

Now, we organized this material as a book and published via CRC press.

Who is this book for?

The book contains practical and theoretical aspects of computational genomics. Biology and medicine generate more data than ever before. Therefore, we need to educate more people with data analysis skills and understanding of computational genomics. Since computational genomics is interdisciplinary, this book aims to be accessible for biologists, medical scientists, computer scientists and people from other quantitative backgrounds. We wrote this book for the following audiences:

Biologists and medical scientists who generate the data and are keen on analyzing it themselves.
Students and researchers who are formally starting to do research on or using computational genomics do not have extensive domain-specific knowledge, but have at least a beginner-level understanding in a quantitative field, for example, math, stats.
Experienced researchers looking for recipes or quick how-to’s to get started in specific data analysis tasks related to computational genomics.

What will you get out of this?

This resource describes the skills and provides how-to’s that will help readers analyze their own genomics data.

After reading:

If you are not familiar with R, you will get the basics of R and dive right in to specialized uses of R for computational genomics.
You will understand genomic intervals and operations on them, such as overlap.
You will be able to use R and its vast package library to do sequence analysis, such as calculating GC content for given segments of a genome or find transcription factor binding sites.
You will be familiar with visualization techniques used in genomics, such as heatmaps, meta-gene plots, and genomic track visualization.
You will be familiar with supervised and unsupervised learning techniques which are important in data modeling and exploratory analysis of high-dimensional data.
You will be familiar with analysis of different high-throughput sequencing datasets (RNA-seq, ChIP-seq, BS-seq and multi-omics integration) mostly using R-based tools.

Monday, May 23, 2016

Computational Genomics Course in Berlin

Berlin Institute for Medical Systems Biology is organizing a computational genomics course and R programming will be used for most of the practical sessions. The course will cover basic statistics, programming and basic concepts in next-generation sequencing as well as it is applications such as RNA-seq, ChIP-seq, DNA-seq and metagenomics in the context of precision medicine. There will be practical sessions every day. The course will last 10 work days and will be taught at MDC-Berlin campus between 12-23 September 2015.

Course Modules

Introduction to R and Bioconductor
Statistics and Exploratory Data analysis
Introduction to Next-gen sequencing
RNA-seq analysis
ChIP-seq analysis
Variant calling and annotation
Data integration and predictive modeling using HT-seq data
metagenomics and human health

Instructors

The instructors include high-profile local and external scientists working on computational genomics.
See the full list at: http://compgen2016.mdc-berlin.de/

Travel Allowance

We offer a travel allowance for covering the travel and the accommodation (for doublerooms in downtown Berlin) for successful candidates.

Apply By 01 July 2016

http://compgen2016.mdc-berlin.de/

Event poster

Computational genomics approaches to precision medicine from Altuna Akalin

Monday, November 22, 2010

Retrieving transcriptome sequences for RNASeq analysis

One approach for analyzing RNASeq data from an organism with a well-annotated genome, is to align the reads to mRNA (cDNA) sequences instead of the genome. To do that you need to extract the transcript sequences from a database. This is how to extract ensembl transcript sequences from UCSC from within R:

_________________________________________________

library(GenomicFeatures)

library(BSgenome.Hsapiens.UCSC.hg18)

tr <- makeTranscriptDbFromUCSC(genome="hg18", tablename="ensGene")

tr_seq <- extractTranscriptsFromGenome(Hsapiens, tr)

write.XStringSet(tr_seq, file="hg18.ensgene.transcripts.fasta", 'fasta', width=80, append=F)