Friday, December 18, 2015

The isoform of Ube3a

The isoform abundances of Ube3a (in colon and brain cortex of adult mice)

Data 
wgEncodeCshlLongRnaSeqColonAdult8wksAlnRep1.bam
wgEncodeCshlLongRnaSeqColonAdult8wksAlnRep1V2.bam
wgEncodeCshlLongRnaSeqCortexAdult8wksAlnRep1.bam
wgEncodeCshlLongRnaSeqCortexAdult8wksAlnRep2.bam
Use Tophat for the transcriptome analysis and examined the isoform_exp.diff file 
mRNA_id         gene_id         locus                   fpkm_colon  fpkm_cortex
NM_001033962    Ube3a_isoform_3 chr7:66484119-66562097  0.004558    5.07055
NM_011668       Ube3a_isoform_2 chr7:66484119-66562097  0.230222    0.60659
NM_173010       Ube3a_isoform_1 chr7:66484119-66562097  5.82296     1.36267
It seems that iso3 of Ube3a is the main variant in the mouse cortex. The iso2 of Ube3a encodes the full length protein. Iso1 is considered to be E3-ligase deficient as it lacks 87 amino acids from the C-terminal HECT domain. Both iso1 and iso3 lack 21 amino acids from N-terminus as well. Regarding the localization, Iso1 and Iso2 are ubiquitously found throughout a cell, where Iso3 is confined to the nucleus.



Tuesday, December 1, 2015

MOOC Courses for Genetics/Genomics Data Analysis

I have taken a number of Massive open online course (MOOC) since 2014 and here is a short summary. 

In general, I believe learning-by-doing is the most efficient way, and learning a subject without applying the knowledge to solve any practical problem is unlikely resulting in a good understanding. That being said, I'm kind of opposed to learn things (I feel) which are less relevant to my current work, for example, I used to think it doesn’t make much sense to learn NGS analysis if I am not doing NGS study. However there is a dilemma that, in many circumstances, when facing a complex problem, you  need a certain level of skill/knowledge and be aware of existing tools available to use. Through this personal learning experience,  I could say I benefited quite a lot from the MOOC and I am happy that I invested my time in learning. These MOOC courses serve as a good staring point to build a broad knowledge base.

The most popular MOOC websites are Coursera, EdX, Stanford OpenEdX. The former two provide more courses on genetics/genomics/bioinforamatics; and in OpenEdX, some courses are not free. 

At Coursera, I obtained a certificate from courses including:
Johns Hopkins University  Regression Models 
Johns Hopkins University  Statistical Inference
Johns Hopkins University  The Data Scientist’s Toolbox
Johns Hopkins University  Reproducible Research
Johns Hopkins University  R Programming (highly recommended)
University of Michigan  Programming for Everybody (Python)

At EdX, the courses I finished with a certification:
HarvardX -  PH525x  Data Analysis for Genomics (highly recommended)
MITx -  6.00.1x   Introduction to Computer Science and Programming Using Python (highly recommended)

At OpenEdX, I took the course Statistics in Medicine but didn’t finish it. 

The courses I like the most are
this course covers almost a wide range of the genomics analysis and it is easy to follow the instruction. And you can always find something useful. From this course, I got to know the pheatmap , a handy tool to plot (elegant) heatmap, and now it becomes one of my favourite R package. 

very entertaining and I like the way how Prof Crimson taught. It is not only about how to program with Python, I learned more on computational thinking. 

R Programming - I skipped a lot lecture videos but enjoyed working on the assignments


The Quantitative Biology Workshop (7.QBWx), in my opinion, is not very focused and the transition from the learning material to the question is not always smooth. Although I understand that Matlab is widely used in the field of neuroscience, I do hope the course can adopt R or Octave over Matlab as the former two are free software.