• We recently published
"Network modeling of the transcriptional effects of copy number aberrations in glioblastoma"
in Molecular Systems Biology ( top 10 MSB download in 2011
). Check out our prognostic tool using decompositions of models of mRNA expression based on DNA copy number aberration for patient survival differentiation.
• The EPoC package
appears in the Springer book series "Advances in Systems Biology". This R package generates network models for mRNA and mRNA-CNA data, and extracts prognostic markers using a sparse SVD of the model matrix.
• I am currently teaching a PhD course on "Bootstrap methods" in lp2 and 3 (winter 2011 - spring 2012). Contact me if you are interested in taking this course.
My research centers on the development of new statistical methodology for clustering and model selection, with applications to high-dimensional biological data.
I am particularly interested in integrating techniques from information theory into new tools for statistical model selection and high-dimensional data exploration. Recent efforts in this area include Simultaneous model selection via rate-distortion which allows for the identification of genes and gene clusters associated with interpretable models derived from the experimental design. This work has been extended to data integration, involving mRNA expression, protein metabolic data and pathway information (work with recent graduate Alexandra Jauhiainen ).
Data integration and joint modeling is a rich source for research problems. These types of problems are central components in several joint projects now underway in collaboration with Sven Nelander's group. Our group aims to formulate integrated models for mRNA, microRNA, DNA copy number, methylation and mutation in human cancer.
In addition to projects stemming from systems biology problem, my students and I are also investigating statistical clustering methodology, particularly how subsets of features play in role in the formation of clusters of observations. These projects constitute continuations of my research into mixture modeling, data depth and missing value imputation (see publication list).
My research is often motivated by my collaborative projects. The best thing about being a statistician is that you get the opportunity to work with people from other disciplines.
I work closely with the Nelander lab on problems pertaining to network modeling of cancer, data integration in cancer genomics, and the identification of therapeutic targets.
With Mikael Benson's group, I investigate data integration problems, looking for disease causing (biomarker) genes using both mRNA expression, SNP, and PPI data.
I have recently become involved in projects at the Center for Brain Repair and Rehabilitation (CBR). Here, together with a team of physicians, neuroscientists and musicians, we explore the therapeutic impact of music.
Since coming to Chalmers/GU I have also become involved in the SHRP2 project, sponsored by the US TSA and housed at
Chalmer's SAFER .
R, Sweave and Reproducible research
I am an avid fan of the R project
I believe that we should give more recognition to the scientists who provide easy-to-use implementations of their methods.
The power of R is that current research is often available as packages almost immediately upon the publication of the methodology in a journal. This leads to a fluidity of ideas and implementations and, more importantly, is a key component of reproducible research.
My interest in reproducible research was awakened some while back taking with
Keith Baggerly (check out his papers and debate articles on the subject).
I recently gave a short lecture about the integration of R and LaTeX (R-Sweave) for dynamic report writing. I am so sold on this idea that I am implementing all my lecture notes in R-sweave. My students can thus reproduce lecture notes and coding demos at home.
My classes is usually made up of a mix of students; undergraduates, master students and PhD students and all from different fields of study. I enjoy this kind of dynamic classroom.
I tend to mix black-board lectures with computer demonstrations for all my classes. My goal in teaching is that the students leave my class recognizing that statistical modeling is not a "push-the-button" type exercise, and every data set requires unique consideration.
Courses & Workshops
I cycle teaching applied statistics couses (Linear models, Applied multivariate analysis) and method courses (Statistical inference principles and Survival analysis).
I also teach PhD courses, sometimes jointly with upper division masters programs (Sparse modeling, Empirical Bayes, Bootstrap methods).
Here is a mirror link to courses I taught at Rutgers university.