• PhD opening.
We invite applications for a PhD position starting September 2015 - deadline for application February 15th
. Please check the the department home page
for details on how to apply.
The project, in collaboration with the the Nelander group at Uppsala University, centers on uncertainty estimation and validation of high-dimensional models in cancer genomic. We are particularly interested in (i) the integration of data sources (gene expression, methylation, copy number variation, mutations), (ii) the development of disease-comparative models; and (iii) the derivation of formal validation tools for comparative models at different levels of detail.
The project requires us to come up with new statistical methodology, fast and efficient implementations and validation schemes, and apply this to a large-scale human cancer data set (both public data sets from the Cancer Genome Atlas as well as tumor profiles obtained through our collaborators at Uppsala.
We are looking for people with strong mathematical, statistical and/or computational skills. You do not need to have any prior training in biology/genetics but will be expected to read up on these subjects in order to better participate in the collaboration.
My research centers on the development of new statistical methodology for network modeling, clustering and model selection, with applications to high-dimensional biological data.
I am particularly interested in integrating techniques from information theory into new tools for statistical model selection and high-dimensional data exploration. Recent efforts in this area include Simultaneous model selection via rate-distortion which allows for the identification of genes and gene clusters associated with interpretable models derived from the experimental design. This work has been extended to data integration, involving mRNA expression, protein metabolic data and pathway information (work with recent graduate Alexandra Jauhiainen ).
Data integration and joint modeling is a rich source for research problems. These types of problems are central components in several joint projects now underway in collaboration with Sven Nelander's group. Our group aims to formulate integrated models for mRNA, microRNA, DNA copy number, methylation and mutation in human cancer.
In addition to projects stemming from systems biology problem, my students and I are also investigating statistical clustering methodology, particularly how subsets of features play in role in the formation of clusters of observations. These projects constitute continuations of my research into mixture modeling, data depth and missing value imputation (see publication list).
My research is often motivated by my collaborative projects. The best thing about being a statistician is that you get the opportunity to work with people from other disciplines.
I work closely with the Nelander lab on problems pertaining to network modeling of cancer, data integration in cancer genomics, and the identification of therapeutic targets.
With Mikael Benson's group, I investigate data integration problems, looking for disease causing (biomarker) genes using both mRNA expression, SNP, and PPI data.
I have recently become involved in projects at the Center for Brain Repair and Rehabilitation (CBR). Here, together with a team of physicians, neuroscientists and musicians, we explore the therapeutic impact of music.
Since coming to Chalmers/GU I have also become involved in the SHRP2 project, sponsored by the US TSA and housed at
Chalmer's SAFER .
R, Sweave and Reproducible research
I am an avid fan of the R project
I believe that we should give more recognition to the scientists who provide easy-to-use implementations of their methods.
The power of R is that current research is often available as packages almost immediately upon the publication of the methodology in a journal. This leads to a fluidity of ideas and implementations and, more importantly, is a key component of reproducible research.
My interest in reproducible research was awakened some while back taking with
Keith Baggerly (check out his papers and debate articles on the subject).
I recently gave a short lecture about the integration of R and LaTeX (R-Sweave) for dynamic report writing. I am so sold on this idea that I am implementing all my lecture notes in R-sweave. My students can thus reproduce lecture notes and coding demos at home.
My classes is usually made up of a mix of students; undergraduates, master students and PhD students and all from different fields of study. I enjoy this kind of dynamic classroom.
I tend to mix black-board lectures with computer demonstrations for all my classes. My goal in teaching is that the students leave my class recognizing that statistical modeling is not a "push-the-button" type exercise, and every data set requires unique consideration.
Courses & Workshops
I cycle teaching applied statistics couses (Linear models, Applied multivariate analysis) and method courses (Statistical inference principles and Survival analysis).
I also teach PhD courses, sometimes jointly with upper division masters programs (Sparse modeling, Empirical Bayes, Bootstrap methods).
Here is a mirror link to courses I taught at Rutgers university.