I am interested in methods and applications of single cell gene expression assays. There is great hope, for good reason, that measuring expression profiles in single cells will aid in immunological and cancer research, and provide new insight in cellular biology (though it still only provides a cross-sectional sample of cells at various points in their lifespan and cell cycle.) However, the biochemical, computational and statistical challenges are sizable.
I am focusing on the combinations of all three. Through the use of unique molecular identifiers, it may be possible to better understand the biases and sources of variability current protocols introduce. I write R software to allow easier manipulation of single-cell data sets (which are sometimes almost as "tall" as they are "wide"). I develop statistical methods that accommodate the bimodality of single cell data, which yield more sensitive and better calibrated tests.
Statistical questions I am currently interested in include:
Graphical modeling and gene-gene interaction networks
Clustering and distance measures for zero-inflated data
Borrowing strength and regularizing vector generalized linear models through empirical Bayes' procedures
2nd-order univariate phenomena, such as exceptional stability or heterogeneity of expression (and how this can be identified given differing levels of technical variability).