Home


Selected publications

Some stage of review
  1. A Cheeger-Type Inequality on Simplicial Complexes.
  2. Randomized Algorithms for Dimension Reduction on Massive Data.
  3. Frechet Means for Distributions of Persistence Diagrams.
  4. Statistical inference for dynamical systems: a review.
  5. Towards stratification learning through homology inference.
  6. Partial factor regression.
  7. Geometric Representations of Hypergraphs for Prior Specification and Posterior Sampling.
  8. Multiscale factor models for molecular networks.
Published or in press (since ~2001)
  1. Bayesian Sparse Factor Analysis of Genetic Covariance Matrices. (2013), Genetics.
  2. Genetics of gene expression responses to temperature stress in a sea urchin gene network. (2012), Molecular Ecology.
  3. A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning . (2012), PLoS One.
  4. Genetic effects on mating success and partner choice in a social mammal . (2012), American Naturalist.
  5. Cyclin-Dependent Kinases Are Regulators and Effectors of Oscillations Driven by a Transcription Factor Network. (2012), Molecular Cell.
  6. Local Homology Transfer and Stratification Learning. (2012), ACM-SIAM Symposium on Discrete Algorithms.
  7. Probability measures on the space of persistence diagrams. (2012), Inverse Problems.
  8. Integrating genetic and gene expression evidence into genome-wide association analysis of gene sets. (2012), Genome Research.
  9. RS-SNP: a random-set method for genome-wide association studies. (2011), BMC Genomics.
  10. Discovering genetic variants in Crohn's disease by exploring genomic regions enriched of weak association signals. (2011), Digestive and Liver Disease.
  11. Cross Species Genomic Analysis Identifies a Mouse Model as Undifferentiated Pleomorphic Sarcoma/Malignant Fibrous Histiocytoma. (2011), PLoS One.
  12. Estimating variable structure and dependence in Multi-task learning via gradients. (2011), Machine Learning.
  13. Multiscale factor models for molecular networks. (2011), Proc of JSM.
  14. On the reproducibility of results of pathway analysis in genome-wide expression studies of colorectal cancers. (2010), Journal of Biomedical Informatics.
  15. Localized Sliced Inverse Regression. (2010), Journal of Computational and Graphical Statistics.
  16. Learning gradients: predictive models that infer geometry and dependence. (2010), Journal of Machine Learning Research.
  17. Bayesian mixture of inverse regressions. (2010), International Conference on Artificial Intelligence and Statistics.
  18. Learning Gradients and Feature Selection on Manifolds. (2010), Bernoulli.
  19. Evidence-ranked motif identification. (2010), Genome Biology.
  20. Comparative study of gene set enrichment methods. (2009), BMC Bionformatics.
  21. Genomic features that predict allelic imbalance in humans suggest patterns of constraint on gene expression variation. (2009) Molelcular Biology and Evolution.
  22. Do serum biomarkers really measure breast cancer?. (2009), BMC Cancer.
  23. Characterizing the developmental pathways TTF-1, NKX2-8, and PAX9 in lung cancer. (2009), Proc. Natl. Acad. Sci. USA.
  24. Local sliced inverse regression. (2009), Proceedings of Advances in Neural Information Processing Systems.
  25. Modeling cancer progression via pathway dependencies. (2008), PLoS Comput Biol.
  26. Gene Expression Programs of Human Smooth Muscle Cells: Tissue-Specific Differentiation and Prognostic Significance in Breast Cancers. (2007), PLoS Genetics.
  27. Understanding the use of unlabelled data in predictive modelling. (2007), Statistical Science.
  28. Characterizing the Function Space for Bayesian Kernel Models. (2007), J Mach Learn Res.
  29. Genomic sweeping for hypermethylated genes (2007), Bioinformatics.
  30. Evidence of influence of genomic DNA sequence on human X chromosome inactivation. (2006), PLoS Comput Biol.
  31. Analysis of Sample Set Enrichment Scores: assaying the enrichment of sets of genes for individual samples in genome-wide expression profiles. (2006), Bioinformatics.
  32. Gene expression changes and moelcular pathways mediating activity-dependent plasticity in visual cortex. (2006), Nat Neurosci.
  33. Estimation of Gradients and Coordinate Covariation in Classification. (2006), J Mach Learn Res.
  34. Learning Coordinate Covariances via Gradients. (2006), J Mach Learn Res..
  35. Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization. (2006), Adv Comput Math.
  36. Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-wide Expression Profiles (2005), Proc Natl Acad Sci USA.
  37. An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis (2005), Nat Genet.
  38. Stability Results in Learning Theory (2005), Anal App.
  39. Permutation Tests for Classification (2005), Proceedings of the Conference on Learning Theory.
  40. Risk Bounds for Mixture Density Estimation (2005), ESAIM: Probability and Statistics.
  41. Androgen-Induced Differentiation and Tumorigenicity of Human Prostate Epithelial Cells. (2004), Cancer Research.
  42. Learning Theory: general conditions for predictivity. (2004), Nature.
  43. Estimating Dataset Size Requirements for Classifying DNA Microarray Data. (2003), J Comput Biol.
  44. An Analytical Method for Multi-class Molecular Cancer Classification. (2003), SIAM Reviews.
  45. Optimal gene expression analysis by microarrays. (2002), Cancer Cell.
  46. Gene Expression-Based Classification and Outcome Prediction of Central Nervous System Embryonal Tumors. (2002), Nature.
  47. Choosing Multiple Parameters for Support Vector Machines. (2002), Machine Learning.
  48. A Uniform Approach to Molecular Cancer Diagnosis Using Tumor Gene Expression Signatures. (2001), Proc Natl Acad Sci U S A.
  49. Molecular classification of multiple tumor types. (2001), Bioinformatics.
  50. Bounds on sample size for policy evaluation in Markov environments. (2001), Proceedings of the Conference on Learning Theory.
  51. Feature Selection for SVMs. J Weston, S Mukherjee, O Chapelle, M Pontil, T Poggio, V Vapnik. Proc Neural Information Processing Systems.
Book Chapters
  1. Classifying Microarray Data Using Support Vector Machines. Understanding and Using Microarray Analysis Techniques: A Practical Guide.
  2. Regression and Classification with Regularization. Nonlinear Estimation and Classification.
  3. b Uncertainty in Geometric Computations.
Unpublished notes
  1. Consistency of regularized sliced inverse regression for kernel models, Working Paper.
  2. Non-parametric Bayesian kernel models, Working Paper.
  3. Gene Selection via a Spectral Approach, IEEE Workshop on Computer Vision Methods for Bioinformatics.
  4. Support Vector Method for Multivariate Density Estimation, CBCL/AI Memo.
  5. Support Vector Machine Classification of Microarray Data, CBCL/AI Memo.