MetAssign: Using Clustering to Identify Metabolites

MetAssign: Using Clustering to Identify Metabolites In this tutorial I will present a method called MetAssign that provides probabilistic identifications of groups of peaks derived from metabolomics experiments. The method uses a mixture model with a Dirichlet Process prior to group peaks together that appear to be derived from the same metabolite. Each group is also identified with a putative chemical formula and the groups are post-processed using a set of heuristic rules to filter out false positives. In the talk, I will first introduce the metabolomic data generation process, highlighting the statistical and machine learning challenges present before introducing mixture models, and Dirichlet Process priors. Finally, I will describe the specific MetAssign model and present some results on some benchmark datasets.


Link to Full Article: MetAssign: Using Clustering to Identify Metabolites