What if we could identify peptides that are specific to the biological function for a desired taxonomic group?
Getting answers to important questions from ocean metaproteomics data is an especially difficult problem for one simple reason: the great majority of organisms producing the proteins in a seawater sample do not have assembled genomes! This means that we don't have protein databases to search our experimental spectra against, which is how most proteomics experiments identify peptides. Damon develops computational methods to use short-read DNA sequencing data from seawater samples to construct databases of peptides that might be found within the samples. This gives us a list of short peptide sequences that are present in the sample, but those peptide sequences by themselves don't tell us anything. What next?
Damon is developing methods to robustly and confidently assign taxonomic and functional annotations to each peptide. He uses sequence homology to organisms within the NCBI taxonomy database to assign a Lowest Common Taxonomic Unit to many identified peptides. Similarly, he uses sequence similarity to derive Lowest Common Functional Units for the same peptides. These doubly-annotated peptides are then used as functional-taxonomic units that can be directly compared across samples (for example, across a gradient of ocean depth), to make a case for differential activity between the samples.
We will be unraveling the complexities of metaproteomics and making all progress available to the public. Our most recent tool and progress on metapeptides can be found here.
Lowest Common Taxonomic Unit (LCTU)
Lowest Common Functional Unit (LCFU)