Cross-check metabolite IDs produced by R IPA and ipaPy2 on the same LC-MS dataset and keep only the high-confidence matches.
Reproduce the worked E. coli example from the linked Current Analytical Chemistry paper as a teaching exercise.
Score candidate annotations by HMDB or KEGG identifier match, molecular skeleton, formula, pathway, or name similarity.
Build a reproducible annotation-consensus step into a larger metabolomics pipeline written in Python.
Package is alpha v0.1 and not on PyPI, so install is pip-from-source and only the R IPA and ipaPy2 readers are supported today.
COMA stands for Consensus Of Metabolite Annotations. It is a Python tool for researchers who study small chemical compounds found in living things, work known as metabolomics. When scientists run samples through a machine called a mass spectrometer, they get back lists of possible chemical matches, and different software programs often disagree about which match is correct. COMA tries to settle those disagreements in a structured way. The problem the project addresses is that two different annotation tools, given the same raw data, can produce different top guesses for what a chemical signal represents. Researchers today usually deal with this by picking one tool and ignoring the others, or by comparing the outputs by hand in spreadsheets, which is slow and hard to reproduce. COMA reads the outputs of these tools, lines them up, and reports where they agree. Agreement is judged at five levels of strictness. The strongest level is an exact match on a chemical's database identifier, such as an HMDB or KEGG code. Weaker levels include matching the chemical skeleton, matching the molecular formula within a small mass tolerance, sharing a metabolic pathway, or just having similar names. Each level produces a confidence score, and the final output is a flat table with one row per pair of guesses, labelled high, medium, or low confidence. The current version is marked alpha and called v0.1. It supports two reader formats, called R IPA and ipaPy2. The roadmap for v0.2 lists more readers for tools named SIRIUS, GNPS, and MetFrag, along with a smarter scoring model, a visualisation module, and HTML reports. Installation is from source via pip, and the package is on GitHub but not yet on PyPI. The licence is MIT. The repository is tied to a research paper by Lita Doolan at City St George's and King's College London, currently under review at Current Analytical Chemistry. An example E. coli dataset from that paper is included in the source tree as a worked example.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.