A PySide6 desktop application for processing direct infusion FT-ICR mass spectrometry data using the CoreMS framework and AI agent.
CoreMS Orchestrator is an AI-augmented desktop application for automated processing, molecular formula assignment, and intelligent interpretation of Fourier-transform ion cyclotron resonance (FT-ICR) mass spectrometry data. The inventive features are:
(a) Embedded autonomous AI agent with dual-scope tool execution. A large language model (LLM) agent operates within the desktop application and possesses programmatic control over both the local four-stage processing workflow (import, calibrate, search, export) and remote computational infrastructure via the Model Context Protocol (MCP). The agent can autonomously execute any combination of local GUI operations and server-side actions (database queries, batch workflow submission, parameter management, object storage) within a single conversational interaction, without requiring the user to manually switch between interfaces.
(b) Structured concurrency bridge between synchronous GUI and asynchronous protocol layers. The application implements a signal-mediated architecture that bridges PySide6/Qt's synchronous event loop with asynchronous MCP transport (anyio/httpx) and OpenAI-compatible LLM inference, enabling real-time agentic tool-use loops within a responsive desktop GUI without blocking the user interface.
(c) Context-aware scientific reasoning with live spectral state injection. The agent receives a continuously updated representation of the current spectral state—including all assigned peaks with mass accuracy, molecular formula, double-bond equivalence, compound class, and confidence scores, enabling domain-specific analytical reasoning (van Krevelen analysis, compound class distributions, aromaticity assessment) grounded in the actual experimental data rather than generic knowledge.
(d) Instrument-agnostic, endpoint-agnostic design. The application supports multiple instrument vendors (Bruker SolarIX, Thermo Scientific) and can connect to any OpenAI-compatible LLM endpoint and any MCP-compliant server, permitting deployment within diverse institutional computing environments without vendor lock-in.
(e) Planned: Automated molecular networking from tandem MS data. Future extensions incorporate MS/MS fragmentation analysis for organometallic compound characterization and metabolomics, with AI-driven construction and annotation of molecular similarity networks.
| Feature | Details |
|---|---|
| Non-blocking processing | Each step runs in a QThread; the UI stays responsive |
| Four-step workflow | Import → Calibrate → Search → Export |
| Live Matplotlib plots | Mass spectrum, assigned bar chart, DBE vs C#, PPM error histogram |
| Sortable / filterable results table | Click any column header to sort; type to filter |
| Multiple export formats | CSV, HDF5, Excel (.xlsx) |
| Agent chatbot | Embedded OpenAI-compatible assistant — reads live spectrum context, triggers local workflow steps, and calls remote CoreMS MCP tools |
- Python 3.10 or newer
- CoreMS installed in the same environment
git clone https://github.com/EMSL-Computing/CoreMS-GUI.git
cd CoreMS-GUIFollow the CoreMS installation guide or install from source:
git clone https://github.com/EMSL-Computing/CoreMS.git
cd CoreMS
pip install -e .
cd ..Install corems-gui as an editable package (recommended for development):
pip install -e .Or install only the runtime dependencies directly:
pip install -r requirements.txtmacOS note:
PySide6ships its own Qt libraries — no separate Qt installation is required.
corems-guipython -m corems_guifrom corems_gui import main
main()
# or launch just the window inside an existing QApplication:
from corems_gui import FTICRMainWindow
window = FTICRMainWindow()
window.show()import corems_gui
print(corems_gui.__version__) # 0.1.0This project uses bump-my-version to keep version numbers consistent across pyproject.toml and corems_gui/__init__.py.
Install the tool:
pip install bump-my-versionBump the version:
# patch: 0.1.0 → 0.1.1
bump-my-version bump patch
# minor: 0.1.0 → 0.2.0
bump-my-version bump minor
# major: 0.1.0 → 1.0.0
bump-my-version bump majorEach bump updates both files, creates a commit, and tags the release (e.g. v0.1.1).
┌──────────────────────────────────────────────────────────────┐
│ MenuBar (File · Help) │
├──────────────────┬───────────────────────────────────────────┤
│ Step tabs │ Spectrum plot (Matplotlib + nav toolbar) │
│ 1. Import │ ─────────────────────────────────────────│
│ 2. Calibrate │ Results table (sortable / filterable) │
│ 3. Search │ + Processing log │
│ 4. Export │ │
├──────────────────┴───────────────────────────────────────────┤
│ Status bar [ progress indicator ] │
└──────────────────────────────────────────────────────────────┘
Select a Bruker Solarix .d directory. Configure:
- Apodization method and zero-fill / truncation counts
- Noise threshold method and sensitivity parameters
- Peak-picking m/z range and minimum prominence
Click Import & Process to load the transient and build the mass spectrum.
Select a reference mass list (.ref format, two-column: Formula m/z). Configure:
- PPM error window for calibrant matching
- Polynomial order (1 = linear, 2 = quadratic)
- Minimisation method and signal-to-noise threshold for calibrant peaks
Click Run Calibration to apply m/z domain calibration.
Configure the molecular formula search:
- PPM error tolerance and DBE (double bond equivalent) range
- Ion types: protonated
[M±H], radical[M]•, or adduct - Per-atom min/max ranges for C, H, O, N, S, P
- Scoring method and first-hit mode
Click Search Molecular Formulas to assign formulas to every peak.
Choose one or more output formats and a filename stem, then click Export Results:
| Format | Description |
|---|---|
.csv |
Flat tabular results, Excel-compatible |
.hdf5 |
Full CoreMS HDF5 archive (metadata + spectra) |
.xlsx |
Excel workbook via pandas |
A live Spectrum Summary shows peak count, assignment rate, m/z range, and baseline noise.
| View | Description |
|---|---|
| Mass Spectrum | All peaks (blue) with assigned peaks highlighted (red) |
| Assigned vs. Unassigned | Bar chart of peak assignment counts |
| DBE vs. C# | Scatter plot of double-bond equivalent vs carbon number |
| PPM Error Distribution | Histogram of formula assignment errors |
The Agent tab (right panel) embeds an OpenAI-compatible LLM assistant (default: PNNL AI Incubator Depot) that can:
- Answer questions about the currently loaded spectrum using a live peak table passed as context.
- Execute local workflow steps (Import, Calibrate, Search, Export) directly from the chat, with optional parameter overrides.
- Call remote tools via the CoreMS MCP server to query the EMSL database or submit server-side processing jobs.
-
Install dependencies (included in
pip install -e .):pip install openai 'mcp[cli]>=1.8' PyJWT -
Start the CoreMS MCP server (from
mcp/):cd /path/to/corems-app/mcp python server.py -
In the Agent tab — Agent Settings:
- Set LLM API Key (or set the
LLM_API_KEYenvironment variable) - Set LLM Base URL (default:
https://ai-incubator-api.pnnl.gov) - Set Model — any model available on the endpoint, e.g.
gpt-4o-birthright - Set MCP URL to
http://localhost:8811/mcp(default)
- Set LLM API Key (or set the
-
To authenticate against protected MCP tools, expand Generate Auth Token:
- Enter the server's Secret Key (matches
SECRET_KEYin the CoreMS API config) - Optionally adjust User ID, First/Last Name, Email
- Click Generate Token — a local HS256 JWT is created and filled into the Auth Token field automatically (same algorithm as
ftms_monet_etl)
- Enter the server's Secret Key (matches
The agent has five built-in tools that operate directly on the local GUI — no MCP server required:
| Tool | Description |
|---|---|
gui_get_state() |
Return a JSON snapshot of the current panel settings and loaded spectrum. The agent calls this first to confirm parameters before running any step. |
gui_run_import(...) |
Trigger Step 1 — Import with optional parameter overrides (file path, apodization, noise method, m/z range, etc.). |
gui_run_calibrate(...) |
Trigger Step 2 — Calibrate with optional overrides (reference file, PPM window, polynomial order, etc.). |
gui_run_search(...) |
Trigger Step 3 — Search with optional overrides (PPM tolerance, DBE range, ion types, atom ranges, etc.). |
gui_run_export(...) |
Trigger Step 4 — Export with optional overrides (output path, formats). |
All parameters are optional — omitted ones use the current panel values. The agent always asks for confirmation before executing a step unless explicitly instructed to proceed.
| Category | Auth required | Description |
|---|---|---|
| MonetResult queries | No | Query processed results from the EMSL database |
| FTMS data & parameters | Yes (JWT) | Register files, manage parameter sets |
| FTMS workflows | Yes (JWT) | Submit QC and DI molecular formula jobs |
| GCMS data & parameters | Yes (JWT) | Register GC-MS files, manage parameter sets |
| GCMS workflows | Yes (JWT) | Submit low-resolution GC-MS peak-picking jobs |
| MinIO storage | Yes (MinIO creds) | Generate presigned upload/download URLs |
Local processing:
- "Run the import step with the file I have selected."
- "Search for molecular formulas using a ±2 ppm window and CHO only."
- "Run the full workflow — import, calibrate, search, and export."
Data analysis:
- "I have 3500 peaks and 60% assigned — what should I check?"
- "What is the compound class distribution for the current spectrum?"
- "Plot DBE vs C# for the assigned peaks."
Server / database (requires MCP server):
- "What proposals are available in the database?"
- "List all FTMS results for proposal P12345."
- "Submit a DI workflow for data IDs 42 and 43 using parameter set 7."
corems_gui/
├── __init__.py ← main() entry point and public API
├── __main__.py ← enables python -m corems_gui
├── _constants.py ← shared enumerations (method lists, column names…)
├── _helpers.py ← Qt widget factory functions (int_spin, combo…)
├── app.py ← FTICRMainWindow (QMainWindow)
├── canvas.py ← SpectrumCanvas (Matplotlib + Qt toolbar)
├── models.py ← PeaksModel / SortFilterPeaksModel
├── workers.py ← ImportWorker, CalibrationWorker, SearchWorker, ExportWorker, ChatWorker
└── panels/
├── __init__.py
├── import_panel.py ← Step 1 form widget
├── calibration_panel.py ← Step 2 form widget
├── search_panel.py ← Step 3 form widget (atom ranges table)
├── export_panel.py ← Step 4 form widget + spectrum summary
└── chat_panel.py ← Agent chatbot (Chat / Settings / Auth Token tabs)
Each *Panel is a self-contained QWidget that emits a run_requested(dict) signal — it has no direct dependency on CoreMS. All CoreMS calls are isolated inside workers.py. ChatWorker runs the agentic tool-use loop (LLM + MCP + GUI actions) in a QThread.
| Input | Extension | Notes |
|---|---|---|
| Bruker Solarix transient | .d directory |
Reads fid/ser + apexAcquisition.method |
| Thermo Fisher RAW | .raw |
Reads via CoreMS Thermo reader |
| Reference mass list | .ref |
Two columns: Formula and m/z |
See requirements.txt for runtime dependencies. Key dependencies:
| Package | Version | Purpose |
|---|---|---|
PySide6 |
≥ 6.5.0 | Qt 6 bindings for the GUI |
matplotlib |
≥ 3.7.0 | Embedded spectrum plots |
pandas |
≥ 1.5.0 | Results table and Excel export |
openpyxl |
≥ 3.1.0 | .xlsx file writing |
openai |
≥ 1.30.0 | OpenAI-compatible LLM client for the agent chatbot |
mcp[cli] |
≥ 1.8 | MCP client SDK (Streamable-HTTP transport) |
httpx |
≥ 0.27 | Async HTTP (transitive dep of mcp[cli]) |
PyJWT |
≥ 2.8 | Local HS256 JWT generation for CoreMS API auth |
| Development extras: |
pip install bump-my-versionFirst public release of CoreMS Orchestrator.
- Four-step processing pipeline: Import → Calibrate → Search → Export
- Bruker SolarIX (.d) and Thermo Scientific (.raw) file support
- Configurable noise thresholding (log, minima, S/N, relative, absolute)
- Reference-mass m/z domain calibration with polynomial regression (1st–3rd order)
- Bayesian-scored molecular formula assignment with full periodic table element support
- Multi-format export: CSV, HDF5, Excel (.xlsx)
- Persistent-homology peak picking for mass feature detection
- Automated MS1/MS2 spectral association (centroid and profile mode auto-detection)
- Molecular formula search on LC-MS mass features (SearchMolecularFormulasLC)
- Element-based mass feature filtering (e.g. Fe for siderophore discovery)
- FlashEntropy spectral library construction from MSP files
- Molecular networking: open search and neutral loss search types
- Entropy and cosine similarity matrices with greedy modularity clustering
- Interactive HTML network visualizations and edge list/matrix export
- Conversational LLM agent with real-time access to live spectral state
- Six local GUI tools:
gui_get_state,gui_run_import,gui_run_calibrate,gui_run_search,gui_run_export,gui_run_lcms - Remote MCP server integration for database queries, workflow submission, and object storage
- Compatible with any OpenAI-compatible LLM endpoint (GPT, Claude, Grok, o-series)
- Automatic retry with exponential backoff on transient API failures
- Structured concurrency bridge (Qt signals ↔ asyncio ↔ anyio/MCP)
- Built-in JWT token generator for authenticated MCP server operations
- PySide6 (Qt 6) responsive UI with non-blocking QThread workers
- Live Matplotlib spectrum visualization with multiple plot types
- Sortable/filterable results table with column-click sorting
- Persistent per-panel settings via QSettings
- Dark-themed high-contrast agent chat interface
This material is free to use, and attribution is always appreciated. Attribution may read as follows:
Authored by Yuri E. Corilo at the Pacific Northwest National Laboratory, operated by Battelle for the U.S. Department of Energy.
Please cite the following in your work: Yuri Corilo. (2026). EMSL-Computing/CoreMS-Orchestrator: CoreMS Orchestrator version 0.1.0 (0.1.0). Zenodo. https://doi.org/10.5281/zenodo.20821747