uQCme - Microbial Quality Control Tool

A comprehensive quality control (QC) tool for microbial sequencing data that provides both command-line processing and interactive web-based visualization capabilities.

Overview

uQCme consists of two main components:

CLI Tool (uqcme): A command-line interface that processes microbial sequencing QC data against configurable quality control rules. It determines QC outcomes (PASS/FAIL/WARNING) based on species-specific criteria.
Web Dashboard (uqcme-dashboard): An interactive Streamlit application for visualizing and exploring the QC results generated by the CLI tool.

Features

Core Functionality

Species-specific QC rules: Support for numerous microbial species with tailored quality control criteria defined in QC_rules.tsv.
Configurable QC tests: Define custom QC outcomes with priority-based rule conditions.
Flexible rule engine: Regex-based validation with threshold checks for various QC metrics.
Robust Validation: Data validation using Pandera schemas and Pydantic configuration management.
Interactive dashboard: Web-based visualization with filtering, sorting, and detailed sample exploration.
Comprehensive logging: Detailed logging system with both file and console output.

Supported Species

The tool handles specific QC rules for the following species (as defined in the default QC_rules.tsv):

Acinetobacter baumannii
Campylobacter coli
Campylobacter jejuni
Enterobacter spp.
Enterococcus faecium
Escherichia coli
Haemophilus influenzae
Helicobacter pylori
Klebsiella pneumoniae
Mycoplasma genitalium
Neisseria gonorrhoeae
Pseudomonas aeruginosa
Salmonella enterica
Shigella flexneri
Shigella sonnei
Staphylococcus aureus
Streptococcus pneumoniae

Note: "all" rules apply to any species not explicitly listed or as general baseline checks.

QC Metrics

Assembly statistics (N50, contigs, genome size)
CheckM completeness and contamination
Species identification validation
Coverage depth analysis
Quality score assessments

Installation

From PyPI

Core only (shared logic):

pip install uqcme

CLI only:

pip install uqcme[cli]

Dashboard/App only:

pip install uqcme[app]

Full installation (CLI + Web Dashboard):

pip install uqcme[all]

From Source

git clone https://github.com/ssi-dk/uQCme.git
cd uQCme

# Core only
pip install .

# Full installation
pip install ".[all]"

Usage

1. CLI Tool (`uqcme`)

The CLI tool reads your sequencing run data and applies the QC rules defined in your configuration.

Basic Usage (using defaults):

uqcme

This will use the bundled default configuration and look for input files in the current directory as specified in the default config.

Override Data Source: You can process a specific data file or API endpoint without creating a full config file:

# Process a local file
uqcme --file path/to/my_run_data.tsv

# Process data from an API
uqcme --api-call "https://api.example.com/runs/123"

Custom Configuration: For full control over rules, tests, and mappings, provide a custom configuration file:

uqcme --config my_config.yaml

What it does:

Loads run data (from file, API, or defaults).
Loads QC rules (QC_rules.tsv) and QC tests (QC_tests.tsv).
- Note: If local rule files are missing, it falls back to bundled defaults.
Evaluates each sample against the rules for its species.
Determines the final QC outcome (e.g., PASS, FAIL).
Outputs a new TSV file containing the original data plus the QC results (e.g., qc_results.tsv).

2. Web Dashboard (`uqcme-dashboard`)

The dashboard visualizes the results generated by the CLI tool.

Command:

uqcme-dashboard --config config.yaml

Export dashboard report view as PDF (non-interactive):

uqcme-dashboard --config config.yaml --export-pdf output/report.pdf

This uses report-mode rendering from app.dashboard.report_mode in the config and exports a table-only report (filtered data dataframe). For PDF export support, install Playwright and Chromium once:

pip install playwright
playwright install chromium

What it does:

Launches a local web server (Streamlit).
Loads the processed data (from file or API) as specified in config.yaml.
Provides an interactive interface to:
- View summary statistics.
- Filter samples by QC outcome, species, or specific metrics.
- Inspect individual sample details and failed rules.
- Visualize metric distributions.

Configuration

The tool is driven by a config.yaml file. This file defines:

Input paths: Locations of your data, mapping file, and QC rules/tests.
Output paths: Where to save the results and logs.
App settings: Dashboard configuration (server port, UI preferences).

Key Input Files:

run_data.tsv: Your raw sequencing metrics.
mapping.yaml: Maps your data columns to the tool's internal field names.
QC_rules.tsv: Defines the specific thresholds and checks for each species. (default values originally from https://www.pathogensurveillance.net/resources/quality/ and https://happykhan.github.io/qualibact/)
QC_tests.tsv: Defines how rule failures translate into overall QC outcomes.

Config-driven sample API action buttons

You can configure one or more action buttons in the dashboard that send values from selected samples to an API endpoint:

app:
  dashboard:
    sample_api_actions:
      - label: "Notify LIMS"
        api_call: "https://example.org/api/notify"
        value_field: "sample_name"
        method: "POST"
        payload_field: "sample_ids"
        send_as_list: true
        include_sample_ids: false

When users select samples in the Data Preview table and click the configured button, uQCme sends the selected value_field values to api_call.

Config-driven report mode defaults

Use the same dashboard config to control deterministic report rendering:

app:
  dashboard:
    report_mode:
      enabled: false
      default_visible_sections:
        Basic: true
        QC_metrics: true
        Experimental: false
      default_filters:
        species: "Escherichia coli"
        completeness:
          min: 90

See the input/example/ directory for template files.

Output Files

1. QC Results (`qc_results.tsv`)

Enhanced run data with QC outcomes:

Original sample data
Failed rules per sample
Assigned QC outcome with priority
Color coding for visualization

2. Rule Warnings (`qc_warnings.tsv`)

Detailed log of rule evaluation issues:

Skipped rules and reasons
Data validation warnings
Processing statistics

Dashboard Features

Data Overview

Interactive data table with filtering and sorting
Priority-based color coding of QC outcomes
Summary statistics and sample counts

Sample Details

Detailed view of individual sample QC results
Failed rules and thresholds
Interactive metric exploration

Visualization

Plotly-based interactive charts
Customizable metric comparisons
Species-specific analysis views

Filtering and Search

Dynamic filtering by QC outcome, species, and metrics
Search functionality across all data columns
Export capabilities for filtered datasets

Advanced Usage

Custom QC Rules

Create custom QC rules by editing QC_rules.tsv:

rule_id	species	qc_tool	qc_metric	validation_type	threshold	column_name
CUSTOM1	Escherichia coli	Assembly	N50	threshold	>=50000	n50
CUSTOM2	all	CheckM	Completeness	threshold	>=90	completeness

Species-Specific Tests

Define new QC tests in QC_tests.tsv:

outcome_id	outcome_name	description	priority	passed_rule_conditions	failed_rule_conditions	action_required
FAIL_CUSTOM	Fail - Custom QC	Custom quality control failed	3		CUSTOM1,CUSTOM2	reject

Development

This project uses pixi for dependency management and development workflow.

To set up the development environment:

pixi install

To run tests:

pixi run pytest

Project Structure

uQCme/
├── src/
│   └── uQCme/
│       ├── __init__.py
│       ├── app/
│       │   ├── main.py     # Streamlit web dashboard entry
│       │   └── plot.py     # Dashboard plotting utilities
│       ├── cli/
│       │   └── main.py     # CLI entry
│       └── core/
│           ├── engine.py   # QC processing engine
│           ├── loader.py   # Config and data loading
│           └── config.py   # Pydantic config models
├── config.yaml             # Configuration file
├── input/
│   └── example/            # Example input files
├── output/                 # Generated results
├── log/                    # Application logs
└── tests/                  # Unit tests

Running Tests

python -m pytest tests/

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Submit a pull request

Troubleshooting

Common Issues

Missing input files: Ensure all required input files exist and paths in config.yaml are correct
Rule validation errors: Check that QC rules reference valid column names in your data
Dashboard not loading: Verify Streamlit installation and port availability

Logging

Check the log file (./log/log.tsv) for detailed processing information:

tail -f ./log/log.tsv

Citation

If you use uQCme in your research, please cite:

uQCme: A Comprehensive Quality Control Tool for Microbial Sequencing Data
SSI-DK, 2025

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions, issues, or feature requests:

Create an issue on GitHub
Contact: Kim Ng (kimn@ssi.dk)

Changelog

See CHANGELOG.md for version history and updates.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.github/workflows		.github/workflows
conda		conda
input/example		input/example
src/uQCme		src/uQCme
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
config.yaml		config.yaml
pixi.lock		pixi.lock
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml
recipe.yaml		recipe.yaml
requirements.txt		requirements.txt
uqcme_interface.png		uqcme_interface.png

Folders and files

Latest commit

History

Repository files navigation

uQCme - Microbial Quality Control Tool

Overview

Features

Core Functionality

Supported Species

QC Metrics

Installation

From PyPI

From Source

Usage

1. CLI Tool (uqcme)

2. Web Dashboard (uqcme-dashboard)

Configuration

Output Files

1. QC Results (qc_results.tsv)

2. Rule Warnings (qc_warnings.tsv)

Dashboard Features

Data Overview

Sample Details

Visualization

Filtering and Search

Advanced Usage

Custom QC Rules

Species-Specific Tests

Development

Project Structure

Running Tests

Contributing

Troubleshooting

Common Issues

Logging

Citation

License

Support

Changelog

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

1. CLI Tool (`uqcme`)

2. Web Dashboard (`uqcme-dashboard`)

1. QC Results (`qc_results.tsv`)

2. Rule Warnings (`qc_warnings.tsv`)

Packages