A comprehensive quality control (QC) tool for microbial sequencing data that provides both command-line processing and interactive web-based visualization capabilities.
uQCme consists of two main components:
- CLI Tool (
uqcme): A command-line interface that processes microbial sequencing QC data against configurable quality control rules. It determines QC outcomes (PASS/FAIL/WARNING) based on species-specific criteria. - Web Dashboard (
uqcme-dashboard): An interactive Streamlit application for visualizing and exploring the QC results generated by the CLI tool.
- Species-specific QC rules: Support for numerous microbial species with tailored quality control criteria defined in
QC_rules.tsv. - Configurable QC tests: Define custom QC outcomes with priority-based rule conditions.
- Flexible rule engine: Regex-based validation with threshold checks for various QC metrics.
- Robust Validation: Data validation using Pandera schemas and Pydantic configuration management.
- Interactive dashboard: Web-based visualization with filtering, sorting, and detailed sample exploration.
- Comprehensive logging: Detailed logging system with both file and console output.
The tool handles specific QC rules for the following species (as defined in the default QC_rules.tsv):
- Acinetobacter baumannii
- Campylobacter coli
- Campylobacter jejuni
- Enterobacter spp.
- Enterococcus faecium
- Escherichia coli
- Haemophilus influenzae
- Helicobacter pylori
- Klebsiella pneumoniae
- Mycoplasma genitalium
- Neisseria gonorrhoeae
- Pseudomonas aeruginosa
- Salmonella enterica
- Shigella flexneri
- Shigella sonnei
- Staphylococcus aureus
- Streptococcus pneumoniae
Note: "all" rules apply to any species not explicitly listed or as general baseline checks.
- Assembly statistics (N50, contigs, genome size)
- CheckM completeness and contamination
- Species identification validation
- Coverage depth analysis
- Quality score assessments
Core only (shared logic):
pip install uqcmeCLI only:
pip install uqcme[cli]Dashboard/App only:
pip install uqcme[app]Full installation (CLI + Web Dashboard):
pip install uqcme[all]git clone https://github.com/ssi-dk/uQCme.git
cd uQCme
# Core only
pip install .
# Full installation
pip install ".[all]"The CLI tool reads your sequencing run data and applies the QC rules defined in your configuration.
Basic Usage (using defaults):
uqcmeThis will use the bundled default configuration and look for input files in the current directory as specified in the default config.
Override Data Source: You can process a specific data file or API endpoint without creating a full config file:
# Process a local file
uqcme --file path/to/my_run_data.tsv
# Process data from an API
uqcme --api-call "https://api.example.com/runs/123"Custom Configuration: For full control over rules, tests, and mappings, provide a custom configuration file:
uqcme --config my_config.yamlWhat it does:
- Loads run data (from file, API, or defaults).
- Loads QC rules (
QC_rules.tsv) and QC tests (QC_tests.tsv).- Note: If local rule files are missing, it falls back to bundled defaults.
- Evaluates each sample against the rules for its species.
- Determines the final QC outcome (e.g., PASS, FAIL).
- Outputs a new TSV file containing the original data plus the QC results (e.g.,
qc_results.tsv).
The dashboard visualizes the results generated by the CLI tool.
Command:
uqcme-dashboard --config config.yamlExport dashboard report view as PDF (non-interactive):
uqcme-dashboard --config config.yaml --export-pdf output/report.pdfThis uses report-mode rendering from app.dashboard.report_mode in the config
and exports a table-only report (filtered data dataframe).
For PDF export support, install Playwright and Chromium once:
pip install playwright
playwright install chromiumWhat it does:
- Launches a local web server (Streamlit).
- Loads the processed data (from file or API) as specified in
config.yaml. - Provides an interactive interface to:
- View summary statistics.
- Filter samples by QC outcome, species, or specific metrics.
- Inspect individual sample details and failed rules.
- Visualize metric distributions.
The tool is driven by a config.yaml file. This file defines:
- Input paths: Locations of your data, mapping file, and QC rules/tests.
- Output paths: Where to save the results and logs.
- App settings: Dashboard configuration (server port, UI preferences).
Key Input Files:
run_data.tsv: Your raw sequencing metrics.mapping.yaml: Maps your data columns to the tool's internal field names.QC_rules.tsv: Defines the specific thresholds and checks for each species. (default values originally from https://www.pathogensurveillance.net/resources/quality/ and https://happykhan.github.io/qualibact/)QC_tests.tsv: Defines how rule failures translate into overall QC outcomes.
Config-driven sample API action buttons
You can configure one or more action buttons in the dashboard that send values from selected samples to an API endpoint:
app:
dashboard:
sample_api_actions:
- label: "Notify LIMS"
api_call: "https://example.org/api/notify"
value_field: "sample_name"
method: "POST"
payload_field: "sample_ids"
send_as_list: true
include_sample_ids: falseWhen users select samples in the Data Preview table and click the configured
button, uQCme sends the selected value_field values to api_call.
Config-driven report mode defaults
Use the same dashboard config to control deterministic report rendering:
app:
dashboard:
report_mode:
enabled: false
default_visible_sections:
Basic: true
QC_metrics: true
Experimental: false
default_filters:
species: "Escherichia coli"
completeness:
min: 90See the input/example/ directory for template files.
Enhanced run data with QC outcomes:
- Original sample data
- Failed rules per sample
- Assigned QC outcome with priority
- Color coding for visualization
Detailed log of rule evaluation issues:
- Skipped rules and reasons
- Data validation warnings
- Processing statistics
- Interactive data table with filtering and sorting
- Priority-based color coding of QC outcomes
- Summary statistics and sample counts
- Detailed view of individual sample QC results
- Failed rules and thresholds
- Interactive metric exploration
- Plotly-based interactive charts
- Customizable metric comparisons
- Species-specific analysis views
- Dynamic filtering by QC outcome, species, and metrics
- Search functionality across all data columns
- Export capabilities for filtered datasets
Create custom QC rules by editing QC_rules.tsv:
rule_id species qc_tool qc_metric validation_type threshold column_name
CUSTOM1 Escherichia coli Assembly N50 threshold >=50000 n50
CUSTOM2 all CheckM Completeness threshold >=90 completenessDefine new QC tests in QC_tests.tsv:
outcome_id outcome_name description priority passed_rule_conditions failed_rule_conditions action_required
FAIL_CUSTOM Fail - Custom QC Custom quality control failed 3 CUSTOM1,CUSTOM2 rejectThis project uses pixi for dependency management and development workflow.
To set up the development environment:
pixi installTo run tests:
pixi run pytestuQCme/
├── src/
│ └── uQCme/
│ ├── __init__.py
│ ├── app/
│ │ ├── main.py # Streamlit web dashboard entry
│ │ └── plot.py # Dashboard plotting utilities
│ ├── cli/
│ │ └── main.py # CLI entry
│ └── core/
│ ├── engine.py # QC processing engine
│ ├── loader.py # Config and data loading
│ └── config.py # Pydantic config models
├── config.yaml # Configuration file
├── input/
│ └── example/ # Example input files
├── output/ # Generated results
├── log/ # Application logs
└── tests/ # Unit tests
python -m pytest tests/- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
- Missing input files: Ensure all required input files exist and paths in
config.yamlare correct - Rule validation errors: Check that QC rules reference valid column names in your data
- Dashboard not loading: Verify Streamlit installation and port availability
Check the log file (./log/log.tsv) for detailed processing information:
tail -f ./log/log.tsvIf you use uQCme in your research, please cite:
uQCme: A Comprehensive Quality Control Tool for Microbial Sequencing Data
SSI-DK, 2025
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, issues, or feature requests:
- Create an issue on GitHub
- Contact: Kim Ng (kimn@ssi.dk)
See CHANGELOG.md for version history and updates.
