Skip to content

Releases: alubbock/microbench

v2.0.0

17 Mar 22:02
470695b

Choose a tag to compare

Microbench v2 is a significant upgrade with many new features versus v1.1.0.
Be sure to review the breaking changes before upgrading.

New features

  • Command-line interface (microbench -- COMMAND): wrap any external
    command and record host metadata alongside timing without writing Python
    code. Useful for SLURM jobs, shell scripts, and compiled executables.

    • Records command, returncode (list, one per timed iteration),
      alongside the standard timing fields. Mixins are specified by short
      kebab-case names without the MB prefix (e.g. host-info);
      original MB-prefixed names are also accepted.
    • Use
      --mixin MIXIN [MIXIN ...] to select metadata to capture (defaults to
      host-info, slurm-info, loaded-modules, python-info
      and working-dir)
    • Use --show-mixins to
      list all available mixins with descriptions; use --field KEY=VALUE to
      attach extra labels
    • Use --iterations N and --warmup N for repeat
      timing
    • Use --stdout[=suppress] and --stderr[=suppress] to capture
      subprocess output into the record (output is re-printed to the terminal
      unless =suppress is given)
    • Use --monitor-interval SECONDS to sample
      child process CPU and memory over time.
    • Some mixins expose
      their own configuration flags (shown in --show-mixins and --help)
    • Capture failures are non-fatal by default
      (capture_optional = True), making the CLI safe across heterogeneous
      cluster nodes.
    • The process exits with the first non-zero returncode seen
      across all timed iterations if present, or zero (success) otherwise.
  • summary(results) / bench.summary(): prints min / mean / median /
    max / stdev of call.durations across all results. No dependencies required
    beyond the Python standard library. bench.summary() is a one-liner
    convenience that calls bench.get_results() internally. The module-level
    summary(results) accepts any list of dicts and can be composed with other
    results-processing steps.

    from microbench import MicroBench, summary
    
    bench = MicroBench()
    
    @bench
    def my_function():
        ...
    
    for _ in range(10):
        my_function()
    
    bench.summary()
    # n=10  min=0.000031  mean=0.000038  median=0.000036  max=0.000059  stdev=0.000008
    
    # or with explicit results list:
    summary(bench.get_results())
  • bench.time(name) sub-timing: [Python API] label phases inside a single benchmark
    record with named timing sections. Sub-timings accumulate in call.timings as
    [{"name": ..., "duration": ...}, ...] in call order. Compatible with
    bench.record(), bench.arecord(), @bench (sync and async), and
    bench.record_on_exit(). Calling outside an active benchmark is a silent
    no-op; call.timings is absent when bench.time() is never called.

    with bench.record('pipeline'):
        with bench.time('parse'):
            data = parse(raw)
        with bench.time('transform'):
            result = transform(data)
  • Async support: [Python API] the @bench decorator now detects async def functions
    and returns an async def wrapper that must be awaited. A new
    bench.arecord(name) method provides the async counterpart of
    bench.record() for use with async with. All mixins, static fields,
    output sinks, iterations, and warmup work identically to the sync path.
    MBLineProfiler raises NotImplementedError at decoration time when used
    with an async function (line profiling of coroutines is not supported).

    @bench
    async def fetch():
        await asyncio.sleep(0.01)
    
    asyncio.run(fetch())
    
    async with bench.arecord('load'):
        await load_data()

    Note: elapsed time includes event-loop interleaving from other concurrent
    tasks; run in an otherwise-idle event loop for repeatable results.

  • bench.record_on_exit(name, handle_sigterm=True): [Python API]
    registers a
    process-exit handler that writes one benchmark record when the script
    terminates. Captures wall-clock duration from the call site to exit plus
    all mixin fields. Designed for SLURM jobs and batch scripts where
    restructuring code around a decorator is impractical. Key behaviours:

    • By default installs a SIGTERM handler (main thread only) that writes
      the record, chains to any existing SIGTERM handler, then re-delivers
      SIGTERM so the process exits with the conventional code 143 (128 + 15).
    • Wraps sys.excepthook to capture unhandled exceptions into an
      exception field before the process exits.
    • Adds an exit_signal field when the exit was triggered by SIGTERM.
    • Falls back to writing the record to sys.stderr if the primary output
      sink raises (e.g. filesystem unmounted at exit time).
    • Calling a second time on the same instance replaces the first
      registration and resets the start time.
  • bench.record(name) context manager: [Python API] times an arbitrary code block
    and writes one record, without requiring the code to be in a named
    function. All mixins, static fields, and output sinks behave identically
    to the decorator form.

  • Exception capture: [Python API] when a benchmarked block raises — via
    bench.record() or a @bench-decorated function — the record is
    written before the exception propagates. An exception field is added
    containing {"type": ..., "message": ...}. The exception is always
    re-raised. With --iterations N, timing stops at the first exception.

  • MBPythonInfo mixin replaces MBPythonVersion: records a python dict
    with version, prefix (sys.prefix), and executable (sys.executable),
    giving a complete picture of the running interpreter in one field. MBPythonVersion
    has been removed (see breaking changes above). MBPythonInfo is included
    in MicroBench by default (Python API) and in the CLI default mixin set;
    --no-mixin suppresses it on the CLI as usual.

  • MBLoadedModules mixin: captures the loaded Lmod / Environment
    Modules software stack into a loaded_modules dict mapping module name
    to version string (e.g. {"gcc": "12.2.0", "openmpi": "4.1.5"}). Reads
    the standard LOADEDMODULES environment variable — no subprocess, no
    extra dependencies. Empty dict when no modules are loaded. Included in
    the CLI defaults alongside MBHostInfo, MBSlurmInfo, and MBWorkingDir.

  • MBWorkingDir mixin: captures the absolute path of the working
    directory at benchmark time into call.working_dir. No dependencies.
    Included in the CLI defaults — useful for reproducibility when comparing
    results across nodes or directories.

  • MBGitInfo mixin: captures the repository root path, current commit
    hash, branch name, and dirty flag (uncommitted changes present) via
    git ≥ 2.11 on PATH. Stored in git. Set git_repo to inspect
    a specific repository directory.

  • MBPeakMemory mixin: captures peak Python memory allocation during the
    benchmarked function as call.peak_memory_bytes (bytes), using
    tracemalloc from the standard library. No extra dependencies required.

  • MBSlurmInfo mixin: captures all SLURM_* environment variables into
    a slurm dict (keys lowercased, SLURM_ prefix stripped). Empty dict
    when running outside a SLURM job. Supersedes the manual
    env_vars = ('SLURM_JOB_ID', ...) pattern.

  • MBFileHash mixin: records a cryptographic checksum of specified files
    in the file_hashes field (a dict mapping path to hex digest). Defaults to
    hashing sys.argv[0] — the running script. Set hash_files to an iterable
    of paths to hash specific files instead. Set hash_algorithm to any
    algorithm accepted by hashlib.new (default: 'sha256').

  • MBCgroupLimits mixin: captures the CPU quota and memory limit enforced by
    the Linux cgroup filesystem. Works for SLURM jobs and Kubernetes pods (cgroup
    v1 and v2). Fields in cgroups: cpu_cores_limit (float — quota ÷ period,
    or null if unlimited), memory_bytes_limit (int or null if unlimited),
    version (1 or 2). Returns {} on non-Linux systems or when the
    cgroup filesystem is unavailable.

    class Bench(MicroBench, MBSlurmInfo, MBCgroupLimits):
        pass
    {
      "cgroups": {
        "cpu_cores_limit": 4.0,
        "memory_bytes_limit": 17179869184,
        "version": 2
      }
    }
  • MBCondaPackages improvements:

    • Queries the environment identified by CONDA_PREFIX (the shell's active conda
      environment) rather than sys.prefix. Falls back to sys.prefix when
      CONDA_PREFIX is not set.
    • Falls back to CONDA_EXE if conda is not on PATH (common in non-interactive
      SLURM batch scripts where conda is activated but its bin/ is not on PATH).
    • Replaces the separate conda_versions field with a unified conda dict
      containing name (CONDA_DEFAULT_ENV), path (CONDA_PREFIX), and
      packages (the version dict). Either of name/path may be None if
      the corresponding variable is unset. With get_results(flat=True) these
      expand to conda.name, conda.path, conda.packages.<pkg> etc.
  • MicroBenchBase: the core benchmarking machinery is now exposed as
    MicroBenchBase (no default mixins). MicroBench inherits from both
    MicroBenchBase and MBPythonInfo. Subclass MicroBenchBase directly when
    you need a completely bare benchmark class with no automatic captures.

  • warmup parameter: pass warmup=N to run the function N times
    before timing begins, priming caches or JIT compilation without affecting
    results. Warmup calls are unrecorded and do not interact with the monitor
    thread or capture triggers.

  • Multi-sink output architecture (#52): Results can now be written to
    multiple d...

Read more

v1.1.0

13 Mar 18:19
60f8cdf

Choose a tag to compare

What's new

  • mb_run_id and mb_version fields added to every record — no configuration required.
    • mb_run_id — UUID generated once at import time, shared by all MicroBench instances in the same process. Use groupby('mb_run_id') to correlate records from independent bench suites within the same run.
    • mb_version — version of the microbench package that produced the record; useful for long-running studies where benchmark code evolves.

Upgrading from v1.0

No changes required. The two new fields appear automatically in all existing benchmark suites.

v1.0.0

13 Mar 16:31
203fdc5

Choose a tag to compare

v1.0.0

First stable release of microbench.

Highlights since last PyPI release

Bug fixes

  • Fix MBCondaPackages double-concatenation of version strings (#28)
  • Fix inverted set difference in MBNvidiaSmi attribute validation (#29)
  • Fix test_multi_iterations using wrong timezone parameter name (#30)
  • Fix self instead of cls in MBLineProfiler classmethod (#27)
  • Fix MBFunctionCall double-encoding args/kwargs as JSON strings (#34)
  • Fix TelemetryThread crash when called from non-main thread (#32)
  • Fix CSS syntax error in diff.py (#26)
  • Fix nvidia_gpus rejecting integer GPU indexes (#43)
  • Fix MicroBenchRedis.get_results() returning empty DataFrame (#45)
  • Fix psutil.cpu_count() returning None on macOS with psutil 7.x (#49)
  • Remove overly restrictive nvidia attribute allowlist (#48)
  • Drop internal conda.testing API, use subprocess (#44)

Features

  • Implement get_results() for MicroBenchRedis (#45)
  • Capture both logical and physical CPU core counts (#49)

Code quality

  • Migrate to pyproject.toml + setuptools-scm (#39)
  • Remove Python 2 patterns (#38)
  • Add ruff linter/formatter with pre-commit hooks (#41)
  • Add __all__ export list (#37)
  • Fix LiveStream exit condition (#36)
  • Robust stack-frame walking in MBGlobalPackages (#35)
  • Add pickle security warning for MBLineProfiler (#33)
  • Expand test coverage from 78% to 88% (#42)
  • Add explicit workflow permissions (#40)
  • Remove dead Python <3.9 fallback (#46)

Documentation

  • Update README for v1.0 (#47, #50)

Requirements

  • Python >= 3.10
  • No required dependencies (pandas recommended for results analysis)

v1.0.0-rc.1

13 Mar 11:00
203fdc5

Choose a tag to compare

v1.0.0-rc.1 Pre-release
Pre-release

v1.0.0 Release Candidate 1

First release candidate for microbench v1.0.

Highlights since last release

Bug fixes

  • Fix MBCondaPackages double-concatenation of version strings (#28)
  • Fix inverted set difference in MBNvidiaSmi attribute validation (#29)
  • Fix test_multi_iterations using wrong timezone parameter name (#30)
  • Fix self instead of cls in MBLineProfiler classmethod (#27)
  • Fix MBFunctionCall double-encoding args/kwargs as JSON strings (#34)
  • Fix TelemetryThread crash when called from non-main thread (#32)
  • Fix CSS syntax error in diff.py (#26)
  • Fix typo "fuctionality" (#25)
  • Fix nvidia_gpus rejecting integer GPU indexes (#43)
  • Fix MicroBenchRedis.get_results() returning empty DataFrame (#45)
  • Fix psutil.cpu_count() returning None on macOS with psutil 7.x (#49)
  • Remove overly restrictive nvidia attribute allowlist (#48)
  • Drop internal conda.testing API, use subprocess (#44)

Features

  • Implement get_results() for MicroBenchRedis (#45)
  • Capture both logical and physical CPU core counts (#49)

Code quality

  • Migrate to pyproject.toml + setuptools-scm (#39)
  • Remove Python 2 patterns (#38)
  • Add ruff linter/formatter with pre-commit hooks (#41)
  • Add all export list (#37)
  • Fix LiveStream exit condition (#36)
  • Robust stack-frame walking in MBGlobalPackages (#35)
  • Add pickle security warning for MBLineProfiler (#33)
  • Expand test coverage from 78% to 88% (#42)
  • Add explicit workflow permissions (#40)
  • Remove dead Python <3.9 fallback (#46)

Documentation

  • Update README for v1.0 (#47)
  • Remove outdated setuptools requirement (#50)

Requirements

  • Python >= 3.10
  • No required dependencies (pandas recommended for results analysis)

v0.9.1

17 Aug 16:40
74d06ab

Choose a tag to compare

This release should fix installation of microbench from pypi with python 3.12.

What's Changed

  • chore: Configure Renovate by @renovate in #11
  • chore(deps): update actions/checkout action to v4 by @renovate in #12
  • chore(deps): update actions/setup-python action to v5 by @renovate in #13
  • chore: upgrade versioneer to support python 3.12 by @alubbock in #16

New Contributors

Full Changelog: v0.9...v0.9.1

v0.9

17 Aug 02:09
e51208a

Choose a tag to compare

What's Changed

  • feat: multiple iterations of functions by @alubbock in #7
    the wrapped function can now be evaluated multiple times, with timings given in a new run_durations field.
  • feat: customisable duration timer functions by @alubbock in #10
    the new run_durations field uses time.perf_counter by default, but the function used is customisable.
  • feat: use importlib instead of pkg_resources by @alubbock in #6
    pkg_resources is deprecated, so use importlib instead.
  • fix: calls to deprecated conda api by @alubbock in #5
  • fix: line_profiler hook by @alubbock in #8
    this was not storing timings properly.
  • fix: update pandas.read_json syntax by @alubbock in #9
    another deprecation fix.

For further details on how to use these new features, see the README.

Full Changelog: v0.8...v0.9