diff --git a/CHANGELOG.md b/CHANGELOG.md index b99f021..226dd04 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,12 +4,34 @@ All notable changes to microbench are documented here. ## [Unreleased] +### New features + +- **`MBResourceUsage` mixin**: captures POSIX `getrusage()` data — user and + system CPU time, peak RSS (in bytes, normalised across platforms), minor and + major page faults, block I/O operations, and voluntary/involuntary context + switches. Results are stored as a **list** in `resource_usage`, one entry per + timed iteration, aligned index-for-index with `call.durations` and + `call.returncode`. Added as a **default CLI mixin** so every CLI run captures + it automatically. + + - *CLI mode*: on POSIX, uses `os.wait4()` to capture the exact rusage of each + individual child process as reported by the kernel, including a reliable + `maxrss` per iteration regardless of `--iterations` or `--warmup` count. + - *Python API mode*: uses `RUSAGE_SELF` with a before/after delta around each + individual call — one entry per timed iteration, aligned with + `call.durations`. Warmup calls are excluded. `maxrss` is omitted (lifetime + process HWM, not per-call). Use `MBPeakMemory` for per-call peak memory. + + On platforms where the stdlib `resource` module is unavailable, the mixin + records an empty list without raising an error. + ### Enhancements - **`--mixin defaults` keyword** (CLI): `defaults` can be used as a mixin name to expand to the standard default set (`python-info`, `host-info`, - `slurm-info`, `loaded-modules`, `working-dir`). This makes it easy to add - one or more extra mixins without listing all five defaults explicitly: + `slurm-info`, `loaded-modules`, `working-dir`, `resource-usage`). This + makes it easy to add one or more extra mixins without listing all six + defaults explicitly: `microbench --mixin defaults file-hash -- ./job.sh`. - **`file-hash` mixin — automatic argument file scanning** (CLI): the diff --git a/docs/cli.md b/docs/cli.md index a6a05d2..a68132c 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -117,11 +117,13 @@ Every record contains the standard `mb.*` and `call.*` fields plus: ## Default mixins When no `--mixin` is specified, `python-info`, `host-info`, `slurm-info`, -`loaded-modules`, and `working-dir` are included automatically, capturing -the Python interpreter version, prefix, and executable path; hostname and -operating system; all `SLURM_*` environment variables; the loaded -Lmod/Environment Modules software stack; and the current working directory. -All five degrade gracefully or produce stable values outside their respective +`loaded-modules`, `working-dir`, and `resource-usage` are included +automatically, capturing the Python interpreter version, prefix, and +executable path; hostname and operating system; all `SLURM_*` environment +variables; the loaded Lmod/Environment Modules software stack; the current +working directory; and POSIX resource usage (CPU time, peak RSS, page faults, +I/O, and context switches) for the benchmarked subprocess. +All six degrade gracefully or produce stable values outside their respective environments. Mixin names use a short kebab-case form without the `MB` prefix @@ -236,6 +238,15 @@ microbench \ -- ./run_simulation.sh ``` +### `resource-usage` options + +`resource-usage` has no CLI flags. It is included in the defaults and records +POSIX `getrusage()` data automatically for the benchmarked subprocess. + +On platforms where the Python `resource` module is unavailable, +`resource-usage` records an empty `resource_usage` list and does not raise an +error. + ## Capture failures Metadata capture failures (e.g. `nvidia-smi` not installed on this node, @@ -289,6 +300,7 @@ With 10 iterations and 2 warmup runs, the record contains: - `call.durations` — list of 10 wall-clock durations in seconds - `call.returncode` — list of 10 exit codes (one per timed iteration) - `call.stdout` / `call.stderr` — list of 10 captured strings, if `--stdout`/`--stderr` is used +- `resource_usage` — list of 10 per-iteration rusage dicts (when `resource-usage` mixin is active) Warmup runs are excluded from all three lists. The process exits with the first non-zero return code, if present, so any failing iteration diff --git a/docs/user-guide/mixins.md b/docs/user-guide/mixins.md index 2956843..3a04537 100644 --- a/docs/user-guide/mixins.md +++ b/docs/user-guide/mixins.md @@ -19,9 +19,9 @@ to list all available mixins with descriptions: microbench --show-mixins ``` -By default, `python-info`, `host-info`, `slurm-info`, `loaded-modules`, and -`working-dir` are included automatically. Specifying `--mixin` replaces the -defaults entirely. Use `--no-mixin` to disable all mixins: +By default, `python-info`, `host-info`, `slurm-info`, `loaded-modules`, +`working-dir`, and `resource-usage` are included automatically. Specifying +`--mixin` replaces the defaults entirely. Use `--no-mixin` to disable all mixins: ```bash # Only peak-memory — no host info or SLURM @@ -70,6 +70,7 @@ combine any number of microbench mixins without conflicts, and their | `MBLoadedModules` | `loaded-modules` *(default)* | `loaded_modules` dict mapping module name to version (empty dict if no Lmod/Environment Modules are loaded) | — | | `MBWorkingDir` | `working-dir` *(default)* | `call.working_dir` — absolute path of the working directory at benchmark time | — | | `MBCgroupLimits` | `cgroup-limits` | `cgroups` dict with `cpu_cores_limit`, `memory_bytes_limit`, `version` (empty dict if not on Linux or cgroup fs unavailable) | Linux only | +| `MBResourceUsage` | `resource-usage` *(default)* | `resource_usage` list of dicts with CPU times, peak RSS, page faults, I/O ops, and context switches (`[]` when the stdlib `resource` module is unavailable) | POSIX only (stdlib) | | `MBGitInfo` | `git-info` | `git` dict with `repo`, `commit`, `branch`, `dirty` | `git` ≥ 2.11 on PATH | | `MBGlobalPackages` | Python only | `python.loaded_packages` for every package in the caller's global scope | — | | `MBInstalledPackages` | `installed-packages` | `python.installed_packages` (and optionally `python.installed_package_paths`) for every installed package | — | @@ -338,6 +339,172 @@ platforms or when the cgroup filesystem is unavailable. scheduler metadata (job ID, node list, etc.) while `MBCgroupLimits` captures the kernel-enforced resource limits. +### `MBResourceUsage` + +Captures POSIX [`getrusage(2)`](https://man7.org/linux/man-pages/man2/getrusage.2.html) +data — CPU time, page faults, block I/O operations, and context switches — +using only the Python standard library (`resource` module). No extra +dependencies are required. + +**Modes** + +- **CLI mode**: on POSIX, uses `os.wait4()` to get the exact rusage of each + child process as reported by the kernel — one dict per timed iteration, + aligned index-for-index with `call.durations`. + `maxrss` is the child's own peak RSS. +- **Python API mode**: uses `RUSAGE_SELF` — one dict per timed iteration, + each a before/after delta around that single call (aligned index-for-index + with `call.durations`). Warmup calls are excluded. + `maxrss` is **omitted** — `RUSAGE_SELF.maxrss` is a lifetime process + high-water mark that reflects the peak since the interpreter started, + not just since the decorated function was called, making it unreliable + for function-level measurement. + +On platforms where the stdlib `resource` module is unavailable, `resource_usage` +is recorded as an empty list. + +```python +from microbench import MicroBench, MBResourceUsage + +class Bench(MicroBench, MBResourceUsage): + pass + +bench = Bench() + +@bench +def work(): + return list(range(1_000_000)) + +work() +``` + +Python API record (one entry per timed iteration, no `maxrss`): + +```json +{ + "resource_usage": [ + { + "utime": 0.052, + "stime": 0.003, + "minflt": 1024, + "majflt": 0, + "inblock": 0, + "oublock": 0, + "nvcsw": 2, + "nivcsw": 1 + } + ] +} +``` + +CLI record with `--iterations 2` (one entry per iteration, includes `maxrss`): + +```json +{ + "resource_usage": [ + { + "utime": 0.068, + "stime": 0.029, + "maxrss": 11386880, + "minflt": 621, + "majflt": 0, + "inblock": 0, + "oublock": 0, + "nvcsw": 1, + "nivcsw": 2 + }, + { + "utime": 0.071, + "stime": 0.031, + "maxrss": 11386880, + "minflt": 618, + "majflt": 0, + "inblock": 0, + "oublock": 0, + "nvcsw": 1, + "nivcsw": 3 + } + ] +} +``` + +| Field | Modes | Description | +|---|---|---| +| `utime` | Both | User CPU time in seconds (float) | +| `stime` | Both | System CPU time in seconds (float) | +| `maxrss` | CLI only | Peak RSS in bytes (int) — see platform notes | +| `minflt` | Both | Minor page faults — pages reclaimed without I/O (int) | +| `majflt` | Both | Major page faults — pages requiring disk I/O (int) | +| `inblock` | Both | Block input operations (int) — see platform notes | +| `oublock` | Both | Block output operations (int) — see platform notes | +| `nvcsw` | Both | Voluntary context switches (int) | +| `nivcsw` | Both | Involuntary context switches (int) | + +All fields are before/after **deltas** so they reflect only the benchmarked +work. `utime`, `stime`, `minflt`, `nvcsw`, and `nivcsw` are the most +reliable across platforms. + +#### Platform notes and known quirks + +**`maxrss` — CLI mode with `os.wait4()` (all POSIX)** + +`os.wait4()` returns the exact rusage of each individual child process as +reported by the kernel. `maxrss` is the child's own peak RSS, accurate +regardless of iteration count or warmup. Values are normalised to bytes +(Linux reports kilobytes; macOS already reports bytes). + +**`maxrss` — Python API mode (`RUSAGE_SELF`)** + +`RUSAGE_SELF.maxrss` is a lifetime high-water mark for the Python interpreter +process. It is intentionally omitted. Use +[`MBPeakMemory`](#mbpeakmemory) if you need per-call peak memory tracking. + +**`inblock` / `oublock` — macOS** + +These counters are **almost always zero on macOS**, even for substantial file +I/O. The macOS unified buffer cache charges block I/O to the *first* process +that touches each page; subsequent reads and writes to cached pages are not +counted against the process that performed them. In practice, nearly all file +I/O is served from the cache and the counters never increment. + +This is a macOS kernel accounting limitation. It is documented in the +[`getrusage(2)` man page](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/getrusage.2.html): +*"The numbers ru_inblock and ru_oublock account only for real I/O; data +supplied by the caching mechanism is charged only to the first process to +read or write the data."* + +**`inblock` / `oublock` — Linux** + +On Linux these counters increment only for I/O that truly bypasses the page +cache — cold-cache reads (first access to a file since it was last evicted) +or writes with `O_DIRECT`. Warm-cache reads also show zero. Drop the page +cache (`echo 3 > /proc/sys/vm/drop_caches` as root) before benchmarking if +you need to measure true cold-cache I/O. + +**`majflt` — macOS** + +Major page faults are rare on macOS because the unified buffer cache handles +most page-in activity. Zero is normal. + +**`utime`, `stime`, `minflt`, `nvcsw`, `nivcsw`** + +These are the most reliable fields across both Linux and macOS and are +non-zero for any non-trivial workload. + +!!! note "Non-POSIX platforms" + When the Python `resource` module is unavailable, `resource_usage` is + recorded as an empty list without raising an error. + +**CLI:** `resource-usage` is a default mixin — no flags needed: + +```bash +# Included automatically +microbench --outfile results.jsonl -- ./run_simulation.sh + +# Explicit, if defaults have been overridden +microbench --mixin resource-usage -- ./run_simulation.sh +``` + ## Code provenance ### `MBGitInfo` diff --git a/microbench/__init__.py b/microbench/__init__.py index dc89281..a077fbd 100644 --- a/microbench/__init__.py +++ b/microbench/__init__.py @@ -43,6 +43,7 @@ MBCgroupLimits, MBHostInfo, MBLoadedModules, + MBResourceUsage, MBSlurmInfo, MBWorkingDir, ) @@ -77,6 +78,7 @@ 'MBCgroupLimits', 'MBGitInfo', 'MBFileHash', + 'MBResourceUsage', 'MBGlobalPackages', 'MBInstalledPackages', 'MBCondaPackages', diff --git a/microbench/cli/main.py b/microbench/cli/main.py index ee5ea11..1e086ce 100644 --- a/microbench/cli/main.py +++ b/microbench/cli/main.py @@ -19,6 +19,7 @@ 'slurm-info', 'loaded-modules', 'working-dir', + 'resource-usage', ) @@ -199,6 +200,8 @@ def capture_subprocess_reset(self, bm_data): self._subprocess_monitor = [] self._subprocess_timed_out = False self._subprocess_timed_phase = True + # Reset resource-usage accumulator (populated by run() per iteration). + self._subprocess_resource_usage = [] def capturepost_subprocess_result(self, bm_data): call = bm_data.setdefault('call', {}) @@ -275,6 +278,7 @@ def capturepost_subprocess_result(self, bm_data): bench._subprocess_stdout = [] bench._subprocess_stderr = [] bench._subprocess_monitor = [] + bench._subprocess_resource_usage = [] bench._subprocess_timed_out = False bench._subprocess_timed_phase = False # becomes True after warmup @@ -282,24 +286,18 @@ def capturepost_subprocess_result(self, bm_data): _real_stdout = sys.__stdout__ _real_stderr = sys.__stderr__ + # Lazy import: resource module is POSIX-only. + try: + import resource as _resource + except ImportError: + _resource = None + def run(): capture_stdout = args.stdout in _CAPTURE_CHOICES capture_stderr = args.stderr in _CAPTURE_CHOICES monitor_interval = args.monitor_interval timeout = args.timeout - if ( - not capture_stdout - and not capture_stderr - and monitor_interval is None - and timeout is None - ): - result = subprocess.run(cmd) - bench._subprocess_returncodes.append(result.returncode) - return - - # Use Popen so we have the PID (needed for monitoring) and can read - # stdout/stderr pipes in real time when capture is requested. stdout_chunks = [] stderr_chunks = [] @@ -317,7 +315,8 @@ def _reader(pipe, chunks, real_stream, passthrough): if capture_stderr: popen_kwargs['stderr'] = subprocess.PIPE - with subprocess.Popen(cmd, **popen_kwargs) as proc: + proc = subprocess.Popen(cmd, **popen_kwargs) + try: threads = [] if capture_stdout: t = threading.Thread( @@ -352,17 +351,65 @@ def _reader(pipe, chunks, real_stream, passthrough): monitor_thread.start() timed_out = False - try: - proc.wait(timeout=timeout) - except subprocess.TimeoutExpired: - timed_out = True - proc.terminate() + child_rusage = None + + if _resource is not None: + # os.wait4() reaps the child and returns its exact per-child rusage + # in a single syscall. It must be called *instead* of proc.wait(). + # + # Timeout handling: os.wait4() is a blocking call with no built-in + # deadline. We handle it by running wait4 in a daemon thread and + # joining with a timeout. If the join times out, we terminate/kill + # the child, then block on wait4 (the child is now dying so this + # completes quickly). + _wait4_result = [None] # [(pid, status, rusage)] + _wait4_error = [None] + + def _do_wait4(): + try: + _wait4_result[0] = os.wait4(proc.pid, 0) + except BaseException as exc: + _wait4_error[0] = exc + + _wait4_thread = threading.Thread(target=_do_wait4, daemon=True) + _wait4_thread.start() + _wait4_thread.join(timeout=timeout) + + if _wait4_thread.is_alive(): + # Timed out — terminate/kill the child and wait for it to exit. + timed_out = True + proc.terminate() + try: + grace = args.timeout_grace_period or _SIGTERM_GRACE_PERIOD + _wait4_thread.join(timeout=grace) + except Exception: + pass + if _wait4_thread.is_alive(): + proc.kill() + _wait4_thread.join() # child is dead; this will return quickly + + if _wait4_error[0] is not None: + raise _wait4_error[0] + + if _wait4_result[0] is None: # pragma: no cover - defensive + raise RuntimeError('os.wait4() returned no child status') + + _, wait_status, raw_ru = _wait4_result[0] + proc.returncode = os.waitstatus_to_exitcode(wait_status) + child_rusage = raw_ru + else: + # No resource module available: use proc.wait() with optional timeout. try: - grace = args.timeout_grace_period or _SIGTERM_GRACE_PERIOD - proc.wait(timeout=grace) + proc.wait(timeout=timeout) except subprocess.TimeoutExpired: - proc.kill() - proc.wait() + timed_out = True + proc.terminate() + try: + grace = args.timeout_grace_period or _SIGTERM_GRACE_PERIOD + proc.wait(timeout=grace) + except subprocess.TimeoutExpired: + proc.kill() + proc.wait() if monitor_thread is not None: monitor_thread.stop() @@ -372,13 +419,36 @@ def _reader(pipe, chunks, real_stream, passthrough): for t in threads: t.join() - if timed_out and bench._subprocess_timed_phase: - bench._subprocess_timed_out = True - bench._subprocess_returncodes.append(proc.returncode) - if capture_stdout: - bench._subprocess_stdout.append(''.join(stdout_chunks)) - if capture_stderr: - bench._subprocess_stderr.append(''.join(stderr_chunks)) + if timed_out and bench._subprocess_timed_phase: + bench._subprocess_timed_out = True + bench._subprocess_returncodes.append(proc.returncode) + if capture_stdout: + bench._subprocess_stdout.append(''.join(stdout_chunks)) + if capture_stderr: + bench._subprocess_stderr.append(''.join(stderr_chunks)) + + # Accumulate per-iteration resource usage (only during timed phase). + if bench._subprocess_timed_phase and child_rusage is not None: + from microbench.mixins.system import _rusage_from_wait4 + + bench._subprocess_resource_usage.append( + _rusage_from_wait4(child_rusage) + ) + except BaseException: + # Ensure the child is not left running as an orphan on any unexpected + # exception (including KeyboardInterrupt during a join). + # proc.kill() is safe after os.wait4() has already reaped the child + # (it sends SIGKILL to a dead PID which Popen silently ignores). + # proc.wait() is safe after os.wait4() too: returncode is already set. + proc.kill() + proc.wait() + raise + finally: + # Always close pipe FDs to prevent file-descriptor leaks. + if proc.stdout is not None: + proc.stdout.close() + if proc.stderr is not None: + proc.stderr.close() run.__name__ = os.path.basename(cmd[0]) bench(run)() diff --git a/microbench/cli/registry.py b/microbench/cli/registry.py index 16b3149..0a70527 100644 --- a/microbench/cli/registry.py +++ b/microbench/cli/registry.py @@ -12,6 +12,7 @@ MBCgroupLimits, MBHostInfo, MBLoadedModules, + MBResourceUsage, MBSlurmInfo, MBWorkingDir, ) @@ -28,6 +29,7 @@ 'cgroup-limits': MBCgroupLimits, 'git-info': MBGitInfo, 'file-hash': MBFileHash, + 'resource-usage': MBResourceUsage, 'installed-packages': MBInstalledPackages, 'conda-packages': MBCondaPackages, 'peak-memory': MBPeakMemory, diff --git a/microbench/core/bench.py b/microbench/core/bench.py index 7157d1b..752cc9c 100644 --- a/microbench/core/bench.py +++ b/microbench/core/bench.py @@ -248,8 +248,16 @@ def post_finish_triggers(self, bm_data): def pre_run_triggers(self, bm_data): bm_data['_run_start'] = self._duration_counter() + # Forward to mixin overrides via cooperative super() chaining. + parent = super() + if hasattr(parent, 'pre_run_triggers'): + parent.pre_run_triggers(bm_data) def post_run_triggers(self, bm_data): + # Forward to mixin overrides before recording the elapsed time. + parent = super() + if hasattr(parent, 'post_run_triggers'): + parent.post_run_triggers(bm_data) bm_data['call']['durations'].append( self._duration_counter() - bm_data['_run_start'] ) diff --git a/microbench/mixins/system.py b/microbench/mixins/system.py index 3e2e8e4..06ec298 100644 --- a/microbench/mixins/system.py +++ b/microbench/mixins/system.py @@ -1,6 +1,7 @@ """System-info mixins. -Classes: MBHostInfo, MBWorkingDir, MBSlurmInfo, MBCgroupLimits, MBLoadedModules. +Classes: MBHostInfo, MBWorkingDir, MBSlurmInfo, MBCgroupLimits, MBLoadedModules, +MBResourceUsage. """ import os @@ -12,6 +13,11 @@ except ImportError: psutil = None +try: + import resource as _resource +except ImportError: + _resource = None + class MBHostInfo: """Capture hostname, operating system, and (optionally) CPU and RAM info. @@ -256,3 +262,216 @@ def capture_cgroup_limits(self, bm_data): bm_data['cgroups'] = _read_cgroup_v1() except (OSError, ValueError, ZeroDivisionError): bm_data['cgroups'] = {} + + +def _rusage_to_dict(ru, *, include_maxrss=True): + """Convert a ``struct_rusage`` to a plain dict with normalised fields. + + ``maxrss`` is normalised to bytes: macOS already reports bytes; Linux + reports kilobytes and is multiplied by 1024. + + Pass ``include_maxrss=False`` to omit ``maxrss`` (used in Python API mode + where ``RUSAGE_SELF.maxrss`` is a lifetime process high-water mark and + cannot meaningfully isolate a single function call). + """ + d = { + 'utime': ru.ru_utime, + 'stime': ru.ru_stime, + 'minflt': ru.ru_minflt, + 'majflt': ru.ru_majflt, + 'inblock': ru.ru_inblock, + 'oublock': ru.ru_oublock, + 'nvcsw': ru.ru_nvcsw, + 'nivcsw': ru.ru_nivcsw, + } + if include_maxrss: + maxrss = ru.ru_maxrss + if sys.platform == 'linux': + maxrss *= 1024 + d['maxrss'] = maxrss + return d + + +def _rusage_from_wait4(raw_ru): + """Convert the raw rusage object returned by ``os.wait4()`` to a dict. + + ``os.wait4()`` returns a ``resource.struct_rusage``-compatible object + whose ``ru_maxrss`` already reflects **only that child process** — no + cumulative HWM subtraction is needed. ``maxrss`` is normalised to bytes + using the same platform rule as ``_rusage_to_dict``. + """ + maxrss = raw_ru.ru_maxrss + if sys.platform == 'linux': + maxrss *= 1024 + return { + 'utime': raw_ru.ru_utime, + 'stime': raw_ru.ru_stime, + 'maxrss': maxrss, + 'minflt': raw_ru.ru_minflt, + 'majflt': raw_ru.ru_majflt, + 'inblock': raw_ru.ru_inblock, + 'oublock': raw_ru.ru_oublock, + 'nvcsw': raw_ru.ru_nvcsw, + 'nivcsw': raw_ru.ru_nivcsw, + } + + +def _rusage_delta(before, after): + """Return ``after - before`` for all accumulator fields. + + ``maxrss`` is only included when ``before`` has a ``maxrss`` key. + """ + d = { + 'utime': after['utime'] - before['utime'], + 'stime': after['stime'] - before['stime'], + 'minflt': after['minflt'] - before['minflt'], + 'majflt': after['majflt'] - before['majflt'], + 'inblock': after['inblock'] - before['inblock'], + 'oublock': after['oublock'] - before['oublock'], + 'nvcsw': after['nvcsw'] - before['nvcsw'], + 'nivcsw': after['nivcsw'] - before['nivcsw'], + } + if 'maxrss' in before: + d['maxrss'] = after['maxrss'] - before['maxrss'] + return d + + +class MBResourceUsage: + """Capture POSIX ``getrusage()`` data for the benchmarked code. + + Records CPU time, page faults, block I/O operations, and context switches. + Results are stored as a **list** of dicts, one entry per timed iteration + in both CLI and Python API mode, aligning index-for-index with + ``call.durations`` and ``call.returncode``. + + **Modes** + + - *CLI mode* (subprocess): on POSIX, uses ``os.wait4()`` to capture the + exact rusage of each child process as reported by the kernel — one entry + per timed iteration, aligned with ``call.durations`` and + ``call.returncode``. + + - *Python API mode* (function): uses ``RUSAGE_SELF`` — one entry per + timed iteration (matching ``call.durations`` index-for-index), each a + before/after delta around that single call. ``maxrss`` is **always + omitted** in this mode because ``RUSAGE_SELF.maxrss`` is a lifetime + process high-water mark; it reflects peak usage since program start, + not just the decorated function. Use ``MBPeakMemory`` for per-call + peak RSS in Python API mode. + + On non-POSIX platforms where the ``resource`` module is unavailable, this + mixin records an empty list without raising an error. + + Output key: ``resource_usage`` (list of dicts) + + Fields per entry (CLI mode with ``os.wait4()`` — the common POSIX case): + + - ``utime``: user CPU time consumed by the child (seconds, float) + - ``stime``: system CPU time consumed by the child (seconds, float) + - ``maxrss``: peak RSS of the child in bytes (int) — see platform notes + - ``minflt``: minor page faults (int) + - ``majflt``: major page faults (int) + - ``inblock``: block input operations (int) — see platform notes + - ``oublock``: block output operations (int) — see platform notes + - ``nvcsw``: voluntary context switches (int) + - ``nivcsw``: involuntary context switches (int) + + Fields per entry (Python API mode): all of the above **except** ``maxrss``. + + **Platform notes** + + *maxrss* (CLI mode with ``os.wait4()``): + Reported directly by the kernel as the child's own peak RSS. Exact + and per-child regardless of iteration count or warmup. + + *inblock / oublock* (macOS): + These counters are almost always zero on macOS regardless of actual + I/O performed. The macOS unified buffer cache charges block I/O to + the *first* process that touches each page; subsequent reads and + writes to cached pages are not counted. In practice nearly all file + I/O is served from the cache and the counters never increment. This + is a kernel accounting limitation, not a microbench bug. On Linux, + ``inblock`` and ``oublock`` increment only for I/O that bypasses the + page cache (i.e. cold-cache reads or ``O_DIRECT`` writes); warm-cache + reads also show zero. + + *majflt* (macOS): + Major page faults are rare on macOS because the unified buffer cache + handles most page-in activity. Zero is normal. + + *utime / stime / minflt / nvcsw / nivcsw*: + These are the most reliable fields across both Linux and macOS and + should be non-zero for any non-trivial workload. + + Example output (CLI mode, 2 iterations):: + + { + "resource_usage": [ + { + "utime": 0.123456, + "stime": 0.012345, + "maxrss": 10485760, + "minflt": 512, + "majflt": 0, + "inblock": 0, + "oublock": 0, + "nvcsw": 3, + "nivcsw": 1 + }, + { + "utime": 0.118000, + "stime": 0.011000, + "maxrss": 10485760, + "minflt": 498, + "majflt": 0, + "inblock": 0, + "oublock": 0, + "nvcsw": 2, + "nivcsw": 1 + } + ] + } + + Note: + CLI compatible. + """ + + def capture_resource_usage(self, bm_data): + """Initialise resource-usage accumulator before all iterations.""" + if _resource is None: + return + if hasattr(self, '_subprocess_command'): + # CLI mode: accumulator populated by run() in main.py. + self._subprocess_resource_usage = [] + else: + # Python API mode: per-iteration list populated by pre/post_run_triggers. + self._rusage_iter_entries = [] + + def pre_run_triggers(self, bm_data): + if _resource is not None and not hasattr(self, '_subprocess_command'): + self._rusage_iter_before = _rusage_to_dict( + _resource.getrusage(_resource.RUSAGE_SELF), include_maxrss=False + ) + + def post_run_triggers(self, bm_data): + if _resource is not None and not hasattr(self, '_subprocess_command'): + after = _rusage_to_dict( + _resource.getrusage(_resource.RUSAGE_SELF), include_maxrss=False + ) + self._rusage_iter_entries.append( + _rusage_delta(self._rusage_iter_before, after) + ) + + def capturepost_resource_usage(self, bm_data): + """Write the resource_usage list to bm_data after all iterations.""" + if _resource is None: + bm_data['resource_usage'] = [] + return + if hasattr(self, '_subprocess_command'): + # CLI mode: list already populated by run() in main.py. + bm_data['resource_usage'] = list( + getattr(self, '_subprocess_resource_usage', []) + ) + else: + # Python API mode: one entry per timed iteration. + bm_data['resource_usage'] = list(getattr(self, '_rusage_iter_entries', [])) diff --git a/tests/test_cli.py b/tests/test_cli.py index 4a0f662..1d899d8 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -2,6 +2,7 @@ import json import os import subprocess +import sys from unittest.mock import MagicMock, patch import pytest @@ -10,30 +11,99 @@ from microbench.__main__ import main -def _make_mock_popen(returncode=0, stdout_lines=None, stderr_lines=None): - """Create a mock Popen process for --stdout/--stderr capture tests.""" +class _FakeRusage: + def __init__( + self, + *, + utime=0.0, + stime=0.0, + maxrss=1, + minflt=0, + majflt=0, + inblock=0, + oublock=0, + nvcsw=0, + nivcsw=0, + ): + self.ru_utime = utime + self.ru_stime = stime + self.ru_maxrss = maxrss + self.ru_minflt = minflt + self.ru_majflt = majflt + self.ru_inblock = inblock + self.ru_oublock = oublock + self.ru_nvcsw = nvcsw + self.ru_nivcsw = nivcsw + + +class _MockPipe: + """Iterable pipe mock that supports .close(), as real subprocess pipes do.""" + + def __init__(self, lines): + self._iter = iter(lines) + self.closed = False + + def __iter__(self): + return self._iter + + def close(self): + self.closed = True + + +def _make_mock_popen(returncode=0, stdout_lines=None, stderr_lines=None, pid=12345): + """Create a mock Popen process.""" mock_proc = MagicMock() - mock_proc.__enter__.return_value = mock_proc - mock_proc.__exit__.return_value = False mock_proc.returncode = returncode - mock_proc.stdout = iter(stdout_lines) if stdout_lines is not None else None - mock_proc.stderr = iter(stderr_lines) if stderr_lines is not None else None + mock_proc.pid = pid + mock_proc.stdout = _MockPipe(stdout_lines) if stdout_lines is not None else None + mock_proc.stderr = _MockPipe(stderr_lines) if stderr_lines is not None else None return mock_proc def _run_main(argv, mock_returncode=0): """Run main() with a mocked subprocess and captured stdout.""" - mock_result = MagicMock() - mock_result.returncode = mock_returncode - mock_result.stdout = None - mock_result.stderr = None + mock_proc = _make_mock_popen(returncode=mock_returncode) buf = io.StringIO() - with patch('subprocess.run', return_value=mock_result) as mock_run: - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(argv) - return exc.value.code, json.loads(buf.getvalue()), mock_run + wait_status = 0 if mock_returncode == 0 else mock_returncode << 8 + fake_wait4 = (mock_proc.pid, wait_status, _FakeRusage()) + with patch('subprocess.Popen', return_value=mock_proc) as mock_popen: + with patch('os.wait4', return_value=fake_wait4): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(argv) + return exc.value.code, json.loads(buf.getvalue()), mock_popen + + +def _patch_wait4_success(mock_proc, *, rusage=None, status=0): + if rusage is None: + rusage = _FakeRusage() + return patch('os.wait4', return_value=(mock_proc.pid, status, rusage)) + + +def _patch_wait4_sleep(mock_proc, delay, *, status=0, rusage=None): + if rusage is None: + rusage = _FakeRusage() + + def _wait4(pid, options): + import time + + time.sleep(delay) + return (mock_proc.pid, status, rusage) + + return patch('os.wait4', side_effect=_wait4) + + +def _patch_wait4_sequence(*results): + iterator = iter(results) + + def _wait4(pid, options): + result = next(iterator) + if callable(result): + return result(pid, options) + return result + + return patch('os.wait4', side_effect=_wait4) def test_cli_records_command_and_timing(): @@ -176,14 +246,12 @@ def test_cli_mixin_defaults_keyword_invalid_extra(): def test_cli_outfile(tmp_path): """--outfile writes JSONL to the specified file.""" outfile = tmp_path / 'results.jsonl' - mock_result = MagicMock() - mock_result.returncode = 0 - mock_result.stdout = None - mock_result.stderr = None + mock_proc = _make_mock_popen(returncode=0) - with patch('subprocess.run', return_value=mock_result): - with pytest.raises(SystemExit): - main(['--outfile', str(outfile), '--', 'true']) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with pytest.raises(SystemExit): + main(['--outfile', str(outfile), '--', 'true']) record = json.loads(outfile.read_text()) assert record['call']['command'] == ['true'] @@ -235,12 +303,16 @@ def test_cli_all_flag_includes_all_mixins(): all_names = set(_get_mixin_map()) # Patch every mixin's capture methods to no-ops to avoid external calls - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - buf = io.StringIO() - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - with patch('subprocess.check_output', side_effect=Exception('skip')): - main(['--all', '--', 'true']) + mock_proc = _make_mock_popen(returncode=0) + buf = io.StringIO() + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + with patch( + 'subprocess.check_output', side_effect=Exception('skip') + ): + main(['--all', '--', 'true']) # At minimum the record should be written (even with capture_optional errors) record = json.loads(buf.getvalue()) @@ -265,55 +337,60 @@ def test_cli_includes_mb_run_id_and_version(): def test_cli_double_dash_separator(): """-- separator is stripped before passing the command to subprocess.""" - _, _, mock_run = _run_main(['--', 'echo', 'hello']) + _, _, mock_popen = _run_main(['--', 'echo', 'hello']) - mock_run.assert_called_once() - called_cmd = mock_run.call_args[0][0] + mock_popen.assert_called_once() + called_cmd = mock_popen.call_args[0][0] assert called_cmd == ['echo', 'hello'] def test_cli_iterations(): """--iterations N runs the command N times and produces N durations entries.""" - _, record, mock_run = _run_main(['--iterations', '3', '--', 'true']) + _, record, mock_popen = _run_main(['--iterations', '3', '--', 'true']) - assert mock_run.call_count == 3 + assert mock_popen.call_count == 3 assert len(record['call']['durations']) == 3 assert len(record['call']['returncode']) == 3 def test_cli_warmup(): """--warmup N runs the command N extra times before timing begins.""" - _, record, mock_run = _run_main(['--warmup', '2', '--', 'true']) + _, record, mock_popen = _run_main(['--warmup', '2', '--', 'true']) # 2 warmup calls + 1 timed call - assert mock_run.call_count == 3 + assert mock_popen.call_count == 3 assert len(record['call']['durations']) == 1 assert len(record['call']['returncode']) == 1 def test_cli_iterations_and_warmup(): """--iterations and --warmup together produce the right call count.""" - _, record, mock_run = _run_main( + _, record, mock_popen = _run_main( ['--iterations', '4', '--warmup', '2', '--', 'true'] ) - assert mock_run.call_count == 6 + assert mock_popen.call_count == 6 assert len(record['call']['durations']) == 4 assert len(record['call']['returncode']) == 4 def test_cli_returncode_is_first_nonzero_across_iterations(): """Process exits with the first non-zero returncode seen across timed iterations.""" - mock_results = [ - MagicMock(returncode=0, stdout=None, stderr=None), - MagicMock(returncode=2, stdout=None, stderr=None), - MagicMock(returncode=1, stdout=None, stderr=None), + mock_procs = [ + _make_mock_popen(returncode=0), + _make_mock_popen(returncode=2), + _make_mock_popen(returncode=1), ] buf = io.StringIO() - with patch('subprocess.run', side_effect=mock_results): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(['--iterations', '3', '--', 'true']) + with patch('subprocess.Popen', side_effect=mock_procs): + with _patch_wait4_sequence( + (mock_procs[0].pid, 0, _FakeRusage()), + (mock_procs[1].pid, 2 << 8, _FakeRusage()), + (mock_procs[2].pid, 1 << 8, _FakeRusage()), + ): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(['--iterations', '3', '--', 'true']) assert exc.value.code == 2 assert json.loads(buf.getvalue())['call']['returncode'] == [0, 2, 1] @@ -321,16 +398,21 @@ def test_cli_returncode_is_first_nonzero_across_iterations(): def test_cli_returncode_preserves_first_nonzero_even_if_later_is_larger(): """A later larger returncode does not override the first failure.""" - mock_results = [ - MagicMock(returncode=0, stdout=None, stderr=None), - MagicMock(returncode=1, stdout=None, stderr=None), - MagicMock(returncode=2, stdout=None, stderr=None), + mock_procs = [ + _make_mock_popen(returncode=0), + _make_mock_popen(returncode=1), + _make_mock_popen(returncode=2), ] buf = io.StringIO() - with patch('subprocess.run', side_effect=mock_results): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(['--iterations', '3', '--', 'true']) + with patch('subprocess.Popen', side_effect=mock_procs): + with _patch_wait4_sequence( + (mock_procs[0].pid, 0, _FakeRusage()), + (mock_procs[1].pid, 1 << 8, _FakeRusage()), + (mock_procs[2].pid, 2 << 8, _FakeRusage()), + ): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(['--iterations', '3', '--', 'true']) assert exc.value.code == 1 assert json.loads(buf.getvalue())['call']['returncode'] == [0, 1, 2] @@ -338,16 +420,21 @@ def test_cli_returncode_preserves_first_nonzero_even_if_later_is_larger(): def test_cli_returncode_preserves_first_nonzero_signal_code(): """Negative subprocess returncodes are also returned if they are first non-zero.""" - mock_results = [ - MagicMock(returncode=0, stdout=None, stderr=None), - MagicMock(returncode=-15, stdout=None, stderr=None), - MagicMock(returncode=1, stdout=None, stderr=None), + mock_procs = [ + _make_mock_popen(returncode=0), + _make_mock_popen(returncode=-15), + _make_mock_popen(returncode=1), ] buf = io.StringIO() - with patch('subprocess.run', side_effect=mock_results): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(['--iterations', '3', '--', 'true']) + with patch('subprocess.Popen', side_effect=mock_procs): + with _patch_wait4_sequence( + (mock_procs[0].pid, 0, _FakeRusage()), + (mock_procs[1].pid, 15, _FakeRusage()), + (mock_procs[2].pid, 1 << 8, _FakeRusage()), + ): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(['--iterations', '3', '--', 'true']) assert exc.value.code == -15 assert json.loads(buf.getvalue())['call']['returncode'] == [0, -15, 1] @@ -363,12 +450,16 @@ def test_cli_multiple_mixins(): def test_cli_all_overrides_mixin(): """--all takes precedence over --mixin.""" - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - buf = io.StringIO() - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - with patch('subprocess.check_output', side_effect=Exception('skip')): - main(['--mixin', 'MBHostInfo', '--all', '--', 'true']) + mock_proc = _make_mock_popen(returncode=0) + buf = io.StringIO() + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + with patch( + 'subprocess.check_output', side_effect=Exception('skip') + ): + main(['--mixin', 'MBHostInfo', '--all', '--', 'true']) record = json.loads(buf.getvalue()) # --all should activate every mixin, so slurm (from MBSlurmInfo) must be present @@ -417,10 +508,11 @@ def test_cli_capture_stdout_records_output(): buf = io.StringIO() terminal = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc) as mock_popen: - with patch('sys.stdout', buf): - with patch('sys.__stdout__', terminal): - with pytest.raises(SystemExit): - main(['--stdout', '--', 'echo', 'hello']) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stdout__', terminal): + with pytest.raises(SystemExit): + main(['--stdout', '--', 'echo', 'hello']) record = json.loads(buf.getvalue()) assert record['call']['stdout'] == ['hello\n'] @@ -436,10 +528,11 @@ def test_cli_capture_stdout_suppress(): buf = io.StringIO() terminal = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with patch('sys.__stdout__', terminal): - with pytest.raises(SystemExit): - main(['--stdout=suppress', '--', 'echo', 'hello']) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stdout__', terminal): + with pytest.raises(SystemExit): + main(['--stdout=suppress', '--', 'echo', 'hello']) record = json.loads(buf.getvalue()) assert record['call']['stdout'] == ['hello\n'] @@ -453,10 +546,11 @@ def test_cli_capture_stderr_records_output(): buf = io.StringIO() terminal_err = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with patch('sys.__stderr__', terminal_err): - with pytest.raises(SystemExit): - main(['--stderr', '--', 'cmd']) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stderr__', terminal_err): + with pytest.raises(SystemExit): + main(['--stderr', '--', 'cmd']) record = json.loads(buf.getvalue()) assert record['call']['stderr'] == ['warning\n'] @@ -471,10 +565,11 @@ def test_cli_capture_stderr_suppress(): buf = io.StringIO() terminal_err = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with patch('sys.__stderr__', terminal_err): - with pytest.raises(SystemExit): - main(['--stderr=suppress', '--', 'cmd']) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stderr__', terminal_err): + with pytest.raises(SystemExit): + main(['--stderr=suppress', '--', 'cmd']) record = json.loads(buf.getvalue()) assert record['call']['stderr'] == ['warning\n'] @@ -492,12 +587,26 @@ def test_cli_capture_stdout_multiple_iterations(): buf = io.StringIO() with patch('subprocess.Popen', side_effect=mock_procs): - with patch('sys.stdout', buf): - with patch('sys.__stdout__', io.StringIO()): - with pytest.raises(SystemExit): - main( - ['--stdout', '--warmup', '1', '--iterations', '3', '--', 'cmd'] - ) + with _patch_wait4_sequence( + (mock_procs[0].pid, 0, _FakeRusage()), + (mock_procs[1].pid, 0, _FakeRusage()), + (mock_procs[2].pid, 0, _FakeRusage()), + (mock_procs[3].pid, 0, _FakeRusage()), + ): + with patch('sys.stdout', buf): + with patch('sys.__stdout__', io.StringIO()): + with pytest.raises(SystemExit): + main( + [ + '--stdout', + '--warmup', + '1', + '--iterations', + '3', + '--', + 'cmd', + ] + ) record = json.loads(buf.getvalue()) assert record['call']['stdout'] == ['run1\n', 'run2\n', 'run3\n'] @@ -512,11 +621,12 @@ def test_cli_capture_stdout_and_stderr(): buf = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with patch('sys.__stdout__', io.StringIO()): - with patch('sys.__stderr__', io.StringIO()): - with pytest.raises(SystemExit): - main(['--stdout', '--stderr', '--', 'cmd']) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stdout__', io.StringIO()): + with patch('sys.__stderr__', io.StringIO()): + with pytest.raises(SystemExit): + main(['--stdout', '--stderr', '--', 'cmd']) record = json.loads(buf.getvalue()) assert record['call']['stdout'] == ['out\n'] @@ -614,9 +724,10 @@ def _run_main_with_monitor(argv, mock_pid=12345, mock_returncode=0, fake_samples buf = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main(argv) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main(argv) return json.loads(buf.getvalue()), MockThread, mock_thread @@ -695,19 +806,20 @@ def make_thread(pid, interval): buf = io.StringIO() with patch('microbench.cli.main._SubprocessMonitorThread', side_effect=make_thread): with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main( - [ - '--no-mixin', - '--monitor-interval', - '5', - '--iterations', - '3', - '--', - 'cmd', - ] - ) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--monitor-interval', + '5', + '--iterations', + '3', + '--', + 'cmd', + ] + ) record = json.loads(buf.getvalue()) assert len(record['call']['monitor']) == 3 @@ -732,21 +844,22 @@ def make_thread(pid, interval): buf = io.StringIO() with patch('microbench.cli.main._SubprocessMonitorThread', side_effect=make_thread): with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main( - [ - '--no-mixin', - '--monitor-interval', - '5', - '--warmup', - '2', - '--iterations', - '2', - '--', - 'cmd', - ] - ) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--monitor-interval', + '5', + '--warmup', + '2', + '--iterations', + '2', + '--', + 'cmd', + ] + ) record = json.loads(buf.getvalue()) assert len(record['call']['monitor']) == 2 @@ -778,7 +891,7 @@ def test_cli_monitor_interval_requires_psutil(): def test_cli_monitor_interval_with_stdout_capture(): """--monitor-interval and --stdout can be combined.""" mock_proc = _make_mock_popen_for_monitor(pid=55) - mock_proc.stdout = iter([b'hello\n']) + mock_proc.stdout = _MockPipe([b'hello\n']) mock_proc.stderr = None fake_samples = [{'timestamp': 'T', 'cpu_percent': 8.0, 'rss_bytes': 2048}] @@ -788,19 +901,20 @@ def test_cli_monitor_interval_with_stdout_capture(): mock_thread.samples = fake_samples MockThread.return_value = mock_thread with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with patch('sys.__stdout__', io.StringIO()): - with pytest.raises(SystemExit): - main( - [ - '--no-mixin', - '--monitor-interval', - '5', - '--stdout', - '--', - 'cmd', - ] - ) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stdout__', io.StringIO()): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--monitor-interval', + '5', + '--stdout', + '--', + 'cmd', + ] + ) record = json.loads(buf.getvalue()) assert record['call']['stdout'] == ['hello\n'] @@ -818,14 +932,16 @@ def _run_main_http(argv, fake_status=200): mock_response.__enter__ = lambda s: s mock_response.__exit__ = MagicMock(return_value=False) + mock_proc = _make_mock_popen(returncode=0) buf = io.StringIO() - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - with patch( - 'urllib.request.urlopen', return_value=mock_response - ) as mock_urlopen: - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(argv) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch( + 'urllib.request.urlopen', return_value=mock_response + ) as mock_urlopen: + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(argv) return exc.value.code, mock_urlopen @@ -848,11 +964,21 @@ def test_cli_http_output_no_stdout_record(): mock_response = MagicMock() mock_response.__enter__ = lambda s: s mock_response.__exit__ = MagicMock(return_value=False) - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - with patch('urllib.request.urlopen', return_value=mock_response): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main(['--no-mixin', '--http-output', 'https://x.com', '--', 'true']) + mock_proc = _make_mock_popen(returncode=0) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch('urllib.request.urlopen', return_value=mock_response): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--http-output', + 'https://x.com', + '--', + 'true', + ] + ) assert buf.getvalue() == '' @@ -862,22 +988,24 @@ def test_cli_http_output_and_outfile(tmp_path): mock_response = MagicMock() mock_response.__enter__ = lambda s: s mock_response.__exit__ = MagicMock(return_value=False) - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - with patch( - 'urllib.request.urlopen', return_value=mock_response - ) as mock_urlopen: - with pytest.raises(SystemExit): - main( - [ - '--no-mixin', - '--outfile', - str(outfile), - '--http-output', - 'https://x.com', - '--', - 'true', - ] - ) + mock_proc = _make_mock_popen(returncode=0) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch( + 'urllib.request.urlopen', return_value=mock_response + ) as mock_urlopen: + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--outfile', + str(outfile), + '--http-output', + 'https://x.com', + '--', + 'true', + ] + ) assert outfile.exists() assert json.loads(outfile.read_text())['call']['name'] == 'true' mock_urlopen.assert_called_once() @@ -996,12 +1124,14 @@ def _run_main_redis(argv): mock_client.rpush.side_effect = lambda key, val: redis_store.append(val) mock_redis = MagicMock() mock_redis.StrictRedis.return_value = mock_client + mock_proc = _make_mock_popen(returncode=0) buf = io.StringIO() - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - with patch.dict('sys.modules', {'redis': mock_redis}): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(argv) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch.dict('sys.modules', {'redis': mock_redis}): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(argv) return exc.value.code, mock_redis, mock_client, redis_store @@ -1031,13 +1161,21 @@ def test_cli_redis_output_no_stdout_record(): mock_client = MagicMock() mock_redis = MagicMock() mock_redis.StrictRedis.return_value = mock_client - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - with patch.dict('sys.modules', {'redis': mock_redis}): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main( - ['--no-mixin', '--redis-output', 'bench:results', '--', 'true'] - ) + mock_proc = _make_mock_popen(returncode=0) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch.dict('sys.modules', {'redis': mock_redis}): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--redis-output', + 'bench:results', + '--', + 'true', + ] + ) assert buf.getvalue() == '' @@ -1047,20 +1185,22 @@ def test_cli_redis_output_and_outfile(tmp_path): mock_client = MagicMock() mock_redis = MagicMock() mock_redis.StrictRedis.return_value = mock_client - with patch('subprocess.run', return_value=MagicMock(returncode=0)): - with patch.dict('sys.modules', {'redis': mock_redis}): - with pytest.raises(SystemExit): - main( - [ - '--no-mixin', - '--outfile', - str(outfile), - '--redis-output', - 'bench:results', - '--', - 'true', - ] - ) + mock_proc = _make_mock_popen(returncode=0) + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch.dict('sys.modules', {'redis': mock_redis}): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--outfile', + str(outfile), + '--redis-output', + 'bench:results', + '--', + 'true', + ] + ) assert outfile.exists() assert json.loads(outfile.read_text())['call']['name'] == 'true' mock_client.rpush.assert_called_once() @@ -1414,9 +1554,10 @@ def test_cli_timeout_not_exceeded(): buf = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main(['--no-mixin', '--timeout', '30', '--', 'sleep', '1']) + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main(['--no-mixin', '--timeout', '30', '--', 'sleep', '1']) assert exc.value.code == 0 record = json.loads(buf.getvalue()) assert 'timed_out' not in record['call'] @@ -1427,17 +1568,13 @@ def test_cli_timeout_sigterm_sufficient(): """--timeout: process exits after SIGTERM; SIGKILL is not sent.""" mock_proc = _make_mock_popen() mock_proc.returncode = -15 # killed by SIGTERM - mock_proc.wait.side_effect = [ - subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5), # main timeout - None, # exits cleanly after SIGTERM (within grace period) - None, # called again by Popen.__exit__ - ] buf = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main(['--no-mixin', '--timeout', '5', '--', 'sleep', '100']) + with _patch_wait4_sleep(mock_proc, 0.05, status=15): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main(['--no-mixin', '--timeout', '0.001', '--', 'sleep', '100']) record = json.loads(buf.getvalue()) assert record['call']['timed_out'] is True mock_proc.terminate.assert_called_once() @@ -1448,18 +1585,24 @@ def test_cli_timeout_sigkill_required(): """--timeout: SIGKILL sent when process ignores SIGTERM past the grace period.""" mock_proc = _make_mock_popen() mock_proc.returncode = -9 # killed by SIGKILL - mock_proc.wait.side_effect = [ - subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5), # main timeout - subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5), # grace period - None, # exits after SIGKILL - None, # called again by Popen.__exit__ - ] buf = io.StringIO() with patch('subprocess.Popen', return_value=mock_proc): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit): - main(['--no-mixin', '--timeout', '5', '--', 'sleep', '100']) + with _patch_wait4_sleep(mock_proc, 0.05, status=9): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main( + [ + '--no-mixin', + '--timeout', + '0.001', + '--timeout-grace-period', + '0.001', + '--', + 'sleep', + '100', + ] + ) record = json.loads(buf.getvalue()) assert record['call']['timed_out'] is True mock_proc.terminate.assert_called_once() @@ -1470,31 +1613,36 @@ def test_cli_timeout_during_warmup_does_not_set_timed_out(): """A timeout in warmup is discarded along with other warmup-only state.""" warmup_proc = _make_mock_popen() warmup_proc.returncode = -15 - warmup_proc.wait.side_effect = [ - subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5), - None, - None, - ] timed_proc = _make_mock_popen() timed_proc.returncode = 0 + def warmup_wait4(pid, options): + import time + + time.sleep(0.05) + return (warmup_proc.pid, 15, _FakeRusage()) + buf = io.StringIO() with patch('subprocess.Popen', side_effect=[warmup_proc, timed_proc]): - with patch('sys.stdout', buf): - with pytest.raises(SystemExit) as exc: - main( - [ - '--no-mixin', - '--warmup', - '1', - '--timeout', - '5', - '--', - 'sleep', - '100', - ] - ) + with _patch_wait4_sequence( + warmup_wait4, + (timed_proc.pid, 0, _FakeRusage()), + ): + with patch('sys.stdout', buf): + with pytest.raises(SystemExit) as exc: + main( + [ + '--no-mixin', + '--warmup', + '1', + '--timeout', + '0.001', + '--', + 'sleep', + '100', + ] + ) assert exc.value.code == 0 record = json.loads(buf.getvalue()) @@ -1502,6 +1650,43 @@ def test_cli_timeout_during_warmup_does_not_set_timed_out(): assert 'timed_out' not in record['call'] +def test_cli_keyboard_interrupt_kills_child(): + """KeyboardInterrupt during wait4 join causes proc.kill() and re-raises.""" + mock_proc = _make_mock_popen() + + def _wait4_interrupt(pid, options): + raise KeyboardInterrupt + + with patch('subprocess.Popen', return_value=mock_proc): + with patch('os.wait4', side_effect=_wait4_interrupt): + with pytest.raises(KeyboardInterrupt): + main(['--no-mixin', '--', 'sleep', '100']) + + mock_proc.kill.assert_called() + mock_proc.wait.assert_called() + + +def test_cli_pipe_fds_closed_after_run(): + """stdout and stderr pipe FDs are closed after a normal run.""" + mock_proc = _make_mock_popen( + returncode=0, + stdout_lines=[b'out\n'], + stderr_lines=[b'err\n'], + ) + + buf = io.StringIO() + with patch('subprocess.Popen', return_value=mock_proc): + with _patch_wait4_success(mock_proc): + with patch('sys.stdout', buf): + with patch('sys.__stdout__', io.StringIO()): + with patch('sys.__stderr__', io.StringIO()): + with pytest.raises(SystemExit): + main(['--stdout', '--stderr', '--', 'cmd']) + + assert mock_proc.stdout.closed + assert mock_proc.stderr.closed + + def test_cli_version(capsys): """--version prints the package version and exits 0.""" with pytest.raises(SystemExit) as exc: @@ -1729,3 +1914,118 @@ def test_cli_show_mixins_includes_nvidia_flags(): output = buf.getvalue() assert '--nvidia-attributes' in output assert '--nvidia-gpus' in output + + +# --------------------------------------------------------------------------- +# MBResourceUsage — CLI tests +# --------------------------------------------------------------------------- + +_RUSAGE_FIELDS = frozenset( + { + 'utime', + 'stime', + 'maxrss', + 'minflt', + 'majflt', + 'inblock', + 'oublock', + 'nvcsw', + 'nivcsw', + } +) + + +def test_cli_resource_usage_in_defaults(): + """resource-usage is included in the default mixin set.""" + _, record, _ = _run_main(['--', 'true']) + assert 'resource_usage' in record + + +def test_cli_resource_usage_fields_present(): + """resource-usage records all expected fields.""" + _, record, _ = _run_main(['--mixin', 'resource-usage', '--', 'true']) + ru_list = record.get('resource_usage', []) + assert isinstance(ru_list, list) + assert len(ru_list) == 1 + assert set(ru_list[0].keys()) == _RUSAGE_FIELDS + + +def test_cli_resource_usage_values_are_numeric(): + """resource-usage field values are all numbers (int or float).""" + _, record, _ = _run_main(['--mixin', 'resource-usage', '--', 'true']) + ru_list = record.get('resource_usage', []) + assert len(ru_list) == 1 + for field, value in ru_list[0].items(): + assert isinstance(value, int | float), ( + f'{field}: expected number, got {type(value)}' + ) + + +def test_cli_resource_usage_no_mixins_omits_field(): + """--no-mixin produces a record without resource_usage.""" + _, record, _ = _run_main(['--no-mixin', '--', 'true']) + assert 'resource_usage' not in record + + +def test_cli_show_mixins_includes_resource_usage(): + """--show-mixins lists resource-usage with a default marker.""" + buf = io.StringIO() + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main(['--show-mixins']) + output = buf.getvalue() + assert 'resource-usage' in output + # resource-usage is a default mixin — should be starred + lines = [line for line in output.splitlines() if 'resource-usage' in line] + assert lines, 'resource-usage not found in --show-mixins output' + assert lines[0].startswith(' *'), 'resource-usage should be marked as default (*)' + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_cli_resource_usage_real_subprocess_maxrss(): + """resource-usage records a positive maxrss from a real subprocess.""" + buf = io.StringIO() + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main(['--mixin', 'resource-usage', '--', sys.executable, '-c', 'pass']) + record = json.loads(buf.getvalue()) + ru_list = record.get('resource_usage', []) + assert isinstance(ru_list, list) + assert len(ru_list) == 1 + ru = ru_list[0] + assert set(ru.keys()) == _RUSAGE_FIELDS + assert ru['maxrss'] > 0, ( + 'maxrss should be positive after running a Python subprocess' + ) + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_cli_resource_usage_real_subprocess_cpu_nonnegative(): + """utime and stime are non-negative after a real subprocess.""" + buf = io.StringIO() + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main(['--mixin', 'resource-usage', '--', sys.executable, '-c', 'pass']) + record = json.loads(buf.getvalue()) + ru = record['resource_usage'][0] + assert ru['utime'] >= 0.0 + assert ru['stime'] >= 0.0 + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_cli_resource_usage_real_subprocess_counts_nonnegative(): + """All integer count fields are non-negative after a real subprocess.""" + buf = io.StringIO() + with patch('sys.stdout', buf): + with pytest.raises(SystemExit): + main(['--mixin', 'resource-usage', '--', sys.executable, '-c', 'pass']) + record = json.loads(buf.getvalue()) + ru = record['resource_usage'][0] + for field in ('minflt', 'majflt', 'inblock', 'oublock', 'nvcsw', 'nivcsw'): + assert ru[field] >= 0, f'{field} should be non-negative' diff --git a/tests/test_mixins.py b/tests/test_mixins.py index d50d8be..5a72c6f 100644 --- a/tests/test_mixins.py +++ b/tests/test_mixins.py @@ -13,6 +13,7 @@ MBLineProfiler, MBLoadedModules, MBPeakMemory, + MBResourceUsage, MBSlurmInfo, MBWorkingDir, MicroBench, @@ -836,3 +837,175 @@ def noop(): noop() assert bench.get_results()[0].get('cgroups', {}) == {} + + +# --------------------------------------------------------------------------- +# MBResourceUsage +# --------------------------------------------------------------------------- + +_RUSAGE_FIELDS_PYTHON_API = { + 'utime', + 'stime', + 'minflt', + 'majflt', + 'inblock', + 'oublock', + 'nvcsw', + 'nivcsw', +} + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_resource_usage_python_api_fields(): + """MBResourceUsage records all expected fields in Python API mode.""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench() + + @bench + def work(): + return sum(range(50000)) + + work() + ru_list = bench.get_results()[0].get('resource_usage', []) + assert isinstance(ru_list, list) + assert len(ru_list) == 1 # 1 timed iteration + assert set(ru_list[0].keys()) == _RUSAGE_FIELDS_PYTHON_API + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_resource_usage_python_api_one_entry_per_iteration(): + """resource_usage has one entry per timed iteration in Python API mode.""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench(iterations=3) + + @bench + def work(): + return sum(range(1000)) + + work() + ru_list = bench.get_results()[0].get('resource_usage', []) + assert len(ru_list) == 3 + for entry in ru_list: + assert set(entry.keys()) == _RUSAGE_FIELDS_PYTHON_API + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_resource_usage_python_api_warmup_excluded(): + """Warmup iterations are not counted in resource_usage.""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench(iterations=2, warmup=3) + + @bench + def work(): + return sum(range(1000)) + + work() + ru_list = bench.get_results()[0].get('resource_usage', []) + assert len(ru_list) == 2 # only timed iterations + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_resource_usage_python_api_maxrss_absent(): + """maxrss is absent in Python API mode (RUSAGE_SELF.maxrss is a lifetime HWM).""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench() + + @bench + def work(): + return list(range(10000)) + + work() + ru_list = bench.get_results()[0].get('resource_usage', []) + assert isinstance(ru_list, list) + assert len(ru_list) == 1 + assert 'maxrss' not in ru_list[0], ( + 'maxrss must not appear in Python API mode records because ' + 'RUSAGE_SELF.maxrss is a lifetime process high-water mark and ' + 'cannot isolate a single function call' + ) + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_resource_usage_python_api_cpu_nonnegative(): + """utime and stime are non-negative floats.""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench() + + @bench + def work(): + return sum(range(10000)) + + work() + ru = bench.get_results()[0]['resource_usage'][0] + assert ru['utime'] >= 0.0 + assert ru['stime'] >= 0.0 + + +@pytest.mark.skipif( + sys.platform == 'win32', reason='resource module not available on Windows' +) +def test_resource_usage_python_api_counts_nonnegative(): + """All integer count fields are non-negative.""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench() + + @bench + def noop(): + pass + + noop() + ru = bench.get_results()[0]['resource_usage'][0] + for field in ('minflt', 'majflt', 'inblock', 'oublock', 'nvcsw', 'nivcsw'): + assert ru[field] >= 0, f'{field} should be non-negative' + + +def test_resource_usage_windows_fallback(): + """Records empty list when the resource module is unavailable (Windows guard).""" + + class Bench(MicroBench, MBResourceUsage): + pass + + bench = Bench() + + @bench + def noop(): + pass + + import microbench.mixins.system as _sys_mod + + original = _sys_mod._resource + try: + _sys_mod._resource = None + noop() + finally: + _sys_mod._resource = original + + assert bench.get_results()[0].get('resource_usage') == []