diff --git a/CHANGELOG.md b/CHANGELOG.md
index b99f021..226dd04 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,12 +4,34 @@ All notable changes to microbench are documented here.
 
 ## [Unreleased]
 
+### New features
+
+- **`MBResourceUsage` mixin**: captures POSIX `getrusage()` data — user and
+  system CPU time, peak RSS (in bytes, normalised across platforms), minor and
+  major page faults, block I/O operations, and voluntary/involuntary context
+  switches. Results are stored as a **list** in `resource_usage`, one entry per
+  timed iteration, aligned index-for-index with `call.durations` and
+  `call.returncode`. Added as a **default CLI mixin** so every CLI run captures
+  it automatically.
+
+  - *CLI mode*: on POSIX, uses `os.wait4()` to capture the exact rusage of each
+    individual child process as reported by the kernel, including a reliable
+    `maxrss` per iteration regardless of `--iterations` or `--warmup` count.
+  - *Python API mode*: uses `RUSAGE_SELF` with a before/after delta around each
+    individual call — one entry per timed iteration, aligned with
+    `call.durations`. Warmup calls are excluded. `maxrss` is omitted (lifetime
+    process HWM, not per-call). Use `MBPeakMemory` for per-call peak memory.
+
+  On platforms where the stdlib `resource` module is unavailable, the mixin
+  records an empty list without raising an error.
+
 ### Enhancements
 
 - **`--mixin defaults` keyword** (CLI): `defaults` can be used as a mixin
   name to expand to the standard default set (`python-info`, `host-info`,
-  `slurm-info`, `loaded-modules`, `working-dir`). This makes it easy to add
-  one or more extra mixins without listing all five defaults explicitly:
+  `slurm-info`, `loaded-modules`, `working-dir`, `resource-usage`). This
+  makes it easy to add one or more extra mixins without listing all six
+  defaults explicitly:
   `microbench --mixin defaults file-hash -- ./job.sh`.
 
 - **`file-hash` mixin — automatic argument file scanning** (CLI): the
diff --git a/docs/cli.md b/docs/cli.md
index a6a05d2..a68132c 100644
--- a/docs/cli.md
+++ b/docs/cli.md
@@ -117,11 +117,13 @@ Every record contains the standard `mb.*` and `call.*` fields plus:
 ## Default mixins
 
 When no `--mixin` is specified, `python-info`, `host-info`, `slurm-info`,
-`loaded-modules`, and `working-dir` are included automatically, capturing
-the Python interpreter version, prefix, and executable path; hostname and
-operating system; all `SLURM_*` environment variables; the loaded
-Lmod/Environment Modules software stack; and the current working directory.
-All five degrade gracefully or produce stable values outside their respective
+`loaded-modules`, `working-dir`, and `resource-usage` are included
+automatically, capturing the Python interpreter version, prefix, and
+executable path; hostname and operating system; all `SLURM_*` environment
+variables; the loaded Lmod/Environment Modules software stack; the current
+working directory; and POSIX resource usage (CPU time, peak RSS, page faults,
+I/O, and context switches) for the benchmarked subprocess.
+All six degrade gracefully or produce stable values outside their respective
 environments.
 
 Mixin names use a short kebab-case form without the `MB` prefix
@@ -236,6 +238,15 @@ microbench \
     -- ./run_simulation.sh
 ```
 
+### `resource-usage` options
+
+`resource-usage` has no CLI flags. It is included in the defaults and records
+POSIX `getrusage()` data automatically for the benchmarked subprocess.
+
+On platforms where the Python `resource` module is unavailable,
+`resource-usage` records an empty `resource_usage` list and does not raise an
+error.
+
 ## Capture failures
 
 Metadata capture failures (e.g. `nvidia-smi` not installed on this node,
@@ -289,6 +300,7 @@ With 10 iterations and 2 warmup runs, the record contains:
 - `call.durations` — list of 10 wall-clock durations in seconds
 - `call.returncode` — list of 10 exit codes (one per timed iteration)
 - `call.stdout` / `call.stderr` — list of 10 captured strings, if `--stdout`/`--stderr` is used
+- `resource_usage` — list of 10 per-iteration rusage dicts (when `resource-usage` mixin is active)
 
 Warmup runs are excluded from all three lists. The process exits with
 the first non-zero return code, if present, so any failing iteration
diff --git a/docs/user-guide/mixins.md b/docs/user-guide/mixins.md
index 2956843..3a04537 100644
--- a/docs/user-guide/mixins.md
+++ b/docs/user-guide/mixins.md
@@ -19,9 +19,9 @@ to list all available mixins with descriptions:
 microbench --show-mixins
 ```
 
-By default, `python-info`, `host-info`, `slurm-info`, `loaded-modules`, and
-`working-dir` are included automatically. Specifying `--mixin` replaces the
-defaults entirely. Use `--no-mixin` to disable all mixins:
+By default, `python-info`, `host-info`, `slurm-info`, `loaded-modules`,
+`working-dir`, and `resource-usage` are included automatically. Specifying
+`--mixin` replaces the defaults entirely. Use `--no-mixin` to disable all mixins:
 
 ```bash
 # Only peak-memory — no host info or SLURM
@@ -70,6 +70,7 @@ combine any number of microbench mixins without conflicts, and their
 | `MBLoadedModules` | `loaded-modules` *(default)* | `loaded_modules` dict mapping module name to version (empty dict if no Lmod/Environment Modules are loaded) | — |
 | `MBWorkingDir` | `working-dir` *(default)* | `call.working_dir` — absolute path of the working directory at benchmark time | — |
 | `MBCgroupLimits` | `cgroup-limits` | `cgroups` dict with `cpu_cores_limit`, `memory_bytes_limit`, `version` (empty dict if not on Linux or cgroup fs unavailable) | Linux only |
+| `MBResourceUsage` | `resource-usage` *(default)* | `resource_usage` list of dicts with CPU times, peak RSS, page faults, I/O ops, and context switches (`[]` when the stdlib `resource` module is unavailable) | POSIX only (stdlib) |
 | `MBGitInfo` | `git-info` | `git` dict with `repo`, `commit`, `branch`, `dirty` | `git` ≥ 2.11 on PATH |
 | `MBGlobalPackages` | Python only | `python.loaded_packages` for every package in the caller's global scope | — |
 | `MBInstalledPackages` | `installed-packages` | `python.installed_packages` (and optionally `python.installed_package_paths`) for every installed package | — |
@@ -338,6 +339,172 @@ platforms or when the cgroup filesystem is unavailable.
     scheduler metadata (job ID, node list, etc.) while `MBCgroupLimits` captures
     the kernel-enforced resource limits.
 
+### `MBResourceUsage`
+
+Captures POSIX [`getrusage(2)`](https://man7.org/linux/man-pages/man2/getrusage.2.html)
+data — CPU time, page faults, block I/O operations, and context switches —
+using only the Python standard library (`resource` module). No extra
+dependencies are required.
+
+**Modes**
+
+- **CLI mode**: on POSIX, uses `os.wait4()` to get the exact rusage of each
+  child process as reported by the kernel — one dict per timed iteration,
+  aligned index-for-index with `call.durations`.
+  `maxrss` is the child's own peak RSS.
+- **Python API mode**: uses `RUSAGE_SELF` — one dict per timed iteration,
+  each a before/after delta around that single call (aligned index-for-index
+  with `call.durations`). Warmup calls are excluded.
+  `maxrss` is **omitted** — `RUSAGE_SELF.maxrss` is a lifetime process
+  high-water mark that reflects the peak since the interpreter started,
+  not just since the decorated function was called, making it unreliable
+  for function-level measurement.
+
+On platforms where the stdlib `resource` module is unavailable, `resource_usage`
+is recorded as an empty list.
+
+```python
+from microbench import MicroBench, MBResourceUsage
+
+class Bench(MicroBench, MBResourceUsage):
+    pass
+
+bench = Bench()
+
+@bench
+def work():
+    return list(range(1_000_000))
+
+work()
+```
+
+Python API record (one entry per timed iteration, no `maxrss`):
+
+```json
+{
+  "resource_usage": [
+    {
+      "utime": 0.052,
+      "stime": 0.003,
+      "minflt": 1024,
+      "majflt": 0,
+      "inblock": 0,
+      "oublock": 0,
+      "nvcsw": 2,
+      "nivcsw": 1
+    }
+  ]
+}
+```
+
+CLI record with `--iterations 2` (one entry per iteration, includes `maxrss`):
+
+```json
+{
+  "resource_usage": [
+    {
+      "utime": 0.068,
+      "stime": 0.029,
+      "maxrss": 11386880,
+      "minflt": 621,
+      "majflt": 0,
+      "inblock": 0,
+      "oublock": 0,
+      "nvcsw": 1,
+      "nivcsw": 2
+    },
+    {
+      "utime": 0.071,
+      "stime": 0.031,
+      "maxrss": 11386880,
+      "minflt": 618,
+      "majflt": 0,
+      "inblock": 0,
+      "oublock": 0,
+      "nvcsw": 1,
+      "nivcsw": 3
+    }
+  ]
+}
+```
+
+| Field | Modes | Description |
+|---|---|---|
+| `utime` | Both | User CPU time in seconds (float) |
+| `stime` | Both | System CPU time in seconds (float) |
+| `maxrss` | CLI only | Peak RSS in bytes (int) — see platform notes |
+| `minflt` | Both | Minor page faults — pages reclaimed without I/O (int) |
+| `majflt` | Both | Major page faults — pages requiring disk I/O (int) |
+| `inblock` | Both | Block input operations (int) — see platform notes |
+| `oublock` | Both | Block output operations (int) — see platform notes |
+| `nvcsw` | Both | Voluntary context switches (int) |
+| `nivcsw` | Both | Involuntary context switches (int) |
+
+All fields are before/after **deltas** so they reflect only the benchmarked
+work. `utime`, `stime`, `minflt`, `nvcsw`, and `nivcsw` are the most
+reliable across platforms.
+
+#### Platform notes and known quirks
+
+**`maxrss` — CLI mode with `os.wait4()` (all POSIX)**
+
+`os.wait4()` returns the exact rusage of each individual child process as
+reported by the kernel. `maxrss` is the child's own peak RSS, accurate
+regardless of iteration count or warmup. Values are normalised to bytes
+(Linux reports kilobytes; macOS already reports bytes).
+
+**`maxrss` — Python API mode (`RUSAGE_SELF`)**
+
+`RUSAGE_SELF.maxrss` is a lifetime high-water mark for the Python interpreter
+process. It is intentionally omitted. Use
+[`MBPeakMemory`](#mbpeakmemory) if you need per-call peak memory tracking.
+
+**`inblock` / `oublock` — macOS**
+
+These counters are **almost always zero on macOS**, even for substantial file
+I/O.  The macOS unified buffer cache charges block I/O to the *first* process
+that touches each page; subsequent reads and writes to cached pages are not
+counted against the process that performed them. In practice, nearly all file
+I/O is served from the cache and the counters never increment.
+
+This is a macOS kernel accounting limitation. It is documented in the
+[`getrusage(2)` man page](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/getrusage.2.html):
+*"The numbers ru_inblock and ru_oublock account only for real I/O; data
+supplied by the caching mechanism is charged only to the first process to
+read or write the data."*
+
+**`inblock` / `oublock` — Linux**
+
+On Linux these counters increment only for I/O that truly bypasses the page
+cache — cold-cache reads (first access to a file since it was last evicted)
+or writes with `O_DIRECT`. Warm-cache reads also show zero. Drop the page
+cache (`echo 3 > /proc/sys/vm/drop_caches` as root) before benchmarking if
+you need to measure true cold-cache I/O.
+
+**`majflt` — macOS**
+
+Major page faults are rare on macOS because the unified buffer cache handles
+most page-in activity. Zero is normal.
+
+**`utime`, `stime`, `minflt`, `nvcsw`, `nivcsw`**
+
+These are the most reliable fields across both Linux and macOS and are
+non-zero for any non-trivial workload.
+
+!!! note "Non-POSIX platforms"
+    When the Python `resource` module is unavailable, `resource_usage` is
+    recorded as an empty list without raising an error.
+
+**CLI:** `resource-usage` is a default mixin — no flags needed:
+
+```bash
+# Included automatically
+microbench --outfile results.jsonl -- ./run_simulation.sh
+
+# Explicit, if defaults have been overridden
+microbench --mixin resource-usage -- ./run_simulation.sh
+```
+
 ## Code provenance
 
 ### `MBGitInfo`
diff --git a/microbench/__init__.py b/microbench/__init__.py
index dc89281..a077fbd 100644
--- a/microbench/__init__.py
+++ b/microbench/__init__.py
@@ -43,6 +43,7 @@
     MBCgroupLimits,
     MBHostInfo,
     MBLoadedModules,
+    MBResourceUsage,
     MBSlurmInfo,
     MBWorkingDir,
 )
@@ -77,6 +78,7 @@
     'MBCgroupLimits',
     'MBGitInfo',
     'MBFileHash',
+    'MBResourceUsage',
     'MBGlobalPackages',
     'MBInstalledPackages',
     'MBCondaPackages',
diff --git a/microbench/cli/main.py b/microbench/cli/main.py
index ee5ea11..1e086ce 100644
--- a/microbench/cli/main.py
+++ b/microbench/cli/main.py
@@ -19,6 +19,7 @@
     'slurm-info',
     'loaded-modules',
     'working-dir',
+    'resource-usage',
 )
 
 
@@ -199,6 +200,8 @@ def capture_subprocess_reset(self, bm_data):
             self._subprocess_monitor = []
             self._subprocess_timed_out = False
             self._subprocess_timed_phase = True
+            # Reset resource-usage accumulator (populated by run() per iteration).
+            self._subprocess_resource_usage = []
 
         def capturepost_subprocess_result(self, bm_data):
             call = bm_data.setdefault('call', {})
@@ -275,6 +278,7 @@ def capturepost_subprocess_result(self, bm_data):
     bench._subprocess_stdout = []
     bench._subprocess_stderr = []
     bench._subprocess_monitor = []
+    bench._subprocess_resource_usage = []
     bench._subprocess_timed_out = False
     bench._subprocess_timed_phase = False  # becomes True after warmup
 
@@ -282,24 +286,18 @@ def capturepost_subprocess_result(self, bm_data):
     _real_stdout = sys.__stdout__
     _real_stderr = sys.__stderr__
 
+    # Lazy import: resource module is POSIX-only.
+    try:
+        import resource as _resource
+    except ImportError:
+        _resource = None
+
     def run():
         capture_stdout = args.stdout in _CAPTURE_CHOICES
         capture_stderr = args.stderr in _CAPTURE_CHOICES
         monitor_interval = args.monitor_interval
         timeout = args.timeout
 
-        if (
-            not capture_stdout
-            and not capture_stderr
-            and monitor_interval is None
-            and timeout is None
-        ):
-            result = subprocess.run(cmd)
-            bench._subprocess_returncodes.append(result.returncode)
-            return
-
-        # Use Popen so we have the PID (needed for monitoring) and can read
-        # stdout/stderr pipes in real time when capture is requested.
         stdout_chunks = []
         stderr_chunks = []
 
@@ -317,7 +315,8 @@ def _reader(pipe, chunks, real_stream, passthrough):
         if capture_stderr:
             popen_kwargs['stderr'] = subprocess.PIPE
 
-        with subprocess.Popen(cmd, **popen_kwargs) as proc:
+        proc = subprocess.Popen(cmd, **popen_kwargs)
+        try:
             threads = []
             if capture_stdout:
                 t = threading.Thread(
@@ -352,17 +351,65 @@ def _reader(pipe, chunks, real_stream, passthrough):
                 monitor_thread.start()
 
             timed_out = False
-            try:
-                proc.wait(timeout=timeout)
-            except subprocess.TimeoutExpired:
-                timed_out = True
-                proc.terminate()
+            child_rusage = None
+
+            if _resource is not None:
+                # os.wait4() reaps the child and returns its exact per-child rusage
+                # in a single syscall.  It must be called *instead* of proc.wait().
+                #
+                # Timeout handling: os.wait4() is a blocking call with no built-in
+                # deadline.  We handle it by running wait4 in a daemon thread and
+                # joining with a timeout.  If the join times out, we terminate/kill
+                # the child, then block on wait4 (the child is now dying so this
+                # completes quickly).
+                _wait4_result = [None]  # [(pid, status, rusage)]
+                _wait4_error = [None]
+
+                def _do_wait4():
+                    try:
+                        _wait4_result[0] = os.wait4(proc.pid, 0)
+                    except BaseException as exc:
+                        _wait4_error[0] = exc
+
+                _wait4_thread = threading.Thread(target=_do_wait4, daemon=True)
+                _wait4_thread.start()
+                _wait4_thread.join(timeout=timeout)
+
+                if _wait4_thread.is_alive():
+                    # Timed out — terminate/kill the child and wait for it to exit.
+                    timed_out = True
+                    proc.terminate()
+                    try:
+                        grace = args.timeout_grace_period or _SIGTERM_GRACE_PERIOD
+                        _wait4_thread.join(timeout=grace)
+                    except Exception:
+                        pass
+                    if _wait4_thread.is_alive():
+                        proc.kill()
+                        _wait4_thread.join()  # child is dead; this will return quickly
+
+                if _wait4_error[0] is not None:
+                    raise _wait4_error[0]
+
+                if _wait4_result[0] is None:  # pragma: no cover - defensive
+                    raise RuntimeError('os.wait4() returned no child status')
+
+                _, wait_status, raw_ru = _wait4_result[0]
+                proc.returncode = os.waitstatus_to_exitcode(wait_status)
+                child_rusage = raw_ru
+            else:
+                # No resource module available: use proc.wait() with optional timeout.
                 try:
-                    grace = args.timeout_grace_period or _SIGTERM_GRACE_PERIOD
-                    proc.wait(timeout=grace)
+                    proc.wait(timeout=timeout)
                 except subprocess.TimeoutExpired:
-                    proc.kill()
-                    proc.wait()
+                    timed_out = True
+                    proc.terminate()
+                    try:
+                        grace = args.timeout_grace_period or _SIGTERM_GRACE_PERIOD
+                        proc.wait(timeout=grace)
+                    except subprocess.TimeoutExpired:
+                        proc.kill()
+                        proc.wait()
 
             if monitor_thread is not None:
                 monitor_thread.stop()
@@ -372,13 +419,36 @@ def _reader(pipe, chunks, real_stream, passthrough):
             for t in threads:
                 t.join()
 
-        if timed_out and bench._subprocess_timed_phase:
-            bench._subprocess_timed_out = True
-        bench._subprocess_returncodes.append(proc.returncode)
-        if capture_stdout:
-            bench._subprocess_stdout.append(''.join(stdout_chunks))
-        if capture_stderr:
-            bench._subprocess_stderr.append(''.join(stderr_chunks))
+            if timed_out and bench._subprocess_timed_phase:
+                bench._subprocess_timed_out = True
+            bench._subprocess_returncodes.append(proc.returncode)
+            if capture_stdout:
+                bench._subprocess_stdout.append(''.join(stdout_chunks))
+            if capture_stderr:
+                bench._subprocess_stderr.append(''.join(stderr_chunks))
+
+            # Accumulate per-iteration resource usage (only during timed phase).
+            if bench._subprocess_timed_phase and child_rusage is not None:
+                from microbench.mixins.system import _rusage_from_wait4
+
+                bench._subprocess_resource_usage.append(
+                    _rusage_from_wait4(child_rusage)
+                )
+        except BaseException:
+            # Ensure the child is not left running as an orphan on any unexpected
+            # exception (including KeyboardInterrupt during a join).
+            # proc.kill() is safe after os.wait4() has already reaped the child
+            # (it sends SIGKILL to a dead PID which Popen silently ignores).
+            # proc.wait() is safe after os.wait4() too: returncode is already set.
+            proc.kill()
+            proc.wait()
+            raise
+        finally:
+            # Always close pipe FDs to prevent file-descriptor leaks.
+            if proc.stdout is not None:
+                proc.stdout.close()
+            if proc.stderr is not None:
+                proc.stderr.close()
 
     run.__name__ = os.path.basename(cmd[0])
     bench(run)()
diff --git a/microbench/cli/registry.py b/microbench/cli/registry.py
index 16b3149..0a70527 100644
--- a/microbench/cli/registry.py
+++ b/microbench/cli/registry.py
@@ -12,6 +12,7 @@
     MBCgroupLimits,
     MBHostInfo,
     MBLoadedModules,
+    MBResourceUsage,
     MBSlurmInfo,
     MBWorkingDir,
 )
@@ -28,6 +29,7 @@
     'cgroup-limits': MBCgroupLimits,
     'git-info': MBGitInfo,
     'file-hash': MBFileHash,
+    'resource-usage': MBResourceUsage,
     'installed-packages': MBInstalledPackages,
     'conda-packages': MBCondaPackages,
     'peak-memory': MBPeakMemory,
diff --git a/microbench/core/bench.py b/microbench/core/bench.py
index 7157d1b..752cc9c 100644
--- a/microbench/core/bench.py
+++ b/microbench/core/bench.py
@@ -248,8 +248,16 @@ def post_finish_triggers(self, bm_data):
 
     def pre_run_triggers(self, bm_data):
         bm_data['_run_start'] = self._duration_counter()
+        # Forward to mixin overrides via cooperative super() chaining.
+        parent = super()
+        if hasattr(parent, 'pre_run_triggers'):
+            parent.pre_run_triggers(bm_data)
 
     def post_run_triggers(self, bm_data):
+        # Forward to mixin overrides before recording the elapsed time.
+        parent = super()
+        if hasattr(parent, 'post_run_triggers'):
+            parent.post_run_triggers(bm_data)
         bm_data['call']['durations'].append(
             self._duration_counter() - bm_data['_run_start']
         )
diff --git a/microbench/mixins/system.py b/microbench/mixins/system.py
index 3e2e8e4..06ec298 100644
--- a/microbench/mixins/system.py
+++ b/microbench/mixins/system.py
@@ -1,6 +1,7 @@
 """System-info mixins.
 
-Classes: MBHostInfo, MBWorkingDir, MBSlurmInfo, MBCgroupLimits, MBLoadedModules.
+Classes: MBHostInfo, MBWorkingDir, MBSlurmInfo, MBCgroupLimits, MBLoadedModules,
+MBResourceUsage.
 """
 
 import os
@@ -12,6 +13,11 @@
 except ImportError:
     psutil = None
 
+try:
+    import resource as _resource
+except ImportError:
+    _resource = None
+
 
 class MBHostInfo:
     """Capture hostname, operating system, and (optionally) CPU and RAM info.
@@ -256,3 +262,216 @@ def capture_cgroup_limits(self, bm_data):
                 bm_data['cgroups'] = _read_cgroup_v1()
         except (OSError, ValueError, ZeroDivisionError):
             bm_data['cgroups'] = {}
+
+
+def _rusage_to_dict(ru, *, include_maxrss=True):
+    """Convert a ``struct_rusage`` to a plain dict with normalised fields.
+
+    ``maxrss`` is normalised to bytes: macOS already reports bytes; Linux
+    reports kilobytes and is multiplied by 1024.
+
+    Pass ``include_maxrss=False`` to omit ``maxrss`` (used in Python API mode
+    where ``RUSAGE_SELF.maxrss`` is a lifetime process high-water mark and
+    cannot meaningfully isolate a single function call).
+    """
+    d = {
+        'utime': ru.ru_utime,
+        'stime': ru.ru_stime,
+        'minflt': ru.ru_minflt,
+        'majflt': ru.ru_majflt,
+        'inblock': ru.ru_inblock,
+        'oublock': ru.ru_oublock,
+        'nvcsw': ru.ru_nvcsw,
+        'nivcsw': ru.ru_nivcsw,
+    }
+    if include_maxrss:
+        maxrss = ru.ru_maxrss
+        if sys.platform == 'linux':
+            maxrss *= 1024
+        d['maxrss'] = maxrss
+    return d
+
+
+def _rusage_from_wait4(raw_ru):
+    """Convert the raw rusage object returned by ``os.wait4()`` to a dict.
+
+    ``os.wait4()`` returns a ``resource.struct_rusage``-compatible object
+    whose ``ru_maxrss`` already reflects **only that child process** — no
+    cumulative HWM subtraction is needed.  ``maxrss`` is normalised to bytes
+    using the same platform rule as ``_rusage_to_dict``.
+    """
+    maxrss = raw_ru.ru_maxrss
+    if sys.platform == 'linux':
+        maxrss *= 1024
+    return {
+        'utime': raw_ru.ru_utime,
+        'stime': raw_ru.ru_stime,
+        'maxrss': maxrss,
+        'minflt': raw_ru.ru_minflt,
+        'majflt': raw_ru.ru_majflt,
+        'inblock': raw_ru.ru_inblock,
+        'oublock': raw_ru.ru_oublock,
+        'nvcsw': raw_ru.ru_nvcsw,
+        'nivcsw': raw_ru.ru_nivcsw,
+    }
+
+
+def _rusage_delta(before, after):
+    """Return ``after - before`` for all accumulator fields.
+
+    ``maxrss`` is only included when ``before`` has a ``maxrss`` key.
+    """
+    d = {
+        'utime': after['utime'] - before['utime'],
+        'stime': after['stime'] - before['stime'],
+        'minflt': after['minflt'] - before['minflt'],
+        'majflt': after['majflt'] - before['majflt'],
+        'inblock': after['inblock'] - before['inblock'],
+        'oublock': after['oublock'] - before['oublock'],
+        'nvcsw': after['nvcsw'] - before['nvcsw'],
+        'nivcsw': after['nivcsw'] - before['nivcsw'],
+    }
+    if 'maxrss' in before:
+        d['maxrss'] = after['maxrss'] - before['maxrss']
+    return d
+
+
+class MBResourceUsage:
+    """Capture POSIX ``getrusage()`` data for the benchmarked code.
+
+    Records CPU time, page faults, block I/O operations, and context switches.
+    Results are stored as a **list** of dicts, one entry per timed iteration
+    in both CLI and Python API mode, aligning index-for-index with
+    ``call.durations`` and ``call.returncode``.
+
+    **Modes**
+
+    - *CLI mode* (subprocess): on POSIX, uses ``os.wait4()`` to capture the
+      exact rusage of each child process as reported by the kernel — one entry
+      per timed iteration, aligned with ``call.durations`` and
+      ``call.returncode``.
+
+    - *Python API mode* (function): uses ``RUSAGE_SELF`` — one entry per
+      timed iteration (matching ``call.durations`` index-for-index), each a
+      before/after delta around that single call.  ``maxrss`` is **always
+      omitted** in this mode because ``RUSAGE_SELF.maxrss`` is a lifetime
+      process high-water mark; it reflects peak usage since program start,
+      not just the decorated function.  Use ``MBPeakMemory`` for per-call
+      peak RSS in Python API mode.
+
+    On non-POSIX platforms where the ``resource`` module is unavailable, this
+    mixin records an empty list without raising an error.
+
+    Output key: ``resource_usage`` (list of dicts)
+
+    Fields per entry (CLI mode with ``os.wait4()`` — the common POSIX case):
+
+    - ``utime``: user CPU time consumed by the child (seconds, float)
+    - ``stime``: system CPU time consumed by the child (seconds, float)
+    - ``maxrss``: peak RSS of the child in bytes (int) — see platform notes
+    - ``minflt``: minor page faults (int)
+    - ``majflt``: major page faults (int)
+    - ``inblock``: block input operations (int) — see platform notes
+    - ``oublock``: block output operations (int) — see platform notes
+    - ``nvcsw``: voluntary context switches (int)
+    - ``nivcsw``: involuntary context switches (int)
+
+    Fields per entry (Python API mode): all of the above **except** ``maxrss``.
+
+    **Platform notes**
+
+    *maxrss* (CLI mode with ``os.wait4()``):
+        Reported directly by the kernel as the child's own peak RSS.  Exact
+        and per-child regardless of iteration count or warmup.
+
+    *inblock / oublock* (macOS):
+        These counters are almost always zero on macOS regardless of actual
+        I/O performed.  The macOS unified buffer cache charges block I/O to
+        the *first* process that touches each page; subsequent reads and
+        writes to cached pages are not counted.  In practice nearly all file
+        I/O is served from the cache and the counters never increment.  This
+        is a kernel accounting limitation, not a microbench bug.  On Linux,
+        ``inblock`` and ``oublock`` increment only for I/O that bypasses the
+        page cache (i.e. cold-cache reads or ``O_DIRECT`` writes); warm-cache
+        reads also show zero.
+
+    *majflt* (macOS):
+        Major page faults are rare on macOS because the unified buffer cache
+        handles most page-in activity.  Zero is normal.
+
+    *utime / stime / minflt / nvcsw / nivcsw*:
+        These are the most reliable fields across both Linux and macOS and
+        should be non-zero for any non-trivial workload.
+
+    Example output (CLI mode, 2 iterations)::
+
+        {
+            "resource_usage": [
+                {
+                    "utime": 0.123456,
+                    "stime": 0.012345,
+                    "maxrss": 10485760,
+                    "minflt": 512,
+                    "majflt": 0,
+                    "inblock": 0,
+                    "oublock": 0,
+                    "nvcsw": 3,
+                    "nivcsw": 1
+                },
+                {
+                    "utime": 0.118000,
+                    "stime": 0.011000,
+                    "maxrss": 10485760,
+                    "minflt": 498,
+                    "majflt": 0,
+                    "inblock": 0,
+                    "oublock": 0,
+                    "nvcsw": 2,
+                    "nivcsw": 1
+                }
+            ]
+        }
+
+    Note:
+        CLI compatible.
+    """
+
+    def capture_resource_usage(self, bm_data):
+        """Initialise resource-usage accumulator before all iterations."""
+        if _resource is None:
+            return
+        if hasattr(self, '_subprocess_command'):
+            # CLI mode: accumulator populated by run() in main.py.
+            self._subprocess_resource_usage = []
+        else:
+            # Python API mode: per-iteration list populated by pre/post_run_triggers.
+            self._rusage_iter_entries = []
+
+    def pre_run_triggers(self, bm_data):
+        if _resource is not None and not hasattr(self, '_subprocess_command'):
+            self._rusage_iter_before = _rusage_to_dict(
+                _resource.getrusage(_resource.RUSAGE_SELF), include_maxrss=False
+            )
+
+    def post_run_triggers(self, bm_data):
+        if _resource is not None and not hasattr(self, '_subprocess_command'):
+            after = _rusage_to_dict(
+                _resource.getrusage(_resource.RUSAGE_SELF), include_maxrss=False
+            )
+            self._rusage_iter_entries.append(
+                _rusage_delta(self._rusage_iter_before, after)
+            )
+
+    def capturepost_resource_usage(self, bm_data):
+        """Write the resource_usage list to bm_data after all iterations."""
+        if _resource is None:
+            bm_data['resource_usage'] = []
+            return
+        if hasattr(self, '_subprocess_command'):
+            # CLI mode: list already populated by run() in main.py.
+            bm_data['resource_usage'] = list(
+                getattr(self, '_subprocess_resource_usage', [])
+            )
+        else:
+            # Python API mode: one entry per timed iteration.
+            bm_data['resource_usage'] = list(getattr(self, '_rusage_iter_entries', []))
diff --git a/tests/test_cli.py b/tests/test_cli.py
index 4a0f662..1d899d8 100644
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -2,6 +2,7 @@
 import json
 import os
 import subprocess
+import sys
 from unittest.mock import MagicMock, patch
 
 import pytest
@@ -10,30 +11,99 @@
 from microbench.__main__ import main
 
 
-def _make_mock_popen(returncode=0, stdout_lines=None, stderr_lines=None):
-    """Create a mock Popen process for --stdout/--stderr capture tests."""
+class _FakeRusage:
+    def __init__(
+        self,
+        *,
+        utime=0.0,
+        stime=0.0,
+        maxrss=1,
+        minflt=0,
+        majflt=0,
+        inblock=0,
+        oublock=0,
+        nvcsw=0,
+        nivcsw=0,
+    ):
+        self.ru_utime = utime
+        self.ru_stime = stime
+        self.ru_maxrss = maxrss
+        self.ru_minflt = minflt
+        self.ru_majflt = majflt
+        self.ru_inblock = inblock
+        self.ru_oublock = oublock
+        self.ru_nvcsw = nvcsw
+        self.ru_nivcsw = nivcsw
+
+
+class _MockPipe:
+    """Iterable pipe mock that supports .close(), as real subprocess pipes do."""
+
+    def __init__(self, lines):
+        self._iter = iter(lines)
+        self.closed = False
+
+    def __iter__(self):
+        return self._iter
+
+    def close(self):
+        self.closed = True
+
+
+def _make_mock_popen(returncode=0, stdout_lines=None, stderr_lines=None, pid=12345):
+    """Create a mock Popen process."""
     mock_proc = MagicMock()
-    mock_proc.__enter__.return_value = mock_proc
-    mock_proc.__exit__.return_value = False
     mock_proc.returncode = returncode
-    mock_proc.stdout = iter(stdout_lines) if stdout_lines is not None else None
-    mock_proc.stderr = iter(stderr_lines) if stderr_lines is not None else None
+    mock_proc.pid = pid
+    mock_proc.stdout = _MockPipe(stdout_lines) if stdout_lines is not None else None
+    mock_proc.stderr = _MockPipe(stderr_lines) if stderr_lines is not None else None
     return mock_proc
 
 
 def _run_main(argv, mock_returncode=0):
     """Run main() with a mocked subprocess and captured stdout."""
-    mock_result = MagicMock()
-    mock_result.returncode = mock_returncode
-    mock_result.stdout = None
-    mock_result.stderr = None
+    mock_proc = _make_mock_popen(returncode=mock_returncode)
 
     buf = io.StringIO()
-    with patch('subprocess.run', return_value=mock_result) as mock_run:
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit) as exc:
-                main(argv)
-    return exc.value.code, json.loads(buf.getvalue()), mock_run
+    wait_status = 0 if mock_returncode == 0 else mock_returncode << 8
+    fake_wait4 = (mock_proc.pid, wait_status, _FakeRusage())
+    with patch('subprocess.Popen', return_value=mock_proc) as mock_popen:
+        with patch('os.wait4', return_value=fake_wait4):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit) as exc:
+                    main(argv)
+    return exc.value.code, json.loads(buf.getvalue()), mock_popen
+
+
+def _patch_wait4_success(mock_proc, *, rusage=None, status=0):
+    if rusage is None:
+        rusage = _FakeRusage()
+    return patch('os.wait4', return_value=(mock_proc.pid, status, rusage))
+
+
+def _patch_wait4_sleep(mock_proc, delay, *, status=0, rusage=None):
+    if rusage is None:
+        rusage = _FakeRusage()
+
+    def _wait4(pid, options):
+        import time
+
+        time.sleep(delay)
+        return (mock_proc.pid, status, rusage)
+
+    return patch('os.wait4', side_effect=_wait4)
+
+
+def _patch_wait4_sequence(*results):
+    iterator = iter(results)
+
+    def _wait4(pid, options):
+        result = next(iterator)
+        if callable(result):
+            return result(pid, options)
+        return result
+
+    return patch('os.wait4', side_effect=_wait4)
 
 
 def test_cli_records_command_and_timing():
@@ -176,14 +246,12 @@ def test_cli_mixin_defaults_keyword_invalid_extra():
 def test_cli_outfile(tmp_path):
     """--outfile writes JSONL to the specified file."""
     outfile = tmp_path / 'results.jsonl'
-    mock_result = MagicMock()
-    mock_result.returncode = 0
-    mock_result.stdout = None
-    mock_result.stderr = None
+    mock_proc = _make_mock_popen(returncode=0)
 
-    with patch('subprocess.run', return_value=mock_result):
-        with pytest.raises(SystemExit):
-            main(['--outfile', str(outfile), '--', 'true'])
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with pytest.raises(SystemExit):
+                main(['--outfile', str(outfile), '--', 'true'])
 
     record = json.loads(outfile.read_text())
     assert record['call']['command'] == ['true']
@@ -235,12 +303,16 @@ def test_cli_all_flag_includes_all_mixins():
     all_names = set(_get_mixin_map())
 
     # Patch every mixin's capture methods to no-ops to avoid external calls
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        buf = io.StringIO()
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit):
-                with patch('subprocess.check_output', side_effect=Exception('skip')):
-                    main(['--all', '--', 'true'])
+    mock_proc = _make_mock_popen(returncode=0)
+    buf = io.StringIO()
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit):
+                    with patch(
+                        'subprocess.check_output', side_effect=Exception('skip')
+                    ):
+                        main(['--all', '--', 'true'])
 
     # At minimum the record should be written (even with capture_optional errors)
     record = json.loads(buf.getvalue())
@@ -265,55 +337,60 @@ def test_cli_includes_mb_run_id_and_version():
 
 def test_cli_double_dash_separator():
     """-- separator is stripped before passing the command to subprocess."""
-    _, _, mock_run = _run_main(['--', 'echo', 'hello'])
+    _, _, mock_popen = _run_main(['--', 'echo', 'hello'])
 
-    mock_run.assert_called_once()
-    called_cmd = mock_run.call_args[0][0]
+    mock_popen.assert_called_once()
+    called_cmd = mock_popen.call_args[0][0]
     assert called_cmd == ['echo', 'hello']
 
 
 def test_cli_iterations():
     """--iterations N runs the command N times and produces N durations entries."""
-    _, record, mock_run = _run_main(['--iterations', '3', '--', 'true'])
+    _, record, mock_popen = _run_main(['--iterations', '3', '--', 'true'])
 
-    assert mock_run.call_count == 3
+    assert mock_popen.call_count == 3
     assert len(record['call']['durations']) == 3
     assert len(record['call']['returncode']) == 3
 
 
 def test_cli_warmup():
     """--warmup N runs the command N extra times before timing begins."""
-    _, record, mock_run = _run_main(['--warmup', '2', '--', 'true'])
+    _, record, mock_popen = _run_main(['--warmup', '2', '--', 'true'])
 
     # 2 warmup calls + 1 timed call
-    assert mock_run.call_count == 3
+    assert mock_popen.call_count == 3
     assert len(record['call']['durations']) == 1
     assert len(record['call']['returncode']) == 1
 
 
 def test_cli_iterations_and_warmup():
     """--iterations and --warmup together produce the right call count."""
-    _, record, mock_run = _run_main(
+    _, record, mock_popen = _run_main(
         ['--iterations', '4', '--warmup', '2', '--', 'true']
     )
 
-    assert mock_run.call_count == 6
+    assert mock_popen.call_count == 6
     assert len(record['call']['durations']) == 4
     assert len(record['call']['returncode']) == 4
 
 
 def test_cli_returncode_is_first_nonzero_across_iterations():
     """Process exits with the first non-zero returncode seen across timed iterations."""
-    mock_results = [
-        MagicMock(returncode=0, stdout=None, stderr=None),
-        MagicMock(returncode=2, stdout=None, stderr=None),
-        MagicMock(returncode=1, stdout=None, stderr=None),
+    mock_procs = [
+        _make_mock_popen(returncode=0),
+        _make_mock_popen(returncode=2),
+        _make_mock_popen(returncode=1),
     ]
     buf = io.StringIO()
-    with patch('subprocess.run', side_effect=mock_results):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit) as exc:
-                main(['--iterations', '3', '--', 'true'])
+    with patch('subprocess.Popen', side_effect=mock_procs):
+        with _patch_wait4_sequence(
+            (mock_procs[0].pid, 0, _FakeRusage()),
+            (mock_procs[1].pid, 2 << 8, _FakeRusage()),
+            (mock_procs[2].pid, 1 << 8, _FakeRusage()),
+        ):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit) as exc:
+                    main(['--iterations', '3', '--', 'true'])
 
     assert exc.value.code == 2
     assert json.loads(buf.getvalue())['call']['returncode'] == [0, 2, 1]
@@ -321,16 +398,21 @@ def test_cli_returncode_is_first_nonzero_across_iterations():
 
 def test_cli_returncode_preserves_first_nonzero_even_if_later_is_larger():
     """A later larger returncode does not override the first failure."""
-    mock_results = [
-        MagicMock(returncode=0, stdout=None, stderr=None),
-        MagicMock(returncode=1, stdout=None, stderr=None),
-        MagicMock(returncode=2, stdout=None, stderr=None),
+    mock_procs = [
+        _make_mock_popen(returncode=0),
+        _make_mock_popen(returncode=1),
+        _make_mock_popen(returncode=2),
     ]
     buf = io.StringIO()
-    with patch('subprocess.run', side_effect=mock_results):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit) as exc:
-                main(['--iterations', '3', '--', 'true'])
+    with patch('subprocess.Popen', side_effect=mock_procs):
+        with _patch_wait4_sequence(
+            (mock_procs[0].pid, 0, _FakeRusage()),
+            (mock_procs[1].pid, 1 << 8, _FakeRusage()),
+            (mock_procs[2].pid, 2 << 8, _FakeRusage()),
+        ):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit) as exc:
+                    main(['--iterations', '3', '--', 'true'])
 
     assert exc.value.code == 1
     assert json.loads(buf.getvalue())['call']['returncode'] == [0, 1, 2]
@@ -338,16 +420,21 @@ def test_cli_returncode_preserves_first_nonzero_even_if_later_is_larger():
 
 def test_cli_returncode_preserves_first_nonzero_signal_code():
     """Negative subprocess returncodes are also returned if they are first non-zero."""
-    mock_results = [
-        MagicMock(returncode=0, stdout=None, stderr=None),
-        MagicMock(returncode=-15, stdout=None, stderr=None),
-        MagicMock(returncode=1, stdout=None, stderr=None),
+    mock_procs = [
+        _make_mock_popen(returncode=0),
+        _make_mock_popen(returncode=-15),
+        _make_mock_popen(returncode=1),
     ]
     buf = io.StringIO()
-    with patch('subprocess.run', side_effect=mock_results):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit) as exc:
-                main(['--iterations', '3', '--', 'true'])
+    with patch('subprocess.Popen', side_effect=mock_procs):
+        with _patch_wait4_sequence(
+            (mock_procs[0].pid, 0, _FakeRusage()),
+            (mock_procs[1].pid, 15, _FakeRusage()),
+            (mock_procs[2].pid, 1 << 8, _FakeRusage()),
+        ):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit) as exc:
+                    main(['--iterations', '3', '--', 'true'])
 
     assert exc.value.code == -15
     assert json.loads(buf.getvalue())['call']['returncode'] == [0, -15, 1]
@@ -363,12 +450,16 @@ def test_cli_multiple_mixins():
 
 def test_cli_all_overrides_mixin():
     """--all takes precedence over --mixin."""
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        buf = io.StringIO()
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit):
-                with patch('subprocess.check_output', side_effect=Exception('skip')):
-                    main(['--mixin', 'MBHostInfo', '--all', '--', 'true'])
+    mock_proc = _make_mock_popen(returncode=0)
+    buf = io.StringIO()
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit):
+                    with patch(
+                        'subprocess.check_output', side_effect=Exception('skip')
+                    ):
+                        main(['--mixin', 'MBHostInfo', '--all', '--', 'true'])
 
     record = json.loads(buf.getvalue())
     # --all should activate every mixin, so slurm (from MBSlurmInfo) must be present
@@ -417,10 +508,11 @@ def test_cli_capture_stdout_records_output():
     buf = io.StringIO()
     terminal = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc) as mock_popen:
-        with patch('sys.stdout', buf):
-            with patch('sys.__stdout__', terminal):
-                with pytest.raises(SystemExit):
-                    main(['--stdout', '--', 'echo', 'hello'])
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stdout__', terminal):
+                    with pytest.raises(SystemExit):
+                        main(['--stdout', '--', 'echo', 'hello'])
 
     record = json.loads(buf.getvalue())
     assert record['call']['stdout'] == ['hello\n']
@@ -436,10 +528,11 @@ def test_cli_capture_stdout_suppress():
     buf = io.StringIO()
     terminal = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with patch('sys.__stdout__', terminal):
-                with pytest.raises(SystemExit):
-                    main(['--stdout=suppress', '--', 'echo', 'hello'])
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stdout__', terminal):
+                    with pytest.raises(SystemExit):
+                        main(['--stdout=suppress', '--', 'echo', 'hello'])
 
     record = json.loads(buf.getvalue())
     assert record['call']['stdout'] == ['hello\n']
@@ -453,10 +546,11 @@ def test_cli_capture_stderr_records_output():
     buf = io.StringIO()
     terminal_err = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with patch('sys.__stderr__', terminal_err):
-                with pytest.raises(SystemExit):
-                    main(['--stderr', '--', 'cmd'])
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stderr__', terminal_err):
+                    with pytest.raises(SystemExit):
+                        main(['--stderr', '--', 'cmd'])
 
     record = json.loads(buf.getvalue())
     assert record['call']['stderr'] == ['warning\n']
@@ -471,10 +565,11 @@ def test_cli_capture_stderr_suppress():
     buf = io.StringIO()
     terminal_err = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with patch('sys.__stderr__', terminal_err):
-                with pytest.raises(SystemExit):
-                    main(['--stderr=suppress', '--', 'cmd'])
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stderr__', terminal_err):
+                    with pytest.raises(SystemExit):
+                        main(['--stderr=suppress', '--', 'cmd'])
 
     record = json.loads(buf.getvalue())
     assert record['call']['stderr'] == ['warning\n']
@@ -492,12 +587,26 @@ def test_cli_capture_stdout_multiple_iterations():
 
     buf = io.StringIO()
     with patch('subprocess.Popen', side_effect=mock_procs):
-        with patch('sys.stdout', buf):
-            with patch('sys.__stdout__', io.StringIO()):
-                with pytest.raises(SystemExit):
-                    main(
-                        ['--stdout', '--warmup', '1', '--iterations', '3', '--', 'cmd']
-                    )
+        with _patch_wait4_sequence(
+            (mock_procs[0].pid, 0, _FakeRusage()),
+            (mock_procs[1].pid, 0, _FakeRusage()),
+            (mock_procs[2].pid, 0, _FakeRusage()),
+            (mock_procs[3].pid, 0, _FakeRusage()),
+        ):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stdout__', io.StringIO()):
+                    with pytest.raises(SystemExit):
+                        main(
+                            [
+                                '--stdout',
+                                '--warmup',
+                                '1',
+                                '--iterations',
+                                '3',
+                                '--',
+                                'cmd',
+                            ]
+                        )
 
     record = json.loads(buf.getvalue())
     assert record['call']['stdout'] == ['run1\n', 'run2\n', 'run3\n']
@@ -512,11 +621,12 @@ def test_cli_capture_stdout_and_stderr():
 
     buf = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with patch('sys.__stdout__', io.StringIO()):
-                with patch('sys.__stderr__', io.StringIO()):
-                    with pytest.raises(SystemExit):
-                        main(['--stdout', '--stderr', '--', 'cmd'])
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stdout__', io.StringIO()):
+                    with patch('sys.__stderr__', io.StringIO()):
+                        with pytest.raises(SystemExit):
+                            main(['--stdout', '--stderr', '--', 'cmd'])
 
     record = json.loads(buf.getvalue())
     assert record['call']['stdout'] == ['out\n']
@@ -614,9 +724,10 @@ def _run_main_with_monitor(argv, mock_pid=12345, mock_returncode=0, fake_samples
 
         buf = io.StringIO()
         with patch('subprocess.Popen', return_value=mock_proc):
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit):
-                    main(argv)
+            with _patch_wait4_success(mock_proc):
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit):
+                        main(argv)
 
     return json.loads(buf.getvalue()), MockThread, mock_thread
 
@@ -695,19 +806,20 @@ def make_thread(pid, interval):
     buf = io.StringIO()
     with patch('microbench.cli.main._SubprocessMonitorThread', side_effect=make_thread):
         with patch('subprocess.Popen', return_value=mock_proc):
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit):
-                    main(
-                        [
-                            '--no-mixin',
-                            '--monitor-interval',
-                            '5',
-                            '--iterations',
-                            '3',
-                            '--',
-                            'cmd',
-                        ]
-                    )
+            with _patch_wait4_success(mock_proc):
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit):
+                        main(
+                            [
+                                '--no-mixin',
+                                '--monitor-interval',
+                                '5',
+                                '--iterations',
+                                '3',
+                                '--',
+                                'cmd',
+                            ]
+                        )
 
     record = json.loads(buf.getvalue())
     assert len(record['call']['monitor']) == 3
@@ -732,21 +844,22 @@ def make_thread(pid, interval):
     buf = io.StringIO()
     with patch('microbench.cli.main._SubprocessMonitorThread', side_effect=make_thread):
         with patch('subprocess.Popen', return_value=mock_proc):
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit):
-                    main(
-                        [
-                            '--no-mixin',
-                            '--monitor-interval',
-                            '5',
-                            '--warmup',
-                            '2',
-                            '--iterations',
-                            '2',
-                            '--',
-                            'cmd',
-                        ]
-                    )
+            with _patch_wait4_success(mock_proc):
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit):
+                        main(
+                            [
+                                '--no-mixin',
+                                '--monitor-interval',
+                                '5',
+                                '--warmup',
+                                '2',
+                                '--iterations',
+                                '2',
+                                '--',
+                                'cmd',
+                            ]
+                        )
 
     record = json.loads(buf.getvalue())
     assert len(record['call']['monitor']) == 2
@@ -778,7 +891,7 @@ def test_cli_monitor_interval_requires_psutil():
 def test_cli_monitor_interval_with_stdout_capture():
     """--monitor-interval and --stdout can be combined."""
     mock_proc = _make_mock_popen_for_monitor(pid=55)
-    mock_proc.stdout = iter([b'hello\n'])
+    mock_proc.stdout = _MockPipe([b'hello\n'])
     mock_proc.stderr = None
     fake_samples = [{'timestamp': 'T', 'cpu_percent': 8.0, 'rss_bytes': 2048}]
 
@@ -788,19 +901,20 @@ def test_cli_monitor_interval_with_stdout_capture():
         mock_thread.samples = fake_samples
         MockThread.return_value = mock_thread
         with patch('subprocess.Popen', return_value=mock_proc):
-            with patch('sys.stdout', buf):
-                with patch('sys.__stdout__', io.StringIO()):
-                    with pytest.raises(SystemExit):
-                        main(
-                            [
-                                '--no-mixin',
-                                '--monitor-interval',
-                                '5',
-                                '--stdout',
-                                '--',
-                                'cmd',
-                            ]
-                        )
+            with _patch_wait4_success(mock_proc):
+                with patch('sys.stdout', buf):
+                    with patch('sys.__stdout__', io.StringIO()):
+                        with pytest.raises(SystemExit):
+                            main(
+                                [
+                                    '--no-mixin',
+                                    '--monitor-interval',
+                                    '5',
+                                    '--stdout',
+                                    '--',
+                                    'cmd',
+                                ]
+                            )
 
     record = json.loads(buf.getvalue())
     assert record['call']['stdout'] == ['hello\n']
@@ -818,14 +932,16 @@ def _run_main_http(argv, fake_status=200):
     mock_response.__enter__ = lambda s: s
     mock_response.__exit__ = MagicMock(return_value=False)
 
+    mock_proc = _make_mock_popen(returncode=0)
     buf = io.StringIO()
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        with patch(
-            'urllib.request.urlopen', return_value=mock_response
-        ) as mock_urlopen:
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit) as exc:
-                    main(argv)
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch(
+                'urllib.request.urlopen', return_value=mock_response
+            ) as mock_urlopen:
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit) as exc:
+                        main(argv)
     return exc.value.code, mock_urlopen
 
 
@@ -848,11 +964,21 @@ def test_cli_http_output_no_stdout_record():
     mock_response = MagicMock()
     mock_response.__enter__ = lambda s: s
     mock_response.__exit__ = MagicMock(return_value=False)
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        with patch('urllib.request.urlopen', return_value=mock_response):
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit):
-                    main(['--no-mixin', '--http-output', 'https://x.com', '--', 'true'])
+    mock_proc = _make_mock_popen(returncode=0)
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch('urllib.request.urlopen', return_value=mock_response):
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit):
+                        main(
+                            [
+                                '--no-mixin',
+                                '--http-output',
+                                'https://x.com',
+                                '--',
+                                'true',
+                            ]
+                        )
     assert buf.getvalue() == ''
 
 
@@ -862,22 +988,24 @@ def test_cli_http_output_and_outfile(tmp_path):
     mock_response = MagicMock()
     mock_response.__enter__ = lambda s: s
     mock_response.__exit__ = MagicMock(return_value=False)
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        with patch(
-            'urllib.request.urlopen', return_value=mock_response
-        ) as mock_urlopen:
-            with pytest.raises(SystemExit):
-                main(
-                    [
-                        '--no-mixin',
-                        '--outfile',
-                        str(outfile),
-                        '--http-output',
-                        'https://x.com',
-                        '--',
-                        'true',
-                    ]
-                )
+    mock_proc = _make_mock_popen(returncode=0)
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch(
+                'urllib.request.urlopen', return_value=mock_response
+            ) as mock_urlopen:
+                with pytest.raises(SystemExit):
+                    main(
+                        [
+                            '--no-mixin',
+                            '--outfile',
+                            str(outfile),
+                            '--http-output',
+                            'https://x.com',
+                            '--',
+                            'true',
+                        ]
+                    )
     assert outfile.exists()
     assert json.loads(outfile.read_text())['call']['name'] == 'true'
     mock_urlopen.assert_called_once()
@@ -996,12 +1124,14 @@ def _run_main_redis(argv):
     mock_client.rpush.side_effect = lambda key, val: redis_store.append(val)
     mock_redis = MagicMock()
     mock_redis.StrictRedis.return_value = mock_client
+    mock_proc = _make_mock_popen(returncode=0)
     buf = io.StringIO()
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        with patch.dict('sys.modules', {'redis': mock_redis}):
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit) as exc:
-                    main(argv)
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch.dict('sys.modules', {'redis': mock_redis}):
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit) as exc:
+                        main(argv)
     return exc.value.code, mock_redis, mock_client, redis_store
 
 
@@ -1031,13 +1161,21 @@ def test_cli_redis_output_no_stdout_record():
     mock_client = MagicMock()
     mock_redis = MagicMock()
     mock_redis.StrictRedis.return_value = mock_client
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        with patch.dict('sys.modules', {'redis': mock_redis}):
-            with patch('sys.stdout', buf):
-                with pytest.raises(SystemExit):
-                    main(
-                        ['--no-mixin', '--redis-output', 'bench:results', '--', 'true']
-                    )
+    mock_proc = _make_mock_popen(returncode=0)
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch.dict('sys.modules', {'redis': mock_redis}):
+                with patch('sys.stdout', buf):
+                    with pytest.raises(SystemExit):
+                        main(
+                            [
+                                '--no-mixin',
+                                '--redis-output',
+                                'bench:results',
+                                '--',
+                                'true',
+                            ]
+                        )
     assert buf.getvalue() == ''
 
 
@@ -1047,20 +1185,22 @@ def test_cli_redis_output_and_outfile(tmp_path):
     mock_client = MagicMock()
     mock_redis = MagicMock()
     mock_redis.StrictRedis.return_value = mock_client
-    with patch('subprocess.run', return_value=MagicMock(returncode=0)):
-        with patch.dict('sys.modules', {'redis': mock_redis}):
-            with pytest.raises(SystemExit):
-                main(
-                    [
-                        '--no-mixin',
-                        '--outfile',
-                        str(outfile),
-                        '--redis-output',
-                        'bench:results',
-                        '--',
-                        'true',
-                    ]
-                )
+    mock_proc = _make_mock_popen(returncode=0)
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch.dict('sys.modules', {'redis': mock_redis}):
+                with pytest.raises(SystemExit):
+                    main(
+                        [
+                            '--no-mixin',
+                            '--outfile',
+                            str(outfile),
+                            '--redis-output',
+                            'bench:results',
+                            '--',
+                            'true',
+                        ]
+                    )
     assert outfile.exists()
     assert json.loads(outfile.read_text())['call']['name'] == 'true'
     mock_client.rpush.assert_called_once()
@@ -1414,9 +1554,10 @@ def test_cli_timeout_not_exceeded():
 
     buf = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit) as exc:
-                main(['--no-mixin', '--timeout', '30', '--', 'sleep', '1'])
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit) as exc:
+                    main(['--no-mixin', '--timeout', '30', '--', 'sleep', '1'])
     assert exc.value.code == 0
     record = json.loads(buf.getvalue())
     assert 'timed_out' not in record['call']
@@ -1427,17 +1568,13 @@ def test_cli_timeout_sigterm_sufficient():
     """--timeout: process exits after SIGTERM; SIGKILL is not sent."""
     mock_proc = _make_mock_popen()
     mock_proc.returncode = -15  # killed by SIGTERM
-    mock_proc.wait.side_effect = [
-        subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5),  # main timeout
-        None,  # exits cleanly after SIGTERM (within grace period)
-        None,  # called again by Popen.__exit__
-    ]
 
     buf = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit):
-                main(['--no-mixin', '--timeout', '5', '--', 'sleep', '100'])
+        with _patch_wait4_sleep(mock_proc, 0.05, status=15):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit):
+                    main(['--no-mixin', '--timeout', '0.001', '--', 'sleep', '100'])
     record = json.loads(buf.getvalue())
     assert record['call']['timed_out'] is True
     mock_proc.terminate.assert_called_once()
@@ -1448,18 +1585,24 @@ def test_cli_timeout_sigkill_required():
     """--timeout: SIGKILL sent when process ignores SIGTERM past the grace period."""
     mock_proc = _make_mock_popen()
     mock_proc.returncode = -9  # killed by SIGKILL
-    mock_proc.wait.side_effect = [
-        subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5),  # main timeout
-        subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5),  # grace period
-        None,  # exits after SIGKILL
-        None,  # called again by Popen.__exit__
-    ]
 
     buf = io.StringIO()
     with patch('subprocess.Popen', return_value=mock_proc):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit):
-                main(['--no-mixin', '--timeout', '5', '--', 'sleep', '100'])
+        with _patch_wait4_sleep(mock_proc, 0.05, status=9):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit):
+                    main(
+                        [
+                            '--no-mixin',
+                            '--timeout',
+                            '0.001',
+                            '--timeout-grace-period',
+                            '0.001',
+                            '--',
+                            'sleep',
+                            '100',
+                        ]
+                    )
     record = json.loads(buf.getvalue())
     assert record['call']['timed_out'] is True
     mock_proc.terminate.assert_called_once()
@@ -1470,31 +1613,36 @@ def test_cli_timeout_during_warmup_does_not_set_timed_out():
     """A timeout in warmup is discarded along with other warmup-only state."""
     warmup_proc = _make_mock_popen()
     warmup_proc.returncode = -15
-    warmup_proc.wait.side_effect = [
-        subprocess.TimeoutExpired(cmd=['sleep', '100'], timeout=5),
-        None,
-        None,
-    ]
 
     timed_proc = _make_mock_popen()
     timed_proc.returncode = 0
 
+    def warmup_wait4(pid, options):
+        import time
+
+        time.sleep(0.05)
+        return (warmup_proc.pid, 15, _FakeRusage())
+
     buf = io.StringIO()
     with patch('subprocess.Popen', side_effect=[warmup_proc, timed_proc]):
-        with patch('sys.stdout', buf):
-            with pytest.raises(SystemExit) as exc:
-                main(
-                    [
-                        '--no-mixin',
-                        '--warmup',
-                        '1',
-                        '--timeout',
-                        '5',
-                        '--',
-                        'sleep',
-                        '100',
-                    ]
-                )
+        with _patch_wait4_sequence(
+            warmup_wait4,
+            (timed_proc.pid, 0, _FakeRusage()),
+        ):
+            with patch('sys.stdout', buf):
+                with pytest.raises(SystemExit) as exc:
+                    main(
+                        [
+                            '--no-mixin',
+                            '--warmup',
+                            '1',
+                            '--timeout',
+                            '0.001',
+                            '--',
+                            'sleep',
+                            '100',
+                        ]
+                    )
 
     assert exc.value.code == 0
     record = json.loads(buf.getvalue())
@@ -1502,6 +1650,43 @@ def test_cli_timeout_during_warmup_does_not_set_timed_out():
     assert 'timed_out' not in record['call']
 
 
+def test_cli_keyboard_interrupt_kills_child():
+    """KeyboardInterrupt during wait4 join causes proc.kill() and re-raises."""
+    mock_proc = _make_mock_popen()
+
+    def _wait4_interrupt(pid, options):
+        raise KeyboardInterrupt
+
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with patch('os.wait4', side_effect=_wait4_interrupt):
+            with pytest.raises(KeyboardInterrupt):
+                main(['--no-mixin', '--', 'sleep', '100'])
+
+    mock_proc.kill.assert_called()
+    mock_proc.wait.assert_called()
+
+
+def test_cli_pipe_fds_closed_after_run():
+    """stdout and stderr pipe FDs are closed after a normal run."""
+    mock_proc = _make_mock_popen(
+        returncode=0,
+        stdout_lines=[b'out\n'],
+        stderr_lines=[b'err\n'],
+    )
+
+    buf = io.StringIO()
+    with patch('subprocess.Popen', return_value=mock_proc):
+        with _patch_wait4_success(mock_proc):
+            with patch('sys.stdout', buf):
+                with patch('sys.__stdout__', io.StringIO()):
+                    with patch('sys.__stderr__', io.StringIO()):
+                        with pytest.raises(SystemExit):
+                            main(['--stdout', '--stderr', '--', 'cmd'])
+
+    assert mock_proc.stdout.closed
+    assert mock_proc.stderr.closed
+
+
 def test_cli_version(capsys):
     """--version prints the package version and exits 0."""
     with pytest.raises(SystemExit) as exc:
@@ -1729,3 +1914,118 @@ def test_cli_show_mixins_includes_nvidia_flags():
     output = buf.getvalue()
     assert '--nvidia-attributes' in output
     assert '--nvidia-gpus' in output
+
+
+# ---------------------------------------------------------------------------
+# MBResourceUsage — CLI tests
+# ---------------------------------------------------------------------------
+
+_RUSAGE_FIELDS = frozenset(
+    {
+        'utime',
+        'stime',
+        'maxrss',
+        'minflt',
+        'majflt',
+        'inblock',
+        'oublock',
+        'nvcsw',
+        'nivcsw',
+    }
+)
+
+
+def test_cli_resource_usage_in_defaults():
+    """resource-usage is included in the default mixin set."""
+    _, record, _ = _run_main(['--', 'true'])
+    assert 'resource_usage' in record
+
+
+def test_cli_resource_usage_fields_present():
+    """resource-usage records all expected fields."""
+    _, record, _ = _run_main(['--mixin', 'resource-usage', '--', 'true'])
+    ru_list = record.get('resource_usage', [])
+    assert isinstance(ru_list, list)
+    assert len(ru_list) == 1
+    assert set(ru_list[0].keys()) == _RUSAGE_FIELDS
+
+
+def test_cli_resource_usage_values_are_numeric():
+    """resource-usage field values are all numbers (int or float)."""
+    _, record, _ = _run_main(['--mixin', 'resource-usage', '--', 'true'])
+    ru_list = record.get('resource_usage', [])
+    assert len(ru_list) == 1
+    for field, value in ru_list[0].items():
+        assert isinstance(value, int | float), (
+            f'{field}: expected number, got {type(value)}'
+        )
+
+
+def test_cli_resource_usage_no_mixins_omits_field():
+    """--no-mixin produces a record without resource_usage."""
+    _, record, _ = _run_main(['--no-mixin', '--', 'true'])
+    assert 'resource_usage' not in record
+
+
+def test_cli_show_mixins_includes_resource_usage():
+    """--show-mixins lists resource-usage with a default marker."""
+    buf = io.StringIO()
+    with patch('sys.stdout', buf):
+        with pytest.raises(SystemExit):
+            main(['--show-mixins'])
+    output = buf.getvalue()
+    assert 'resource-usage' in output
+    # resource-usage is a default mixin — should be starred
+    lines = [line for line in output.splitlines() if 'resource-usage' in line]
+    assert lines, 'resource-usage not found in --show-mixins output'
+    assert lines[0].startswith('  *'), 'resource-usage should be marked as default (*)'
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_cli_resource_usage_real_subprocess_maxrss():
+    """resource-usage records a positive maxrss from a real subprocess."""
+    buf = io.StringIO()
+    with patch('sys.stdout', buf):
+        with pytest.raises(SystemExit):
+            main(['--mixin', 'resource-usage', '--', sys.executable, '-c', 'pass'])
+    record = json.loads(buf.getvalue())
+    ru_list = record.get('resource_usage', [])
+    assert isinstance(ru_list, list)
+    assert len(ru_list) == 1
+    ru = ru_list[0]
+    assert set(ru.keys()) == _RUSAGE_FIELDS
+    assert ru['maxrss'] > 0, (
+        'maxrss should be positive after running a Python subprocess'
+    )
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_cli_resource_usage_real_subprocess_cpu_nonnegative():
+    """utime and stime are non-negative after a real subprocess."""
+    buf = io.StringIO()
+    with patch('sys.stdout', buf):
+        with pytest.raises(SystemExit):
+            main(['--mixin', 'resource-usage', '--', sys.executable, '-c', 'pass'])
+    record = json.loads(buf.getvalue())
+    ru = record['resource_usage'][0]
+    assert ru['utime'] >= 0.0
+    assert ru['stime'] >= 0.0
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_cli_resource_usage_real_subprocess_counts_nonnegative():
+    """All integer count fields are non-negative after a real subprocess."""
+    buf = io.StringIO()
+    with patch('sys.stdout', buf):
+        with pytest.raises(SystemExit):
+            main(['--mixin', 'resource-usage', '--', sys.executable, '-c', 'pass'])
+    record = json.loads(buf.getvalue())
+    ru = record['resource_usage'][0]
+    for field in ('minflt', 'majflt', 'inblock', 'oublock', 'nvcsw', 'nivcsw'):
+        assert ru[field] >= 0, f'{field} should be non-negative'
diff --git a/tests/test_mixins.py b/tests/test_mixins.py
index d50d8be..5a72c6f 100644
--- a/tests/test_mixins.py
+++ b/tests/test_mixins.py
@@ -13,6 +13,7 @@
     MBLineProfiler,
     MBLoadedModules,
     MBPeakMemory,
+    MBResourceUsage,
     MBSlurmInfo,
     MBWorkingDir,
     MicroBench,
@@ -836,3 +837,175 @@ def noop():
         noop()
 
     assert bench.get_results()[0].get('cgroups', {}) == {}
+
+
+# ---------------------------------------------------------------------------
+# MBResourceUsage
+# ---------------------------------------------------------------------------
+
+_RUSAGE_FIELDS_PYTHON_API = {
+    'utime',
+    'stime',
+    'minflt',
+    'majflt',
+    'inblock',
+    'oublock',
+    'nvcsw',
+    'nivcsw',
+}
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_resource_usage_python_api_fields():
+    """MBResourceUsage records all expected fields in Python API mode."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench()
+
+    @bench
+    def work():
+        return sum(range(50000))
+
+    work()
+    ru_list = bench.get_results()[0].get('resource_usage', [])
+    assert isinstance(ru_list, list)
+    assert len(ru_list) == 1  # 1 timed iteration
+    assert set(ru_list[0].keys()) == _RUSAGE_FIELDS_PYTHON_API
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_resource_usage_python_api_one_entry_per_iteration():
+    """resource_usage has one entry per timed iteration in Python API mode."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench(iterations=3)
+
+    @bench
+    def work():
+        return sum(range(1000))
+
+    work()
+    ru_list = bench.get_results()[0].get('resource_usage', [])
+    assert len(ru_list) == 3
+    for entry in ru_list:
+        assert set(entry.keys()) == _RUSAGE_FIELDS_PYTHON_API
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_resource_usage_python_api_warmup_excluded():
+    """Warmup iterations are not counted in resource_usage."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench(iterations=2, warmup=3)
+
+    @bench
+    def work():
+        return sum(range(1000))
+
+    work()
+    ru_list = bench.get_results()[0].get('resource_usage', [])
+    assert len(ru_list) == 2  # only timed iterations
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_resource_usage_python_api_maxrss_absent():
+    """maxrss is absent in Python API mode (RUSAGE_SELF.maxrss is a lifetime HWM)."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench()
+
+    @bench
+    def work():
+        return list(range(10000))
+
+    work()
+    ru_list = bench.get_results()[0].get('resource_usage', [])
+    assert isinstance(ru_list, list)
+    assert len(ru_list) == 1
+    assert 'maxrss' not in ru_list[0], (
+        'maxrss must not appear in Python API mode records because '
+        'RUSAGE_SELF.maxrss is a lifetime process high-water mark and '
+        'cannot isolate a single function call'
+    )
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_resource_usage_python_api_cpu_nonnegative():
+    """utime and stime are non-negative floats."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench()
+
+    @bench
+    def work():
+        return sum(range(10000))
+
+    work()
+    ru = bench.get_results()[0]['resource_usage'][0]
+    assert ru['utime'] >= 0.0
+    assert ru['stime'] >= 0.0
+
+
+@pytest.mark.skipif(
+    sys.platform == 'win32', reason='resource module not available on Windows'
+)
+def test_resource_usage_python_api_counts_nonnegative():
+    """All integer count fields are non-negative."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench()
+
+    @bench
+    def noop():
+        pass
+
+    noop()
+    ru = bench.get_results()[0]['resource_usage'][0]
+    for field in ('minflt', 'majflt', 'inblock', 'oublock', 'nvcsw', 'nivcsw'):
+        assert ru[field] >= 0, f'{field} should be non-negative'
+
+
+def test_resource_usage_windows_fallback():
+    """Records empty list when the resource module is unavailable (Windows guard)."""
+
+    class Bench(MicroBench, MBResourceUsage):
+        pass
+
+    bench = Bench()
+
+    @bench
+    def noop():
+        pass
+
+    import microbench.mixins.system as _sys_mod
+
+    original = _sys_mod._resource
+    try:
+        _sys_mod._resource = None
+        noop()
+    finally:
+        _sys_mod._resource = original
+
+    assert bench.get_results()[0].get('resource_usage') == []