Pure-Rust port of the LZ4 compression library (v1.10.0), providing the full LZ4 C API surface — block compression, high-compression mode, streaming frame format, and file I/O — with no C dependencies.
Compressed output is bit-for-bit identical to the reference C implementation across all compression modes and levels.
| Project | Description |
|---|---|
| lz4/lz4 | Original C implementation — the upstream reference this crate ports |
| inikep/lzbench | In-memory benchmark harness used for apples-to-apples throughput comparison |
| jafreck/AAMF | Automated Architecture Migration Framework — the AI-assisted toolchain that performed the primary migration work |
- Block API — one-shot
compress_default,compress_fast,decompress_safe, and partial decompression - High-Compression (HC) —
compress_hcwith configurable compression levels 1–12 - Frame API —
LZ4F-prefixed streaming compress/decompress with content checksums, dictionary support, and auto-flush - File I/O —
Lz4ReadFile/Lz4WriteFilewrappers forstd::io::{Read, Write} - C ABI shim — optional
c-abifeature exportsLZ4_compress_default,LZ4_compress_fast,LZ4_decompress_safe, andLZ4_compress_HCas astaticlibfor drop-in use with C consumers (e.g. lzbench) - Multi-threaded I/O — optional
multithreadfeature mirrors theLZ4IO_MULTITHREADpath from the C programs
[dependencies]
lz4 = "1.10.0"use lz4::{compress_default, decompress_safe};
let input = b"hello world hello world hello world";
let bound = lz4::compress_bound(input.len() as i32) as usize;
let mut compressed = vec![0u8; bound];
let compressed_size = compress_default(input, &mut compressed).unwrap();
compressed.truncate(compressed_size as usize);
let mut output = vec![0u8; input.len()];
let n = decompress_safe(&compressed, &mut output).unwrap();
assert_eq!(&output[..n as usize], &input[..]);use lz4::compress_hc;
let input = b"highly compressible text content …";
let bound = lz4::compress_bound(input.len() as i32) as usize;
let mut compressed = vec![0u8; bound];
// Levels 1–12; 9 is a good balance of ratio vs speed
let n = compress_hc(input, &mut compressed, 9).unwrap();use lz4::frame::{Lz4FCompressContext, Lz4FDecompressContext, Preferences};
let prefs = Preferences::default();
let mut ctx = Lz4FCompressContext::new()?;
// … write chunks via ctx.compress_update(…)# Debug build
cargo build
# Optimised release build
cargo build --release
# With multi-threaded I/O support
cargo build --release --features multithread
# As a C-compatible static library (for lzbench integration)
RUSTFLAGS="-C panic=abort" cargo build --release --features c-abi
# → target/release/liblz4.acargo testAll 856 tests across 23 integration test suites and 2 doc-tests are expected to pass.
# Fuzz targets (requires cargo-fuzz + nightly)
cargo +nightly fuzz run block_roundtrip
cargo +nightly fuzz run frame_roundtrip
cargo +nightly fuzz run decompress_block_arbitrary
cargo +nightly fuzz run decompress_frame_arbitrarycargo benchResults are written to target/criterion/. HTML reports are available at
target/criterion/report/index.html.
The definitive throughput comparison uses the lzbench harness to run both the C reference and this Rust port through the identical timing loop, eliminating harness artefacts.
liblz4.a is built with --features c-abi and linked into a patched lzbench binary
(lzbench-rust) that replaces lz4.o / lz4hc.o with the Rust archive. The four
C-ABI symbols (LZ4_compress_default, LZ4_compress_fast, LZ4_decompress_safe,
LZ4_compress_HC) are exported as #[no_mangle] pub unsafe extern "C" shims that
forward to the native Rust block and HC APIs.
Environment: lzbench 2.2.1 | Clang 17 | Apple M2 Max (12-core, 64 GB) | Silesia corpus
| File | C | Rust | Δ |
|---|---|---|---|
| webster | 623 | 588 | −6% |
| mozilla | 906 | 819 | −10% |
| mr | 882 | 823 | −7% |
| dickens | 554 | 515 | −7% |
| x-ray | 2718 | 2339 | −14% |
| ooffice | 838 | 751 | −10% |
| xml | 1131 | 1119 | −1% |
Averaged across all 120 data points (5 codecs × 12 files × compress + decompress): Rust is ~10% slower than C. Compress-only mean: -9.9%; decompress-only mean: -9.7%.
Full per-file tables for all five codec variants (
lz4,lz4hc -1/-4/-8/-12), correctness verification (byte-for-byte size identity across all 12 Silesia files), and lzbench integration instructions are in docs/benchmark-results.md.
This crate was ported from LZ4 v1.10.0 (≈13,000 lines across 11 C source / header files) using AAMF (Automated Architecture Migration Framework), an AI-assisted toolchain powered by Claude Sonnet 4.6.
Migration followed a bottom-up, dependency-ordered approach across 21 tasks in 6 serial phases so that each module could be built and tested in isolation before any downstream consumer depended on it:
lz4.c (block) → xxhash (crate) → lz4hc.c → lz4frame.c → lz4file.c → lib.rs
| C source | Rust target |
|---|---|
lz4.c / lz4.h |
src/block/{types,compress,stream,decompress_core,decompress_api}.rs |
lz4hc.c / lz4hc.h |
src/hc/{types,encode,lz4mid,search,compress_hc,dispatch,api}.rs |
lz4frame.c / headers |
src/frame/{types,header,cdict,compress,decompress}.rs |
lz4file.c / lz4file.h |
src/file.rs |
xxhash.c / xxhash.h |
xxhash-rust crate (intentional substitution) |
| All public headers | src/lib.rs |
xxhashreplaced by crate —xxhash.c(~1,000 lines of endianness / SIMD#ifdefchains) was substituted withxxhash-rust = "0.8", verified to produce identical wire output.- Large C files split into focused modules —
lz4.c(2,829 lines) andlz4hc.c(2,192 lines) were decomposed into Rust modules of 150–400 lines each. unsafeconfined to hot paths — pointer arithmetic in the decompressor core and C-ABI shims is kept in dedicated files; the rest of the API surface is safe.malloc/free→ RAII — all state objects becomeBox<T>withDropimpls; no explicit memory management at call sites.
See docs/migration-summary.md and docs/decision-log.md for the complete record.
| Document | Description |
|---|---|
| docs/architecture-guide.md | Module structure, layer diagram, and design rationale |
| docs/api-reference.md | Public API surface — functions, types, and error codes |
| docs/developer-guide.md | Build, test, lint, and contribution workflow |
| docs/benchmark-results.md | Full lzbench Rust-vs-C throughput tables and methodology |
| docs/migration-summary.md | Source-to-target file map and pattern mapping reference |
| docs/decision-log.md | Architectural decisions with rationale and rejected alternatives |
| docs/known-issues.md | Known warnings, limitations, and open items |
GPL-2.0-only — same as the upstream LZ4 programs. The LZ4 library itself (which this crate ports) is BSD-2-Clause.