RVXV -- RISC-V AI Extension Verification Generator

RVXV generates verification infrastructure for custom RISC-V AI vector instructions from YAML specifications. Given one YAML file describing an instruction's encoding, operands, and semantics, it produces Spike C++ extensions, self-checking RISC-V assembly tests, SVA assertions, and documentation.

What Problem This Solves

RISC-V AI chip companies (SiFive, Tenstorrent, Esperanto, etc.) each add custom vector instructions for operations like INT8 dot products, BF16 FMA, and quantized matrix multiply. Verifying each instruction requires:

A Spike golden reference model (C++ extension)
Directed and random assembly tests with expected values
Formal properties (SVA) for decode and functional correctness
RVFI trace comparison infrastructure

Most of this is repetitive infrastructure -- the same decode patterns, the same vector loop structure, the same mask handling -- just with different opcodes and element types. RVXV automates the boilerplate.

What Works Today (v0.1.0)

Tested and verified with real tools:

Spike C++ extensions compile against real Spike headers (extern/spike) using g++ -fsyntax-only. The generated code uses the correct insn_desc_t (10 fields), extension_t method signatures, and softfloat float32_t conversions.
Assembly tests assemble with riscv64-linux-gnu-gcc -march=rv64gcv. Uses .word encoding (not .insn r) because GCC does not support .insn r with vector register operands and large funct7 values.
RVV encoding is correct: format: vector with funct6 properly separates the vm bit (bit 25) from the instruction encoding. MASK does not include bit 25, so both masked and unmasked variants share the same MATCH/MASK pair.
End-to-end Spike execution verified for all 7 instruction types: each generated extension was built into Spike, test binaries compiled and executed, and all golden value comparisons passed.
Golden values are independently verified against manual Python computation (not computed by the same engine that generates them).
Numeric library is exhaustively validated against Berkeley SoftFloat (the IEEE 754 reference implementation used by Spike): 66,048 bit patterns tested across FP8 E4M3, FP8 E5M2, and BFloat16, plus 50,000 rounding mode comparisons. 100% match rate.

216 tests pass, 4 expected failures.

End-to-End Spike Execution Results

All 7 instruction types pass end-to-end in Spike:

Instruction Type	Operation	Element Types	E2E
INT8 dot product	`dot_product`	INT8 -> INT32	PASS
FP8 E4M3 dot product	`dot_product`	FP8 E4M3 -> FP32	PASS
INT4 packed MAC	`dot_product`	INT4 -> INT32	PASS
BF16 FMA	`fma`	BF16 -> FP32	PASS
BF16-to-FP32 convert	`convert`	BF16 -> FP32	PASS
Fused exponential	`fused_exp`	BF16 -> BF16	PASS
INT32 reduction sum	`reduction_sum`	INT32 -> INT32	PASS

What Has NOT Been Tested

Be honest about what we haven't proven:

Masked reduction tests are skipped. Masked reductions have different semantics from element-wise masking and need separate golden value logic. The reduction instruction itself works correctly (verified E2E).
SVA assertions are structural templates. They produce syntactically valid SystemVerilog but have not been tested in a real simulation environment (VCS, Questa, etc.).
RVFI checker is a template. The Python trace checker script would need adaptation for your specific trace format.
Only tested with GCC 15. Other cross-compiler versions may behave differently.

Quick Start

git clone --recurse-submodules https://github.com/jyrj/rvxv.git
cd rvxv
pip install -e .

# If you already cloned without --recurse-submodules:
# git submodule update --init --recursive

# Generate from a YAML spec
rvxv generate --spec examples/int8_dot_product.yaml --output ./build/

# Validate a spec without generating
rvxv validate --spec examples/int8_dot_product.yaml

# Use a preset
rvxv preset --list
rvxv preset --name common_ai_ops --output ./build/common/

# Export JSON Schema for editor autocompletion
rvxv schema --output ./instruction_spec.schema.json

Real Example: INT8 Dot Product

Here is the complete workflow from YAML spec to generated artifacts.

1. The YAML Spec

name: vdot4_i8i8_acc32
description: "4-way INT8 dot product with INT32 accumulation"
encoding:
  format: vector
  opcode: 0x5b    # custom-2
  funct3: 0x0
  funct6: 0x20    # bits[31:26], vm bit[25] is separate
operands:
  vs2: { type: vector, element: int8, groups: 4 }
  vs1: { type: vector, element: int8, groups: 4 }
  vd:  { type: vector, element: int32 }
semantics:
  operation: dot_product
  accumulate: true
  saturation: none
constraints:
  vl_dependent: true
  mask_agnostic: true
  tail_agnostic: true

2. Generated Files (13 total)

build/
├── spike/
│   ├── rvxv_encoding.h                        # MATCH/MASK defines
│   ├── insns/vdot4_i8i8_acc32.h               # Execute function body
│   ├── extension.cc                           # Extension registration
│   ├── disasm.cc                              # Disassembly support
│   └── Makefile.patch                         # Build integration
├── tests/
│   ├── directed/vdot4_i8i8_acc32_directed.S   # Corner case tests
│   └── random/vdot4_i8i8_acc32_random.S       # Random vector tests
├── assertions/
│   ├── vdot4_i8i8_acc32_assertions.sv         # SVA decode + result
│   ├── rvxv_coverage.sv                       # Functional coverage
│   ├── rvxv_bind.sv                           # Bind file
│   └── rvfi_checker.py                        # RVFI trace checker
└── docs/
    ├── vdot4_i8i8_acc32.md
    └── instruction_summary.md

3. What the Generated Code Looks Like

Encoding header (rvxv_encoding.h):

#define MATCH_VDOT4_I8I8_ACC32  0x8000005B
#define MASK_VDOT4_I8I8_ACC32   0xFC00707F

Execute body (insns/vdot4_i8i8_acc32.h) -- this is the Spike instruction implementation:

require_vector_vs;
require(P.VU.vsew == e8);
{
  const reg_t vl = P.VU.vl->read();
  const reg_t vstart_val = P.VU.vstart->read();
  const bool masked = insn.v_vm() == 0;
  const reg_t rd_num = insn.rd();
  const reg_t rs2_num = insn.rs2();
  const reg_t rs1_num = insn.rs1();

  const reg_t num_outputs = vl / 4;
  for (reg_t i = vstart_val; i < num_outputs; ++i) {
    if (masked && !(P.VU.elt<uint8_t>(0, i / 8) & (1 << (i % 8))))
      continue;
    int32_t acc = P.VU.elt<int32_t>(rd_num, i);
    for (int g = 0; g < 4; g++) {
      int8_t a = P.VU.elt<int8_t>(rs2_num, i * 4 + g);
      int8_t b = P.VU.elt<int8_t>(rs1_num, i * 4 + g);
      acc += (int32_t)a * (int32_t)b;
    }
    P.VU.elt<int32_t>(rd_num, i, true) = acc;
  }
  P.VU.vstart->write(0);
}

Assembly test (excerpt from directed test):

  # Test 1: All zeros -> expect 0
  vsetvli t0, t1, e8, m1, ta, ma
  # Load test vectors into v16, v24...
  .word 0x8388045b  # vdot4_i8i8_acc32 v8, v16, v24 (vm=1, unmasked)
  # Check: v8[0] == expected golden value
  li t3, 0x00000000
  bne t2, t3, fail

4. Validate the Output

python scripts/validate_output.py ./build/

This checks: directory structure exists, MATCH/MASK defines present, vstart handling in execute body, .word encoding used (not .insn r), golden values and bne comparisons present, and optionally assembles with riscv-gcc.

Integrating with Spike

# 1. Clone Spike (or use the submodule)
git submodule update --init extern/spike

# 2. Generate your extension
rvxv generate --spec examples/int8_dot_product.yaml --output ./build/

# 3. Copy into the Spike source tree
cp build/spike/rvxv_encoding.h   extern/spike/riscv/
cp build/spike/insns/*.h         extern/spike/riscv/insns/
cp build/spike/extension.cc      extern/spike/riscv/rvxv_extension.cc

# 4. Build Spike with the extension
cd extern/spike && mkdir build && cd build
../configure --prefix=$RISCV && make -j$(nproc)

# 5. Run tests
spike --extension=rvxv path/to/test.elf

Note: The INT8 dot product workflow has been verified end-to-end (generate, build into Spike, compile test, execute with correct results). Other operation types have not been tested through this full integration workflow yet. Requires riscv64-linux-gnu-gcc and a built Spike tree.

Numeric Library (Standalone)

The numeric library works independently from the rest of RVXV:

from rvxv.numeric import FP8E4M3, FP8E5M2, BFloat16, RoundingMode

# FP8 E4M3 -- no infinity, saturates to +/-448
fp8 = FP8E4M3()
bits = fp8.encode(1.5)        # -> 0x3E
value = fp8.decode(0x7E)      # -> 448.0 (max finite)
fp8.encode(float('inf'))      # -> 0x7E (saturates, no infinity)

# BFloat16 with configurable rounding
bf16_bits = BFloat16.from_fp32(3.14159, RoundingMode.RNE)
fp32_val = BFloat16.to_fp32(bf16_bits)

# ULP-aware comparison for verification
from rvxv.numeric import compare_with_tolerance, TOLERANCE_FMA
result = compare_with_tolerance(actual=1.0000001, expected=1.0, tolerance=TOLERANCE_FMA)
print(result.passed, result.ulp_dist)  # True, 1

Supported Types and Operations

Element Types

Format	Bits	Range	Notes
INT4 / UINT4	4	[-8,7] / [0,15]	Packed sub-byte
INT8 / UINT8	8	[-128,127] / [0,255]	Standard integer
INT16 / UINT16	16	Standard	Standard integer
INT32 / UINT32	32	Standard	Accumulator type
INT64	64	Standard	Wide accumulator
FP8 E4M3	8	+/-448	No infinity, 2 NaN encodings
FP8 E5M2	8	+/-57344	IEEE 754-like
FP16	16	IEEE 754	Standard half-precision
BFloat16	16	Same as FP32	5 rounding modes
FP32	32	IEEE 754	Via softfloat in Spike
FP64	64	IEEE 754	Double precision
MXFP8/6/4	4-8	Varies	MX block floating point

Semantic Operations

Operation	Description	Example
`dot_product`	N-way dot product with accumulation	INT8x4 -> INT32
`fma`	Fused multiply-add	BF16 * BF16 + FP32
`mac`	Multiply-accumulate	INT4 * INT4 + INT32
`multiply`	Element-wise multiply	Any type
`add`	Element-wise add	Any type
`fused_exp`	Fused exponential (for softmax)	BF16 -> BF16
`convert`	Type conversion	FP8 -> BF16
`compare`	Element-wise comparison	Any type
`outer_product`	Outer product (tiled matrix)	INT8 x INT8 -> INT32
`reduction_sum` / `reduction_max`	Vector reductions	Any type

Writing Your Own Spec

Encoding

RVXV supports two encoding formats:

encoding:
  format: vector      # Recommended for vector AI instructions
  opcode: 0x5b        # custom-2 (0x0b, 0x2b, 0x5b, 0x7b are custom spaces)
  funct3: 0x0
  funct6: 0x20        # bits[31:26]; vm bit[25] handled automatically

Use format: vector for vector instructions. This correctly separates the vm bit from funct6. Use format: R-type only for scalar custom instructions where funct7 encodes all 7 bits [31:25].

Operands

operands:
  vs2: { type: vector, element: int8, groups: 4 }   # 4 elements per output
  vs1: { type: vector, element: int8, groups: 4 }
  vd:  { type: vector, element: int32 }              # wider accumulator

groups: N means N source elements contribute to one destination element (e.g., 4-way dot product).

Semantics

semantics:
  operation: dot_product   # See table above
  accumulate: true         # Add to existing vd value
  saturation: none         # Or: signed, unsigned

Pre-Built Presets

Presets are example specs showing how to describe common AI instruction patterns. They are NOT official RISC-V extensions.

Preset	Instructions	Description
`common_ai_ops`	5	INT8 dot, BF16 FMA, FP8 dot, fused exp, INT4 MAC
`vme_zvbdot`	4	4-element dot product variants
`ime_option_e`	4	Outer product matrix tile operations
`ame_outer_product`	3	Tiled matrix multiply-accumulate

rvxv preset --list
rvxv preset --name common_ai_ops --output ./build/

Testing

# Run all tests (216 tests)
pytest tests/ -v

# What the tests cover:
# - Numeric library: bit-accurate FP8/BF16/INT4/MX encode/decode,
#   cross-validated against Berkeley SoftFloat (115 tests)
# - Generators: structural validation of generated artifacts (32 tests)
# - Core: spec parser, IR, type system, semantics engine (18 tests)
# - End-to-end: Spike E2E execution (3 tests), Spike artifacts (5 tests),
#   Verilator output validation (15 tests)
# - Golden values: independently computed expected values (13 tests)
# - Real toolchain: riscv-gcc assembly, g++ Spike compilation (13 tests)
# - CLI: all 4 commands work end-to-end (6 tests)

# Lint
ruff check src/ tests/ scripts/

# Validate generated output
python scripts/validate_output.py ./build/

Git Submodules

For real-world integration testing, RVXV includes git submodules:

git submodule update --init --recursive

# This provides:
# extern/riscv-tests/  -- real riscv_test.h and test_macros.h headers
# extern/spike/        -- real Spike headers for compilation testing

When submodules are present, tests automatically use real headers instead of minimal stubs.

Project Structure

src/rvxv/
├── core/           # Spec parser, instruction IR, type system, semantics engine
├── numeric/        # Bit-accurate AI numeric types (FP8, BF16, INT4, MX formats)
├── generators/
│   ├── spike/      # Spike C++ extension generator (Jinja2 templates)
│   ├── tests/      # Assembly test generator (directed + random)
│   ├── assertions/ # SVA properties, RVFI checker, coverage, bind
│   └── docs/       # Markdown documentation generator
├── presets/        # Pre-built YAML specs for common AI instructions
└── cli.py          # Click CLI (generate, validate, preset, schema)

extern/
├── riscv-tests/    # Git submodule: riscv-tests framework headers
└── spike/          # Git submodule: Spike ISA simulator headers

scripts/
└── validate_output.py  # One-command output validation

How to Extend

Adding a New Operation Type

Add the operation to SemanticOp enum in src/rvxv/core/type_system.py
Create a Jinja2 template in src/rvxv/generators/spike/templates/execute_body/
Register it in _TEMPLATE_MAP in src/rvxv/generators/spike/execute_gen.py
Add test generation support in src/rvxv/generators/tests/directed_gen.py
Add golden value computation in src/rvxv/core/semantics.py

Adding a New Element Type

Add the type to ElementType enum in src/rvxv/core/type_system.py
Add TypeInfo entry with width, signedness, and C type name
If it's a non-standard format, add encode/decode to src/rvxv/numeric/
Add conversion helpers to _CONVERSION_IMPLS in spike_gen.py
Add the type to _TO_F32 / _FROM_F32 in execute_gen.py

Adding a New Generator

Implement the Generator base class in src/rvxv/generators/base.py:

class MyGenerator(Generator):
    def generate(self, specs: list[InstructionSpec], output_dir: Path) -> list[Path]:
        # Generate files, return list of created paths
        ...

Comparison with Existing Tools

	riscv-dv	riscv-formal	RVXV
Custom AI instructions	No	No	Yes (YAML-driven)
FP8/BF16/INT4 numerics	No	No	Bit-accurate
Spike extension gen	No	No	Yes (real API)
Assembly test gen	Constrained random	No	Directed + random
SVA properties	No	Base ISA only	Custom instruction templates
Open source	Yes	Yes	Yes (Apache 2.0)

RVXV is complementary to these tools, not a replacement. riscv-dv is better for general ISA-level random testing. riscv-formal is better for proving base ISA correctness. RVXV fills the gap for custom AI instruction verification infrastructure.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
examples		examples
scripts		scripts
src/rvxv		src/rvxv
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

RVXV -- RISC-V AI Extension Verification Generator

What Problem This Solves

What Works Today (v0.1.0)

End-to-End Spike Execution Results

What Has NOT Been Tested

Quick Start

Real Example: INT8 Dot Product

1. The YAML Spec

2. Generated Files (13 total)

3. What the Generated Code Looks Like

4. Validate the Output

Integrating with Spike

Numeric Library (Standalone)

Supported Types and Operations

Element Types

Semantic Operations

Writing Your Own Spec

Encoding

Operands

Semantics

Pre-Built Presets

Testing

Git Submodules

Project Structure

How to Extend

Adding a New Operation Type

Adding a New Element Type

Adding a New Generator

Comparison with Existing Tools

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages