Skip to content

Fix matmul_4bit gemv path for mismatched quant_state layout#1906

Open
datavorous wants to merge 1 commit intobitsandbytes-foundation:mainfrom
datavorous:fix/matmul4bit-gemv-shape-guard
Open

Fix matmul_4bit gemv path for mismatched quant_state layout#1906
datavorous wants to merge 1 commit intobitsandbytes-foundation:mainfrom
datavorous:fix/matmul4bit-gemv-shape-guard

Conversation

@datavorous
Copy link
Copy Markdown

Fixes #1862

Problem: gemv fast path in matmul_4bit assumes quant_state.shape follows (out_features, in_features), and can silently produce wrong output shape/values for transposed layout metadata.

Fix: add a minimal shape guard in matmul_4bit vector fast path; on mismatch, fall back to MatMul4Bit.apply.

Scope: no kernel changes, no API changes, minimal 2-line guard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gemv_4bit silently produces wrong results when weight is quantized in (in_features, out_features) layout

1 participant