ore,server-core,pgwire,balancerd,environmentd: migrate TLS from openssl to rustls#35864
Closed
jasonhernandez wants to merge 39 commits intoMaterializeInc:mainfrom
Closed
ore,server-core,pgwire,balancerd,environmentd: migrate TLS from openssl to rustls#35864jasonhernandez wants to merge 39 commits intoMaterializeInc:mainfrom
jasonhernandez wants to merge 39 commits intoMaterializeInc:mainfrom
Conversation
Add bin/lint-openssl to detect all openssl dependencies, feature flags, and source imports across the workspace. This is the first step toward migrating from openssl to rustls—it serves as a tracking tool for migration progress and can later be promoted to a CI gate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add migration plan (doc/developer/openssl-to-rustls-migration.md) with tiered breakdown of all 28 affected crates, dependency graph, replacement crate mapping, and links to Linear issues (SEC-176 through SEC-200). Include raw linter output snapshots (.txt and .json) as baseline for tracking progress. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace all non-FIPS crypto crate recommendations (ring, sha2, hmac, pbkdf2, subtle, rsa, ed25519-dalek, aes+cbc) with aws-lc-rs equivalents. Add FIPS 140-3 strategy section, workspace fips feature flag (SEC-201), and updated replacement crate mapping table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add [[bans.deny]] entries for crypto crates that are not FIPS 140-3 validated: sha2, hmac, subtle, ring, pbkdf2, ed25519-dalek, aes, cbc, rsa. All new crypto code must use aws-lc-rs instead. Existing workspace and third-party usage is allowed via wrappers, with TODO comments to remove them as each crate is migrated to aws-lc-rs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add bin/lint-fips-containers to scan Dockerfiles for FIPS 140-3 compliance gaps: non-FIPS base images, crypto-relevant package installations, and non-FIPS algorithms in cert generation scripts. Distinguishes production images (must comply) from test/dev (informational). Supports --strict and --json flags. Current results: 8 production findings across 4 base images. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers all three compliance layers: Rust binaries (137 openssl findings across 28 crates + sha2/hmac/subtle), container images (8 production findings across 4 base images), and Kubernetes/Helm deployment (Ed25519, image validation, external services, FIPS toggle). Includes full issue inventory (SEC-176 through SEC-213), remediation strategy, recommended execution order, and FIPS validation caveat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the rustls ban from deny.toml, unblocking all openssl-to-rustls migration work. Add `aws-lc-rs` as an optional dependency in mz-ore with two feature flags: - `crypto`: enables aws-lc-rs in standard mode - `fips`: enables aws-lc-rs with FIPS 140-3 validated module mz-ore is the natural distribution channel since every crate in the workspace depends on it. Downstream crates enable `mz-ore/crypto` (or `mz-ore/fips` for FIPS builds) to get the validated backend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The `fips` feature on mz-ore enables `aws-lc-rs/fips`, which pulls in `aws-lc-fips-sys`. That crate builds BoringSSL's FIPS module via cmake, requiring Go for integrity verification. Since cargo-test runs with `--all-features`, Go must be available in the CI builder. Fixes SEC-232. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
uuid 1.23.0 changed error message format which breaks the fmt_ids test in mz-persist-types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pin three deps that were inadvertently bumped during Cargo.lock regeneration: - os_info 3.11.0: avoids objc2 0.6.x which causes E0275 on macOS - chrono-tz 0.8.1: avoids Egypt timezone data change that breaks test_pg_timezone_abbrevs - serde_path_to_error 0.1.8: avoids error message format change that breaks test_mcp_observatory Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch TLS backend from openssl/native-tls to rustls: - mz-cloud-resources: kube openssl-tls → rustls-tls - mz-npm: reqwest native-tls-vendored → rustls-tls-webpki-roots-no-provider - mz-testdrive: reqwest native-tls-vendored → rustls-tls-webpki-roots-no-provider Uses the -no-provider variant for reqwest to avoid pulling in ring, allowing aws-lc-rs to serve as the crypto provider instead. Deferred: tiberius (SEC-223, fork needs rustls fix), segment and duckdb (no rustls feature available), storage-types (has direct native-tls dep). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The client-legacy feature was previously activated transitively. After Cargo.lock regeneration, the transitive activation stopped and `hyper_openssl::client` became configured out. Enable it explicitly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The webpki-roots 1.0.6 crate uses the CDLA-Permissive-2.0 license, which is already allowed in deny.toml but was missing from about.toml (the cargo-about config that must be manually kept in sync). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- mz-aws-util: remove custom hyper-tls HTTP client override; the AWS SDK already uses rustls by default, so the native-TLS override was unnecessary - mz (CLI): reqwest default-tls → rustls-tls-webpki-roots-no-provider - mz-persist: reqwest default-tls → rustls-tls-webpki-roots-no-provider Deferred: mz-dyncfg-launchdarkly (LD SDK takes hyper_tls::HttpsConnector directly — needs upstream/fork change), mz-persist openssl-sys removal (has openssl_sys::init() hack that needs investigation), mz CLI openssl-probe removal (needs source changes for cert discovery). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove references to native-TLS policy override and hyper-tls dep in generated docs. The AWS SDK's default rustls client is now used directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The client-legacy feature was previously activated transitively through hyper-tls in mz-ore. After replacing hyper-tls with hyper-rustls, the transitive activation stopped and `hyper_openssl::client` became configured out. Enable it explicitly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The PR removed the custom hyper-tls HTTP client from aws-util but didn't enable the `default-https-client` feature on aws-config. With default-features = false, no HTTP client was bundled, causing environmentd to crash on startup. Also pin os_info to 3.11.0 to avoid pulling in objc2 0.6.x which causes E0275 overflow on macOS clippy due to its blanket IntoIterator impl on Retained<T>. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eature Add a `telemetry` feature (default-enabled) to mz-adapter that gates `launchdarkly-server-sdk` and `mz-segment` as optional deps. Add #[cfg] guards on LD-specific code in config.rs and config/frontend.rs. WIP: segment client refs in client.rs, coord.rs, coord/ddl.rs, and coord/message_handler.rs still need cfg guards. The pattern is proven but the threading is extensive — see SEC-229 for remaining work. Compiles with default features. Does not yet compile with --no-default-features (missing segment cfg guards). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…for FIPS In FIPS mode, non-essential third-party SDKs must be excluded from the binary at compile time (not just disabled at runtime). This adds a `telemetry` Cargo feature to mz-adapter, mz-environmentd, and mz-balancerd, plus a `sentry` feature to mz-ore, mz-orchestrator-tracing, and mz-service. When these features are disabled: - Segment analytics client is compiled out via SegmentClient type alias - LaunchDarkly SDK and dyncfg sync are excluded - Sentry error reporting and panic integration are excluded - CLI args are still accepted but values are ignored All features are default-enabled so standard builds are unaffected. FIPS builds use `--no-default-features` to exclude them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a section to the FIPS compliance report documenting the Cargo feature flags that gate third-party telemetry SDKs (Segment, LaunchDarkly, Sentry) for FIPS builds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace all openssl cryptographic primitives in src/auth/src/hash.rs with aws-lc-rs equivalents to ensure FIPS 140-3 compliance: - openssl::rand::rand_bytes -> aws_lc_rs::rand::SystemRandom - openssl::memcmp::eq -> aws_lc_rs::constant_time::verify_slices_are_equal - openssl::pkey::PKey::hmac + openssl::sign::Signer -> aws_lc_rs::hmac - openssl::sha::sha256 -> aws_lc_rs::digest - openssl::pkcs5::pbkdf2_hmac -> aws_lc_rs::pbkdf2 Removes the openssl dependency from mz-auth entirely. Part of SEC-198. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### Motivation This is a stacked PR for OIDC login PR: MaterializeInc#35440 This PR let's the user retrieve the ID token for psql connection string Changes that would go in are from the last commit ### Description - Added OIDC Connection modal similar to Connect modal for cloud console to show the connection instructions and ID token <img width="2680" height="1598" alt="image" src="https://github.com/user-attachments/assets/494b2949-827f-489d-afd9-6ca86bf890b5" /> ### Verification Once logged in using SSO, take the connection string and put that in the terminal. You will be prompted to put in a password so copy and paste the id token to get authenticated
Context: Instructions for running bin/environmentd with postgres cause a panic https://materializeinc.slack.com/archives/CU7ELJ6E9/p1775151866083609.
Replace sha2, hmac, and subtle crate usage with aws-lc-rs equivalents to consolidate on a single FIPS 140-3 validated crypto backend. Changes by crate: - mz-adapter: sha2::Sha256 → aws_lc_rs::digest - mz-avro: sha2/digest traits → aws_lc_rs::digest (fingerprint API changed from generic type parameter to algorithm reference) - mz-catalog: sha2::Sha256 → aws_lc_rs::digest - mz-expr: sha2/sha1 digests → aws_lc_rs::digest, hmac (sha1-512) → aws_lc_rs::hmac, subtle::ConstantTimeEq → aws_lc_rs::constant_time (hmac crate retained for MD5 HMAC only) - mz-fivetran-destination: sha2::Sha256 → aws_lc_rs::digest - mz-license-keys: sha2::Sha256 → aws_lc_rs::digest - mz-npm: sha2::Sha256 → aws_lc_rs::digest - mz-orchestrator-kubernetes: sha2::Sha256 → aws_lc_rs::digest - mz-orchestratord: sha2::Sha256 → aws_lc_rs::digest - mz-persist: removed sha2 "asm" perf dependency (aws-lc-rs uses native assembly) - mz-storage: sha2 Digest trait → aws_lc_rs::digest::Context Dependencies removed: sha2 (10 crates), subtle (1 crate), sha1 (1 crate), digest (1 crate). The hmac crate remains in mz-expr for HMAC-MD5 (not available in aws-lc-rs). Part of SEC-206. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…c-rs/rustls Migrate 7 leaf crates away from direct openssl/native-tls dependencies as part of FIPS 140-3 compliance: - mz-adapter: openssl::rand::rand_bytes → aws_lc_rs::rand::fill - mz-catalog: openssl::rand::rand_bytes → aws_lc_rs::rand::fill, openssl::sha::sha256 → aws_lc_rs::digest - mz-ssh-util: Ed25519 keygen from openssl PKey → aws_lc_rs::signature - mz-frontegg-mock: test RSA keygen from openssl → aws_lc_rs::rsa - mz-oidc-mock: RSA key parsing from openssl → aws_lc_rs::rsa + manual DER - mz-ccsr: native-tls cert handling → base64 PEM parsing + reqwest validation, reqwest native-tls-vendored → rustls-tls-webpki-roots - mz-storage-types: remove NativeTls/Openssl error variants from CsrConnectError mz-debug and mz-postgres-util migrations are blocked by SEC-192 (mz-tls-util) since they consume mz_tls_util::make_tls which returns OpenSSL-based types. Part of SEC-220. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ustls Replace openssl-based test certificate generation and TLS connector construction with rcgen (cert generation) and tokio-postgres-rustls. - Ca struct now uses rcgen::CertificateParams + KeyPair instead of openssl X509/PKey. Certificate and key are stored as PEM bytes. - New TestTlsConfig builder replaces the closure-based SslConnectorBuilder pattern with a declarative config struct. - make_pg_tls now takes TestTlsConfig and returns MakeRustlsConnect. Test files (auth.rs, server.rs, balancerd/tests) still need call-site migration to the new API — tracked as remaining work for SEC-219. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update all test files to use the new TestTlsConfig-based API: - auth.rs: Migrate ~50 make_pg_tls call sites, replace SslConnectorBuilder closures with TestTlsConfig builder. Switch JWT from RS256 to ES256 (matching rcgen's ECDSA key generation). Stub make_http_tls/make_ws_tls with TODO comments for full rustls migration. - environmentd/tests/server.rs: Migrate make_pg_tls calls, JWT keys, reqwest cert access, and X509 comparisons (with TODO stubs). - balancerd/tests/server.rs: Migrate make_pg_tls calls, JWT keys, reqwest cert access, and X509 comparisons (with TODO stubs). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace TODO stubs with working implementations: - peer_certificate_der(): raw tokio-rustls handshake to inspect peer certificates. reqwest's TlsInfo::peer_certificate() only works with the native-tls backend, returning None with rustls — so we drop down to tokio_rustls::TlsConnector directly where ServerConnection::peer_certificates() always works. - cert_file_to_der(): parse PEM cert files to DER for comparison. - make_http_tls(): now honors TestTlsConfig (builds hyper-rustls connector from the client config that trusts the test CA). - make_ws_tls(): uses rustls::StreamOwned for synchronous TLS WebSocket connections. Cert reloading test assertions in both environmentd and balancerd are now fully restored — no remaining TODO stubs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite the central TLS utility crate to use rustls instead of openssl: - make_tls: returns MakeRustlsConnect (rustls-based) instead of postgres-openssl MakeTlsConnector. Supports SslMode verification, client certificates, and a NoVerifier for non-verifying modes. - pkcs12der_from_pem: validates PEM with rustls-pki-types instead of openssl. Stores concatenated PEM in the Pkcs12Archive for backward compatibility (consumers use reqwest::Identity::from_pem). - TlsError: OpenSsl variant replaced with Rustls variant. - MakeRustlsConnect + RustlsConnect: implements tokio_postgres MakeTlsConnect trait using tokio-rustls, with RustlsTlsStream wrapper for TlsStream trait. Updated consumers: - mz-postgres-util: removed openssl + postgres-openssl deps, updated error types - mz-postgres-client: updated TlsError match arm - mz-debug: replaced MakeTlsConnector/TlsStream with rustls equivalents - mz-ccsr: pkcs12der_from_pem error type changed (already updated in SEC-220) - mz-storage-types: pkcs12der_from_pem returns anyhow::Error (compatible) Part of SEC-192. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sl to rustls Replace openssl/tokio-openssl/native-tls/hyper-tls with rustls/tokio-rustls/ hyper-rustls across the core TLS infrastructure and all downstream consumers: - mz-ore: Add `crypto` module with FIPS-aware CryptoProvider helper. Replace openssl/tokio-openssl in `async` feature with tokio-rustls. Replace native-tls/hyper-tls in `tracing` feature with hyper-rustls. Update AsyncReady impls for tokio-rustls stream types. - mz-server-core: Rewrite TlsCertConfig::load_context() to produce rustls::ServerConfig. Update ReloadingSslContext to wrap Arc<RwLock<Arc<ServerConfig>>> with acceptor() method. - mz-pgwire-common: Introduce TlsStream enum wrapping both server and client rustls streams with SNI extraction support. - mz-pgwire: Update SSL accept to TlsAcceptor pattern. - mz-balancerd: Migrate server-side TLS accept, client-side TLS connect (with NoVerifier for internal connections), and SNI extraction. - mz-environmentd: Migrate HTTP server TLS accept and console proxy HTTPS client from hyper-tls to hyper-rustls. Part of SEC-218. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
- Fix clippy::clone_on_ref_ptr in tls-util (Arc::clone instead of .clone()) - Fix rustfmt line width in pgwire server.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Frontegg mock server uses RS256 (RSA) for JWT signing, but the tests were passing the CA's ECDSA key via `from_ec_pem()`. This worked with the old openssl backend (which generated RSA keys for CAs) but fails with rcgen (which generates ECDSA keys). Use `Ca::generate_jwt_rsa_keypair()` to create a separate RSA keypair for JWT signing, and switch all `from_ec_pem` calls to `from_rsa_pem`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…wt_keys - DecodingKey needs the RSA public key, not private key - Add jwt_keys initialization to OIDC-only test functions that were missing it - Remove unused JwtRsaKeyPair import from balancerd tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Change JWT algorithm from ES256 to RS256 to match RSA keys - Fix OIDC issuer switch test: use jwt_keys instead of ca1.key_pem - Update error message assertions for rustls: - "packet length too long" → "InvalidContentType" - "unable to get local issuer certificate" → "UnknownIssuer" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix second occurrence of "unable to get local issuer certificate" assertion that was missed in prior commit - Update miri ignore comments: remove stale OPENSSL_init_ssl reference - Clean up outdated SslConnectorBuilder reference in test_util.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Closing in favor of a new PR from the upstream repo branch (needed for Docker image builds in CI). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cryptomodule with FIPS-aware CryptoProvider helper. Replace openssl/tokio-openssl inasyncfeature with tokio-rustls. Replace native-tls/hyper-tls intracingfeature with hyper-rustls. Update AsyncReady impls for tokio-rustls stream types.TlsCertConfig::load_context()to producerustls::ServerConfig. UpdateReloadingSslContextto wrapArc<RwLock<Arc<ServerConfig>>>withacceptor()method.TlsStreamenum wrapping both server and client rustls streams with SNI extraction support.TlsAcceptorpattern.Part of SEC-218 (PR 6: Core TLS infrastructure and downstream consumers).
Test plan
cargo checkpasses for all modified cratescargo check -p mz-environmentd --features testpassesChecklist
cargo fmtclean🤖 Generated with Claude Code