Skip to content

REGEX_PATTERNS static in secrets_redactor.rs increased the P50 memory usage#330

Closed
ZhidongPeng wants to merge 6 commits intoAzure:devfrom
ZhidongPeng:regex
Closed

REGEX_PATTERNS static in secrets_redactor.rs increased the P50 memory usage#330
ZhidongPeng wants to merge 6 commits intoAzure:devfrom
ZhidongPeng:regex

Conversation

@ZhidongPeng
Copy link
Copy Markdown
Collaborator

Issue:
REGEX_PATTERNS static in secrets_redactor.rs (PR #317)

  • Before PR regex perf improvements #317, Regex::new() was called on-the-fly and the compiled automaton was dropped after use.
  • Now 17 compiled regex::Regex objects live permanently, including several extremely complex patterns:
  1. "(?i)((app(lication)?|client|api)[_ \-]?(se?cre?t|key(url)?)|(refresh|twilio...)..."
  2. "(?-i)eyJ(?i)[a-z0-9\-_%]+\.(?-i)eyJ"
  3. The 500+ character Azure Storage and auth header patterns

The regex crate compiles each pattern to an NFA and pre-builds DFA transition tables. For complex patterns with deep alternation and lookaheads, this can easily consume hundreds of KB each. 17 patterns × ~200–400 KB each = estimated 3–5 MB permanently resident.

Fix:
Part 1 — Eliminate 6 compiled Regex objects entirely
The patterns pwd=[^;], password=[^;], AccountKey=[^;], PrimaryKey=[^;], SecondaryKey=[^;], and sig=[^&] are all structurally identical: match a fixed literal prefix, then capture everything up to a single stop character. The new redact_prefixed() helper does this with plain str::find — zero compiled automaton, zero regex heap.

Memory saved: ~6 compiled Regex objects removed (the simple ones each cost tens of KB; gone completely).

Part 2 — Per-indicator LazyLock dispatch for the 11 complex patterns
Previously: any indicator (e.g., the word "key" in a path like "key_keeper") triggered all 17 patterns to run.
Now: each pattern has its own static LazyLock and only runs when its own specific indicator is present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant