Skip to content

fix(vm): make findAll linear by stopping matchAt when no threads remain#8

Open
DaliVana wants to merge 1 commit into
zig-utils:mainfrom
DaliVana:fix/findall-quadratic
Open

fix(vm): make findAll linear by stopping matchAt when no threads remain#8
DaliVana wants to merge 1 commit into
zig-utils:mainfrom
DaliVana:fix/findall-quadratic

Conversation

@DaliVana
Copy link
Copy Markdown

@DaliVana DaliVana commented May 15, 2026

Fixes: #6

matchAt seeds only the start-state thread, yet after a match completed and all threads died it kept iterating to input.len doing nothing. findAll restarts matchAt per match, so that dead-walk was paid Σ(n − posᵢ) ≈ O(n²): a 64 KiB scan with a match every 64 bytes took ~6.5 s. This also made findIter/replaceAll/split unusable past a few KB and contradicted the documented O(n×m) guarantee.

Break out of the matchAt loop as soon as the live thread set is empty; no further match can begin from this start_pos, and any accept on the consumed input was already recorded. Restores O(n·states) end to end.

Reproduction (issue), ReleaseFast: 16→7.7ms 32→15.4 64→30.8 128→56.3 256→119ms — ~2× per doubling (was ~4×). Match counts unchanged.

Adds a build-mode-independent regression test (time ratio of input vs 2×input must stay sub-3×); links libc for tests/regression.zig to get a monotonic clock (Zig 0.16 std.time has none).

matchAt seeds only the start-state thread, yet after a match completed
and all threads died it kept iterating to input.len doing nothing.
findAll restarts matchAt per match, so that dead-walk was paid
Σ(n − posᵢ) ≈ O(n²): a 64 KiB scan with a match every 64 bytes took
~6.5 s. This also made findIter/replaceAll/split unusable past a few KB
and contradicted the documented O(n×m) guarantee.

Break out of the matchAt loop as soon as the live thread set is empty;
no further match can begin from this start_pos, and any accept on the
consumed input was already recorded. Restores O(n·states) end to end.

Reproduction (issue), ReleaseFast: 16→7.7ms 32→15.4 64→30.8 128→56.3
256→119ms — ~2× per doubling (was ~4×). Match counts unchanged.

Adds a build-mode-independent regression test (time ratio of input vs
2×input must stay sub-3×); links libc for tests/regression.zig to get a
monotonic clock (Zig 0.16 std.time has none).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

findAll is O(n²) — re-inits the VM and rescans from each match position

1 participant