fix(vm): make findAll linear by stopping matchAt when no threads remain#8
Open
DaliVana wants to merge 1 commit into
Open
fix(vm): make findAll linear by stopping matchAt when no threads remain#8DaliVana wants to merge 1 commit into
DaliVana wants to merge 1 commit into
Conversation
matchAt seeds only the start-state thread, yet after a match completed and all threads died it kept iterating to input.len doing nothing. findAll restarts matchAt per match, so that dead-walk was paid Σ(n − posᵢ) ≈ O(n²): a 64 KiB scan with a match every 64 bytes took ~6.5 s. This also made findIter/replaceAll/split unusable past a few KB and contradicted the documented O(n×m) guarantee. Break out of the matchAt loop as soon as the live thread set is empty; no further match can begin from this start_pos, and any accept on the consumed input was already recorded. Restores O(n·states) end to end. Reproduction (issue), ReleaseFast: 16→7.7ms 32→15.4 64→30.8 128→56.3 256→119ms — ~2× per doubling (was ~4×). Match counts unchanged. Adds a build-mode-independent regression test (time ratio of input vs 2×input must stay sub-3×); links libc for tests/regression.zig to get a monotonic clock (Zig 0.16 std.time has none). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes: #6
matchAt seeds only the start-state thread, yet after a match completed and all threads died it kept iterating to input.len doing nothing. findAll restarts matchAt per match, so that dead-walk was paid Σ(n − posᵢ) ≈ O(n²): a 64 KiB scan with a match every 64 bytes took ~6.5 s. This also made findIter/replaceAll/split unusable past a few KB and contradicted the documented O(n×m) guarantee.
Break out of the matchAt loop as soon as the live thread set is empty; no further match can begin from this start_pos, and any accept on the consumed input was already recorded. Restores O(n·states) end to end.
Reproduction (issue), ReleaseFast: 16→7.7ms 32→15.4 64→30.8 128→56.3 256→119ms — ~2× per doubling (was ~4×). Match counts unchanged.
Adds a build-mode-independent regression test (time ratio of input vs 2×input must stay sub-3×); links libc for tests/regression.zig to get a monotonic clock (Zig 0.16 std.time has none).