Shared: Improvements to SensitiveDataHeuristics.qll#21806
Open
geoffw0 wants to merge 22 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the shared sensitive-data naming heuristics used across multiple languages to improve classification of passwords/private information (increasing true positives and reducing false positives), and refreshes language-specific tests and change notes to reflect the updated behavior.
Changes:
- Refines shared sensitive-data heuristics (regex patterns and exclusions) in
SensitiveDataHeuristics.qll. - Updates Swift/Python/Rust tests and expected baselines to reflect newly-detected (or no-longer-detected) sensitive data sources.
- Adds per-language change notes documenting the heuristics improvement.
Show a summary per file
| File | Description |
|---|---|
| shared/concepts/codeql/concepts/internal/SensitiveDataHeuristics.qll | Updates shared sensitive-data heuristic patterns/exclusions used by multiple languages. |
| rust/ql/test/library-tests/sensitivedata/test.rs | Extends Rust library test coverage for the updated sensitive-data name heuristics. |
| python/ql/test/query-tests/Security/CWE-312-CleartextLogging/test.py | Updates Python cleartext logging test to reflect newly-classified sensitive values. |
| swift/ql/test/query-tests/Security/CWE-328/testCryptoKit.swift | Extends Swift hashing tests to cover additional API spellings. |
| swift/ql/test/query-tests/Security/CWE-311/testSend.swift | Updates Swift transmission test to reflect newly-detected sensitive field. |
| swift/ql/test/query-tests/Security/CWE-328/WeakSensitiveDataHashing.expected | Updates Swift expected results baseline for weak sensitive data hashing. |
| swift/ql/test/query-tests/Security/CWE-328/WeakPasswordHashing.expected | Updates Swift expected results baseline for weak password hashing. |
| swift/ql/test/query-tests/Security/CWE-311/SensitiveExprs.expected | Updates Swift expected sensitive-expression baseline. |
| swift/ql/test/query-tests/Security/CWE-311/CleartextTransmission.expected | Updates Swift expected cleartext transmission baseline. |
| swift/ql/lib/change-notes/2026-05-14-sensitive-data.md | Adds Swift change note for the sensitive-data heuristics update. |
| rust/ql/lib/change-notes/2026-05-14-sensitive-data.md | Adds Rust change note for the sensitive-data heuristics update. |
| python/ql/lib/change-notes/2026-05-14-sensitive-data.md | Adds Python change note for the sensitive-data heuristics update. |
| javascript/ql/lib/change-notes/2026-05-14-sensitive-data.md | Adds JavaScript change note for the sensitive-data heuristics update. |
Copilot's findings
- Files reviewed: 14/14 changed files
- Comments generated: 5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR consists of a series of small improvements to
SensitiveDataHeuristics.qll, intended to find more true and less false sources of sensitive data. One of these changes addresses a request from a user, the rest are motivated by issues we've spotted at various points in the past. None are expected to have a big impact by themselves (but 7 changes x 5 affected languages is quite a lot of surface area).card.?no,api.?tok,security.?codepatterns. We already had similar cases but no exact coverage for these.wildcard_nois notcard.?no;profileis notfile;coauthoris notoauth.security_codefor containingcode. It was also handlingunencryptedincorrectly - whileunencryptwas not matched due to the special case, thecryptsubstring was matched due to the entireunenpart of the regex being optional. Copilot gets most of the credit for spotting this one.Draft PR because I need to:
accountmatches a bit widely and we could potentially add a “not sensitive” rule forvalidator, if we see more of either of these cases.