Add Promptfoo (LLM eval & red-teaming) parser#15081
Draft
Dashtid wants to merge 1 commit into
Draft
Conversation
Adds a file-based parser for promptfoo (https://promptfoo.dev) results JSON, produced by `promptfoo eval -o results.json` or `promptfoo redteam run -o results.json`. - Inverted semantics: a result with success:false (a failed assertion / successful red-team attack) becomes a Finding; success:true (the model defended) is skipped, as are failureReason==ERROR (provider) results. - Severity from red-team metadata.severity, with a Medium fallback for plain-eval failures; CWE mapped from the plugin/category as a coarse starter (89/78/1427/200, default 1426). - Failures for the same plugin against the same target aggregate into one Finding (nb_occurences), keeping the most severe rung. - Deduplicated via hash_code on title + component_name; severity and description are excluded as unstable across runs. Verified against the promptfoo v3 results schema (results.version == 3); the sample scan files are real promptfoo output. Adds unit tests and parser documentation. Signed-off-by: David Dashti <dashti.dat@gmail.com>
Contributor
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a parser for promptfoo, an open-source LLM evaluation and red-teaming tool, aligned with the AI-testing direction in discussion #13242.
The parser ingests the JSON results file written by
promptfoo eval -o results.json(andpromptfoo redteam run -o results.json). promptfoo's pass/fail semantics are inverted relative to most scanners, which is the central design point:success: truemeans every assertion passed — for a red-team probe that means the target model defended the attack — so it is not a finding.success: falseis a failed assertion (for a red-team probe, the attack succeeded) and becomes a Finding.failureReason == 2(a provider/eval error rather than an assertion failure) are skipped — the test could not run, so it is not a vulnerability.Other decisions:
metadata.severity(critical/high/medium/low). A plainpromptfoo evalfailure carries no severity metadata and defaults to Medium.nb_occurences, keeping the most severe rung).hash_codededuplication ontitle+component_name.severityanddescriptionare intentionally excluded:descriptionholds the per-run attack input/output, andseverityis an aggregate that shifts as the set of failed attempts changes — neither is stable enough for the dedup hash.Test results
Adds
unittests/tools/test_promptfoo_parser.py(14 tests) covering: zero/one/many findings, the severity matrix, the CWE mapping (including the specific*-injectionrules taking precedence over the broad rule), aggregation, the plain-eval fallback (severity/title/identity from the failed assertion, metric-over-type), skipping passed and errored results, the lenient input shapes (bare list, top-levelresultslist),shareableUrl->references, string-form providers, bytes + UTF-8 BOM + non-ASCII input, and rejection of non-JSON input.The sample scan files under
unittests/scans/promptfoo/are real promptfoo v3 output (results.version == 3).Documentation
Adds
docs/content/supported_tools/parsers/file/promptfoo.md.Checklist
devruff.toml, ruff 0.15.16)Import Scansandsettings_changes(touchessettings.dist.pyfor deduplication)