Add Turkish (tr) prompt-injection & data-extraction test cases — AltayDuel#8
Conversation
Signed-off-by: Fevzi Ege Yurtsevenler <egeyurtsevenler@gmail.com> Signed-off-by: Fevzi Ege Yurtsevenler <127763772+fevziegeyurtsevenler@users.noreply.github.com>
emmanuelgjr
left a comment
There was a problem hiding this comment.
Approving. Clean, in-scope contribution: valid JSON, fields match the dataset README schema, all DSGAI IDs are real, and it complies with the defensive-only / no-undisclosed-vendor-vulns rule. Turkish cross-lingual and homoglyph surfaces are a genuine gap-filler.
One small consistency fix applied on top of this branch: TR-ALTAYDUEL-002 and TR-ALTAYDUEL-004 both target the system prompt/configuration, so I've re-mapped them from DSGAI01 to DSGAI15 (and set category to 'System prompt extraction') to match the README's category↔risk convention and the precedent set by TR-001. Thanks for the contribution, @fevziegeyurtsevenler.
Both cases target the system prompt/configuration, so DSGAI15 fits better than DSGAI01, matching the dataset README's category<->risk convention and TR-001.
emmanuelgjr
left a comment
There was a problem hiding this comment.
Re-approving to cover the DSGAI re-mapping commit (5539d5b). Final state LGTM.
Adds an initial set of Turkish-language prompt-injection / data-extraction test cases to the promptinj_dataextraction_testcases dataset (currently being built from scratch).
Source: AltayDuel — an agent-vs-agent Turkish prompt-injection arena (2,594 duel transcripts). https://huggingface.co/datasets/AltaySec/altayduel-transcripts (CC-BY-4.0)
5 cases mapped to DSGAI risks (DSGAI01 / DSGAI11 / DSGAI15), each with secure vs vulnerable expected behavior, severity, prerequisites and source. They exercise Turkish-specific bypass surfaces that English-only test sets miss: system-tag spoofing, translate-then-execute cross-lingual smuggling, Cyrillic-homoglyph filter evasion, and verbatim system-prompt leak.
These are original contributions for defensive validation; no undisclosed vendor vulnerabilities are referenced. Happy to align the file to a final schema and expand the set — flagging the open schema TODO as a good place to coordinate.
Contributor: Fevzi Ege Yurtsevenler, AltaySec.