feat: add Ansible file parsing (playbooks, roles, tasks, handlers)#415
Open
jnovack wants to merge 1 commit intotirth8205:mainfrom
Open
feat: add Ansible file parsing (playbooks, roles, tasks, handlers)#415jnovack wants to merge 1 commit intotirth8205:mainfrom
jnovack wants to merge 1 commit intotirth8205:mainfrom
Conversation
Contributor
Author
|
It seems the Julia tests, the imports in main.py and the import in review.py failed CI prior to my PR. I'll rebase once they are fixed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Adds structural parsing for Ansible YAML files, mapping Ansible's semantic concepts onto the existing graph model so that playbooks, roles, and task files show up as navigable nodes and edges alongside the rest of the codebase.
The graph now understands:
hosts:+ task sections in a playbook) →ClassnodesFunctionnodes, with the full module name (including FQCNs likeansible.builtin.package) stored inextra.ansible_modulenotify:chains →CALLSedges from task to handler, covering both scalar and list formsinclude_tasks/import_tasks→IMPORTS_FROMedges to the included fileinclude_role/import_role→IMPORTS_FROMedges to the role nameroles:list in a play →IMPORTS_FROMedges, including{role: name, when: ...}dict form and{name: ns.role}collections formatimport_playbook:→IMPORTS_FROMat the file levelvars_files:→IMPORTS_FROMedges so you can trace variable provenanceblock:/rescue:/always:nesting → tasks extracted recursively, parented to the enclosing playpre_tasks:andpost_tasks:→ treated the same astasks:listen:on handlers → stored inextra.ansible_listenso notify-by-alias can be resolvedmeta/main.ymldependencies →DEPENDS_ONedgesHow detection works
.yml/.yamlfiles now map to"yaml"in the extension table. If the path contains an Ansible directory component (playbooks/,roles/,tasks/,handlers/,group_vars/,host_vars/) or a well-known top-level filename (site.yml,deploy.yml, etc.),detect_language()promotes the result to"ansible".For clearly typed paths (tasks/, handlers/, meta/) we trust the path and parse directly. For playbooks and unknown paths we run a lightweight content sniff first to avoid false positives — specifically, we require that a top-level sequence item has both
hosts:and at least one unambiguous Ansible play key (tasks:,gather_facts:,become:, etc.), not justhosts:alone.import_playbook:by itself is treated as unambiguous and skips the check.What's NOT here yet
A few things I knowingly left out or couldn't handle without more scope:
hosts/*.yml, INI-format inventories) — the structure is completely different; it deserves its own sub-parservars:inline dictionaries — individual variable names aren't extracted as nodes; they add noise more than signal at this pointlisten:cross-file resolution — the alias is stored, but wiring notify targets across files to their matchinglisten:handler would require a post-parse pass similar to what the ReScript resolver doesinclude_tasks: "tasks/{{ ansible_os_family }}.yml") — the raw template string is stored as the edge target; actual resolution at graph time would require knowing the variable valuesansible.cfgandrequirements.yml— not parsed; ansible-galaxy role dependencies in requirements files would be a natural follow-onKnown Ansible quirks / caveats
block:tasks with noname:get a fallback name liketask@line42. These are common in real playbooks and show up in the graph, but the name is less useful than a named task.with_*:loop keywords (with_items,with_first_found, etc.) are correctly skipped when scanning for the module key, but anywith_*variant not in the explicit meta-key list would also be skipped (by thestartswith("with_")guard), which is intentional.!vault |block as a scalar. No special handling needed.handlers:section in a play and a standalone handlers file are both parsed the same way, so handler nodes from inline play handlers and fromroles/myrole/handlers/main.ymlboth appear asFunctionnodes withansible_kind=handler.---separated documents in one file) —yaml.compose()only reads the first document. Multi-document Ansible files are rare in practice, but worth noting.Testing
Ran against a real production Ansible repo (20 roles, 27 playbooks managing Docker Swarm, Elasticsearch, and supporting infrastructure) to validate the patterns before writing fixtures. Fixtures are sanitized versions of patterns found there.
36 new tests across three classes (
TestAnsiblePlaybookParsing,TestAnsibleTasksParsing,TestAnsibleMetaParsing), all passing. Pre-existing test failures (Julia, Java, PHP, GDScript parsers) are unchanged.🤖 Summary generated with Claude Code