Skip to content

Commit 7f53552

Browse files
jbloomclaude
andauthored
add color_tree_by parameter for Auspice-style tree coloring (#2)
Introduces a color_tree_by parameter that colors the tree's branches and tip circles by: - A node_attrs key like "subclade" — colored by node_attrs[<key>].value on each node. - "genotype:<GENE>:<SITE>" — colored by the inferred amino-acid (or nucleotide) state at the site, computed from branch_attrs.mutations walked from the root. - "genotype:<GENE>:<SITE1>,<SITE2>,..." — same but for a haplotype across sites; sites that don't vary in the tree drop out of the label; if every requested site is invariant, every node gets a single "<no variation>" category. Color and ordering are chosen to match Nextstrain views for the same tree: when the Auspice JSON defines meta.colorings[<key>].scale that palette is used; otherwise the same per-N palette Auspice's frontend uses (reproduced in _color.py with attribution to AGPL-licensed Auspice) fills in. Categories are sorted by descending frequency (ties broken alphabetically), matching Auspice's sortedDomain. The "unknown" category renders in gray and is hidden from the legend when only internal nodes lack the attribute. Other changes bundled in: - TreeNode now carries node_attrs and branch_attrs for downstream consumers; load_auspice_with_meta sibling helper exposes the JSON's top-level meta dict alongside the parsed root. - Default tree_line_width bumped from 1.5 to 2 and tree_node_size from 28 to 45 with explicit opacity=1, since the prior defaults were tuned for unicolor black trees and read poorly when colored. - New "Color the tree" subsection in docs/examples.md with a second H3N2 example colored by genotype HA1:158. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent ef5b46c commit 7f53552

8 files changed

Lines changed: 1560 additions & 42 deletions

File tree

CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1616
- `strain_label_font_size` (default `10`), `strain_label_font_weight`
1717
(default `"normal"`), and `shift_tree_loc` (default `0`) for tuning
1818
the size, weight, and placement of the connected labels.
19+
- `color_tree_by` (default `None`): color the tree's branches and tip
20+
circles by an Auspice node attribute (e.g. `"subclade"`) or by the
21+
inferred genotype state at one or more sites
22+
(e.g. `"genotype:HA1:158"` or `"genotype:HA1:158,189"`). Colors,
23+
category ordering, and the bottom-of-plot legend match the
24+
Nextstrain view of the same tree.
25+
26+
### Changed
27+
28+
- Default `tree_line_width` bumped from `1.5` to `2`, default
29+
`tree_node_size` from `28` to `45`. Tree branch lines and tip
30+
circles are now drawn at full opacity. The thicker / fuller
31+
defaults read better when the tree is colored (the prior values
32+
were tuned for unicolor black trees).
1933

2034
## [0.1.0] - 2026-05-04
2135

docs/examples.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,11 @@ chart-builder wrote them (fonts, ticks, axis title, and all), and the
103103
tree's dashed leader lines stop at the tree panel's chart-facing
104104
edge.
105105

106+
The H3N2 example above is rendered with `color_tree_by="subclade"`,
107+
which colors the tree's branches and tip circles by the
108+
`node_attrs.subclade` value at each node and adds a categorical legend
109+
below the plot. See "Color the tree" below for the full set of options.
110+
106111
### Optional: connect leaders all the way to the labels
107112

108113
If you'd prefer the dashed leaders to run flush into the strain
@@ -131,6 +136,40 @@ CLI flags: `--connect-leader-to-label --strain-label-font-size 9
131136
--shift-tree-loc 60`. In Python:
132137
`connect_leader_to_label=True, strain_label_font_size=9, shift_tree_loc=60`.
133138

139+
### Color the tree
140+
141+
Pass `color_tree_by` to color the tree's branches and tip circles by
142+
any property the Auspice JSON exposes — broadly, anything that
143+
appears in the "Color By" dropdown on the Nextstrain view of the same
144+
tree. Two forms are supported:
145+
146+
- A named attribute, e.g. `color_tree_by="subclade"` (used in the
147+
example above). Common alternatives include `"clade_membership"`,
148+
`"region"`, `"country"` — whichever the tree provides.
149+
- A genotype at one or more sites in a gene. For a single site,
150+
`color_tree_by="genotype:HA1:158"` colors each tip by the amino
151+
acid at HA1 site 158. A comma-separated list gives a haplotype:
152+
`"genotype:HA1:158,189"`. Sites that don't vary in the tree are
153+
dropped from the haplotype label.
154+
155+
Colors match what you'd see on the Nextstrain view of the same tree —
156+
either from the JSON's palette information when the build provides it,
157+
or from the same default palette Auspice uses when it doesn't.
158+
Categories are ordered by descending frequency in both cases. Missing
159+
values render in gray, and the legend is drawn at the bottom of the
160+
combined plot.
161+
162+
The example below colors the same H3N2 chart by genotype at HA1
163+
site 158, which has two mutations in the tree (`N158K`, `N158D`) and
164+
so renders three states (N, K, D):
165+
166+
![H3N2 combined chart, colored by genotype HA1:158](images/h3n2_combined_genotype_158.svg)
167+
168+
[Open the interactive chart in a new tab →](charts/h3n2_combined_genotype_158.html){target="_blank"}
169+
170+
CLI flag: `--color-tree-by genotype:HA1:158`. In Python:
171+
`color_tree_by="genotype:HA1:158"`.
172+
134173
### Reproduce — command line
135174

136175
```bash
@@ -145,6 +184,7 @@ tree-annotated-plot \
145184
--tree-size 140 \
146185
--scale-bar \
147186
--branch-length-units substitutions \
187+
--color-tree-by subclade \
148188
--output examples/data/h3n2_combined.json
149189
```
150190

@@ -162,6 +202,7 @@ out = tree_annotated_plot.plot(
162202
tree_size=140,
163203
scale_bar=True,
164204
branch_length_units="substitutions",
205+
color_tree_by="subclade",
165206
)
166207
```
167208

scripts/generate_docs_assets.py

Lines changed: 37 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -110,16 +110,24 @@ def _render_kikawa() -> None:
110110
# Render the bare chart (no tree) so the docs page can show what
111111
# the chart looks like before tree-annotated-plot wraps it.
112112
_save_pair(chart, f"{basename}_chart_only")
113-
out = tree_annotated_plot.plot(
114-
DATA_DIR / f"flu-seqneut-2025to2026_{subtype}.json",
115-
chart,
113+
plot_kwargs = dict(
116114
chart_strain_field="axis_label",
117115
tree_strain_field="derived_haplotype",
118116
branch_length="div",
119117
tree_size=140,
120118
scale_bar=True,
121119
branch_length_units="substitutions",
122120
)
121+
if subtype == "H3N2":
122+
# Color H3N2 by subclade so the docs SVG matches what users see
123+
# on Nextstrain. The Auspice JSON's meta.colorings.subclade has
124+
# no `scale` defined, so colors come from the default palette.
125+
plot_kwargs["color_tree_by"] = "subclade"
126+
out = tree_annotated_plot.plot(
127+
DATA_DIR / f"flu-seqneut-2025to2026_{subtype}.json",
128+
chart,
129+
**plot_kwargs,
130+
)
123131
_save_pair(out, f"{basename}_combined")
124132

125133
# H3N2 again, with `connect_leader_to_label=True` and a 9-point label
@@ -146,9 +154,35 @@ def _render_kikawa() -> None:
146154
connect_leader_to_label=True,
147155
strain_label_font_size=9,
148156
shift_tree_loc=60,
157+
color_tree_by="subclade",
149158
)
150159
_save_pair(out, "h3n2_combined_label_connect")
151160

161+
# H3N2 once more, colored by genotype at HA1 site 158: same chart and
162+
# default layout as `h3n2_combined`, with `color_tree_by` switched to
163+
# the genotype form. Site 158 has two mutations (N158K, N158D) in the
164+
# tree, so this renders three states (N, K, D).
165+
h3n2_chart_genotype = builder.make_chart(
166+
subtype="H3N2",
167+
chart_type="iqr",
168+
titers=titers,
169+
viruses=viruses,
170+
metadata=metadata,
171+
all_cohorts=all_cohorts,
172+
)
173+
out = tree_annotated_plot.plot(
174+
DATA_DIR / "flu-seqneut-2025to2026_H3N2.json",
175+
h3n2_chart_genotype,
176+
chart_strain_field="axis_label",
177+
tree_strain_field="derived_haplotype",
178+
branch_length="div",
179+
tree_size=140,
180+
scale_bar=True,
181+
branch_length_units="substitutions",
182+
color_tree_by="genotype:HA1:158",
183+
)
184+
_save_pair(out, "h3n2_combined_genotype_158")
185+
152186

153187
def main() -> None:
154188
"""Render every example to SVG + interactive HTML under `docs/`."""

0 commit comments

Comments
 (0)