Skip to content

fix: render text/html display_data output instead of raw IPython repr#58

Open
merlinran wants to merge 1 commit into
googlecolab:mainfrom
merlinran:fix/html-display-output
Open

fix: render text/html display_data output instead of raw IPython repr#58
merlinran wants to merge 1 commit into
googlecolab:mainfrom
merlinran:fix/html-display-output

Conversation

@merlinran

Copy link
Copy Markdown

When a kernel cell produces a rich display object like
IPython.display.HTML, the display_data message includes both
text/html (the actual content) and text/plain (the Python repr,
e.g. <IPython.core.display.HTML object>).

All three display_output paths — colab exec, colab run, and
the REPL — only checked for text/plain in the data dict and printed
that raw repr, which is useless to the user.

This adds a text/html branch ahead of the text/plain fallback
in all three locations. HTML tags are stripped for terminal display
via html.parser.HTMLParser. The result isn't a full HTML renderer,
but it surfaces the actual content — which is a clear improvement
over the current behavior of printing the Python object repr.

Fixes in:

  • execution.pydisplay_output()
  • automation.py — inline output loop in run_automation()
  • repl.pyColabREPL.display_output()

@merlinran

Copy link
Copy Markdown
Author

Before:

Starting training...

<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
...

After:

Starting training...





      [   2/1075 : < :, Epoch 0.00/5]




      Step
      Training Loss
      Validation Loss









      [   3/1075 00:08 < 2:32:58, 0.12 it/s, Epoch 0.01/5]




      Step
      Training Loss
      Validation Loss









      [   4/1075 00:16 < 2:31:04, 0.12 it/s, Epoch 0.01/5]
...

@teeler

teeler commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Hi there,

Thanks for the PR - please glance at https://github.com/googlecolab/google-colab-cli/blob/main/CONTRIBUTING.md

That said - I do like this idea, but I'm wondering if there is a better/more exciting alternative than just stripping out the tags (this is why i prefer to have the discussion btw, so we can sort out these details before having folks write any code)

Instead of stripping, i'm wondering if there's a nicer way to render the HTML outputs in the terminal? Eg, using something like rich?

@merlinran

Copy link
Copy Markdown
Author

Yeah I mostly use this as an idea for the discussion. I'm not very familiar with this domain. A quick search suggests that rich is great at exporting stuff as html, but not for taking html as input. Would https://pypi.org/project/html2text/ be a good fit?

@merlinran

merlinran commented Jun 16, 2026

Copy link
Copy Markdown
Author

Updated to use html2text + rich (left text/latex unhanlded). With test_display.py

uv run colab new -s display-test
uv run colab exec -s display-test -f /tmp/test_display.py
uv run colab stop -s display-test
image

@teeler

teeler commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@merlinran what do you think? I think that looks pretty good!

@sethtroisi any thoughts?

@merlinran

Copy link
Copy Markdown
Author

This works good enough to me. Just tested with a finetuning task.

Starting training...

[ 2/1075 : < :, Epoch 0.00/5]


 Step  Training Loss  Validation Loss
 ────────────────────────────────────

[ 3/1075 00:08 < 2:39:12, 0.11 it/s, Epoch 0.01/5]


 Step  Training Loss  Validation Loss
 ────────────────────────────────────

[ 4/1075 00:17 < 2:36:37, 0.11 it/s, Epoch 0.01/5]


 Step  Training Loss  Validation Loss
 ────────────────────────────────────

[ 5/1075 00:26 < 2:37:36, 0.11 it/s, Epoch 0.02/5]
...

@teeler

teeler commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Yea html2text was a good find, i think i've used that one before.

Can we add some tests here? If you're using an agent, they can read AGENTS.md i think there is some testing guidance in there.

@merlinran merlinran force-pushed the fix/html-display-output branch from 55d7694 to ab9576c Compare June 16, 2026 07:00
Previously text/html display_data output was either ignored (log) or
printed raw. Now:

- New dependency: html2text (>=2024.2.26)
- New util: render_display_data() in utils.py — returns Rich renderables
  (Markdown or Text) using priority text/markdown > text/html > text/plain.
  text/html is converted via html2text; text/plain is wrapped with
  Text.from_ansi to handle embedded ANSI escapes.
- All three call sites (exec.py, automation.py, repl.py) refactored to
  use the shared function instead of duplicated if/elif chains.
- Each module uses a shared Console instance (_console / self.console)
  to avoid re-allocating per-output during streaming.
- Removed direct html2text and Markdown imports from the three call sites
  — they now just pass the renderable to Console.print().
- Tests in test_utils.py cover Markdown return, Text return, priority,
  and no-text fallback.
@merlinran merlinran force-pushed the fix/html-display-output branch from ab9576c to 924f901 Compare June 16, 2026 07:01
@merlinran

Copy link
Copy Markdown
Author

Extracted the logic into a single utility with test coverage. Also squashed into a single commit for better history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants