fix: render text/html display_data output instead of raw IPython repr#58
fix: render text/html display_data output instead of raw IPython repr#58merlinran wants to merge 1 commit into
Conversation
|
Before: After: |
|
Hi there, Thanks for the PR - please glance at https://github.com/googlecolab/google-colab-cli/blob/main/CONTRIBUTING.md That said - I do like this idea, but I'm wondering if there is a better/more exciting alternative than just stripping out the tags (this is why i prefer to have the discussion btw, so we can sort out these details before having folks write any code) Instead of stripping, i'm wondering if there's a nicer way to render the HTML outputs in the terminal? Eg, using something like rich? |
|
Yeah I mostly use this as an idea for the discussion. I'm not very familiar with this domain. A quick search suggests that |
|
Updated to use
|
|
@merlinran what do you think? I think that looks pretty good! @sethtroisi any thoughts? |
|
This works good enough to me. Just tested with a finetuning task. |
|
Yea html2text was a good find, i think i've used that one before. Can we add some tests here? If you're using an agent, they can read AGENTS.md i think there is some testing guidance in there. |
55d7694 to
ab9576c
Compare
Previously text/html display_data output was either ignored (log) or printed raw. Now: - New dependency: html2text (>=2024.2.26) - New util: render_display_data() in utils.py — returns Rich renderables (Markdown or Text) using priority text/markdown > text/html > text/plain. text/html is converted via html2text; text/plain is wrapped with Text.from_ansi to handle embedded ANSI escapes. - All three call sites (exec.py, automation.py, repl.py) refactored to use the shared function instead of duplicated if/elif chains. - Each module uses a shared Console instance (_console / self.console) to avoid re-allocating per-output during streaming. - Removed direct html2text and Markdown imports from the three call sites — they now just pass the renderable to Console.print(). - Tests in test_utils.py cover Markdown return, Text return, priority, and no-text fallback.
ab9576c to
924f901
Compare
|
Extracted the logic into a single utility with test coverage. Also squashed into a single commit for better history. |

When a kernel cell produces a rich display object like
IPython.display.HTML, thedisplay_datamessage includes bothtext/html(the actual content) andtext/plain(the Python repr,e.g.
<IPython.core.display.HTML object>).All three
display_outputpaths —colab exec,colab run, andthe REPL — only checked for
text/plainin the data dict and printedthat raw repr, which is useless to the user.
This adds a
text/htmlbranch ahead of thetext/plainfallbackin all three locations. HTML tags are stripped for terminal display
via
html.parser.HTMLParser. The result isn't a full HTML renderer,but it surfaces the actual content — which is a clear improvement
over the current behavior of printing the Python object repr.
Fixes in:
execution.py—display_output()automation.py— inline output loop inrun_automation()repl.py—ColabREPL.display_output()