requests: HTTP/1.1 with Content-Length and streaming raw#1124
requests: HTTP/1.1 with Content-Length and streaming raw#1124pablogventura wants to merge 11 commits into
Conversation
Introduce read_status_line(), read_headers(), and BodyStream with Content-Length and until-close modes. No request() changes yet. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
Wire _http into request(); send HTTP/1.1; return BodyStream as .raw without reading or closing the socket in request(). Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
Adapt outbound tests to HTTP/1.1; add Content-Length and incremental .raw.read() coverage. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
Document HTTP/1.1 requests and Content-Length body reads via .raw. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
Read part of the body via .raw then the remainder via .content; avoid "hel" token flagged by codespell. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
ESP32 hardware testingTested on device connected via
Mock tests (
|
| Test | Result |
|---|---|
GET /get with Content-Length body |
200, 11 bytes |
GET /bytes/8 via .content |
abcdefgh, socket closed after read |
GET /bytes/10 incremental .raw.read(3) + .raw.read(3) + .raw.read() |
3+3+4 bytes → 0123456789 |
GET /chunked |
ValueError: Unsupported Transfer-Encoding: chunked (expected for v1) |
Server log confirmed outbound requests use HTTP/1.1:
GET /get HTTP/1.1
GET /bytes/8 HTTP/1.1
GET /bytes/10 HTTP/1.1
GET /chunked HTTP/1.1
Notes
httpbin.orgreturned 503 from this network, so live body tests used a local test server instead.- Mock tests replace
sys.modules["socket"]; live tests reloadusocketafterwards before hitting the network.
| @@ -1,5 +1,7 @@ | |||
| import socket | |||
|
|
|||
| from ._http import open_body, read_headers, read_status_line | |||
There was a problem hiding this comment.
I don't think these functions need to be in a separate module. MicroPython aims to be minimal, and we prefer minimal code over breaking things up into lots of modules.
So, please put everything from _http.py back in this file. That will also make it easier to review the changes.
Per review feedback: keep the package in a single module for easier review. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
|
Done — I moved everything from Re-ran Let me know if anything else should change before you take another look. |
|
Also re-tested on the same ESP32 as in my earlier comment (ESP32-D0WD on |
| return None | ||
|
|
||
|
|
||
| def read_status_line(stream): |
There was a problem hiding this comment.
This function is used only once so please inline it at its point of use.
| return status, reason | ||
|
|
||
|
|
||
| def read_headers(stream, parse_headers=True): |
There was a problem hiding this comment.
This function is used only once so please inline it at its point of use.
| self._sock.close() | ||
|
|
||
|
|
||
| def open_body(stream, headers): |
There was a problem hiding this comment.
This function is used only once so please inline it at its point of use.
Fold read_status_line, read_headers, and open_body into request() per review feedback. Keep BodyStream as the only helper class. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
|
Thanks for the review, @dpgeorge — much appreciated. I've inlined CI should re-run shortly. Happy to adjust anything else. |
Use resp_d.items() when scanning for Location header. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
|
Re-tested after inlining the helpers and the ruff fix (commit d3c3628): Unix port - ESP32 (ESP32-D0WD,
CI re-running on d3c3628. |
| reason = "" | ||
| if len(l) > 2: | ||
| reason = l[2].rstrip() | ||
| line = s.readline() |
There was a problem hiding this comment.
Please keep original code unchanged if possible. In this case it's just renaming l to line, and that doesn't really need to change. It makes it much easier to review if the diff is minimal.
| reason = l[2].rstrip() | ||
| line = s.readline() | ||
| if not line: | ||
| raise ValueError("HTTP error: empty status line") |
There was a problem hiding this comment.
I don't think this check is needed. If line is empty then the len(parts) < 2 will catch that.
| line = s.readline() | ||
| if not line: | ||
| raise ValueError("HTTP error: empty status line") | ||
| parts = line.split(None, 2) |
There was a problem hiding this comment.
Please keep this as l = l.split(None, 2) to keep the diff minimal (it's also more efficient bytecode to reuse the variable).
| l = s.readline() | ||
| if not l or l == b"\r\n": | ||
| line = s.readline() | ||
| if not line or line == b"\r\n": |
There was a problem hiding this comment.
Please keep these 2 lines as they were.
| status = int(parts[1]) | ||
| reason = parts[2].rstrip() if len(parts) > 2 else "" | ||
|
|
||
| resp_d = {} |
There was a problem hiding this comment.
This can be set to None and save a dict alloc if parse_headers is False. That's actually how the code was, so best to keep it as it was.
Keep l-based status/header parsing, resp_d = None when headers disabled, and in-loop Location/Transfer-Encoding checks. Only add BodyStream and HTTP/1.1 where needed. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
|
Thanks again for the detailed review, @dpgeorge. Addressed the latest round in the new commit:
Re-tested on unix port: Happy to adjust further if needed. |
| remaining = None | ||
| if resp_d is not None: | ||
| for k, v in resp_d.items(): | ||
| if k.lower() == "content-length": |
There was a problem hiding this comment.
To save code size, this check for content-length can probably go up in the code where resp_d is being populated.
| if k.lower() == "content-length": | ||
| remaining = int(v) | ||
| break | ||
| resp = Response(BodyStream(s, remaining)) |
There was a problem hiding this comment.
If remaining is None I suggest not wrapping in BodyStream, it doesn't need it. So that means the behaviour is unchanged from before if there's no content-length, and the BodyStream class can be simpler (because remaining is now non-None).
Detect Content-Length while parsing headers; pass the socket through unchanged otherwise. Simplify BodyStream to require a byte count. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
|
Thanks for the follow-up comments, @dpgeorge. Addressed in commit 8b20e5c:
Re-tested on unix port: |
| data = self._sock.read(n) | ||
| self._remaining -= len(data) | ||
| if self._remaining > 0 and not data: | ||
| raise ValueError("Connection closed before Content-Length satisfied") |
There was a problem hiding this comment.
How does CPython's requests behave in this situation, does it raise an exception, or just return short data?
(Also, the check can be just if not data:.)
Use `if not data:` when a read returns empty before Content-Length is satisfied. Signed-off-by: Pablo Ventura <pablogventura@gmail.com>
|
Good question. I checked CPython 3.12 with requests 2.31 / urllib3 2.0.7 on a response with Content-Length: 10 but only 3 body bytes before close:
So streaming behaviour is: partial reads are fine; exception when you try to read past EOF with Content-Length still outstanding. That matches what BodyStream does (ValueError instead of IncompleteRead, for simplicity). Applied your suggestion in 6372c93: the check is now |
Summary
_httpmodule.Content-Lengthresponse bodies without buffering inrequest()..rawas a liveBodyStreamwrapper;.contentremains lazy.Closes the approach in #1119 (which buffered the full body and closed the socket). Chunked responses,
max_body, and relative redirects are intentionally left for follow-up PRs.Testing
micropython test_requests.py(unix port) — 13 mock tests including incremental.raw.read().Trade-offs and Alternatives
ValueError(unchanged from master until a follow-up PR).BodyStreamadds a small wrapper object (~two fields) instead of buffering the body inrequest().stream=Trueis not implemented yet (issue requests: stream parameter is noop #777).Generative AI
I used generative AI tools when creating this PR, but a human has checked the
code and is responsible for the code and the description above.