Conversation
|
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3951/docs/iroh/ Last updated: 2026-04-14T11:45:13Z |
| ); | ||
|
|
||
| // ALPN filter: runs after the connection is established for both | ||
| // relay and direct connections. |
There was a problem hiding this comment.
Maybe the EndpointId filter should be run here for IncomingAddr::Ip now that the endpoint id is known for ip connections as well? If not, it should be clearly documented that the endpoint id filter is not run for ip connections. I think it would be simpler if we'd run it here, so that you can impl just that filter if all you want is to filter by endpoint id.
There was a problem hiding this comment.
Maybe just naming? Call the other relay filter?
This is the one that is going to be called for every request, the one that takes just endpoint id is the fastest way to intercept a relay request.
You can of course use this one to filter or rate limit by endpoint id.
There was a problem hiding this comment.
Also, it just now occurs to me that maybe all this filtering could be done through EndpointHooks with some new hooks instead of adding it as a new concept to the Router?
There was a problem hiding this comment.
Yes we need more hooks. I am using Irpc-iroh and I need a hook to validate JWT token on each request.
There was a problem hiding this comment.
Also, it just now occurs to me that maybe all this filtering could be done through EndpointHooks with some new hooks instead of adding it as a new concept to the Router?
I will take a look, but I am not sure about this. We already have all the stages of the state machine exposed with the Incoming -> Accepting -> Connection, with various ways to retry and reject. A state machine is vastly superior to a hook if you want maximum flexibility. And if you want something predefined and more opinionated, that's the router.
There was a problem hiding this comment.
I am using irpc-iroh with router. I want to verify/decode a JWT token on each method request (not connection request) and pass that data(user ID, endpoint Id, user role etc) to RPC method for request authorization. Is this somehow possible?
There was a problem hiding this comment.
So the JWT token comes with each request, or on a separate APLN?
The latter is already possible. You could have an auth ALPN where you just send the JWT token, then have a map from endpoint id to data that you share between the auth ALPN and the service ALPN.
But I am not sure what exactly you want to do.
There was a problem hiding this comment.
I want to authorize each RPC method request and pass decoded user info(id, role) to rpc method, So the JWT token comes with each request. Is there any better method?
There was a problem hiding this comment.
I think it is preferable to do this separately. Otherwise you will have to extend every single protocol with auth.
Basically, you have a server side state that keeps track of endpoints and their data. Then you do a request using ALPN "auth" or whatever that includes the JWT token. At this point you update the shared state and store the JWT data for this endpoint (indefinitely or for some time).
Then in the actual RPC handler you don't need any extra bytes per request. You just check if the request is allowed as soon as you have the endpoint id, otherwise fail.
This is safe because the endpoint id can not be forged unless you have exposed the private key or Ed25519 is broken.
If you store the credentials only for a limited time, you have to have logic on the client side to re-send a token either proactively or when a RPC request fails.
But this is a matter of taste really. I would find having a JWT on each possibly tiny RPC request not just cumbersome but also inefficient.
There was a problem hiding this comment.
I think separate ALPN solution will not work for my use case. My client app is a full stack web app accessible on local LAN. Users will login with their own user name/password by using the same client endpoint. Irpc-Iroh server has a login method for user auth. Login method sends a JWT token which will be included in every request. so I need a per request hook/middleware.
|
https://github.com/n0-computer/quinn would add another place to filter incoming packets by ALPN |
Arqu
left a comment
There was a problem hiding this comment.
Looks like a solid improvement over the current state. I do agree that given we're working against adverse conditions here, we should maybe have a check/guard against attempts of Relay like conns via IP and inverse.
d1f2aee to
4cf66a9
Compare
0a4e3fa to
8fc9301
Compare
All the complex logic about retries etc. moves into the example for people to copy there.
d357c42 to
a8188b8
Compare
a8188b8 to
ac4fd73
Compare
d092a6b to
5ef38a9
Compare
Also strip down the example a lot.
5ef38a9 to
b2b9757
Compare
flub
left a comment
There was a problem hiding this comment.
Only reviewed the public api, not the implementation. But that looks nice to me and makes sense. I even love the docs on them!
Frando
left a comment
There was a problem hiding this comment.
Looks good in general.
I'm wondering a bit still if this should be an endpoint hook. Or, how we intend to evolve both APIs (endpoint hooks and router hooks) - we have a similar "can be done on both levels" with access control.
But this does not have to hold up the PR. We still have some time to settle until 1.0, and I also don't have a strong opinion in either direction.
| /// Two direct endpoints with a filtered router on the first. | ||
| /// | ||
| /// Binds to IPv4 loopback only so retry-token validation works on | ||
| /// multi-homed CI hosts (tokens are tied to the source address). |
There was a problem hiding this comment.
Does that mean retrying is broken when there's multipe interfaces?
If so that should be prominently documented, because this can not only happen in CI but also for regular endpoints, right?
If I have an EndpointAddr with two IPs that are both reachable, and the remote asks for address validation retry, and it doesn't work, this is bad. It basically would mean that you cannot use address validation reliably if your server is reachable over two IP addresses?
Maybe we should add a patchbay test to verify current behavior, and/or document the limitation clearly. I also wonder, though, if we shouldn't fix this?
There was a problem hiding this comment.
Yeah, retrying under some circumstances is broken. I had this happen in CI and had to make sure I only subscribe to a single interface:
https://discord.com/channels/949724860232392765/950683937661935667/1493210239116251157
If we would fix it it should not hold up this PR, since it only exposes existing functionality in the iroh Endpoint at the ProtocolHandler level.
There was a problem hiding this comment.
if you have a rough idea of when it fails, would be great to add this to the docs somewhere
Frando
left a comment
There was a problem hiding this comment.
Requesting changes for the debug_assert (needs to be documented or removed IMO) and the unused governor dep
modify some tests, remove unused governor dependency.
3e4e913 to
12ea8ed
Compare
It is already more hook like than before, but I think it is fine to have this only on the Router. If you use the endpoint state machine manually you can already do all of this without a hook.
|
| ); | ||
| } | ||
| if let Err(err) = incoming.retry() { | ||
| err.into_incoming().refuse(); |
There was a problem hiding this comment.
Actually we should definitely log this error, otherwise real users not following the protocol to the dot won't know why they just keep getting refused.
There was a problem hiding this comment.
Yeah, a log would be good. maybe even an iroh::_events::conn::refused (if you grep for _events::conn you can see the other events i added recently).
| ); | ||
| } | ||
| if let Err(err) = incoming.retry() { | ||
| err.into_incoming().refuse(); |
There was a problem hiding this comment.
Yeah, a log would be good. maybe even an iroh::_events::conn::refused (if you grep for _events::conn you can see the other events i added recently).
| /// Two direct endpoints with a filtered router on the first. | ||
| /// | ||
| /// Binds to IPv4 loopback only so retry-token validation works on | ||
| /// multi-homed CI hosts (tokens are tied to the source address). |
There was a problem hiding this comment.
if you have a rough idea of when it fails, would be great to add this to the docs somewhere
I added a log. Not sure about the event. Do we want an event every time we refuse, or just when we refuse because the filter incorrectly tells us to retry? For the filter misbehaving, that isn't really fault of the endpoint, so the event would have to be something about the router. For an event for every reject, that would be somewhere different, no? |
The example is now pretty minimal.
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [iroh](https://github.com/n0-computer/iroh) | workspace.dependencies | minor | `0.97.0` → `0.98.0` | --- ### Release Notes <details> <summary>n0-computer/iroh (iroh)</summary> ### [`v0.98.0`](https://github.com/n0-computer/iroh/blob/HEAD/CHANGELOG.md#0980---2026-04-17) [Compare Source](n0-computer/iroh@v0.97.0...v0.98.0) ##### ⛰️ Features - *(iroh)* Avoid allocations in `AddrFilter` ([#​4018](n0-computer/iroh#4018)) - ([d26cbd2](n0-computer/iroh@d26cbd2)) - *(iroh)* Pluggable crypto backends ([#​3992](n0-computer/iroh#3992)) - ([387c2e4](n0-computer/iroh@387c2e4)) - *(iroh)* Add ability to configure external addrs ([#​4098](n0-computer/iroh#4098)) - ([372aab9](n0-computer/iroh@372aab9)) - *(iroh)* Do not advertise deprecated IPv6 addrs ([#​4106](n0-computer/iroh#4106)) - ([b2b1d46](n0-computer/iroh@b2b1d46)) - *(iroh-relay)* Expose more metrics on the relay server ([#​4085](n0-computer/iroh#4085)) - ([ee844a6](n0-computer/iroh@ee844a6)) - Expose the new fn decrypt on noq. ([#​4002](n0-computer/iroh#4002)) - ([197d8db](n0-computer/iroh@197d8db)) - Update noq and net-tools to latest ([#​4088](n0-computer/iroh#4088)) - ([159a5cf](n0-computer/iroh@159a5cf)) - Rate limiting in router ([#​3951](n0-computer/iroh#3951)) - ([ea7a634](n0-computer/iroh@ea7a634)) - Update relay protocol to iroh-relay-v2, add new Health frame, test protocol update mechanism ([#​3955](n0-computer/iroh#3955)) - ([0a22d76](n0-computer/iroh@0a22d76)) - Update to noq\@​0.18 ([#​4121](n0-computer/iroh#4121)) - ([72e4538](n0-computer/iroh@72e4538)) - Vendor pkarr implementation ([#​4026](n0-computer/iroh#4026)) - ([3ab6222](n0-computer/iroh@3ab6222)) ##### 🐛 Bug Fixes - *(ci)* Install cmake and nasm for aws-lc-sys on Windows ([#​4032](n0-computer/iroh#4032)) - ([405de1d](n0-computer/iroh@405de1d)) - *(iroh)* \[**breaking**] Faster relay health check after network change ([#​4041](n0-computer/iroh#4041)) - ([b11b0eb](n0-computer/iroh@b11b0eb)) - *(iroh)* `Endpoint::online` should only return once connected to a relay ([#​4115](n0-computer/iroh#4115)) - ([3424c6d](n0-computer/iroh@3424c6d)) - *(iroh)* \[**breaking**] Don't stop address lookup when one service errors ([#​4126](n0-computer/iroh#4126)) - ([321443d](n0-computer/iroh@321443d)) - *(iroh)* Don't falsely report home relay as connected on change ([#​4136](n0-computer/iroh#4136)) - ([1b04d34](n0-computer/iroh@1b04d34)) - *(iroh-relay)* Send EndpointGone message to the right peers ([#​4079](n0-computer/iroh#4079)) - ([6be4b3c](n0-computer/iroh@6be4b3c)) - *(iroh-relay)* Treat frames not allowed in a protocol version as error ([#​4127](n0-computer/iroh#4127)) - ([1c92a39](n0-computer/iroh@1c92a39)) - Improve holepunching after network changes ([#​3928](n0-computer/iroh#3928)) - ([cc21f51](n0-computer/iroh@cc21f51)) - Poll for default route with exponential backoff after network change ([#​4039](n0-computer/iroh#4039)) - ([dc01a77](n0-computer/iroh@dc01a77)) - Modify test\_ip\_wins\_over\_custom so it is a noop if no ip transport is… ([#​4048](n0-computer/iroh#4048)) - ([b141e02](n0-computer/iroh@b141e02)) - Increase path idle timeouts and configure relay paths ([#​4038](n0-computer/iroh#4038)) - ([eba6afc](n0-computer/iroh@eba6afc)) - Handle wine properly ([#​3902](n0-computer/iroh#3902)) - ([8bf7002](n0-computer/iroh@8bf7002)) - Set default crypto provider for iroh-relay binary ([#​4087](n0-computer/iroh#4087)) - ([2433645](n0-computer/iroh@2433645)) ##### 🚜 Refactor - *(iroh)* \[**breaking**] Improve address lookup registry ([#​4130](n0-computer/iroh#4130)) - ([481c870](n0-computer/iroh@481c870)) - *(iroh, iroh-relay)* \[**breaking**] Mark public types as non\_exhaustive ([#​4107](n0-computer/iroh#4107)) - ([86d3ac6](n0-computer/iroh@86d3ac6)) - *(iroh-base)* \[**breaking**] Don't expose third-party error types in iroh-base ([#​4073](n0-computer/iroh#4073)) - ([d803dfe](n0-computer/iroh@d803dfe)) - *(iroh-base)* \[**breaking**] Rename `CustomAddr::as_vec` to `CustomAddr::to_vec` and improve docs ([#​4074](n0-computer/iroh#4074)) - ([3077f17](n0-computer/iroh@3077f17)) - *(iroh-base)* \[**breaking**] Change `SecretKey::generate` to not take an `Rng` arg ([#​4075](n0-computer/iroh#4075)) - ([36781ad](n0-computer/iroh@36781ad)) - *(iroh-relay)* \[**breaking**] Proper timeouts on relay connections ([#​4083](n0-computer/iroh#4083)) - ([79bb0e4](n0-computer/iroh@79bb0e4)) - Use latest noq api ([#​4057](n0-computer/iroh#4057)) - ([8af8370](n0-computer/iroh@8af8370)) - Remove needless alloc in relay server ([#​4084](n0-computer/iroh#4084)) - ([40b50a1](n0-computer/iroh@40b50a1)) - Move to\_z32/from\_z32 into iroh-base and remove EndpointIdExt trait ([#​4133](n0-computer/iroh#4133)) - ([382bf36](n0-computer/iroh@382bf36)) ##### 🧪 Testing - *(iroh)* Add patchbay test matrix for switching uplinks ([#​4095](n0-computer/iroh#4095)) - ([d72f1cb](n0-computer/iroh@d72f1cb)) - *(iroh)* Add more patchbay tests and improve existing tests ([#​4118](n0-computer/iroh#4118)) - ([f5ec24a](n0-computer/iroh@f5ec24a)) - *(iroh)* Improve patchbay NAT matrix ([#​4135](n0-computer/iroh#4135)) - ([2ad657a](n0-computer/iroh@2ad657a)) - Fix ip\_wins\_over\_custom flakiness ([#​4047](n0-computer/iroh#4047)) - ([174b58b](n0-computer/iroh@174b58b)) - Add patchbay tests ([#​3986](n0-computer/iroh#3986)) - ([2ab1240](n0-computer/iroh@2ab1240)) - Improve patchbay test setup ([#​4078](n0-computer/iroh#4078)) - ([9b01751](n0-computer/iroh@9b01751)) - Minor patchbay test improvements ([#​4091](n0-computer/iroh#4091)) - ([0cc2441](n0-computer/iroh@0cc2441)) ##### ⚙️ Miscellaneous Tasks - *(ci)* Move more CI jobs to self hosted runners ([#​4072](n0-computer/iroh#4072)) - ([54442c3](n0-computer/iroh@54442c3)) - *(iroh)* Update hickory to 0.26.0-beta.4 and use exact-version deps for prereleased crates ([#​4117](n0-computer/iroh#4117)) - ([f829593](n0-computer/iroh@f829593)) - *(iroh-bench)* Allow configuring the number of worker threads for each endpoint in iroh-bench ([#​4063](n0-computer/iroh#4063)) - ([065b448](n0-computer/iroh@065b448)) - Update to netwatch\@​0.16 and portmapper\@​0.16 ([#​4128](n0-computer/iroh#4128)) - ([24efedf](n0-computer/iroh@24efedf)) - Fix changelog generation - ([08c6454](n0-computer/iroh@08c6454)) ##### Deps - *(iroh)* Update patchbay to 0.5 ([#​4097](n0-computer/iroh#4097)) - ([811a062](n0-computer/iroh@811a062)) - Bump noq and net-tools ([#​4113](n0-computer/iroh#4113)) - ([184e378](n0-computer/iroh@184e378)) ##### Examples - *(iroh)* Allow to configure the receive window in the transfer example ([#​4082](n0-computer/iroh#4082)) - ([c865251](n0-computer/iroh@c865251)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJ0eXBlL21pbm9yIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/towonel/pulls/19
Description
Add an optional mechanism for rate limiting and requesting requests in the iroh router.
The reason to have something like this is if we ever have a single iroh endpoint exposed to the world that can be overloaded or hit by a DOS attack. E.g. a public irpc service, n0des, ...
Unfortunately this has quite some API surface: a trait for the callbacks, and enums for the fn returns.
I also added an example that hammers a server with lots of clients and measures the CPU load. Here is the output of the latest version:Simplified the sample. The hammering tool now lives in a separate repo, since it got a bit too big for an example.
ignore is fastest, reject just a tad slower. reject alpn doesn't buy much over just closing the connection. It is just a convenient place to throttle or reject by alpn.
For relay connections, the cheapest is to reject by endpoint id. This is quite a bit faster than closing the connection since we get the endpoint id early. Rejecting by alpn in this case helps a bit more.
The example starts a server in a subprocess and then hammers it with n connections. The subprocess is measuring its own CPU time.
Breaking Changes
None
Notes & open questions
Change checklist
quic-rpciroh-gossipiroh-blobsdumbpipesendme