Fix tailscale#106
Conversation
- Place binaries in /usr/{bin,sbin} instead of /usr/local/{bin,sbin} to match provided systemd service definition
- Add tmpfiles.d config to copy vendor supplied tailscaled.defaults to /etc/default/tailscaled so service can start
| cat <<EOF >"${SYSEXTNAME}"/usr/lib/systemd/system/tailscaled.service.d/10-networkd-reload.conf | ||
| # Reload systemd-networkd.service to pick up 50-tailscale.network | ||
|
|
||
| [Service] | ||
| ExecStartPre=systemctl reload systemd-networkd.service | ||
| EOF |
There was a problem hiding this comment.
Is there a better way to achieve this? The sysext will place /usr/lib/systemd/network/50-tailscale.network but that happens after systemd-networkd.service has started. It would be better if we had some mechanism to reload this service when this sysext is loaded.
There was a problem hiding this comment.
I ran some test and this does not seem required. I booted an instance without this drop-in and I see the tailscale0 link unmanaged as expected.
There was a problem hiding this comment.
When you run networkctl list what does it show for “TYPE” for tailscale0?
Mine was showing “none” but I expected it to show “tun” or “tunnel”. If it did, then a default networkctl config would have made it unmanaged
There was a problem hiding this comment.
Ok, there's some kind of race condition with my (minimal) Ignition config.
I observed what you have, with a minimal Ignition config, the link comes up umanaged.
However, if I add just a bit more config to Ignition (specifically, mount another volume to /var/lib/docker), the link repeatedly comes up as managed.
variant: flatcar
version: 1.0.0
storage:
filesystems:
- device: /dev/disk/by-id/scsi-0HC_Volume_XXXXXX
format: ext4
wipe_filesystem: false
label: VOLUME
files:
- path: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
contents:
source: https://XXXXXX.s3.com/tailscale.raw
- path: /etc/sysupdate.d/noop.conf
contents:
source: https://github.com/flatcar/sysext-bakery/releases/download/latest/noop.conf
- path: /etc/sysupdate.tailscale.d/tailscale.conf
contents:
source: https://github.com/flatcar/sysext-bakery/releases/download/latest/tailscale.conf
links:
- path: /etc/resolv.conf
target: /run/systemd/resolve/stub-resolv.conf
overwrite: true
- path: /etc/extensions/tailscale.raw
target: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
hard: false
- path: /etc/systemd/system/multi-user.target.wants/tailscaled.service
target: /usr/local/lib/systemd/system/tailscaled.service
overwrite: true
systemd:
units:
# Docker volume mount
- name: var-lib-docker.mount
enabled: true
contents: |
[Unit]
Description=Mount external volume to /var/lib/docker
Before=local-fs.target
[Mount]
What=/dev/disk/by-label/VOLUME
Where=/var/lib/docker
Type=ext4
[Install]
WantedBy=local-fs.target
- name: docker.service
dropins:
- name: 10-wait-docker.conf
contents: |
[Unit]
After=var-lib-docker.mount
Requires=var-lib-docker.mount
# Tailscale sysext
- name: systemd-sysupdate.timer
enabled: true
- name: systemd-sysupdate.service
dropins:
- name: tailscale.conf
contents: |
[Service]
ExecStartPre=/usr/lib/systemd/systemd-sysupdate -C tailscale update
- name: sysext.conf
contents: |
[Service]
ExecStartPost=systemctl restart systemd-sysextThere's no explicit link between the docker mount and sysext, so I'm not sure what's going on.
There was a problem hiding this comment.
Ok, there is a race between systemd-sysext.service and systemd-networkd.service.
If I add a sleep 3 as an ExecStartPre to systemd-sysext.service then the link will come up as managed, even with a minimal config (shown below). Change to sleep 0 or remove the drop-in and it comes up unmanaged as expected.
variant: flatcar
version: 1.0.0
storage:
files:
- path: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
contents:
source:https://XXXXXX.s3.com/tailscale.raw
- path: /etc/sysupdate.d/noop.conf
contents:
source: https://github.com/flatcar/sysext-bakery/releases/download/latest/noop.conf
- path: /etc/sysupdate.tailscale.d/tailscale.conf
contents:
source: https://github.com/flatcar/sysext-bakery/releases/download/latest/tailscale.conf
links:
- path: /etc/resolv.conf
target: /run/systemd/resolve/stub-resolv.conf
overwrite: true
- path: /etc/extensions/tailscale.raw
target: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
hard: false
- path: /etc/systemd/system/multi-user.target.wants/tailscaled.service
target: /usr/local/lib/systemd/system/tailscaled.service
overwrite: true
systemd:
units:
# Make sysext wait a bit, remove this and link will be unmanaged
- name: systemd-sysext.service
dropins:
- name: fake-wait.conf
contents: |
[Service]
ExecStartPre=sleep 3
# Tailscale sysext
- name: systemd-sysupdate.timer
enabled: true
- name: systemd-sysupdate.service
dropins:
- name: tailscale.conf
contents: |
[Service]
ExecStartPre=/usr/lib/systemd/systemd-sysupdate -C tailscale update
- name: sysext.conf
contents: |
[Service]
ExecStartPost=systemctl restart systemd-sysext
If we add an explicit dependency then the sleeps no longer expose the race. However, I can't provide this in the sysext because (of course) that's only loaded after systemd has decided to load both services in parallel.
systemd:
units:
- name: systemd-networkd.service
dropins:
- name: 10-after-sysext.conf
contents: |
[Unit]
After=systemd-sysext.serviceSo, options are:
- Keep
systemctl reload systemd-networkd.serviceasExecStartPrein sysext provided tailscaled.service - Add something to README.md to tell the user to add this drop-in to their Butane config 😞
- Change Flatcar upstream to add the above dependency by default 😬
There was a problem hiding this comment.
Thanks a lot for this investigation. On Flatcar there is a helper to run sysext provided units: https://github.com/flatcar/init/blob/b5a6cbcfaabe605e28e075b8ac674edaf576a0eb/systemd/system/ensure-sysext.service#L15 - I am wondering if we could not add a restart of systemd-networkd here.
There was a problem hiding this comment.
Agreed. There is no need to restart systemd-networkd, a reload works perfectly and avoids bringing eth0 down and up.
I've added a PR to do exactly this: flatcar/init#128
Full disclosure, I've moved to just having tailscale run in docker instead of using this sysext, but happy to test/fix this for this PR ;)
There was a problem hiding this comment.
Full disclosure, I've moved to just having tailscale run in docker instead of using this sysext, but happy to test/fix this for this PR ;)
Oh good to know. One of the goal of sysext-bakery is to provide alternatives when containers are not usable (e.g alternative container runtime, kernel modules, etc.) - there are a few exceptions of course (e.g Kubernetes components to benefit from auto-updates). If you have a tailscale setup running with containers, I think we could add some documentation here: https://www.flatcar.org/docs/latest/setup/customization/ and why not sunsetting this sysext image.
There was a problem hiding this comment.
Unfortunately, you lose out on some optional tailscale functionality when running in Docker, such as built-in SSH and its file transfer features. These work, but since you're in the equivalent of a chroot it's not as useful as it running natively. I don't use those features so Docker works for me. In fact, it's an officially supported way of running Tailscale: https://tailscale.com/kb/1282/docker
All this being said, do you want another thing to maintain and update? 😄
I suppose another solution would be to build sysexts directly from Gentoo: https://packages.gentoo.org/packages/net-vpn/tailscale
But that'll quickly get tricky maintaining runtime dependencies.
tormath1
left a comment
There was a problem hiding this comment.
Thanks a lot, that looks good. Can I ask you to bump the released version to 1.78.1?
sysext-bakery/release_build_versions.txt
Line 26 in 6cb91d8
| cat <<EOF >"${SYSEXTNAME}"/usr/lib/systemd/system/tailscaled.service.d/10-networkd-reload.conf | ||
| # Reload systemd-networkd.service to pick up 50-tailscale.network | ||
|
|
||
| [Service] | ||
| ExecStartPre=systemctl reload systemd-networkd.service | ||
| EOF |
There was a problem hiding this comment.
I ran some test and this does not seem required. I booted an instance without this drop-in and I see the tailscale0 link unmanaged as expected.
a182858 to
e5b27ac
Compare
Bumped in f3d6287 |
|
great work @jmacdonagh. I tested this on a testvm. have you observed similar? need to check how I can debug as I can no longer log into the machine. edit: reverting to copying the binary to /usr/local/sbin instead of /usr/sbin fixes it for me. |
|
I can reproduce the breakage on 4152.2.0 - repros in qemu for me all the time. This is caused by a "unified I think we should just put everything into |
Fix tailscale sysext so the service can start by default
The tailscale sysext had a number of issues as described in #105. Specifically:
EnvironmentFile, so service could not start upnetworkdwas managing thetailscale0service so Tailscale couldn't enableMagicDNS.How to use
./create_tailscale_sysext.sh 1.76.6 tailscaleandscpto a running Flatcar machine without thetailscalesysext already configured.sshinto the Flatcar machine.systemctl status tailscaled.servicedoes not existmv /path/to/tailscale.raw /etc/extensions/tailscale.rawsystemd-sysext refreshsystemctl status tailscaled.servicenow existssystemctl start tailscaled.servicenetworkctl listshowstailscale0asunmanaged(note, the state will show asdegradeduntil youtailscale upwhich isn't needed for this test).Testing done
flatcar-resetwith an Ignition config that fetched thetailscale.rawfrom an S3 bucket, and checked starting service / checked interface unmanaged / etc...Closes: #105