Introduce dedicated stage EFS; fix MQ broker drift and Memcached SG#378
Open
e9e4e5f0faef wants to merge 3 commits intostagefrom
Open
Introduce dedicated stage EFS; fix MQ broker drift and Memcached SG#378e9e4e5f0faef wants to merge 3 commits intostagefrom
e9e4e5f0faef wants to merge 3 commits intostagefrom
Conversation
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/addons, POSIX9500:9500) so the eventualNETAPP_STORAGE_ROOTflip does not fail on filesystem permissions for the non-rootolympiaruntime userengineTypecasing mismatch.on.awsdomain endpointsChanges
infra/pulumi/__main__.pyaddons-efsvolume); correct Memcached ingress rule placement; fixengine_typecasing to prevent force-new broker replacement; remove obsolete SG rulesinfra/pulumi/config.stage.yamlinfra/scripts/preflight_check.py.mq.<region>.on.awsendpoints in broker isolation and SG reachability checksWhy
Dedicated stage EFS: the previous approach tried to mount a shared filesystem across VPC boundaries and failed with
MountTargetConflict. A dedicated stage filesystem keeps storage aligned with the stage isolation model and avoids the cross-VPC limitation.EFS access point for
olympiaUID/GID 9500: the application runs asolympia(UID 9500) perDockerfile.ecs. Without an access point, the eventualNETAPP_STORAGE_ROOTflip from/tmp/storageto/var/addonswould fail withEACCESbecause an empty EFS root directory is owned byroot:rootwith0755permissions. The access point at/addonswith POSIX9500:9500exposes a writable subtree without requiring a root-task bootstrap ritual at activation time. Containers still mount it at/var/addonsviamountPoints. Injection is scoped to the volume namedaddons-efsand applies to web/worker (YAML-defined) and cron (Python-constructed).MQ broker drift fix: AWS returns
engineTypeasRabbitMQ, while the code previously usedRABBITMQ. Because this is a force-new field, the mismatch caused a perpetual broker replacement diff. The fix aligns the configured value with the value returned by AWS.Memcached SG correction: the 11211 ingress rule was attached to the wrong security group, so it had no effect. This moves it to the correct SG and restores the intended cache connectivity.
Validator fix: Amazon MQ RabbitMQ endpoints use
.mq.<region>.on.aws, which the validator did not previously recognise.Storage activation model (staged)
This PR creates and mounts the dedicated stage EFS filesystem at
/var/addonson web/worker/cron, with the access point in place so the runtime user can write into it. It does not switch application writes to EFS.After this PR deploys:
/var/addonsvia the/addonsaccess pointNETAPP_STORAGE_ROOTremains/tmp/storageThe
NETAPP_STORAGE_ROOTflip is intentionally a separate, future operational step. It is gated on post-deploy validation (mount verified, write/read/delete asolympiaUID 9500 succeeds, persistence across task restart confirmed).Validation
pulumi previewshows+ 6 to create / ~ 20 to update / - 3 to delete / +- 3 to replace / = 140 unchanged. The creates are the expected dedicated EFS resources, access point, and the corrected Memcached SG reachability. The replaces are the task-definition updates that pick up the new filesystem ID and access-point authorisation. No broker replacement.ruff checkandruff format --checkpass.on.awsfix appliedSafety
NETAPP_STORAGE_ROOTstill points at/tmp/storageFollow-up
pulumi up/var/addons)olympiaUID 9500 against the access pointNETAPP_STORAGE_ROOTto the EFS-backed path after mount and write verificationstageAddresses part of #375, with issue closure to follow post-deploy validation.