You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add an artifact-type suffix to the standardized filename scheme so a downloaded file is self-describing out of context. Today the scheme (from get_filename_without_ext_for_artifact in website/utils/fileutils.py) is:
LastName_TitleInTitleCase_VenueYear.ext
e.g. Froehlich_MakingInTheHCIL_CSDepartmentExternalReview2014.pdf
A talk, paper, and poster can all produce the same basename. That's not a storage problem (each type is saved in its own subdir — talks/, posters/, publications/ — and serve_pdf only serves /media/publications/ and queries only Publication, so they never collide). But once a file is downloaded, Froehlich_MakingInTheHCIL_CHI2024.pdf in someone's Downloads folder doesn't reveal whether it's the talk, the paper, or the poster.
get_filename_without_ext_for_artifact already takes an optional suffix param (LastName_Title_suffix_VenueYear) — though we'd likely want the type at the end, so the exact placement is part of the discussion.
Why this is a Discuss (not just a do)
It changes generate_filename, which is the scheme used everywhere — Artifact.save(), the backfill_original_filenames comparison, and the restandardize_artifact_filenames idempotency gate (#1401). Consequences:
Every artifact file gets re-renamed once (more churn, more chances for -<timestamp> collisions).
Need to decide: types to cover (Talk/Poster/Demo/Video?), placement (end vs the existing mid suffix slot), label text (Talk vs InvitedTalk vs talk_type-derived), and whether to apply retroactively or only to new uploads.
Sequencing
Best done after #1401 lands and prod is standardized, so it's a single well-understood scheme change with the #1391 provenance safety net already in place. Until then, leave the scheme as-is.
Decisions to make
Worth doing at all? (cosmetic/provenance benefit vs. re-rename churn + link risk)
Which types, and what label text? (static Talk/Poster vs. derived from talk_type)
Suffix placement (trailing _Talk vs. the existing mid suffix slot).
Retroactive (re-rename everything) or forward-only (new uploads only)?
Idea
Add an artifact-type suffix to the standardized filename scheme so a downloaded file is self-describing out of context. Today the scheme (from
get_filename_without_ext_for_artifactinwebsite/utils/fileutils.py) is:A talk, paper, and poster can all produce the same basename. That's not a storage problem (each type is saved in its own subdir —
talks/,posters/,publications/— andserve_pdfonly serves/media/publications/and queries onlyPublication, so they never collide). But once a file is downloaded,Froehlich_MakingInTheHCIL_CHI2024.pdfin someone's Downloads folder doesn't reveal whether it's the talk, the paper, or the poster.Proposed shape (one option):
get_filename_without_ext_for_artifactalready takes an optionalsuffixparam (LastName_Title_suffix_VenueYear) — though we'd likely want the type at the end, so the exact placement is part of the discussion.Why this is a Discuss (not just a do)
It changes
generate_filename, which is the scheme used everywhere —Artifact.save(), thebackfill_original_filenamescomparison, and therestandardize_artifact_filenamesidempotency gate (#1401). Consequences:-<timestamp>collisions).original_pdf_filename+serve_pdffallback; talk/poster links have no such fallback.suffixslot), label text (TalkvsInvitedTalkvs talk_type-derived), and whether to apply retroactively or only to new uploads.Sequencing
Best done after #1401 lands and prod is standardized, so it's a single well-understood scheme change with the #1391 provenance safety net already in place. Until then, leave the scheme as-is.
Decisions to make
Talk/Postervs. derived fromtalk_type)_Talkvs. the existing midsuffixslot).Spun out of discussion on #1401.