Skip to content

Conversation

@osfameron
Copy link
Collaborator

Context:

  1. @malarky noted that pages like

BAD https://docs.couchbase.com/admin/admin/enterprise-edition.html

still exist and are indexed by Google.

  1. most recent builds of those docs were replaced with an AWS S3 redirect file that would take you to the DA site, e.g.

404 https://developer.couchbase.com/documentation/server/3.x/admin/pdfs.html

  1. ... those links are now unmaintained 404

  2. But there are x13 files that were moved or deleted and therefore not replaced with those redirection links.
    Those remain as *zombie 200 files with bad content under /admin/admin

  3. we have in the meantime restored our 3.x archive docs:

GOOD https://docs-archive.couchbase.com/docs-3x/admin/Couchbase-intro.html

Solution:

Step 1: this commit adds nginx rewrites to:

  • rewrite zombie 200 files without a candidate location to the archive link above
  • rewrite zombie 200 files that have a good candidate page to a specific page under that tree
  • rewrite everything else to within that tree under the same path

Step 2: once tested, we can expunge these unused files from the /admin/admin/ path in the main bucket.

NOTE ON Testing

Er, don't currently have a good way to do this, so propose 🤠 (deploy -> test -> rollback).
Happy to put work into improving that if worthwhile, or to schedule this change at the least inconvenient time if not!

Context: 1) @malarky noted that pages like BAD https://docs.couchbase.com/admin/admin/enterprise-edition.html still exist and are indexed by Google. 2) most recent builds of those docs were replaced with an AWS S3 redirect file that would take you to the DA site, e.g. 404 https://developer.couchbase.com/documentation/server/3.x/admin/pdfs.html 3) ... those links are now unmaintained 404 4) But there are x13 files that were *moved or deleted* and therefore not replaced with those redirection links. Those remain as **zombie* 200 files with bad content under /admin/admin 5) we have in the meantime restored our 3.x archive docs: GOOD https://docs-archive.couchbase.com/docs-3x/admin/Couchbase-intro.html Solution: Step 1: this commit adds nginx rewrites to: * rewrite zombie 200 files without a candidate location to the archive link above * rewrite zombie 200 files that have a good candidate page to a specific page under that tree * rewrite everything else to within that tree under the same path Step 2: once tested, we can expunge these unused files from the /admin/admin/ path in the main bucket.
Copy link
Contributor

@simon-dew simon-dew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions for redirects.

(Incidentally the PDFs page for v3.x, mentioned in DOC-13531, is here.)

@osfameron osfameron marked this pull request as ready for review April 3, 2025 08:47
@osfameron osfameron requested a review from simon-dew April 3, 2025 08:47
@osfameron osfameron merged commit ad2bedb into master Apr 3, 2025
2 checks passed
osfameron added a commit that referenced this pull request Apr 3, 2025
This reverts commit ad2bedb. 1) This change does not fix the issue it was logged for. In itself, that isn't a huge problem, as it could be iterated on but: 2) I don't have a good way to test the changes beyond deploying to staging. Previously we'd have deployed to `staging` but now (for convenience) those share the same branch, and therefore we can't update one without the other. 3) It's hard to tell if the `sudo update-nginx-rewrites` command succeeded. There's nothing mentioned in the GitHub Workflow log. In the past, when invoked via `ssh` this would have guaranteed that the command ran to completion with a successful exit code. But now, it's run via `aws ssm run`, which only returns that the job is pending. 4) I can't ssh to the production jenkins to examine the state of the nginx there. That's generally not something I want to be doing regularly, but is useful for this kind of case where we're unsure if a change has taken effect. 5) I don't entirely understand the `update-nginx-rewrites` script, so I'm unsure what happens if it fails (does it retain the bad config or revert it? It's been hacked at and commented over some time, including by me, and may not work as specified) 6) `nginx` won't ever *reload* config to a bad config, which is GOOD! BUT... does mean that if a bad config is uploaded, and it doesn't return an error, there is a risk that the bad config was silently uploaded, rejected, but remains on the filesystem... waiting for the time to pouch. 7) If AWS reboot the prod Jenkins for maintenance, the nginx might not come up. This could happen at an unexpected time.
osfameron added a commit that referenced this pull request Apr 7, 2025
osfameron added a commit that referenced this pull request Apr 8, 2025
This reverts commit b33ce7c. Fixed infrastructure (broken duplicate location removal logic) and tested in infratest.
sarahlwelton pushed a commit that referenced this pull request Jun 23, 2025
* DOC-13151 fix zombie 3.x docs issue Context: 1) @malarky noted that pages like BAD https://docs.couchbase.com/admin/admin/enterprise-edition.html still exist and are indexed by Google. 2) most recent builds of those docs were replaced with an AWS S3 redirect file that would take you to the DA site, e.g. 404 https://developer.couchbase.com/documentation/server/3.x/admin/pdfs.html 3) ... those links are now unmaintained 404 4) But there are x13 files that were *moved or deleted* and therefore not replaced with those redirection links. Those remain as **zombie* 200 files with bad content under /admin/admin 5) we have in the meantime restored our 3.x archive docs: GOOD https://docs-archive.couchbase.com/docs-3x/admin/Couchbase-intro.html Solution: Step 1: this commit adds nginx rewrites to: * rewrite zombie 200 files without a candidate location to the archive link above * rewrite zombie 200 files that have a good candidate page to a specific page under that tree * rewrite everything else to within that tree under the same path Step 2: once tested, we can expunge these unused files from the /admin/admin/ path in the main bucket. * DOC-13151 tweaks to target location via @simon-dew, thanks!
sarahlwelton pushed a commit that referenced this pull request Jun 23, 2025
This reverts commit ad2bedb. 1) This change does not fix the issue it was logged for. In itself, that isn't a huge problem, as it could be iterated on but: 2) I don't have a good way to test the changes beyond deploying to staging. Previously we'd have deployed to `staging` but now (for convenience) those share the same branch, and therefore we can't update one without the other. 3) It's hard to tell if the `sudo update-nginx-rewrites` command succeeded. There's nothing mentioned in the GitHub Workflow log. In the past, when invoked via `ssh` this would have guaranteed that the command ran to completion with a successful exit code. But now, it's run via `aws ssm run`, which only returns that the job is pending. 4) I can't ssh to the production jenkins to examine the state of the nginx there. That's generally not something I want to be doing regularly, but is useful for this kind of case where we're unsure if a change has taken effect. 5) I don't entirely understand the `update-nginx-rewrites` script, so I'm unsure what happens if it fails (does it retain the bad config or revert it? It's been hacked at and commented over some time, including by me, and may not work as specified) 6) `nginx` won't ever *reload* config to a bad config, which is GOOD! BUT... does mean that if a bad config is uploaded, and it doesn't return an error, there is a risk that the bad config was silently uploaded, rejected, but remains on the filesystem... waiting for the time to pouch. 7) If AWS reboot the prod Jenkins for maintenance, the nginx might not come up. This could happen at an unexpected time.
sarahlwelton pushed a commit that referenced this pull request Jun 23, 2025
This reverts commit b33ce7c. Fixed infrastructure (broken duplicate location removal logic) and tested in infratest.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants