- Notifications
You must be signed in to change notification settings - Fork 0
testing some workflow optimizations #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| You have run out of free Bugbot PR reviews for this billing cycle. This will reset on December 6. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
| Warning Rate limit exceeded@LukeHagar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 0 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (5)
WalkthroughIntroduces URL caching with TTL persistence, HTTP retry and redirect handling, validates numeric inputs, improves PR comment robustness with graceful error handling, optimizes file reading, and refactors the entrypoint script for maintainability. Changes
Sequence DiagramssequenceDiagram actor User participant slinky as Slinky Check participant cache as URL Cache participant http as HTTP Fetcher participant github as GitHub API User->>slinky: Start check with cache enabled slinky->>cache: Load() from disk (if path set) cache-->>slinky: Cache populated (non-expired entries) loop For each URL slinky->>cache: Get(url) alt Cache hit & not expired cache-->>slinky: Return cached result else Cache miss or expired slinky->>http: fetchWithMethod(url) http->>http: Attempt request with retry/backoff alt Retryable error http->>http: Exponential backoff → retry else Success or final error http-->>slinky: Return status + error end slinky->>cache: Set(url, ok, status, errMsg) end end slinky->>cache: Save() to disk (if path set) cache-->>slinky: Cache persisted alt INPUT_COMMENT_PR enabled & report exists slinky->>github: upsertPRComments(chunks) github-->>slinky: Post/update comment (with error recovery) end slinky-->>User: Return results sequenceDiagram participant slinky as cmd/check.go participant github as GitHub API alt Report exists and COMMENT_PR enabled slinky->>slinky: Read report markdown (warn on error, continue) slinky->>slinky: Split into chunks slinky->>github: List existing PR comments alt Request/parse succeeds github-->>slinky: Comment list slinky->>slinky: Find previous slinky comment alt Found slinky->>github: Delete old comment end else Request/parse fails slinky->>slinky: Log debug, continue (non-fatal) end loop For each chunk slinky->>github: Create/post comment alt Success slinky->>slinky: Log successful post else Error slinky->>slinky: Log error, continue (non-fatal) end end end Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Enhance the description for the 'comment_pr' input in action.yml to specify that it defaults to true when GITHUB_TOKEN is present. In check.go, update the logic to explicitly disable PR commenting only when the input is set to "false", improving clarity on the commenting behavior during PR checks.
…TOKEN usage This commit modifies action.yml to remove the direct reference to GITHUB_TOKEN and updates the README.md to specify that it should be accessed via secrets.GITHUB_TOKEN. Additionally, the example workflow is updated to include the GITHUB_TOKEN in the environment variables, ensuring clarity on its necessity for PR comment functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/web/checker.go (1)
58-64: Deadlock risk when context is cancelledIf
ctx.Done()fires inside the worker loop, the workerreturns without ever sending ondone, but the main goroutine still waits forconcurrencydonesignals. That can deadlockCheckURLsunder cancellation.You can fix this by guaranteeing one
donesend per worker viadefer:-worker := func() { -for j := range jobs { +worker := func() { +defer func() { done <- struct{}{} }() +for j := range jobs { select { case <-ctx.Done(): -return +return default: } ... -} -done <- struct{}{} -} +} +}This preserves the current behavior while making cancellation safe.
Also applies to: 135-135
🧹 Nitpick comments (7)
internal/fsurls/fsurls.go (2)
249-256: Consider adding debug logging for read errors.The file reading optimization using
io.LimitReaderandio.ReadAllis sound and provides defense-in-depth even though file size is pre-checked. However, silently skipping files on read errors could make troubleshooting difficult.Consider adding debug logging when files are skipped due to read errors:
contentBytes, readErr := io.ReadAll(limitedReader) if readErr != nil { // Non-critical error: skip file and continue +if isDebugEnv() { +fmt.Printf("::debug:: Skipping file %s due to read error: %v\n", path, readErr) +} return nil }
422-429: Same optimization, consider debug logging here too.This segment mirrors the file reading optimization in
CollectURLsWithIgnoreConfig. The consistent implementation is good, but the same debug logging suggestion applies.Consider adding debug logging for read errors:
contentBytes, readErr := io.ReadAll(limitedReader) if readErr != nil { // Non-critical error: skip file and continue +if isDebugEnv() { +fmt.Printf("::debug:: Skipping file %s due to read error: %v\n", path, readErr) +} return nil }README.md (1)
19-29: Docs around PR comments and GITHUB_TOKEN are almost perfectly alignedThe workflow example + note clearly explain:
pull-requests: writeis only needed whencomment-pris enabled.GITHUB_TOKENis only required for PR comment functionality.One small polish opportunity: in the Inputs list (
comment-prbullet at Line 42), you currently say “Default:true” without mentioning the “when GITHUB_TOKEN is present” nuance that you added toaction.yml. Mirroring that phrasing here would make the behavior 100% consistent across docs.Also applies to: 31-31
entrypoint.sh (1)
4-14: Entrypoint refactor is solid; consider two robustness tweaksOverall, this is a nice cleanup: PR head SHA handling is safer, ARGS construction is centralized, and outputs/summary handling are clearer.
Two things to consider tightening:
set -evs. EXIT_CODE handlingWith
set -ein effect, any non-zero exit fromslinky $ARGSwill terminate the script immediately, meaningEXIT_CODE=$?, the output writes, and the summary append won’t run in those cases. If your intention is:
- to always write JSON/MD outputs and step summaries even when
--fail-on-failures=truecauses a non-zero exit, it would be safer to explicitly guard the call, e.g.:set +e slinky $ARGS EXIT_CODE=$? set -eor:
slinky $ARGS || EXIT_CODE=$?(and initialize
EXIT_CODE=0before the call).ARGS as a single string vs. argument array
Building
ARGS="check ..."and later doingslinky $ARGSrelies on word-splitting, which is fine for your current numeric/path inputs but will misbehave if any input or target ever legitimately contains spaces. A more robust pattern is to build an argument vector:ARGS="check" # or, more robustly: # set -- check # [ -n "${INPUT_CONCURRENCY:-}" ] && set -- "$@" --concurrency "$INPUT_CONCURRENCY" # ... # slinky "$@"That keeps quoting correct while still allowing globs like
**/*.mdto be passed as intended.Neither point is a blocker, but addressing them would make the entrypoint more future‑proof.
Please double-check how GitHub runs this container (shell implementation on the runner image) to confirm
set -ebehavior and argument splitting match your expectations.Also applies to: 16-40, 43-47, 51-52, 60-60
internal/web/checker.go (1)
65-81: Verify URLCache concurrency safety and minor cache-path nitsThe cache-first path is a nice optimization and the
CacheHitflag inResultis helpful. A couple of points to double-check:
- Multiple workers call
cfg.Cache.Getandcfg.Cache.Setconcurrently; please ensureURLCache’s implementation (ininternal/web/cache.go) protects its internalentriesmap with appropriate locking or otherwise guarantees concurrent safety. If it’s just a plain map, this will race under load.- For reconstructing cached errors,
fmt.Errorf("%s", cached.ErrMsg)could be simplified toerrors.New(cached.ErrMsg)if you don’t need formatting/stack context, but that’s purely cosmetic.Everything else in this cache integration looks consistent with the rest of the checker.
Also applies to: 84-104, 118-121
cmd/check.go (2)
176-199: Cache wiring is reasonable; TTL semantics worth documentingThe URL cache initialization via
SLINKY_CACHE_PATHand optionalSLINKY_CACHE_TTL_HOURSlooks good, with best-effortLoad/Saveand debug-only logging on failure.One minor suggestion: since
SLINKY_CACHE_TTL_HOURSis treated as hours and values ≤0 fall back to 24h insideNewURLCache, it might be worth documenting that:
0or negative TTL doesn’t disable caching; it reverts to the default.- Fractional hours are truncated when converted to
int.Functionally this is fine, but explicit docs will avoid surprises for users tuning cache behavior.
Also applies to: 201-205
499-587: upsertPRComments robustness: good overall, but close DELETE responsesThe refactor here nicely hardens the PR comment flow: clear errors on list failure, non-fatal handling of JSON parse issues, and defensive behavior when marshalling/posting chunks.
One small improvement:
http.DefaultClient.Do(dReq)for DELETE ignores both the error and the response body, which can leak idle connections:-_, _ = http.DefaultClient.Do(dReq) // Non-critical: ignore delete errors +if res, err := http.DefaultClient.Do(dReq); err == nil && res != nil { +res.Body.Close() +} +// Non-critical: we still ignore delete errorsThis keeps the non-critical semantics while ensuring connections are properly returned to the pool.
Also applies to: 537-544
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.github/workflows/example-slinky.yml(1 hunks)Dockerfile(1 hunks)README.md(1 hunks)action.yml(1 hunks)cmd/check.go(4 hunks)entrypoint.sh(1 hunks)internal/fsurls/fsurls.go(2 hunks)internal/web/cache.go(1 hunks)internal/web/checker.go(6 hunks)internal/web/http.go(1 hunks)internal/web/types.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
internal/web/checker.go (1)
internal/web/types.go (2)
Result(5-16)Stats(18-21)
internal/web/types.go (1)
internal/web/cache.go (1)
URLCache(20-24)
cmd/check.go (2)
internal/web/cache.go (2)
URLCache(20-24)NewURLCache(27-37)internal/web/types.go (1)
Config(23-30)
🔇 Additional comments (8)
.github/workflows/example-slinky.yml (1)
20-21: GITHUB_TOKEN wiring for the step looks correctProviding
GITHUB_TOKENviaenvhere matches the permissions block (pull-requests: write) and is what the action expects for PR comments. No changes needed.action.yml (1)
35-35: Improved comment_pr description is clear and consistentThe updated description explicitly tying the default to
GITHUB_TOKENpresence is helpful and avoids ambiguity about when PR comments are actually enabled.Please confirm
cmd/check.go’s defaulting logic forINPUT_COMMENT_PRstill matches this description (default-on only when a token is available).Dockerfile (1)
10-16: Runtime image and ENTRYPOINT wiring look correctAdding
jqandca-certificatesmatches the needs ofentrypoint.shand HTTPS requests, and the ENTRYPOINT to/entrypoint.shis consistent with the new startup flow. No changes needed.If you haven’t already, it’s worth building and running the image locally (
docker runwith a minimalGITHUB_EVENT_PATH) to confirm jq is reachable and the entrypoint runs as expected on Alpine’s/bin/sh.internal/web/types.go (1)
23-30: Config extension with Cache pointer looks appropriateAdding
Cache *URLCachetoConfigis a clean way to thread caching through the web layer. Just make sure callers consistently nil-checkcfg.Cachebefore calling its methods so that configurations without caching remain supported.Please scan
internal/web/checker.goandcmd/check.goto confirm all uses ofcfg.Cacheare guarded againstnil.internal/web/http.go (1)
6-12: HTTP retry/redirect implementation looks solid for link-checking useThe new
fetchWithMethod+followRedirectsflow is well-structured:
- Exponential backoff with bounded retries (
maxRetries) and context-aware cancellation is correctly implemented.- Redirect handling closes bodies along the way, caps depth via
maxRedirects, and copes with missingLocationheaders and simple redirect loops.- Status classification via
isOKStatus(200–299, 401, 403) andisRetryableStatus(408, 429, 5xx) matches the documented behavior and is appropriate for a link checker.- Error handling for DNS, connection refused, and timeouts uses clear simpleError messages while still exposing status codes.
I don’t see correctness issues here; this should significantly improve robustness versus transient HTTP failures and redirects.
Once wired into your checker, it’d be good to run against a sample repo containing redirects, 401/403 endpoints, and flaky hosts to confirm the observed behavior (retries, final statuses, and marking links OK/broken) matches your expectations.
Also applies to: 15-16, 18-106, 108-161, 163-182
internal/web/checker.go (1)
3-11: HTTP client tuning looks solidDynamic
MaxIdleConnswith a minimum and aligning per-host limits with configured concurrency is a sensible way to improve throughput across many domains without over-opening connections. No issues from my side here.Also applies to: 18-23, 29-34
cmd/check.go (2)
161-171: Numeric flag clamping looks correctClamping
maxConcurrencyto[1, 100]andtimeoutSecondsto[1, 300]gives sane bounds and prevents obviously bad inputs from the CLI without surprising behavior. No issues here.
319-327: PR comment gating and error handling look robustThe new
INPUT_COMMENT_PRtoggle (default-on, only disabled by"false") and the guardghOK && commentPR && finalMDPath != ""make the PR commenting behavior predictable and easy to opt out of.Gracefully handling markdown read failures and treating
upsertPRCommentserrors as warnings (without failing the run) is also a good balance between observability and resilience.Also applies to: 328-345
…rkflow This commit updates the input names in action.yml to use hyphenated formats for better consistency and clarity. Additionally, the example workflow is modified to demonstrate the new input names and to showcase the functionality of disabling blocking failures and enabling PR comments.
…handling in CheckURLs function This commit introduces a mutex to the URLCache struct to ensure thread-safe access to cache entries during load, get, set, and clear operations. Additionally, it refines the job handling logic in the CheckURLs function by using atomic counters for processed and pending jobs, improving concurrency management.
…robustness This commit modifies the argument construction in entrypoint.sh to use 'set --' for better handling of optional flags and targets. This change ensures that arguments are properly quoted, accommodating spaces and enhancing overall script reliability. Additionally, debug output is updated to display the constructed command more clearly.
Slinky Test ReportLast Run: 2025-11-14 15:00:48 CST (Duration: 10.008s)
Failures by URL
|
Summary by CodeRabbit
New Features
Improvements
Documentation