Skip to content

Conversation

michelle0927
Copy link
Collaborator

@michelle0927 michelle0927 commented Oct 6, 2025

Resolves #18486

Summary by CodeRabbit

  • Refactor
    • Improved BigQuery sources to use native query jobs for more reliable querying, safer async handling, reduced memory use, and clearer processing logs.
  • Removed Features
    • VM management (Compute Engine) helpers and related options removed from the Google Cloud integration.
  • Chores
    • Multiple Google Cloud package and component version bumps (BigQuery sources, Pub/Sub source, and several actions).
Copy link

vercel bot commented Oct 6, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
pipedream-docs Ignored Ignored Oct 8, 2025 5:00pm
pipedream-docs-redirect-do-not-edit Ignored Ignored Oct 8, 2025 5:00pm
Copy link
Contributor

coderabbitai bot commented Oct 6, 2025

Walkthrough

Bumps many Google Cloud component versions; BigQuery new-row source switches to BigQuery client query execution and removes its deactivate hook; common BigQuery run() becomes async; Google Cloud app drops Compute VM helper methods; several actions bump metadata versions.

Changes

Cohort / File(s) Summary
Package metadata
components/google_cloud/package.json
Increment package version 0.6.20.6.3.
BigQuery - New Row source
components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs
Version 0.1.80.1.9. Remove hooks.deactivate. Replace prior in-memory/getRowsForQuery flow with BigQuery client usage (createQueryJob + getQueryResults) for fetching rows and for _getIdOfLastRow; adjust query-building and result-starting/dedup logic.
BigQuery - Query Results source
components/google_cloud/sources/bigquery-query-results/bigquery-query-results.mjs
Version bump 0.1.70.1.8 (metadata only).
BigQuery common utilities
components/google_cloud/sources/common/bigquery.mjs
Change run(event)async run(event) and return await this.processCollection(...) to await query/job processing (no other logic changes).
Google Cloud app (core)
components/google_cloud/google_cloud.app.mjs
Remove Compute Engine-related imports and helper methods (InstancesClient, ZoneOperationsClient, ZonesClient, listZones, listVmInstancesByZone, switchInstanceBootStatus, waitOperation) and remove ConfigurationError import — VM management helpers removed.
Compute-related action (refactor to local methods)
components/google_cloud/actions/switch-instance-boot-status/switch-instance-boot-status.mjs
Version 0.0.30.0.4. Add @google-cloud/compute clients and ConfigurationError import; replace propDefinition-driven inputs with explicit UI props and async options; add methods (zonesClient, zoneOperationsClient, instancesClient, listZones, waitOperation, listVmInstancesByZone, switchInstanceBootStatus); run() now uses local methods and may return operation (new control flow).
Other action version bumps
components/google_cloud/actions/*
Multiple action modules bumped metadata versions only: bigquery-insert-rows 0.0.40.0.5; create-bucket 0.0.40.0.5; create-scheduled-query 0.0.20.0.3; get-bucket 0.0.50.0.6; get-object 0.0.40.0.5; list-buckets 0.0.40.0.5; logging-write-log 0.0.50.0.6; run-query 0.0.20.0.3; search-objects 0.0.40.0.5; upload-object 0.0.40.0.5. No behavioral changes.
Pub/Sub source
components/google_cloud/sources/new-pubsub-messages/new-pubsub-messages.mjs
Version bump 0.1.60.1.7 (metadata only).

Sequence Diagram(s)

sequenceDiagram autonumber actor Runner as Source Runner participant SRC as bigquery-new-row source participant BQ as BigQuery Client participant JOB as Query Job Runner->>SRC: run(event) / poll() activate SRC SRC->>BQ: createQueryJob({ query, params }) BQ-->>SRC: job SRC->>JOB: await job.getQueryResults() / job.promise() JOB-->>SRC: rows (paged) loop per page SRC->>SRC: process rows, dedupe/emit alt continue paging SRC->>JOB: getQueryResults(nextPageToken) end end SRC->>SRC: update lastResultId deactivate SRC 
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • jcortes
  • GTFalcao

Poem

I twitch my whiskers at the query's hum,
Jobs spin up, I wait until they're done.
Pages of rows hop into view,
I splice and emit, then bid adieu.
New versions dance — a carrot for fun. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning The pull request includes widespread version bumps across unrelated components, removal of Compute Engine integration in google_cloud.app.mjs, and the new switch-instance-boot-status action, none of which relate to the BigQuery source bug fix and fall outside the linked issue’s scope. Please isolate the BigQuery bug fix by reverting unrelated version changes and Compute Engine modifications or move those changes into separate pull requests.
Description Check ⚠️ Warning The description only contains a single line resolving the issue and does not include the required “## WHY” section or any explanation of the problem, its context, or the solution, so it does not meet the repository’s template requirements. Please fill out the description using the project template by adding a “## WHY” section that explains the root cause of the bug, its impact on users, and a summary of how the changes resolve it.
Title Check ❓ Inconclusive The title indicates a bug fix in the Google Cloud BigQuery integration but is too generic to convey the specific issue addressed, namely the TypeError in job.getQueryResults, so it does not clearly summarize the main change. Please update the title to clearly reference the specific bug and fix, for example “Fix TypeError: job.getQueryResults is not a function in BigQuery source.”
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues Check ✅ Passed The modifications to the BigQuery source and common helper directly address the job.getQueryResults TypeError by switching to the BigQuery client’s createQueryJob and getQueryResults methods, fulfilling the primary objective from issue #18486.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch issue-18486-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d268b9a and 169cd95.

📒 Files selected for processing (2)
  • components/google_cloud/package.json (1 hunks)
  • components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: pnpm publish
  • GitHub Check: Verify TypeScript components
  • GitHub Check: Publish TypeScript components
  • GitHub Check: Lint Code Base
GTFalcao
GTFalcao previously approved these changes Oct 7, 2025
Copy link
Collaborator

@GTFalcao GTFalcao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (1)

102-106: Qualify the table with dataset to avoid resolution issues.

These queries reference the table without a dataset. Either fully qualify it or set defaultDataset on the job.

- FROM \`${this.tableId}\` + FROM \`${this.datasetId}.${this.tableId}\`
♻️ Duplicate comments (1)
components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (1)

140-152: Honor falsy-but-valid lastResultId (0, "").

Using if (lastResultId) skips these legitimate values, causing re-scan and duplicates. Compare against null/undefined instead. This mirrors prior feedback.

- if (lastResultId) { + if (lastResultId !== null && lastResultId !== undefined) { query += ` WHERE \`${this.uniqueKey}\` >= @lastResultId`; } @@ - ...(lastResultId - ? { - lastResultId, - } - : {}), + ...((lastResultId !== null && lastResultId !== undefined) + ? { lastResultId } + : {}),
🧹 Nitpick comments (2)
components/google_cloud/sources/common/bigquery.mjs (2)

138-144: Clearing rows: prefer reassignment or length=0; splice isn’t faster.

rows is local and re-assigned each page; explicit clearing is unnecessary. If you keep it, rows.length = 0 is simpler and typically fastest.

- rows.splice(0, rows.length); // More efficient than rows.length = 0 + // rows.length = 0; // simple and fast, or omit clearing entirely

182-186: Nit: avoid redundant return await.

return await in async functions adds no benefit without try/catch.

- return await this.processCollection(queryOpts, timestamp); + return this.processCollection(queryOpts, timestamp);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 169cd95 and 75091ec.

📒 Files selected for processing (3)
  • components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (3 hunks)
  • components/google_cloud/sources/bigquery-query-results/bigquery-query-results.mjs (1 hunks)
  • components/google_cloud/sources/common/bigquery.mjs (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • components/google_cloud/sources/bigquery-query-results/bigquery-query-results.mjs
🧰 Additional context used
🧬 Code graph analysis (2)
components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (1)
components/google_cloud/sources/common/bigquery.mjs (5)
  • client (54-56)
  • job (64-66)
  • queryOpts (184-184)
  • rows (92-95)
  • maxRowsPerExecution (73-73)
components/google_cloud/sources/common/bigquery.mjs (1)
components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (5)
  • queryOpts (107-112)
  • job (116-118)
  • client (113-115)
  • maxRowsPerExecution (134-134)
  • rows (119-121)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Publish TypeScript components
  • GitHub Check: Verify TypeScript components
  • GitHub Check: Lint Code Base
🔇 Additional comments (3)
components/google_cloud/sources/common/bigquery.mjs (1)

29-33: Max rows per execution defaults look good.

Lower default (1000) and raising max (10000) are reasonable safeguards for memory.

components/google_cloud/sources/bigquery-new-row/bigquery-new-row.mjs (2)

12-12: Version bump OK.


113-121: Use BigQuery client for createQueryJob instead of Dataset.createQueryJob

  • Dataset.createQueryJob returns an object without getQueryResults; replace with:
- const client = this.googleCloud.getBigQueryClient().dataset(this.datasetId); - const [job] = await client.createQueryJob(queryOpts); + const bigquery = this.googleCloud.getBigQueryClient(); + const opts = { ...queryOpts, defaultDataset: { datasetId: this.datasetId } }; + const [job] = await bigquery.createQueryJob(opts);
  • Optional: narrow SELECT * to SELECT \${this.uniqueKey}`` to reduce payload.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
components/google_cloud/actions/switch-instance-boot-status/switch-instance-boot-status.mjs (1)

77-90: Consider rate limiting in the wait loop.

The while loop continuously polls operationsClient.wait() without an explicit delay between iterations. If the Google Cloud SDK's wait() method doesn't include built-in rate limiting, this could result in excessive API calls for long-running operations.

Consider adding a small delay between iterations or verify that operationsClient.wait() includes built-in throttling:

 async waitOperation(operation) { const operationsClient = this.zoneOperationsClient(); const sdkParams = this.googleCloud.sdkParams(); while (operation.status !== "DONE") { [ operation, ] = await operationsClient.wait({ operation: operation.name, project: sdkParams.projectId, zone: operation.zone.split("/").pop(), }); + // Add a small delay to prevent excessive API calls if wait() doesn't throttle + if (operation.status !== "DONE") { + await new Promise(resolve => setTimeout(resolve, 1000)); + } } return operation; }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c173cc and ea3e814.

📒 Files selected for processing (14)
  • components/google_cloud/actions/bigquery-insert-rows/bigquery-insert-rows.mjs (1 hunks)
  • components/google_cloud/actions/create-bucket/create-bucket.mjs (1 hunks)
  • components/google_cloud/actions/create-scheduled-query/create-scheduled-query.mjs (1 hunks)
  • components/google_cloud/actions/get-bucket/get-bucket.mjs (1 hunks)
  • components/google_cloud/actions/get-object/get-object.mjs (1 hunks)
  • components/google_cloud/actions/list-buckets/list-buckets.mjs (1 hunks)
  • components/google_cloud/actions/logging-write-log/logging-write-log.mjs (1 hunks)
  • components/google_cloud/actions/run-query/run-query.mjs (1 hunks)
  • components/google_cloud/actions/search-objects/search-objects.mjs (1 hunks)
  • components/google_cloud/actions/switch-instance-boot-status/switch-instance-boot-status.mjs (4 hunks)
  • components/google_cloud/actions/upload-object/upload-object.mjs (1 hunks)
  • components/google_cloud/google_cloud.app.mjs (0 hunks)
  • components/google_cloud/sources/bigquery-query-results/bigquery-query-results.mjs (1 hunks)
  • components/google_cloud/sources/new-pubsub-messages/new-pubsub-messages.mjs (1 hunks)
💤 Files with no reviewable changes (1)
  • components/google_cloud/google_cloud.app.mjs
✅ Files skipped from review due to trivial changes (9)
  • components/google_cloud/actions/bigquery-insert-rows/bigquery-insert-rows.mjs
  • components/google_cloud/actions/list-buckets/list-buckets.mjs
  • components/google_cloud/actions/run-query/run-query.mjs
  • components/google_cloud/sources/new-pubsub-messages/new-pubsub-messages.mjs
  • components/google_cloud/actions/create-bucket/create-bucket.mjs
  • components/google_cloud/actions/get-object/get-object.mjs
  • components/google_cloud/actions/logging-write-log/logging-write-log.mjs
  • components/google_cloud/actions/upload-object/upload-object.mjs
  • components/google_cloud/actions/get-bucket/get-bucket.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • components/google_cloud/sources/bigquery-query-results/bigquery-query-results.mjs
🧰 Additional context used
🧬 Code graph analysis (1)
components/google_cloud/actions/switch-instance-boot-status/switch-instance-boot-status.mjs (1)
components/google_cloud/google_cloud.app.mjs (2)
  • zones (16-16)
  • instances (26-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Lint Code Base
  • GitHub Check: Publish TypeScript components
  • GitHub Check: Verify TypeScript components
🔇 Additional comments (8)
components/google_cloud/actions/search-objects/search-objects.mjs (1)

5-5: LGTM!

The version bump from "0.0.4" to "0.0.5" is appropriate as part of the broader package update.

components/google_cloud/actions/create-scheduled-query/create-scheduled-query.mjs (1)

15-15: LGTM! Metadata version bump aligns with package update.

The version increment from "0.0.2" to "0.0.3" is appropriate as part of the broader Google Cloud package update (0.6.2 → 0.6.3 as noted in the PR summary). No functional changes were made to this action.

components/google_cloud/actions/switch-instance-boot-status/switch-instance-boot-status.mjs (6)

2-7: LGTM!

The imports are correctly structured and all are utilized in the methods section below.


11-11: LGTM!

Version increment is appropriate for the refactoring changes.


22-40: LGTM!

The props are well-structured with appropriate UI metadata and dynamic options. The zone dependency in instanceName is correctly handled with the guard clause on Line 36.


57-76: LGTM!

Client factory methods and listZones() are correctly implemented using the Google Cloud Compute SDK.


91-119: LGTM!

Both listVmInstancesByZone() and switchInstanceBootStatus() are correctly implemented. The validation and dynamic method invocation in switchInstanceBootStatus() are handled properly.


121-141: LGTM!

The run() method correctly orchestrates the boot status switch operation with optional wait for completion. The logic flow is clear and appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants