Skip to content

Conversation

@codelixir
Copy link
Member

  1. Add openlineage properties to Spark31BigQueryTable class
  2. Add BigQueryRelationProvider as an abstract class to v2 module, to be extended by BaseBigQuerySource (parent class of all the Spark BigQuery Table Provider classes).
@vishalkarve15
Copy link
Contributor

/gcbrun

Copy link
Member

@davidrabinowitz davidrabinowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an integration test testing that the lineage events are created.

@vishalkarve15
Copy link
Contributor

/gcbrun

@codelixir
Copy link
Member Author

codelixir commented Apr 22, 2024

I have moved the logic to the common module, as discussed, so that both dsv1 and dsv2 connectors call the same method internally.

@vishalkarve15
Copy link
Contributor

/gcbrun

1 similar comment
@vishalkarve15
Copy link
Contributor

/gcbrun

@ddebowczyk92
Copy link
Contributor

Hey @codelixir, thank you for your contribution! We appreciate your effort. Have you thought about leveraging the spark-interfaces-scala package for generating metadata for OpenLineage events? This package is designed to facilitate the transition of lineage extraction ownership to the Spark extension owners. You can find more information about it here. Thanks once again for your contribution!

@davidrabinowitz
Copy link
Member

Hi @ddebowczyk92 , thanks for the input! We try to keep the DataSource v2 connectors Scala agnostic in order to simplify the usage for customers due to the incompatibility between Scala 2.12 and 2.13. Once this is PR is done, we can think how to incorporate the interface into the connector.

@davidrabinowitz davidrabinowitz changed the title Add BigQueryRelationProvider class for OpenLineage Support OpenLineage in spark-3.x-bigquery connectors Apr 25, 2024
Signed-off-by: Pahulpreet Singh <pahulpreets@google.com>
@vishalkarve15
Copy link
Contributor

/gcbrun

codelixir and others added 2 commits April 30, 2024 05:14
@codelixir codelixir requested a review from vishalkarve15 May 1, 2024 13:33
@davidrabinowitz
Copy link
Member

/gcbrun

@davidrabinowitz
Copy link
Member

/gcbrun

@davidrabinowitz davidrabinowitz merged commit 558f18f into GoogleCloudDataproc:master May 1, 2024
isha97 pushed a commit that referenced this pull request May 29, 2024
Signed-off-by: Pahulpreet Singh <pahulpreets@google.com>
isha97 pushed a commit that referenced this pull request May 29, 2024
Signed-off-by: Pahulpreet Singh <pahulpreets@google.com>
isha97 pushed a commit that referenced this pull request May 30, 2024
Signed-off-by: Pahulpreet Singh <pahulpreets@google.com>
codelixir added a commit that referenced this pull request Mar 12, 2025
Signed-off-by: Pahulpreet Singh <pahulpreets@google.com> (cherry picked from commit 558f18f)
@codelixir codelixir deleted the bigquery-relation-provider-v2 branch May 22, 2025 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants