Skip to content

Conversation

@m1n0
Copy link
Contributor

@m1n0 m1n0 commented May 12, 2025

This changes the model to use connections with a nested object inside to provide named connections instead of just the singular connection we had previously. I thought about supporting both, or adding backwards compatibility but the change is so small, clear and easy to understand that I don't really think we can add that extra layer in yaml without confusing people too much.

The implemented logic just uses the first connection in the list to create data source connections for now, we will extend that once we know the details for use cases that will use multiple connections. Any name can be used for that first connection.

I also left the parsing as is for now since originally there was no parsing of Datasource yamls at all, (not even checking for type to be present) so pydantic errors are surfaces with no alteration for now. This will be tackled later.

so this

type: postgres name: contracts connection: host: localhost user: postgres password: secret database: postgres 

becomes

type: postgres name: contracts connections: default: host: localhost user: postgres password: secret database: postgres 
@m1n0 m1n0 requested review from Copilot, nielsn and tombaeyens May 13, 2025 10:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR supports multiple datasource connections by updating the configuration from a single "connection" to a dictionary under "connections" and adding a default connection selector. Key changes include:

  • Updates to test helpers and model files for Snowflake, Postgres, and Databricks to use the "connections" key with a nested "default" connection.
  • Modification of the base DataSource model in soda-core to store connection properties as a dictionary and introduce the default_connection property.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
soda-snowflake/tests/data_sources/snowflake_data_source_test_helper.py Updated YAML to use "connections" with a nested "default" connection.
soda-snowflake/src/soda_snowflake/model/data_source/snowflake_data_source.py Removed the old connection field and validator to support multiple connections.
soda-snowflake/src/soda_snowflake/common/data_sources/snowflake_data_source.py Adjusted connection access to use the new default_connection property.
soda-postgres/tests/data_sources/postgres_data_source_test_helper.py Updated YAML to reflect the new "connections" structure.
soda-postgres/src/soda_postgres/model/data_source/postgres_data_source.py Removed the field validator and adjusted the connection alias in line with multiple connections.
soda-postgres/src/soda_postgres/common/data_sources/postgres_data_source.py Updated connection access to use default_connection.
soda-databricks/tests/data_sources/databricks_data_source_test_helper.py Updated YAML to use "connections" with a nested "default" connection.
soda-databricks/src/soda_databricks/model/data_source/databricks_data_source.py Removed the old connection field to support the multiple connection approach.
soda-databricks/src/soda_databricks/common/data_sources/databricks_data_source.py Adjusted connection access to use default_connection.
soda-core/tests/components/test_data_source_api.py Updated tests to access the default connection from the new connections dictionary.
soda-core/src/soda_core/model/data_source/data_source.py Changed connection_properties to a dictionary, added default_connection property, and implemented a field validator for connections.

@property
def default_connection(self) -> DataSourceConnectionProperties:
"""Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed."""
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling mistake: 'implemetation' should be corrected to 'implementation'.

Suggested change
"""Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed."""
"""Default first iteration implementation. Use first connection in the dict for now. Empty dict is not allowed."""
Copilot uses AI. Check for mistakes.
Comment on lines +26 to +28
"""Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed."""

# TODO: Decide and implement the connection selection strategy. Use "default" or "primary" connection if available? etc
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider explicitly selecting the 'default' connection key (if present) instead of relying on iteration order, to improve clarity and maintainability when multiple connections are provided.

Suggested change
"""Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed."""
# TODO: Decide and implement the connection selection strategy. Use "default" or "primary" connection if available? etc
"""Return the default connection. Prioritize the 'default' key if present; otherwise, use the first connection in the dict."""
if "default" in self.connection_properties:
return self.connection_properties["default"]
Copilot uses AI. Check for mistakes.
Copy link
Contributor

@nielsn nielsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still in doubt if this is the way to go forward (and if we should not introduce an additional CLI argument with its own configuration file instead)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants