- Notifications
You must be signed in to change notification settings - Fork 248
Support multiple datasource connections #2276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v4
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR supports multiple datasource connections by updating the configuration from a single "connection" to a dictionary under "connections" and adding a default connection selector. Key changes include:
- Updates to test helpers and model files for Snowflake, Postgres, and Databricks to use the "connections" key with a nested "default" connection.
- Modification of the base DataSource model in soda-core to store connection properties as a dictionary and introduce the default_connection property.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| soda-snowflake/tests/data_sources/snowflake_data_source_test_helper.py | Updated YAML to use "connections" with a nested "default" connection. |
| soda-snowflake/src/soda_snowflake/model/data_source/snowflake_data_source.py | Removed the old connection field and validator to support multiple connections. |
| soda-snowflake/src/soda_snowflake/common/data_sources/snowflake_data_source.py | Adjusted connection access to use the new default_connection property. |
| soda-postgres/tests/data_sources/postgres_data_source_test_helper.py | Updated YAML to reflect the new "connections" structure. |
| soda-postgres/src/soda_postgres/model/data_source/postgres_data_source.py | Removed the field validator and adjusted the connection alias in line with multiple connections. |
| soda-postgres/src/soda_postgres/common/data_sources/postgres_data_source.py | Updated connection access to use default_connection. |
| soda-databricks/tests/data_sources/databricks_data_source_test_helper.py | Updated YAML to use "connections" with a nested "default" connection. |
| soda-databricks/src/soda_databricks/model/data_source/databricks_data_source.py | Removed the old connection field to support the multiple connection approach. |
| soda-databricks/src/soda_databricks/common/data_sources/databricks_data_source.py | Adjusted connection access to use default_connection. |
| soda-core/tests/components/test_data_source_api.py | Updated tests to access the default connection from the new connections dictionary. |
| soda-core/src/soda_core/model/data_source/data_source.py | Changed connection_properties to a dictionary, added default_connection property, and implemented a field validator for connections. |
| | ||
| @property | ||
| def default_connection(self) -> DataSourceConnectionProperties: | ||
| """Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed.""" |
Copilot AI May 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling mistake: 'implemetation' should be corrected to 'implementation'.
| """Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed.""" | |
| """Default first iteration implementation. Use first connection in the dict for now. Empty dict is not allowed.""" |
| """Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed.""" | ||
| | ||
| # TODO: Decide and implement the connection selection strategy. Use "default" or "primary" connection if available? etc |
Copilot AI May 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider explicitly selecting the 'default' connection key (if present) instead of relying on iteration order, to improve clarity and maintainability when multiple connections are provided.
| """Default first iteration implemetation. Use first connection in the dict for now. Empty dict is not allowed.""" | |
| # TODO: Decide and implement the connection selection strategy. Use "default" or "primary" connection if available? etc | |
| """Return the default connection. Prioritize the 'default' key if present; otherwise, use the first connection in the dict.""" | |
| if "default" in self.connection_properties: | |
| return self.connection_properties["default"] |
nielsn left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still in doubt if this is the way to go forward (and if we should not introduce an additional CLI argument with its own configuration file instead)



This changes the model to use
connectionswith a nested object inside to provide named connections instead of just the singularconnectionwe had previously. I thought about supporting both, or adding backwards compatibility but the change is so small, clear and easy to understand that I don't really think we can add that extra layer in yaml without confusing people too much.The implemented logic just uses the first connection in the list to create data source connections for now, we will extend that once we know the details for use cases that will use multiple connections. Any name can be used for that first connection.
I also left the parsing as is for now since originally there was no parsing of Datasource yamls at all, (not even checking for
typeto be present) so pydantic errors are surfaces with no alteration for now. This will be tackled later.so this
becomes