Databricks Platform Discussions
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...
Hello AllAm getting the below error when trying to create ODBC DSN Simba 64 in local system to connect Databricks Server using the token and enabled SSL System trust store & Thrift Transport: HTTP.any helping hand really appreciated . [Simba][ThriftE...
Solved for my case. Still not sure why/how it was working on one server but not the other.Final fix was to add HTTPPath value to the Connection String I listed above.
Hello,I have a databricks workspace with sso authentication. the IDP is on azure.The client certificate expired and now, I can't log on to databricks to add the new one.How can I do? Any idea is welcomed.Thank you!!Best regards,daniela
Hello I’m facing severe performance issues with a merge into databricksmerge_condition = """ source.data_hierarchy = target.data_hierarchy AND source.sensor_id = target.sensor_id AND source.timestamp = target.timestamp """The target Delt...
I’m building a dashboard in Power BI’s Pro Workspace, connecting data via Direct Query from Databricks (around 60 million rows from 15 combined tables), using a SQL Serverless (small size and 4 clusters).The problem is that the dashboard is taking to...
@viniciuscini have you managed to get it working well for you?
Hi everyone,I’m running into an issue with a Delta Live Tables (DLT) pipeline that processes a few transformation layers (raw → intermediate → primary → feature).When I trigger the entire pipeline, it fails with the following error:can not infer sche...
@databricksero Explicit Schema Definition: When calling spark.createDataFrame(pdf_cleaned), explicitly provide the schema even if the DataFrame is empty. This helps Spark infer the types and prevents the “cannot infer schema from empty dataset” erro...
If you were creating Unity Catalogs again, what would you do differently based on your past experience?
@nayan_wylde no don't do that hehe. It was example of extreme approach. Usually use catalog to separate environment + in enterprises to separate divisions like customer tower, marketing tower, finance tower etc
After completing all the relevant courses for the certification, I haven’t received the coupon code yet.
Wait till 15th November if you still have no voucher open support ticket,
Let’s say we have big data application where data loss is not an option.Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zo...
Databricks is working on improvements and new functionality related to that. For now, the only solution is a DEEP CLONE. You can run it more frequently or implement your own replication based on a change data feed. You could use delta sharing for tha...
Hi!We are creating table in streaming job every micro-batch using spark.sql('create or replace table ... using delta as ...') command. This query includes combining data from multiple tables.Sometimes our job fails with error:py4j.Py4JException: An e...
Hi @deng_dev ,Did you discover any way to raise this error gracefully? I'm facing the same error when running the kinesis stream. Although I'm aware of what the error is but my intent is to raise and log the error gracefully
Hi All, How are you doing today?I wanted to share something interesting from my recent Databricks work — I’ve been playing around with an idea I call “Real-Time Metadata Intelligence.” Most of us focus on optimizing data pipelines, query performance,...
I like the core idea. You are mining signals the platform already emits.I would start rules first, track small files ratio and average file size trend, watch skew per partition and shuffle bytes per input gigabyte. Compare job time to input size to c...
Hello Everyone,Happy for being a part of Virtual Journey !!Enrolled in Associate Spark Developer and completed learning path in Databricks Academy. Can anyone please confirm is completing learning path enough for obtaining 50% off voucher for certifi...
Hello @Bhavana_Y! To be eligible for the incentives, you’ll need to complete one of the pathways mentioned in the Learning Festival post. Based on your screenshot, it looks like you’ve completed all four modules of LEARNING PATHWAY 7: APACHE SPARK DE...
Hi,I'm a Solution Architect from a reputed insurance company looking for few key technical information about Lakebase architecture. Being fully managed serverless OLTP offering from Databricks, there is no clear documentation that talks about data st...
Hi @YugandharG ,1. Lakebase data is stored in databricks-managed cloud object storage. There's no option to use customer storage as of now.2. File format: vanilla postgres pages. The storage format of postgres has nothing to do with parquet/delta. Wa...
When issuing a query from Informatica using a Delta connection, the statement use catalog_name.schema_name is executed first. At that time, the following error appeared in the query history:Query could not be scheduled: (conn=5073499)Deadlock found w...
I’ll try making adjustments on the Informatica side.Thank you for your help.
We are using PySpark and notice that when we are doing many transformations/aggregations/joins of the data then at some point the execution time of simple task (count, display, union of 2 tables, ...) become very slow even if we have a small data (ex...
This is a pretty common issue with PySpark when working on large DAGs with lots of joins and transformations. As the DAG grows, Spark has to maintain a huge execution plan, and performance can drop due to shuffling, serialization, and memory overhead...
Hi all,We are running into an issue with Databricks Asset Bundles (DAB) when trying to destroy a DLT pipeline. Setup is as follows:Two separate service principals:Deployment SP: used by Azure DevOps for deploying bundles.Run_as SP: used for running t...
We just released https://github.com/databricks/cli/releases/tag/v0.273.0 with a mitigation for this, the error should disappear if you upgrade. Please try and let us know how it goes. Terraform fix is in https://github.com/databricks/terraform-provid...
User | Count |
---|---|
1840 | |
920 | |
806 | |
476 | |
317 |