Discussions - Databricks Community

Register to join the community

Discussions

Engage in dynamic conversations covering diverse topics within the Databricks Community. Explore discussions on data engineering, machine learning, and more. Join the conversation and expand your knowledge base with insights from experts and peers.

Activity in Discussions

Sorted by:

Start a conversation

by Rajeshwar_Reddy > • New Contributor II

01-10-2025 9:57:00 AM

2011 Views
4 replies
0 kudos

ODBC connection issue Simba 64 bit

Hello AllAm getting the below error when trying to create ODBC DSN Simba 64 in local system to connect Databricks Server using the token and enabled SSL System trust store & Thrift Transport: HTTP.any helping hand really appreciated . [Simba][ThriftE...

Databricks Free Edition Help

2011 Views
4 replies
0 kudos

01-10-2025 9:57:00 AM

Latest Reply

DougJames
New Contributor II

3m ago

0 kudos

Solved for my case. Still not sure why/how it was working on one server but not the other.Final fix was to add HTTPPath value to the Connection String I listed above.

0 kudos

3m ago

by Daniela_Boamba > • New Contributor III

24m ago

4 Views
0 replies
0 kudos

Databricks certificate expired

Hello,I have a databricks workspace with sso authentication. the IDP is on azure.The client certificate expired and now, I can't log on to databricks to add the new one.How can I do? Any idea is welcomed.Thank you!!Best regards,daniela

Administration & Architecture

4 Views
0 replies
0 kudos

24m ago

by Mous92i > • New Contributor

Wednesday

121 Views
3 replies
1 kudos

Resolved! Liquid Clustering With Merge

Hello I’m facing severe performance issues with a merge into databricksmerge_condition = """ source.data_hierarchy = target.data_hierarchy AND source.sensor_id = target.sensor_id AND source.timestamp = target.timestamp """The target Delt...

Data Engineering

121 Views
3 replies
1 kudos

Wednesday

Latest Reply

Mous92i
New Contributor

24m ago

1 kudos

Thanks for your response

1 kudos

24m ago

by viniciuscini > • New Contributor

08-12-2024 5:21:23 PM

5198 Views
2 replies
0 kudos

Improve query performance of direct query with Databricks

I’m building a dashboard in Power BI’s Pro Workspace, connecting data via Direct Query from Databricks (around 60 million rows from 15 combined tables), using a SQL Serverless (small size and 4 clusters).The problem is that the dashboard is taking to...

Get Started Discussions

5198 Views
2 replies
0 kudos

08-12-2024 5:21:23 PM

Latest Reply

ArekKemp
Visitor

2 hours ago

0 kudos

@viniciuscini have you managed to get it working well for you?

0 kudos

2 hours ago

by databricksero > • New Contributor

Wednesday

231 Views
8 replies
3 kudos

DLT pipeline fails with “can not infer schema from empty dataset” — works fine when run manually

Hi everyone,I’m running into an issue with a Delta Live Tables (DLT) pipeline that processes a few transformation layers (raw → intermediate → primary → feature).When I trigger the entire pipeline, it fails with the following error:can not infer sche...

Data Engineering

231 Views
8 replies
3 kudos

Wednesday

Latest Reply

ManojkMohan
Honored Contributor

yesterday

3 kudos

@databricksero Explicit Schema Definition: When calling spark.createDataFrame(pdf_cleaned), explicitly provide the schema even if the DataFrame is empty. This helps Spark infer the types and prevents the “cannot infer schema from empty dataset” erro...

3 kudos

yesterday

by Rezakorehi > • New Contributor II

2 weeks ago

400 Views
7 replies
11 kudos

Unity catalogues - What would you do

If you were creating Unity Catalogs again, what would you do differently based on your past experience?

Get Started Discussions

400 Views
7 replies
11 kudos

2 weeks ago

Latest Reply

Hubert-Dudek
Esteemed Contributor III

3 hours ago

11 kudos

@nayan_wylde no don't do that hehe. It was example of extreme approach. Usually use catalog to separate environment + in enterprises to separate divisions like customer tower, marketing tower, finance tower etc

11 kudos

3 hours ago

by sjujjuru > • Visitor

yesterday

56 Views
2 replies
1 kudos

REGARD COPOUN CODE

After completing all the relevant courses for the certification, I haven’t received the coupon code yet.

56 Views
2 replies
1 kudos

yesterday

Latest Reply

Hubert-Dudek
Esteemed Contributor III

3 hours ago

1 kudos

Wait till 15th November if you still have no voucher open support ticket,

1 kudos

3 hours ago

by YuriS > • New Contributor II

a week ago

280 Views
3 replies
1 kudos

Resolved! How to reduce data loss for Delta Lake on Azure when failing from primary to secondary regions?

Let’s say we have big data application where data loss is not an option.Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zo...

Get Started Discussions

280 Views
3 replies
1 kudos

a week ago

Latest Reply

Hubert-Dudek
Esteemed Contributor III

3 hours ago

1 kudos

Databricks is working on improvements and new functionality related to that. For now, the only solution is a DEEP CLONE. You can run it more frequently or implement your own replication based on a change data feed. You could use delta sharing for tha...

1 kudos

3 hours ago

by deng_dev > • New Contributor III

11-27-2023 1:44:03 AM

11238 Views
1 replies
0 kudos

py4j.protocol.Py4JJavaError: An error occurred while calling o359.sql. : java.util.NoSuchElementExce

Hi!We are creating table in streaming job every micro-batch using spark.sql('create or replace table ... using delta as ...') command. This query includes combining data from multiple tables.Sometimes our job fails with error:py4j.Py4JException: An e...

Data Engineering

11238 Views
1 replies
0 kudos

11-27-2023 1:44:03 AM

Latest Reply

sahilchavan
New Contributor II

3 hours ago

0 kudos

Hi @deng_dev ,Did you discover any way to raise this error gracefully? I'm facing the same error when running the kinesis stream. Although I'm aware of what the error is but my intent is to raise and log the error gracefully

0 kudos

3 hours ago

by Brahmareddy > • Esteemed Contributor

a week ago

112 Views
1 replies
4 kudos

I Tried Teaching Databricks About Itself — Here’s What Happened

Hi All, How are you doing today?I wanted to share something interesting from my recent Databricks work — I’ve been playing around with an idea I call “Real-Time Metadata Intelligence.” Most of us focus on optimizing data pipelines, query performance,...

Data Engineering

112 Views
1 replies
4 kudos

a week ago

Latest Reply

ruicarvalho_de
New Contributor II

3 hours ago

4 kudos

I like the core idea. You are mining signals the platform already emits.I would start rules first, track small files ratio and average file size trend, watch skew per partition and shuffle bytes per input gigabyte. Compare job time to input size to c...

4 kudos

3 hours ago

by Bhavana_Y > • New Contributor

yesterday

42 Views
1 replies
1 kudos

Learning Path for Spark Developer Associate

Hello Everyone,Happy for being a part of Virtual Journey !!Enrolled in Associate Spark Developer and completed learning path in Databricks Academy. Can anyone please confirm is completing learning path enough for obtaining 50% off voucher for certifi...

Screenshot (15).png

Data Engineering

42 Views
1 replies
1 kudos

yesterday

Latest Reply

Advika
Databricks Employee

5 hours ago

1 kudos

Hello @Bhavana_Y! To be eligible for the incentives, you’ll need to complete one of the pathways mentioned in the Learning Festival post. Based on your screenshot, it looks like you’ve completed all four modules of LEARNING PATHWAY 7: APACHE SPARK DE...

1 kudos

5 hours ago

by YugandharG > • New Contributor

yesterday

48 Views
1 replies
0 kudos

Lakebase storage location

Hi,I'm a Solution Architect from a reputed insurance company looking for few key technical information about Lakebase architecture. Being fully managed serverless OLTP offering from Databricks, there is no clear documentation that talks about data st...

Administration & Architecture

48 Views
1 replies
0 kudos

yesterday

Latest Reply

szymon_dybczak
Esteemed Contributor III

5 hours ago

0 kudos

Hi @YugandharG ,1. Lakebase data is stored in databricks-managed cloud object storage. There's no option to use customer storage as of now.2. File format: vanilla postgres pages. The storage format of postgres has nothing to do with parquet/delta. Wa...

0 kudos

5 hours ago

by donlxz > • New Contributor III

Wednesday

122 Views
4 replies
3 kudos

Resolved! deadlock occurs with use statement

When issuing a query from Informatica using a Delta connection, the statement use catalog_name.schema_name is executed first. At that time, the following error appeared in the query history:Query could not be scheduled: (conn=5073499)Deadlock found w...

Data Engineering

122 Views
4 replies
3 kudos

Wednesday

Latest Reply

donlxz
New Contributor III

6 hours ago

3 kudos

I’ll try making adjustments on the Informatica side.Thank you for your help.

3 kudos

6 hours ago

by Jonathan_ > • New Contributor II

Tuesday

151 Views
4 replies
6 kudos

Slow PySpark operations after long DAG that contains many joins and transformations

We are using PySpark and notice that when we are doing many transformations/aggregations/joins of the data then at some point the execution time of simple task (count, display, union of 2 tables, ...) become very slow even if we have a small data (ex...

Data Engineering

151 Views
4 replies
6 kudos

Tuesday

Latest Reply

tarunnagar
New Contributor

6 hours ago

6 kudos

This is a pretty common issue with PySpark when working on large DAGs with lots of joins and transformations. As the DAG grows, Spark has to maintain a huge execution plan, and performance can drop due to shuffling, serialization, and memory overhead...

6 kudos

6 hours ago

by mikvaar > • New Contributor III

09-16-2025 4:11:32 AM

586 Views
8 replies
5 kudos

DAB + DLT destroy fails due to ownership/permissions mismatch

Hi all,We are running into an issue with Databricks Asset Bundles (DAB) when trying to destroy a DLT pipeline. Setup is as follows:Two separate service principals:Deployment SP: used by Azure DevOps for deploying bundles.Run_as SP: used for running t...

Data Engineering

Databricks

Databricks Asset Bundles

DevOps

586 Views
8 replies
5 kudos

09-16-2025 4:11:32 AM

Latest Reply

denis-dbx
Databricks Employee

7 hours ago

5 kudos

We just released https://github.com/databricks/cli/releases/tag/v0.273.0 with a mitigation for this, the error should disappear if you upgrade. Please try and let us know how it goes. Terraform fix is in https://github.com/databricks/terraform-provid...

5 kudos

7 hours ago

Featured Posts

Level Up with Databricks Specialist Sessions

Level Up with Databricks Specialist Sessions

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

Databricks Community Champion - September 2025 - Nayanjyoti Sonowal

Databricks Community Champion - September 2025 - Nayanjyoti Sonowal