Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

rcostanza
by New Contributor III
  • 81 Views
  • 2 replies
  • 2 kudos

Resolved! Stateless streaming with aggregations on a DLT/Lakeflow pipeline

In a DLT pipeline I have a bronze table that ingest files using Autoloader, and a derived silver table that, for this example, just stores the number of rows for each file ingested into bronze. The basic code example: import dlt from pyspark.sql impo...

  • 81 Views
  • 2 replies
  • 2 kudos
Latest Reply
rcostanza
New Contributor III
  • 2 kudos

Gotcha. Thanks for the reply.We already have a medallion architecture going in production for a while, based on DLT pipelines having most bronze/silver tables, with some separate jobs for silver tables that are more complex to materialize. It works, ...

  • 2 kudos
1 More Replies
jimoskar
by New Contributor
  • 193 Views
  • 6 replies
  • 6 kudos

Resolved! Cluster cannot find init script stored in Volume

I have created an init script stored in a Volume which I want to execute on a cluster with runtime 16.4 LTS. The cluster has policy = Unrestricted and Access mode = Standard. I have additionally added the init script to the allowlist. This should be ...

  • 193 Views
  • 6 replies
  • 6 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 6 kudos

Hi @jimoskar ,Since you're using standard access mode you need to add init script to allowlist. Did you add your init script to allowlist? If not, do the following:In your Databricks workspace, click  Catalog.Click the gear icon .Click the metastore ...

  • 6 kudos
5 More Replies
YuriS
by New Contributor II
  • 137 Views
  • 1 replies
  • 1 kudos

Resolved! How to reduce data loss for Delta Lake on Azure when failing from primary to secondary regions?

Let’s say we have big data application where data loss is not an option.Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zo...

  • 137 Views
  • 1 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

In Azure and Databricks environments, ensuring zero data loss during a primary-to-secondary failover—especially for Delta Lake/streaming workloads—is extremely challenging due to asynchronous replication, potential ordering issues, and inconsistent s...

  • 1 kudos
cbhoga
by New Contributor II
  • 76 Views
  • 2 replies
  • 3 kudos

Delta sharing with Celonis

Is there is any way/plans of Databricks use Delta sharing to provide data access to Celonis?

  • 76 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @cbhoga ,Delta Sharing is an open protocol for secure data sharing. Databricks already supports it natively, so you can publish data using Delta Sharing. However, whether Celonis can directly consume that shared data depends on whether Celonis sup...

  • 3 kudos
1 More Replies
ChristianRRL
by Valued Contributor III
  • 101 Views
  • 3 replies
  • 4 kudos

Performance Comparison: spark.read vs. Autoloader

Hi there, I would appreciate some help to compare the runtime performance of two approaches to performing ELT in Databricks: spark.read vs. Autoloader. We already have a process in place to extract highly nested json data into a landing path, and fro...

  • 101 Views
  • 3 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 4 kudos

Hi @ChristianRRL ,For that kind of ingestion scenario autoloader is a winner . It will scale much better than batch approach - especially if we are talking about large number of files.If you configure autoloader with file notification mode it can sca...

  • 4 kudos
2 More Replies
ChristianRRL
by Valued Contributor III
  • 91 Views
  • 1 replies
  • 2 kudos

Resolved! AutoLoader Ingestion Best Practice

Hi there, I would appreciate some input on AutoLoader best practice. I've read that some people recommend that the latest data should be loaded in its rawest form into a raw delta table (i.e. highly nested json-like schema) and from that data the app...

  • 91 Views
  • 1 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor II
  • 2 kudos

I think the key thing with holding the raw data in a table, and not transforming that table, is that you have more flexibility at your disposal. There's a great resource available via Databricks Docs for best practices in the Lakehouse. I'd highly re...

  • 2 kudos
ChristianRRL
by Valued Contributor III
  • 92 Views
  • 2 replies
  • 4 kudos

Resolved! What is `read_files`?

Bit of a silly question, but wondering if someone can help me better understand what is `read_files`?read_files table-valued function | Databricks on AWSThere's at least 3 ways to pull raw json data into a spark dataframe:df = spark.read...df = spark...

  • 92 Views
  • 2 replies
  • 4 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor II
  • 4 kudos

Also, @ChristianRRL , with a slight adjustment to the syntax, it does indeed behave like Autoloaderhttps://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/patterns?language=SQL I'd also advise looking at the different options th...

  • 4 kudos
1 More Replies
Maria_fed
by New Contributor III
  • 5039 Views
  • 8 replies
  • 0 kudos

Need help migrating company customer and partner academy accounts to work properly

Hi, originally I accidentally made a customer academy account with my company that is a databricks partner. Then I made an account using my personal email and listed my company email as the partner email for the partner academy account. that account ...

  • 5039 Views
  • 8 replies
  • 0 kudos
Latest Reply
Vaishali2
New Contributor
  • 0 kudos

Need help to merge my customer portal id with  partner mail id my case number is 00754330 

  • 0 kudos
7 More Replies
rcostanza
by New Contributor III
  • 162 Views
  • 4 replies
  • 2 kudos

Trying to reduce latency on DLT pipelines with Autoloader and derived tables

What I'm trying to achieve: ingest files into bronze tables with Autoloader, then produce Kafka messages for each file ingested using a DLT sink.The issue: latency between file ingested and message produced get exponentially higher the more tables ar...

  • 162 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Hi, I think it is a delay of the autoloader as it doesn't know about the ingested files. It is nothing in common with the state, as it is just an autoloader and it keeps a list of processed files. Autloader scans the directory every minute, usually a...

  • 2 kudos
3 More Replies
Rezakorehi
by New Contributor
  • 139 Views
  • 3 replies
  • 7 kudos

Unity catalogues - What would you do

If you were creating Unity Catalogs again, what would you do differently based on your past experience?

  • 139 Views
  • 3 replies
  • 7 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 7 kudos

From my experiance:- don't create separate catalogs for every project. Try to think about your desing before implementation- try to come up with consistent naming convention to avoid cognitive overhead- principle of least privilege - grant users and ...

  • 7 kudos
2 More Replies
frunzy
by New Contributor
  • 107 Views
  • 2 replies
  • 2 kudos

how to import sample notebook to azure databricks workspace

In the second onboarding video, the Quickstart Notebook is shown. I found that notebook here:https://www.databricks.com/notebooks/gcp-qs-notebook.htmlI wanted to import it to my workspace in Azure Databricks account, to play with it. However, selecti...

  • 107 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

I reported this as a bug:

  • 2 kudos
1 More Replies
thethirtyfour
by New Contributor III
  • 5761 Views
  • 3 replies
  • 3 kudos

Resolved! Configure Databricks in VSCode through WSL

Hi,I am having a hard time configuring my Databricks workspace when working in VSCode via WSL. When following the steps to setup Databricks authentication I am receiving the following error on the Step 5 of "Step 4: Set up Databricks authentication"....

  • 5761 Views
  • 3 replies
  • 3 kudos
Latest Reply
RaulMoraM
New Contributor III
  • 3 kudos

What worked for me was NOT opening the browser using the pop-up (which generated the 3-legged-OAuth flow error), but clicking on the link provided by the CLI (or copy paste the link on the browser)

  • 3 kudos
2 More Replies
masterelaichi
by New Contributor II
  • 120 Views
  • 3 replies
  • 0 kudos

Data analyst learning plan lab files

Hi all,I am very new to databricks and to this community. I recently signed up for the data analyst learning plan and the data engineering one.The learning platform page seems like confusing maze to navigate! In the course material for the data analy...

  • 120 Views
  • 3 replies
  • 0 kudos
Latest Reply
masterelaichi
New Contributor II
  • 0 kudos

Thanks for your reply. I believe I have access to the partner academy through my company as I am able to access the partner academy through this link I can access the courses and various learning plans. However, I am not able to see the actual lab fi...

  • 0 kudos
2 More Replies
Lakshmipriya_N
by New Contributor II
  • 79 Views
  • 1 replies
  • 1 kudos

Request to Extend Partner Tech Summit Lab Access

Hi Team,I would appreciate it if my Partner Tech Summit lab access could be extended, as two of the assigned labs were inaccessible. Could you please advise whom I should contact for this?Thank you.Regards,Lakshmipriya

  • 79 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Lakshmipriya_N ,Create a support ticket and wait for reply:Contact Us

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels