load data Flashcards

(8 cards)

1
Q

… (existing content unchanged) …

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can you load data from existing Databricks tables in Lakeflow?

A

Use a CREATE OR REFRESH MATERIALIZED VIEW or define a DLT table that reads from the existing table. This supports further transformation within the pipeline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you load data from cloud object storage using Auto Loader in SQL?

A

Use the read_files function in a CREATE OR REFRESH STREAMING TABLE query, specifying the file path and format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you ingest data from Kafka into a streaming table?

A

Use read_kafka in a SQL CREATE OR REFRESH STREAMING TABLE statement, providing Kafka bootstrap servers and topic name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can Lakeflow ingest data from systems like PostgreSQL?

A

Yes, using Python and Spark DataFrame readers (e.g., .format("postgresql") with connection options), you can read external database tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you ingest small or static files directly into a materialized view?

A

Use read_files in a CREATE OR REFRESH MATERIALIZED VIEW SQL command to load the file contents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you ignore updates and deletes in a streaming table source?

A

Use the skipChangeCommits option in spark.readStream.option() to ignore change operations from the source table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you securely load data from Azure Data Lake using secrets?

A

Store credentials in Databricks Secrets and use them in the spark.hadoop config. Then define a DLT table using Auto Loader to read from the ADLS path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly