Columns Flashcards

1
Q

What are the different ways you can select columns from customerDf

A
.select("column name")
.select('column_name)
.select($"column_name")
.select(col("column_name")
.select(column("column_name"))
.select(customerDf.col("column_name"))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you combine two columns

A

.select(expr(“concat(firstname, lastname) name”))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is there an overload that takes a string and a column object

A

No, you cannot do variations of column objects, but you can’t mix strings and column objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can you select columns using sql

A

.selectExpr(“column”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

With SQL, how do you get and rename a column

A

.selectExpr(“birthdate birthday”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you see all the columns for a DataFrame

A

customerDf.columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you rename a column in the data frame

A

.withColumnRenamed(“old_name”, “new name”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

If you rename a column that does not exist, spark will fail

A

False, it will succeed but do nothing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

can columnRenamed take in column objects

A

No, strings only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you print the schema of the data frame

A

.printSchema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can you change a datatype of a column not using apache spark types

A

.select($”column_object”.cast(“long”))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you change the data type of the column using apache spark types

A

import org.apache.spark.sql.types._ (the _ means all tyoes)

.select($”column_object”.cast(StringType))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can you change the data type of a column using a select expression

A

.selectExpr(“cast(complex_object.property_in_there[0]) as double) rename_if_want”)

example is a changing the property of a complex type and renaming it. The same works with any column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can you add a column, make it of two existing columns with a space

A

.withColumn(“new_column_name”, concat_ws(“ “, $”first_column”, $”second_column”))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you remove a column using a string

A

.drop(‘‘column_name”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you remove multiple columns using string

A

.drop(“column1”, “column2”)

17
Q

Can the remove column function take column objects

A

Yes

18
Q

What are the different ways you can use column objects

A
'column_name
$"column_name"
col("column_name"
column("column_name")
customerDf.col("column_name")
19
Q

Using column objects, how to you remove multiple columns

A

You can’t. With column objects you can only remove one a. time

20
Q

How do you create a new column by multiplying two other columns together

A

.withColumn(“column_name”, $”column_a” * $”column_b”)

21
Q

How do you create a new column by dividing two other columns, using expression

A

.withColumn(“new_column”, expr(“column1 / column2”))

not required to use expression, just how it would work

22
Q

How do you create a new column by rounding another column

A

withColumn(“new_column”, round($”column_name”, 2))

23
Q

with .toDf how can you specify columns

A

.toDf(“column_name”)

24
Q

.withColumn will put a column where

A

at the end