Week 2: Data Extraction Flashcards

1
Q

REVERSED

  • Physical level: describes how a record is stored
  • Logical level: describes what data stored in the database and the relationships among the data e.g. name: string, salary: real
  • View level: application programs which hide details of data types. Views can also hide information for security/privacy purposes
A

What are the 3 database levels of abstraction?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

REVERSED

-structures: e.g. rows and columns, nodes and edges, key-value pairs
-constraints: e.g. all rows must have same number of columns, age must be positive, c
* operations: e.g. return the values of record x, find rows where the column “name” is “Jordan”

A

What are the 3 main components of data models?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

REVERSED

  • Schema: specifies name of relation, plus name and type of each column
  • Instance: a table with rows and columns. rows = records = tuples, column = attributes = fields = features (instance is a snapshot of the schema at a point in time)
A

What are the 2 parts that a relation is made up of?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

REVERSED

The number of columns

A

What is the degree of a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

REVERSED

The number of columns

A

What is the arity of a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

REVERSED

The number of rows

A

What is the cardinality of a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

REVERSED

attribute values are normally required to be atomic: indivisible (not containing more than one value)

A

What is the requirement of attribute values in a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

REVERSED

K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation

A

What is a superkey of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

REVERSED

Superkey K is a candidate key if K is minimal

A

What is a candidate key of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

REVERSED

A selected candidate key used to represent the relation

A

What is the primary key of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

REVERSED

column or set of columns in table whose values correspond to the primary key in another table

A

What is a foreign key of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

REVERSED

  • used for defining the database schema
  • DDL is used to create and modify database objects such as tables, indexes and users
A

What is data definition language?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

REVERSED

Language for accessing and manipulating the data in a model
DML is used for adding, deleting and modifying the records in a database

A

What is data manipulation language?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

REVERSED

a statement requesting the retrieval of information

A

What is a query?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

REVERSED

  • and (^): retrieves records that satisfy both conditions
  • or (v): retrieves records that satisfy one of the conditions
  • not (~): retrieves records that don’t satisfy the condition
A

What are the three boolean operators?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

REVERSED

A^F = F 
AvF = A 
A^T = A 
AvT = T
A

What are the 4 boolean laws including true and false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

REVERSED

A^A = A 
AvA = A
A

What is A^A and AvA?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

REVERSED

Av ~A = T
A^ ~A = F

A

What are the complement laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

REVERSED

~(~A) = A

A

What is the double negation law?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

REVERSED

~(A^B) = ~A v ~B 
~(AvB) = ~A ^ ~B
A

What are de morgans laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

REVERSED

A ^ (BvC) = (A^B) v (A^C) 
A v (B^C) = (AvB) ^ (AvC)
A

What are the distributive laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

REVERSED

A v (A^B) = A 
A ^ (AvB) = A
A

What are the absorptive laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

REVERSED

  • the output of a relational algebra (RA) is a relation that can be input for another RA operator
  • don’t need heaps of functions for each line of code
A

What are the advantages of composing relational algebra operators?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

REVERSED

Union (U): total set of records in both relations, only records each record once even if it is in both
intersection: common records
difference (-): records in one table but not the other
selection (sigma): a specific set of rows
projection (pi): a specific set of columns. Duplicate tuples are removed
join: the natural join of two tables

A

What are the 6 main relational algebra operators?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

REVERSED

rho (p)

A

What is the relational algebra symbol for renaming?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

REVERSED

G or gamma

A

What is the relational algebra symbol for grouping and aggregation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

REVERSED

Gets all the tuples from relation R and pair them with all the tuples from relation S, end up with huge table

A

What does a cartesian product do?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

REVERSED

set of all combinations of tuples in R and S that are equal on their common attribute names. and includes all other attributes from both relations filled in

A

What does a natural join do?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

REVERSED

Pi(A1, sum(A3)) (A1, A2 G sum(A3) (sigma P (r1, r2)))

A

What is the form of relational algebra statement for the general sql query:
select A1, sum (A3)
from r1, r2
where P
group by A1, A2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

REVERSED

Can do the union, intersection or set difference (-) of full relational algebra statements - but not include them within the statements

A

How do you use union, intersection and set difference in relational algebra?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

REVERSED

  • char(n) or character(n): character string with fixed length n
  • varchar(n): variable length character string
  • int or integer: integer of any length
  • numeric(p,d): fixed point number with p digits and d of the digits are to the right of the decimal
  • float(n): floating point number with precision at least n digits
A

What are the 5 basic types in SQL?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

REVERSED

SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of the tables

A

What is the basic form of a select from where statement?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

REVERSED

SELECT DISTINCT

A

How do you force the elimination of duplicates in an SQL query?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

REVERSED

SELECT *

A

How do you select all attributes in an SQL query?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

REVERSED

results in a relation with one column and one row with value ‘542’

A

What is the result of SELECT ‘542’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

REVERSED

returns a relation with one column and N rows(the number of tuples in instructor) where each row will contain the value ‘A’

A

What is the result of SELECT ‘A’ FROM instructor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

REVERSED

using “as”
e.g. SELECT ID, name, salary/12.0 as monthly_salary
FROM instructor
or simply write salary/12.0 monthly_salary
do not need the “as”

A

How do you rename an attribute in SQL? (2 ways)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

REVERSED

SELECT ID, name
FROM student
WHERE dept_name = ‘Math’ AND tot_cred >= 24

can also use OR

A

What do you use in the where clause to meet more than one criteria?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

REVERSED

generates every possible instructor - teaches pair, with all attributes from both relations/ common attributes in the resulting table are renamed using the relation name
cartesian product

A

What is the result of
SELECT *
FROM instructor, teaches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

REVERSED

=, <> (not equal), , <=, >=

A

What are the comparison signs in SQL?

41
Q

REVERSED

Self join is where a table joins with itself on two columns with different names but storing data from the same domain. distinguish copies by renaming the relations

A

What is a self join?

42
Q

REVERSED

SELECT T.name AS N1, S.name AS N2
FROM instructor AS T, instructor AS S
WHERE T.salary = S.salary AND T.name < S.name

A

What is an SQL query for getting the name of instructors who have the same salary, avoids pairs like (miller, miller) and produces pairs in alphabetical order. From the relation instructor

43
Q

REVERSED

All attributes from left table are listed. Matching attributes from the right table are added where they are available, null when not available
SELECT *
FROM course
LEFT OUTER JOIN prereq
ON course.course_id = prereq.course_id

A

What is a left outer join and its code in SQL?

44
Q

REVERSED

All attributes from the right table are listed. Matching attributes from the left table are added where they are available, null when not available
SELECT *
FROM course
RIGHT OUTER JOIN prereq
ON course.course_id = prereq.course_id

A

What is a right outer join and its code in SQL?

45
Q

REVERSED

takes all attributes from left and right tables, fills in all matching attributes and puts null where not available
SELECT *
FROM course
FULL OUTER JOIN prereq
ON course.course_id = prereq.course_id

A

What is a full outer join and its code in SQL?

46
Q

REVERSED

takes only the attributes that are common in both tables
Inner join keeps both the columns in the output even though they will be the same, while natural join will only keep one common column

A

What is a natural join and an inner join?

47
Q

REVERSED

SELECT *
FROM course
JOIN prereq
ON course.course_id = prereq.course_id

INNER JOIN is also the same as JOIN

A

How do you do an inner join in SQL?

48
Q

REVERSED

SELECT *
FROM course
NATURAL JOIN prereq
ON course.course_id = prereq.course_id

A

How do you do a natural join in SQL?

49
Q

REVERSED

WHERE name LIKE ‘%Van der%’

A

How do you match a string in the where clause?

50
Q

REVERSED

‘Intro%’ #matches any string beginning with “Intro”
‘%Comp%’ #matches any string containing “Comp” as a substring

A

How do you match an exact substring and the beginning of a string?

51
Q

REVERSED

‘___’ #matches any string of exactly three characters
‘___%’ #matches any string of at least three characters

A

How do you match a string of 3 characters and a string of at least 3 characters?

52
Q

REVERSED

WHERE … NOT LIKE …

A

How do you find mismatches of strings?

53
Q

REVERSED

LIKE ‘100\%’ ESCAPE ‘'

A

How would you match the string “100%”?

54
Q

REVERSED

WHERE (instructor.ID, dept_name) = (teaches.ID, ‘Biology’)

A

What is a tuple comparison?

55
Q

REVERSED

SELECT name
FROM instructor
WHERE salary BETWEEN 90000 and 100000

A

How can you find the names of all instructors with salary between $90,000 and $100,000 (this includes the values equal to)

56
Q

REVERSED

(SELECT course_id FROM section WHERE sem = ‘Fall’ AND year = 2009)
EXCEPT
(SELECT course_id FROM section WHERE sem = ‘Spring’ AND year = 2010)

A

How do you find the difference of two SQL queries?

57
Q

REVERSED

WHERE salary IS NULL

A

How do you check for null values using SQL?

58
Q

REVERSED

Unknown

A

What does 5> unknown return?

59
Q

REVERSED

  • unknown OR true = true
  • unknown OR false = unknown
  • unknown OR unknown = unknown
A

What are the unknowns combined with the or operator?

60
Q

REVERSED

  • true AND unknown = unknown
  • false AND unknown = false
  • unknown AND unknown = unknown
A

What are the unknowns combined with the and operator?

61
Q

REVERSED

treated as false if it evaluates to unknown

A

What happens to unknown values in the where clause?

62
Q

REVERSED

WHERE dept_name IN (‘Comp sci’, ‘Biology’)

A

What is another way to write
Where dept_name = ‘Comp Sci’ or dept_name = ‘Biology’

63
Q

REVERSED

SELECT *
FROM instructor
WHERE dept_name = ‘Physics’
ORDER BY name
-puts results in ascending order

A

How do you order the results of an SQL query?
What order does it put them in?

64
Q

REVERSED

  • avg: average value
  • min: minimum value
  • max: maximum value
  • sum: sum of values
  • count: number of values
A

What aggregate functions can be used in SQL? (5)

65
Q

REVERSED

SELECT dept_name, AVG(salary) AS avg_salary
FROM instructor
GROUP BY dept_name

A

How do you find the average salary of instructors in each department

66
Q

REVERSED

SELECT dept_name, AVG(salary) AS avg_salary
FROM instructor
GROUP BY dept_name
HAVING avg_salary > 80000

A

How do you find the average salary of instructors in each department which has average salary greater than 80000

67
Q

REVERSED

Having must be used after a GROUP BY clasue

A

When do you use WHERE vs HAVING?

68
Q

REVERSED

A subquery is a SELECT-FROM-WHERE expression that is nested within another query

A

What is a subquery in SQL?

69
Q

REVERSED

SELECT A1, A2, …, An
FROM r1, r2, …, rn
WHERE P

as follows:

  • Ai can be replaced by a sub query that generates a single value
  • ri can be replaced by any valid subquery
  • P can be replaced with an expression of the form: B (subquery)
A

What type of subqueries can be nested in each of the select, from and where clauses?

70
Q

REVERSED

SELECT dept_name,
(SELECT COUNT(*) FROM instructor WHERE department.dept_name = instructor.dept_name) AS num_instructors
FROM department

A

Code to list all departments along with the number of instructors in each department in SQL? where there is an instructor relation and department relation

71
Q

REVERSED

data = {‘State’: [‘Ohio, ‘Ohio’, ‘Ohio’, ‘Nevada’, ‘Nevada’, ’Nevada'], ‘Year’: [2000, 2001, 2002, 2001, 2002, 2003], ‘Population’: [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]} 
df = pd.DataFrame(data) 
data = [[‘Ohio’, 2000, 1.5], [‘Ohio’, 2001, 1.7], [‘Ohio’, 2002, 3.6], [Nevada, 2001, 2.4], [‘Nevada, 2002, 2.9], [‘Nevada’, 2003, 3.2]] 
cols = [‘State’, ‘Year’, ‘Population’] 
df = pd.DataFrame(data, columns = cols)
A

What are the 2 ways to create dataframes in python pandas?

72
Q

REVERSED

df. head(n) #get first n rows
df. tail(n) #get last n rows

A

How do you get the first and last rows of a dataframe in python?

73
Q

REVERSED

df.info()

A

How do you get the index and data types of a dataframe in python?

74
Q

REVERSED

df.describe()

A

How do you get the summary statistics of dataframe columns in python?

75
Q

REVERSED

df.sum()

A

How do you get column sums in python?

76
Q

REVERSED

df.count()

A

How do you get the count for each column in python?

77
Q

REVERSED

df[‘column’].isnull()

A

How do you get a true/false vector for whether the value is null in a column in python?

78
Q

REVERSED

df. iloc[5]
df. iloc[5, :]
df. loc[‘rowname’]

A

How can you extract row 5 in python by index (2) and rowname (1)?

79
Q

REVERSED

df.iloc[0:5,1]

A

How do you extract the first 5 rows and column 2 in python?

80
Q

REVERSED

df.columns

A

How do you get a vector containing the column names in python?

81
Q

REVERSED

df[[‘actor’, ‘facebook_likes’, ‘content_rating’]]

A

How do you extract multiple columns by name in python?

82
Q

REVERSED

df.director_name
df[“director_name”]

A

What are 2 ways to extract a single column by name in python?

83
Q

REVERSED

df[df.columns[[5,7,9]]][:] #extracts columns 5,7,9

A

How can you extract multiple columns by index in python?

84
Q

REVERSED

df.loc[df[‘content_rating’]==‘PG_13’, [‘facebook_likes’, ‘budget’]]
#or
df.iloc[(df[‘content_rating]==‘PG_13’).values,[3,5]]
#or
df[df.content_rating == ‘PG_13’][[‘facebook_likes’, ‘budget’]]

A

What are 3 ways to extract the rows with the condition that content rating = PG and the columns with names facebook likes and budget or index 3 and 5?

85
Q

REVERSED

df[‘newcolumn’] = …

A

How do you add a new column to a dataframe in python?

86
Q

REVERSED

del df[‘column1’]
df.drop(‘column1’, axis=1)

A

How do you delete a column in python? (2 ways)

87
Q

REVERSED

df.T

A

How do you transpose a dataframe in python?

88
Q

REVERSED

df.drop([‘row1’])

A

How do you delete a row in python?

89
Q

REVERSED

print(len(df.columns))

A

How do you display the number of columns in python?

90
Q

REVERSED

print(len(df))

A

How do you display the number of rows in python?

91
Q

REVERSED

df.columns[1].min()

A

How do you get the minimum of column 1 in python?

92
Q

REVERSED

pd.concat([df1, df2])

A

How do you concatenate dataframes row-wise in python?

93
Q

REVERSED

pd.concat([df1, df2], axis=1)

A

How do you concatenate dataframes column-wise in python?

94
Q

REVERSED

pd.merge(df1, df2, on=‘attribute’, how = ‘outer’)

A

How can you do a full outer join in python?

95
Q

REVERSED

pd.merge(df1, df2, on=‘attribute’, how = ‘left’)

A

How do you do a left outer join in python?

96
Q

REVERSED

list.unique()

A

How do you get the unique values in a list?

97
Q

REVERSED

list.append(c)

A

How do you add values to the end of a list?

98
Q

REVERSED

df.groupby([‘column’]).count()

A

How do you group by a particular attribute and count how many in each group in python?

99
Q

REVERSED

Where columns match, it will add the data in that column from each df, e.g. “BIOBIO” because strings will concatenate. Where columns don’t match, will include all columns from both dataframes but entries will all be NaN

A

What is the result in python of adding two dataframes df1+df2