f Flashcards

(123 cards)

1
Q

Aggregation:

A

Used to model a relationship involving a relationship set

Allows us to treat a relationship set as an entity set for purposes of participation in other relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lowest-level ER model:

A

physical data model (PDM) – most detailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

is ER design subjective

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Entity:

A

real-world object, distinguishable from other objects. An entity is described as using a set of attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reasons to use ISA:

A

Add descriptive attributes specific to a subclass (ex. Not appropriate for all entities in the superclass)

Identify entities that participate in a particular relationship (ex. Not all superclass entities participate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Weak Entity:

A

Weak entity can be identified uniquely only by considering the primary key of another (owner) entity.

Weak entity set must have total participation in this identifying relationship set.

Weak entities only have a “partial key” (dashed underline)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The steps:

A
  1. Identify the entities
  2. Identify the relationships between the entities
  3. Determine whether a relationship is 1:1 1:M or M:N
  4. Determine whether participation in a relationship mandatory (At least one instance or full participation) or not
  5. Identify weak entities
  6. Identify ISA hierarchies and aggregations
  7. Consider possible refinements: should a concept be modeled as an entity or attribute? Should a concept be modeled as an entity or a relationship? Identifying relationships? Binary or ternary?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mid-level ER model:

A

Logical data model (LDM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In translation a relationship set to a relation, attributes of the relation must include(3):

A
  • keys for each participating entity set (as foreign keys)
  • this set of attributes forms a superkey for the relation
  • all descriptive attributes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Owner entity set and weak entity set must participate in

A

Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Entity set:

A

a connection of similar entities (ex. All employees). All entities in an entity set have the same set of attributes, each entity set has a key (underlined), each attribute has a domain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Locks and transaction design:

A

When you change a row, no one else can modify it until you issue a COMMIT, try not to hold locks for too long since it will slow down other sessions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Two biggest causes of contention are

A

locks and latches (or mutexes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Types of contention (3):

A

locks
latches and mutex
buffer contention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Two ways to design transaction locking?

A

Pessimistic and Optimistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Contention prevents:

A

Contention prevents the database from working on all of the requests that are outstanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

High-level ER model:

A

Conceptual data model (CDM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Contention is another word for

A

bottleneck

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Pessimistic locking:

A

works best if you think someone else is going to “grab” your row before you are finished

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Hints:

A

You can add a hint to your SQL to change the execution plan

SELECT /* index(index name) */

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Performance problems are often seen as

A

unacceptable response time or throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Are indexes good for small or large amount of the tables?

A

Indexes are only good for getting small amount of the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Tuning involves

A

proactive monitoring and bottleneck elimination, providing room for system scalability (process more workload)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Partitioning:

A

Split table up to make them smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Execution plan:
Called explain plan or query plan. Tells you how the optimizer will execute your SQL
26
Transaction design:
Minimize lock duration using pessimistic and optimistic locking strategies
27
Histograms:
Histogram allows the optimizer to understand how the data is being distributed and make the best decision
28
Network overhead:
Avoid unnecessary network round trips.
29
Optimistic locking:
works best if you think no one else will “grab” the row
30
Denormalize:
merge tables to avoid joins, create “materialized views: to avoid big grouping or filtering
31
SQL tuning
optimizer determines execution plan
32
does a concatenated index have 1 or many columns?
Concatenated index has more than one column
33
Two ways to reduce round trips:
Use the “Array” interface in your program code Use the stored procedures for complex interactions with the database
34
Should you create an index on all columns?
Do not create an index on every column
35
what do you violate in Denormalization?
In denormalization you violate 3NF to improve performance
36
Configure Memory:
use memory to avoid I/O, operations to read from memory, operations to share memory
37
IT Service Management (ITSM) can be used to
IT Service Management (ITSM) can be used to identify and maintain service levels
38
When to not use NULLS?
Do not use NULL if you will search for NULLS use something that can be indexed (N/A)
39
Reduce contention:
avoid contention for locks and latches
40
Instance tuning
avoid bottlenecks in initial design and later monitor performance
41
Indexes:
Create indexes on groups of columns that are queried together Indexes speed up queries but make DML slower
42
Nulls:
Using nulls has significant performance implications (nulls do not take any space, nulls cannot be indexed)
43
Performance management tools do what?
Monitor, estimate, plan capacity, analyze, reorganize, optimize, cache, compress, sort
44
What can you do for optimal performance(4)?
Tune the application Reduce contention Configure Memory Tune I/O
45
Tune the application:
choose the best data model, reduce the load on the database, tune the SQL statements
46
3 Things we can do to tune our application:
1. Structure the tables in a way that the database would work better 2. Tune the application code (Java, C++ etc.) 3. Tune the SQL statements
47
Should you put a high or low load on a database and why?
Do not put much load on the database | When the application puts too much load on the database, and the performance decrease, it is not the database fault
48
When do subtypes occur?
Subtypes can occur when you are modelling things that are almost the same
49
Tune I/O:
use fast disks, use RAID (o + 1), use SSDs
50
Performance management tools(7):
* Oracle performance method – eliminating bottlenecks and developing efficient SQL statements * Database self-monitoring – sends alerts to notify of impending problem using expected values for comparison * AWR (Automatic Workload Repository) – performance history * AWR baseline-statistics with DB performing well at peak load * Adaptive threshold-warning and critical alert thresholds * ADDM uses AWR statistics to diagnose performance * OEM (Oracle Enterprise Manager) – GUI for maintenance
51
Why start in 3NF?
Starting point is 3NF, it takes all redundancy in the data and determines the PKs. Every column in a table should be identified by: the PK, all of the PK, nothing but the PK.
52
usually locking problems are due to :
usually locking problems are due to application locks, sometimes internal locks can cause problems
53
Database buffers cache table and index data to avoid what?
Database buffers cache table and index data to avoid reading data from the disk
54
What do locks prevent?
Prevent 2 sessions from changing table data at the same time – this avoids “lost updates”
55
Database might have to perform a sort if:
* SQL contains an ORDER BY or GROUP BY * The two tables are being joined without an index, both tables are sorted and the results merged (called a SORT-MERGE join)
56
what are soft areas and what do they allow?
Soft areas (Called PGA in Oracle) allow sorting and hash structures to be maintained in memory; otherwise they would be written to a temporary
57
Latches and Mutex:
Latches are very light weight locks that protect memory instead of tables
58
Buffer cache ‘hit rate’:
When asked for data, the database first looks for it in the memory buffers. If the data is found in memory it is called a “hit”, otherwise the data must be read from the disk (Called a “miss”)
59
Buffer contention:
Memory itself can become a problem; no one can get the memory they want.
60
If there is not enough memory to do the hash table or a sort in memory, then the database will...?
If there is not enough memory to do the hash table or a sort in memory, then the database will read and write data to a temporary file group
61
What do database buffers improve?
Database buffers improve performance by caching data in memory
62
Hash join
Hash joins are more efficient alternative to SORT-MERGE. A hash table is built on one of the tables and acts like an “on the fly” index
63
What happens when a buffer is modified?
When buffers are modified they are called dirty; these have to be written to disk
64
Hit rate ratio?
The ratio (hit/(hits + misses)) is called the ‘hit rate’
65
What do latches do?
Latches are like locks, but instead of protecting table rows, they protect memory (buffers)
66
Disk IO is the fastest or slowest part of the database system?
Disk IO is the slowest part of the database system, so it is critical to performance
67
What are locking problems caused by?
Most locking problems are caused by application code, optimistic locking strategy is often the solution There are system locks that can cause problems, these are rare and database specific
68
Latency
Latency is the time taken to perform a single IO
69
What happens when all database buffers are dirty?
When all the blocks are dirty then sessions have to wait for the buffers to be written before new data can be read
70
Throughput
Throughput is the number of operations over time (IO/second)
71
Difference between a latch and a lock?
for a latch: If two sessions try to access the same area of memory, then one will wait Instead of “sleeping” (like a lock) the waiting session will “spin” on the CPU for a very short time
72
High latency means
High latency means you are overloading the disk
73
Disk latency increases with
throughput, disk fill
74
To avoid overloading disks
To avoid overloading disks, we combine multiple disks into an “array”. The array can then support higher amounts of disk IO
75
To get best performance disk should be
sparsely populate, under only moderate load
76
what can the multiple disk array protect against?
The array can also protect the disk data loss by storing multiple copies of data
77
what is the “hockey stick” curve?
When the disk is overloaded, latency goes up and throughput stalls (called the “hockey stick” curve)
78
“RAID” levels describe
“RAID” levels describe the type of array. RAID levels 1,0 and 5 are the most frequently used
79
RAID FIVE:
Distributes data across disks like RAID 0 Creates a “parity” block for every data block that can be used to recover data if the disk fails
80
why are SSD a poor choice?
But they are a poor choice for data that does not get accessed very often
81
RAID 5 vs RAID 0+1
RAID 5 requires less disks that RAID 0+1 so it’s cheaper (but also much slower when writing)
82
Using SSD can be the best way to provide
very high IP rates
83
RAID 0+1:
Stripping and mirroring together Best performance Protection against data loss More expensive (more disks) than RAID 5 Best solution for database files
84
raid 0 is also called
Also called striping
85
Pitfalls
use of information as directive rather than indicator of potential problem
86
RAID ZERO:
Data is spread across multiple disks to distribute IO evenly Good performance but no protection against data loss
87
RAID ONE is also called
Also called mirroring
88
RAID ONE:
Data is duplicated on two or more disks Protects against data loss, but does not spread the IO across disks
89
Data governance program oversees the management of (7)?
Data governance program oversees the management of the quality, maintainability, availability, usability, integrity, scalability and security of enterprise data
90
Types of changes(5):
* DBMS software – migrations, procedures * Hardware configuration * Logical and physical design * Applications * Physical database structures
91
Management visibility:(4)
Impact Prosecution Cost Durability
92
Impact:
upper-level management is keenly aware of the need to comply
93
who requests changes to databases?
DBA does not request change (programmers, application owners, business owners do)
94
who implements changes to databases?
DBA carries out most database changes
95
Change management requirements:(8)
* Proactivity * Intelligence * Analysis * Automation * Standardization of procedure * Reliable and predictable process * Availability * Quick and efficient delivery
96
Prosecution:
can result in huge fines and imprisonment
97
Cost:
can be significant but so can the cost of non-compliance
98
Durability:
increasing regulation – increasing time, effort and capital will be spent on compliance
99
Collaborative approach:(3)
Business legal IT
100
Nulling out:
removes the sensitive data by deleting it
101
Business:
must understand the legal requirements imposed on their data and systems as dictated in regulations
102
Encryption:
scrambles the data algorithmically. Thi5s technique will not produce realistic looking data and can make the data larger
103
Compliance-related tasks that impact DBA role include:(5)
* Metadata management and data quality * Database and data access auditing * Data masking and obfuscation * Long-term data retention and database archiving * Closer tracking of traditional DBA tasks
104
Number and data variance:
varies the existing values in a specified range in order to obfuscate them
105
Five techniques that can be used to audit database access:
1. Adding columns to tables – not a good idea 2. DBMS traces – ISV offering is better 3. Log based - missing read activity 4. Network sniffing – missing server requests 5. Capture requests at the server
106
Shuffling:
uses the existing data and moves the values between rows in such a way that the no values are present in their original rows
107
Legal:
must be involved to interpret the legal language of the regulations and ensure that the business is taking proper steps to protect itself
108
Substitution:
replaces existing data with random values from a pre-prepared data set
109
Data masking:
Data masking is the process of protecting sensitive and personally identifiable information (PII) in non-production databases from inappropriate visibility
110
IT:
must be involved to implement the policies and procedures to enact the technology to support the regulatory mandates
111
Table-to-table synchronization:
masks data assuring that the results are referentially intact.
112
Data masking techniques:(6)
``` Substitution Shuffling Number and data variance Encryption Nulling out Table-to-table synchronization ```
113
OLAP
online analytical processing
114
Data warehouse
a collection of integrated, non-volatile, time-variant, subject oriented databases designed to support the DSS function.
115
DDS data
differs from transactional operational data in timespan, granularity and dimension
116
DDS
data distrubution system
117
VLDB
very large databases
118
Data warehouses are designed for
Data warehouses are designed for analytical processing.
119
OLTP
online transaction processing
120
Life cycle of data:
* Create * Operational (completing business transactions) * Reference (reporting or queries) * Archive (compliance and business protection) * discard
121
Data warehouse contains:
The data warehouse contains atomic data and lightly summarized data.
122
Business Intelligence:
Comprehensive, cohesive, integrated tools and processes.
123
BI Actions include:
* Drilling up/down hierarchies * Comparing aggregate values * Parallel execution