Databashantering Flashcards

1
Q

ER Model

A
  • Makes a visual model of the database
  • Easy to understand and interpret
  • Physcological representations of how tasks should be carried out
  • Easy to convert into relations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Super key

A

A superkey is an attribute or set of attributes that is used to uniquely identifies all attributes in a table, such as ID or phone-number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Candidate key

A

A minimal super key. It has the least possible number of attributes to still be a super key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Weak entity set

A
  • Do not have sufficient attributes to form a primary key
  • Depend on another entity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

One-to-one

A

For example, one person can only be married to one person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

One-to-many

A

For example a scientist can invent many inventions, but the invention is done by one specific scientist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Many-to-many

A

For example, classmates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data redundancy

A
  • When the same piece of data can be found in multiple places
  • Leads to update anomalies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Anomalies

A

Inconsistencies or errors that can arise when working with relational databases

Type of anomalies
- Update anomalies
- Deletion anomalies
- Insertion anomalies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Normalisation

A
  • Fix the problem with data redundancy and anomalies
  • Efficiency
  • Divides larger table into smaller and linked them using relationships
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Advantages with normalization

A
  • Data consistency: Normal forms ensure that data is consistent and does not contain any redundant information.
  • Data redundancy: Normal forms minimize data redundancy by organizing data into tables that contain only unique data
  • Response time: Normal forms can improve query performance by reducing the number of joins required to retrieve data. This helps to speed up query processing and improve overall system performance.
  • Database maintenance: Normal forms make it easier to maintain the database by reducing the amount of redundant data that needs to be updated, deleted, or modified.
  • Database design: Normal forms provide guidelines for designing databases that are efficient, flexible, and scalable. This helps to ensure that the database can be easily modified, updated, or expanded as needed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Steps of database normalisation

A
  1. 1NF: Every table have a primary key. The tables cannot be broken down further. Each value needs to be in its own column
  2. 2NF: Any column that is not the primary key needs to be dependent on the primary key.
  3. 3NF: Any column that is not the primary key needs to be dependent on the primary key (and no other column)
    Has no transitive functional dependencies
  4. BCNF: A stricter version of 3NF, BCNF ensures that every non-trivial functional dependency is a superkey. This means that no partial dependencies or transitive dependencies are allowed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does 3NF and BCNF give you?

A
  • Lossless join
    It should be possible to project the original relations onto the decomposed schema, and then reconstruct the original
  • Dependency Preservation
    It should be possible to check in the projected relations whether all the given FD’s are satisfied
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Inner/outer join

A
  • Inner join joins all matching rows
  • Left outer joins all matching rows and keeps the rows of the left table
  • Right outer joins all matching rows and keeps the rows of the right table
  • Full outer join joins all matching rows and keeps the other non-matching rows
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Extended Relational Algebra

A
  • An extension of relational algebra
  • Introduces aggregate functions (sum, average, count, min max)
  • Grouping
  • Set Difference
  • Division
  • Allows more complex queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SQL Triggers

A
  • Stored programs that runs automatically when a condition is met
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Advantages of Triggers

A
  • It will examine the data and make changes if necessary
  • Automate repetitive or complex tasks
  • Improves efficiency
  • Saving time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is Concurrency control and why is it important?

A
  • Ensures that correct and consistent behaviour when multiple transactions are executed concurrently in a multi-user environment
  • Usually happens when multiple users try to write data at the same time or when one write and the other read
  • Without proper control this concurrent access can lead to data inconsistency, lost updates and anomalies
  • To ensure data integrity when updates occur to the database in a multi-user environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Advantages of Semi-structured Data (XQuery) over traditional structured data (SQL)

A
  • The data is not constrained by a fixed schema
  • Data is portable
  • Provides more flexibility in terms of data storage and management
  • Scalability, well suited for managing large volumes of data
  • Richer data analysis, contains more contextual information, such as metadata or tags. This provides additional insights and context that can approve the accuracy and relevance of data analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

ACID transaction

A
  • Atomic
    All the operations within the transaction are completed successfully or none of the are
  • Consistent
    Database constraints preserved.
    The database remains in a consistent state before and after the transaction
  • Isolated
    It appears to the user as if only one process executes at a time
    Ensures that the intermediate state of the transaction is invisible to other concurrent transactions until it is committed
  • Durable
    Effects of a process survive a crash
    Once a transaction is committed, the changes made by the transaction are permanent and will not be lost, even in the event of a system failure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why do we need Normalisation?

A
  • The main reason for normalizing the relations is for removing anomalies.
  • Failure to eliminate anomalies lead to data redundancy and can cause data integrity and other problems as the database grows.
  • We want the database to be accessed and used by any user, data needs to be connected so that it can be joined, or linked, and so that changes to one piece of data automatically propagate throughout the system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Advantages of Normalisation

A
  • Helps to minimize data redundancy
  • Greater overall database organization
  • Data consistency within the database
  • Much more flexible database design
  • Enforces the concept of relational integrity
  • Protected from
    insertion anomalies
    deletion anomalies
    update anonalies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

GRANT

A
  • Grant a specific privilege on one object in a single statement, grant a list of privileges, or grant privileges over a list of object
  • Grant ALL, for all the privileges of accessing a single table or for all privileges that are associated with a specific package
  • GRANT is a command used to provide access or privileges on the database objects to the users
  • Syntax: GRANT privilege ON object To user_or_role;
    privileges can be various actions or operations such as SELECT, INSERT, UPDATE, DELETE..
    GRANT SELECT, INSERT ON employees TO john;
24
Q

Advantages of the relational model by SQL over semi-structured as represented by XML with XPATH/XQuery

A
  • Schema
    SQL databases have fixed schema, ensuring that all data follows the same structure, which promotes data integrity and consistency
  • Efficiency
    SQL queries are generally more efficient for large datasets and complex queries
  • ACID properties
    SQL databases are built to guarantee ACID properties (Atomic, consistency, isolation, durability)
25
Q

T/F: Primary key attributes may be NULL?

A

False. It removes the uniqueness. Primary keys are unique and NULL values would undermine this

26
Q

For user A, a superuser must explicitly grant privileges to A on
the objects that A creates?

A

False.

In many database management systems, when a user creates an object (like a table, view, etc.), that user automatically becomes the owner of that object. As the owner, the user typically has certain default privileges on the object, such as the ability to modify or drop it.

27
Q

Materialized views take up more storage space than regular views?

A

True.
-Materialized views store the result set of a query as a physical table on disk
- Regular views do not store the result set directly but instead store the query definition

Since materialized views store the actual data, they typically consume more storage space compared to regular views, which only store metadata about the query

28
Q

T/F: With triggers, ‘ECA rules’ stands for Event, Condition, Action rules

A

True.

  • ECA rules refer to Event, Condition, Action rules:
  • Event: This is the specific operation or event that triggers the execution of the trigger. Examples include INSERT, UPDATE, DELETE, or other database events.
  • Condition:
    This is an optional part of a trigger that specifies a condition that must be satisfied for the trigger action to be executed. If the condition evaluates to true, the trigger action will be executed; otherwise, it will not.
  • Action:
    This is the set of instructions or commands that are executed when the trigger is fired. The action is typically some operation that modifies data or performs some other action in response to the event and condition.
    These components together define the behavior of a trigger in response to specific events occurring within the database system.
29
Q

Under set semantics and no NULLs, R⋈R = R

A

True.

Under set semantics and in the absence of NULL values, the natural join (⋈) of a relation R with itself (R⋈R) would result in the relation R itself.

30
Q

Armstrongs Axiom

A

Armstrong’s Axiom
1. Inclusion rule
2. Transitivity rule
3. Augmentation rule

Other derived rules
1. Union rule
2. Decomposition rule
3. Psuedotransitivity
4. Accumulation rule

31
Q

CRUD

A

The data layer ensures data integrity and security. It provides the application layer with data access and service functionalities via CRUD (Create, Read, Update, Delete) operation

32
Q

The three tiers are:

A
  1. Client tier (Web browser, displays information to the client)
  2. Server tier (Python etc, interacts with the server that stores the data)
  3. Data tier (Postgres etc, the retrieval of data and data storage)
33
Q

ACID Transactions

A

Atomic
All the operations within the transaction are completed successfully or none of the are

Consistent
Database constraints preserved.
The database remains in a consistent state before and after the transaction

Isolated
It appears to the user as if only one process executes at a time
Ensures that the intermediate state of the transaction is invisible to other concurrent transactions until it is committed

Durable
Effects of a process survive a crash
Once a transaction is committed, the changes made by the transaction are permanent and will not be lost, even in the event of a system failure

34
Q

Indexing

A
  • The process of creating data structures that improves the speed of data retrieval operations
  • By creating an index on one or more columns of a table, the database engine can locate the desired data more efficiently
  • Think of it as index in the back of a book
    Not all columns need to be indexed.
35
Q

The chase test

A

is commonly used to determine whether a set of
functional dependencies in preserved over a decomposition.

36
Q

Using UPDATE in SQL, remove all the products of type ‘Tobacco’.

A

Product(pid,name,type, price)

UPDATE is used to modify existing data rather than remove it.
UPDATE Product
SET pid = ‘inactive’, name=’inactive’,type=’inactive’,price=’inactive’
WHERE type = ‘Tobacco’;

OR delete
DELETE FROM Product
WHERE type = ‘Tobacco’;

37
Q

Relational model

A

Table

38
Q

Semi-structured model

A
  • Trees/graphs
  • Nested data structure
  • For example XML, e-mails, data from apps, photos and videos
  • The flexibility of schemaless design and the ability to represent a wide range of information are important reasons that semi-structured data has become so common
39
Q

Difference between TEXT, CHAR and VARCHAR

A
  • Unlike CHAR, VARCHAR only uses as much storage as the length of the data plus one or two additional bytes to record the length of the data (depending on the maximum length specified or the database system).
  • If you store a string of 10 characters in a VARCHAR(20) column, it will only use storage space for those 10 characters plus the additional bytes needed to record the length, not for the full 20 characters.
    This makes VARCHAR a more space-efficient choice for columns that store strings of widely varying lengths
  • TEXT is appropriate for storing longer text, such as descriptions, since it has no limit of the length
40
Q

Difference between UNIQUE and PRIMARY KEY

A
  1. There can be only one PRIMARY KEY
    for a relation, but several UNIQUE
    attributes.
  2. No attribute of a PRIMARY KEY can
    ever be NULL in any tuple. But
    attributes declared UNIQUE may have
    NULL’s, and there may be several
    tuples with NULL.
41
Q

Aggregation operators

A

The most important examples: SUM,
AVG, COUNT, MIN, and MAX.

42
Q

Three-Valued Logic

A

● To understand how AND, OR, and NOT work in
3-valued logic, think of TRUE = 1, FALSE = 0, and
UNKNOWN = ½.
● AND = MIN; OR = MAX, NOT(x) = 1-x.
● Example:
TRUE AND (FALSE OR NOT(UNKNOWN)) = MIN(1,
MAX(0, (1 - ½ ))) =
MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.

43
Q

Why does SQL have a keyword CASCADE used with CREATE TABLE?

A

If changes on the PT it changes in CT too
Automatically

44
Q

T/F: XQuery uses FLWOR expressions

A

FLWOR (pronounced “flower”) is an acronym for “For, Let, Where, Order by, Return”.

For - selects a sequence of nodes
Let - binds a sequence to a variable
Where - filters the nodes
Order by - sorts the nodes
Return - what to return (gets evaluated once for every node)

45
Q

T/F: An XML document must have an associated DTD to evaluate XPATH queries over it.

A

False

An XML document does not need to have an associated DTD (Document Type Definition) to evaluate XPath queries over it. XPath can be used to navigate and query XML documents regardless of whether they have a DTD, XML Schema, or any other type of schema defined.

46
Q

What are SQL Indexes and what are they good for?

A
  • SQL indexes are special lookup tables that the database management system maintains to speed up data retrieval operations on a database table.
  • An index in a database is somewhat similar to an index in a book - it allows the database engine to quickly find and access the data without scanning every row of the table every time a query is executed.
47
Q

Synthesis algorithm

A

Algorithm to obtain a decomposition of a relation into 3NF that is:
- Lossless
- dependency-preserving

48
Q

The 3 steps of Synthesis Algorithm

A
  1. Cannicalization
  2. Minimal cover
  3. Grouping
49
Q

Core Relational Algebra

A

● Union, intersection, and difference.
● Usual set operations, but
both operands
must have the same relation schema.
● Selection: picking certain rows.
● Projection: picking certain columns.
● Products and joins: compositions of relations.
● Renaming of relations and attributes.

50
Q

The isolation levels

A
  • Read Uncommitted
    Highest performance but leads to most undesirable read phenomena
  • Read Committed
    Eliminates “dirty read”
  • Repeatable Read
    The transaction has to wait till the other transaction’s update or read query is complete.
    Avoids dirty read and repeatable read
  • Serializable
    We can ask any transaction to wait until the current transaction completes
    Highest level, avoids dirty read, repeatable read and phantoms
51
Q

Transaction

A

A transaction is a set of queries (like SELECT, INSERT, UPDATE, DELETE) sent to the database

52
Q

Dirty read

A
  • When two transactions access the same data and we allow for reading values that are not yet committed, we may get a dirty read.

Let’s say that we have two transactions doing the following:

Transaction 1:
SELECT salary FROM People WHERE id = 1

Transaction 2:
UPDATE People SET salary = 180 WHERE id = 1
ROLLBACK

‍Transaction 2 modifies row with id = 1, then Transaction 1 reads the row and gets value 180, and Transaction 2 rolls things back. Effectively, Transaction 1 uses value that doesn’t exist in the database

53
Q

Repeatable Read

A

Repeatable read is a problem when a transaction reads the same thing twice and gets different results each time.
Transaction 1:
SELECT salary FROM People WHERE id = 1

Transaction 2:
UPDATE People SET salary = 180 WHERE id = 1
COMMIT

Transaction 1:
SELECT salary FROM People WHERE id = 1

Transaction 1 reads a row and gets value 150. Transaction 2 modifies the same row. Then Transaction 1 reads the row again and gets a different value (180 this time).

54
Q

Phantom read

A

Phantom read is a case when a transaction looks for rows the same way twice but gets different results

Transaction 1
SELECT * FROM People WHERE salary < 250

Transaction 2
INSERT INTO People(id, name, salary) VALUES (3, Jacob, 120)
COMMIT

Transaction 1
SELECT * FROM People WHERE salary < 250

‍Transaction 1 reads rows and finds two of them matching the conditions. Transaction 2 adds another row that matches the conditions used by the Transaction 1. When the Transaction 1 reads again, it gets a different set of rows

55
Q

T/F: Different transactions can run under different isolation levels

A

The isolation level is defined per transaction. For example, it’s allowed for one transaction to run with SERIALIZALBLE level, and for another to run with READ UNCOMMITED