SQL Day 5 Flashcards

(46 cards)

1
Q

What is a B-tree
(Binary Tree) in
database indexing,
and why is it used?

A

-Used to retrieve data from a table
-Is the logical structure of the index
-It has three levels
1. The Root level
2. The Branch level
3. The Leaf levels

Levels on a clustered index contain the actual data pages where the data is physically stored.

In a nonclustered index, the leaf level has a pointer to the location in memory where the data is stored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between
CHAR and VARCHAR data types in SQL?

A

CHAR is a fixed-length data type,
meaning it always reserves a set amount of space for the data, even if the actual data is shorter.

VARCHAR, on the other
hand, is a variable-length data type,
which only uses as much space as the data requires.

CHAR is more efficient for
storing fixed-length data, while

VARCHAR is better for variable-length
data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some performance considerations when choosing between CHAR and VARCHAR?

A

CHAR can be faster for fixed-length strings because the database knows exactly how much space to allocate and retrieve.

However, it wastes space for shorter strings.

VARCHAR is more space-efficient for variable-length data,

but

It may introduce a slight performance overhead.

When calculating the string’s actual length.

For large datasets with variable-length data, VARCHAR is generally preferred to save storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between
VARCHAR and VARCHAR(MAX)
in SQL?

A

The VARCHAR data type is used to store variable-length strings, with a maximum length specified by the user (up to 8000 characters).

VARCHAR(MAX) can store much larger strings,
up to 2 GB of data.

The key difference is that
VARCHAR has a defined size limit, while VARCHAR(MAX) is designed to store much larger data when needed, such as large text or
documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When should you use
VARCHAR(MAX) instead of
VARCHAR?

A

VARCHAR(MAX) should be used
when you need to store very large strings that could exceed 8000 characters, such as file contents, large text blobs, or lengthy descriptions.

For smaller, predictable data lengths, VARCHAR with a defined limit is preferred because it is more memory-efficient and can perform
better for regular string operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are theperformance
implications ofusing
VARCHAR(MAX)?

A

VARCHAR(MAX) can impact performance if
not used carefully.

Since it can store very large
data, SQL Server may allocate more resources than necessary for operations like sorting or
indexing.

In most cases, it’s better to use a
specific VARCHAR size if the data length is known and predictable, as VARCHAR(MAX) may also be stored out-of-row, adding extra overhead
for retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the
difference between
VARCHAR and
NVARCHAR in
SQL?

A

That VARCHAR stores non-Unicode characters, while NVARCHAR
stores Unicode characters.

NVARCHAR allows for storing multilingual or special characters and uses more storage space (twice the space per character) compared to VARCHAR.

VARCHAR is ideal for data that
doesn’t need to support multiple languages or special characters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the performance
considerations when using NVARCHAR vs VARCHAR?

A

NVARCHAR requires more storage space (2 bytes per character) compared to VARCHAR (1 byte per character), which can lead to larger database sizes and slightly slower performance for large datasets.

However, if you need to store
multilingual data or special characters, NVARCHAR is necessary.

For performance optimization, use VARCHAR when Unicode support is not needed, as it uses less storage and may improve performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the
difference between
an Index Seek and
an Index Scan in
SQL?

A

An Index Seek is an efficient operation where SQL Server uses an index to directly locate specific rows that satisfy the query conditions.

An Index Scan, on the other hand, scans through all the rows in an index to find matching data, which is less efficient.

Index Seeks are preferred because they minimize the number of rows
accessed, while Index Scans can impact performance, especially on large tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When might SQL
Server use an
Index Scan instead
of an Index Seek?

A

SQL Server might use an Index Scan
when no appropriate index exists for the query’s WHERE clause or when a large portion of the table’s data needs to be retrieved.

It can also occur if the query is complex or requires reading the majority of the table’s rows, making a scan more efficient than performing multiple seeks.

Adding or optimizing indexes can often reduce unnecessary scans.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some
other important
operators seen in
an SQL execution
plan?

A

Other important operators in an execution plan include:

Table Scan, Nested Loop Join, Hash Join, Sort

Table Scan: Reads all rows in a table when no index is available.

Nested Loop Join: Iterates over rows from one table for each row of another, useful for small datasets.

Hash Join: Uses a hashing algorithm to join tables, often for large datasets.

Sort: Orders rows based on specified columns, which can be resource-intensive for large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does ACID
stand for in
database
management, and
why is it important?

A

ACID stands for Atomicity,
Consistency, Isolation, and Durability.

These are the core principles that ensure the reliable processing of the database transactions.

They are essential for
maintaining data integrity and ensuring that transactions are processed completely, accurately, and reliably, even in cases of system failure or concurrent operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Atomicity in
the context of ACID,
and how does
it affect
transactions?

A

Atomicity ensures that a transaction is
treated as a single unit of work, meaning
either all operations within the transaction
are completed, or none of them are. This
prevents partial updates or changes from
being saved if an error occurs, ensuring
that the database remains in a consistent
state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do Isolation
and Durability
contribute to
database
reliability?

A

Isolation ensures that transactions are
executed independently of one another,
preventing concurrent transactions from
interfering with each other. Durability guarantees
that once a transaction has been committed, the
changes are permanent and will survive system
failures. Together, these principles protect data
integrity during concurrent processing and
ensure that committed data remains safe.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is
Consistency in
the ACID
Principles?

A

Consistency
means that a
transaction will
never be left in a
half-finished state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Principles?
What are
subqueries in SQL,
and where can
they be used?

A

A subquery is a query nested
within another query. It can be used
in SELECT, INSERT, UPDATE, or
DELETE statements or inside
clauses like WHERE, FROM, or
HAVING. Subqueries are often
used to return data that will be used
by the main query to filter or
manipulate results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is SQL
Profiler, and what
is its primary
function?

A

SQL Profiler is a tool in SQL Server
used to monitor and capture events
happening in the database. It allows
database administrators to track and
analyze queries, transactions, and server
activity in real time, helping diagnose
performance issues, optimize queries,
and troubleshoot problems like deadlocks
or excessive resource usage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are some
key considerations
when using SQL
Profiler?

A

When using SQL Profiler, it’s important to
capture only relevant events to avoid
performance overhead on the server. Filtering by
specific databases, users, or events can reduce
the volume of data captured. Running Profiler on
a production system can also affect
performance, so it’s recommended to run traces
during off-peak hours or use it sparingly for
troubleshooting specific issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a local
temporary table
in SQL, and
how is it used?

A

A local temporary table in SQL is a
temporary table that is visible only to the
session that created it. It is created with a
single # before the table name (e.g.,
#TempTable). Local temp tables are
used to store temporary data for that
session and are automatically dropped
when the session ends. They are useful
for intermediate calculations or storing
temporary results.

20
Q

What is a global
temporary table,
and how does it
differ from a local
temporary table?

A

A global temporary table is accessible
by any session or connection and is
created with two ## before the table
name (e.g., ##GlobalTempTable). It
remains available to all users until the
session that created it is closed, and no
other sessions are referencing it. Global
temp tables are less commonly used but
are helpful when multiple sessions need
access to the same temporary data.

21
Q

What are some
considerations
when using
temporary tables in
SQL?

A

When using temp tables, it’s important
to be mindful of performance, especially
with large datasets, as temporary tables
are stored in the tempdb database.
Indexing temp tables can improve query
performance. Additionally, be sure to
drop temp tables when they are no longer
needed to free up resources, although
SQL Server often handles this
automatically for local temp tables.

22
Q

What is the
REPLACE function
in SQL, and how
does it differ from
STUFF?

A

The REPLACE function in SQL is used
to replace al occurrences of a specified
substring within a string with another
substring. Unlike STUFF, REPLACE
doesn’t work with character positions but
instead searches for and replaces al
instances of a specified substring. It’s
useful when you need to substitute
specific characters or words within a
string.

23
Q

When should you
use the NOT NULL
constraint in a
database table?

A

The NOT NULL constraint should be
used when a column is required to
always have a value, meaning it cannot
be left blank. It is typically applied to
columns that are essential for data
completeness, such as identifiers (e.g.,
customer ID), dates, or critical
information (e.g., names or email
addresses). It ensures that essential data
is always provided.

24
Q

What is a
Check
constraint?

A

It hinders the insertion of
data into a column if it
doesn’t meet certain
criteria specified by the
user. For example an
Email column must contain
‘@’ as well as a ‘.’.

25
Explain what a CASE statement is and describe its primary usage in T-SQL queries.
A CASE statement in T-SQL is a way to implement conditional logic directly within SQL queries. It allows you to execute different SQL expressions based on specified conditions. The primary usage of a CASE statement is to conditionally modify the output of a query without using multiple queries or external application logic. It's often used in data transformation, data categorization, or implementing business logic directly within the database.
26
Can you describe a scenario where a CASE statement might be more beneficial than using multiple IF statements in a T-SQL query?
A CASE statement is particularly beneficial when you need to embed conditional logic within a single SQL statement, such as a SELECT, UPDATE, or DELETE. For example, in a report generation scenario where you need to categorize output based on certain thresholds (e.g., categorizing sales into low, medium, and high), using a CASE statement directly in the SELECT clause can streamline the process and improve performance by reducing the need for multiple conditional IF statements or separate queries for each category.
27
Discuss any potential performance impacts of using CASE statements within a SQL query. Are there best practices for optimizing their usage?
CASE statements can affect performance if used improperly, such as in scenarios where multiple CASE statements evaluate the same expressions repeatedly or when used in predicates that lead to inefficient query plans. Best practices for optimizing CASE statements include avoiding redundant evaluations by simplifying expressions, using them judiciously within predicates to ensure they do not prevent the use of indexes, and carefully managing their placement in SELECT clauses versus WHERE clauses to optimize how and when conditions are evaluated during query execution.
28
What is the purpose of the WHERE clause in SQL?
The WHERE clause in SQL is used to filter records from a table to include only those that meet specific conditions. It helps in narrowing down the result set by specifying criteria that the retrieved data must satisfy, making the query more efficient and the data more relevant. It is also used with the UPDATE and DELETE to specify Which Columns are to be manipulated.
29
What are some common mistakes to avoid when using WHERE clauses in SQL queries?
Common mistakes include not accounting for NULL values which can lead to unexpected results, misusing logical operators which can alter the logic of the query, and confusing the WHERE and HAVING clauses. It’s crucial to ensure that the conditions within the WHERE clause are correctly stated and logicaly consistent to avoid retrieving incorrect or unintended data.
30
What is a unique key constraint in T-SQL?
A unique key constraint in T-SQL ensures that all values in a column, or a set of columns, are unique across the database table. This means no two rows are allowed to have the same value(s) for the columns that the unique key constraint covers, enforcing uniqueness for the data. It also creates a nonclustered index.
31
How does a unique key constraint differ from a primary key constraint?
Both unique key and primary key constraints ensure data uniqueness, but a primary key constraint additionally enforces that the column(s) cannot accept NULL values. A table can have multiple unique key constraints but only one primary key. Unique keys can be used to enforce uniqueness on non-essential data columns that can accept NULL values. A PK creates a clustered index while a unique key constraint creates a nonclustered index.
32
Can you temporarily disable a unique key constraint? If so, why might you want to do this?
Yes, unique key constraints can be temporarily disabled, primarily for performance reasons during large data imports or when bulk updating data where checking constraints would significantly slow down the process. After the data manipulation, the constraint should be re-enabled to ensure the data's integrity. Disabling is done through altering the table structure, but it should be handled with caution to avoid data corruption.
33
What is the purpose of using a TRY...CATCH block in T-SQL?
A TRY...CATCH block in T-SQL is used to handle exceptions that occur during the execution of a set of T-SQL statements. The TRY block contains the SQL code that might generate an error, while the CATCH block contains the code that executes if an error occurs. This structure is used to gracefully handle errors, log them, and perform cleanup or recovery steps, preventing the entire script or transaction from failing abruptly.
34
What is the purpose of the TOP clause in T-SQL?
The TOP clause in T-SQL is used to limit the number of rows returned by a query. It is often used to retrieve a specific quantity of records from a database table or to sample a subset of data from larger result sets. The TOP clause helps in managing large datasets by allowing the focus on a manageable number of records, which is particularly useful in performance tuning and quick data analysis tasks.
35
What is a transaction in T-SQL and why is it important?
A transaction in T-SQL is a sequence of operations performed as a single logical unit of work. It is important because it ensures data integrity and consistency, even in the face of errors, power failures, or other issues. Transactions adhere to the ACID properties (Atomicity, Consistency, Isolation, Durability), which guarantee that all operations within the transaction either complete fully or have no effect at all.
36
How can transactions affect database performance and how should they be managed?
Transactions can lock resources like data rows or tables, potentially leading to blocking and deadlocks, which can degrade database performance. To manage transactions effectively, keep them as short as possible to minimize locking time, use appropriate isolation levels to balance accuracy and performance, and handle exceptions properly to avoid leaving transactions open. Ensuring efficient transaction management helps maintain smooth database operations and enhances concurrency.
37
What is an INNER JOIN in T-SQL?
An INNER JOIN in T-SQL is used to combine rows from two or more tables based on a related column between them. It returns only those rows that have matching values in both tables, effectively intersecting the datasets.
38
What is a LEFT OUTER JOIN and how does it function in T-SQL?
A LEFT OUTER JOIN in T-SQL returns all rows from the left table, and the matched rows from the right table. If there is no match, the result is NULL on the side of the right table.
39
Describe what a RIGHT OUTER JOIN does in T-SQL.
A RIGHT OUTER JOIN in T-SQL operates similarly to a LEFT OUTER JOIN, but it returns all rows from the right table and the matched rows from the left table. If there is no match, the result will include NULLs on the side of the left table.
40
What is a FULL OUTER JOIN in T-SQL?
A FULL OUTER JOIN in T-SQL returns all rows when there is a match in either the left or right table records. Where there is no match, the result will have NULLs for all columns of the table that does not have a matching row.
41
What is a SELF JOIN and its purpose in T-SQL?
A SELF JOIN in T-SQL is a regular join, but the table is joined with itself rather than another table.
42
Can you explain the concept of a MERGE JOIN in T-SQL?
A MERGE JOIN in T-SQL is a type of join that requires both input tables to be sorted on the join columns. It is very efficient if the data is already sorted because it merges the rows by walking through each table only once. This join type is ideal for large volume data operations and where indexes that support sorting are in place. In T-SQL, a MERGE JOIN is not explicitly written in SQL syntax but is instead an internal operation that SQL Server's query optimizer may choose based on the query's execution plan. However, if you want to mimic the behavior of a MERGE JOIN, you can write a query that leverages ordered data from two tables,
43
What is a Table Valued Function (TVF) in T-SQL?
A Table Valued Function (TVF) in T-SQL is a function that returns a table data type. It can be invoked in the FROM clause of a SELECT statement, similar to how you would use a table. TVFs are useful for encapsulating complex logic that can transform or generate sets of rows, making them reusable across different SQL queries.
44
What are the types of Table Valued Functions in T-SQL and how do they differ?
There are two main types of Table Valued Functions in T-SQL: Inline Table-Valued Functions (ITVFs) and Multi-Statement Table-Valued Functions (MSTVFs). ITVFs contain a single SELECT statement and cannot have any other kind of SQL logic or variables, making them generally faster and more efficient. MSTVFs, on the other hand, can contain multiple T-SQL statements, declare variables, and perform complex logic before producing a result set, offering more flexibility at the cost of potential performance overhead.
45
How can Table Valued Functions enhance SQL query operations, and what are their limitations?
Table Valued Functions enhance SQL query operations by allowing complex operations to be packaged as functions, promoting code reusability and modular architecture. They simplify the management of complex queries and can be optimized by the database engine when used properly. However, their limitations include potential performance drawbacks, especially with MSTVFs, which might not always be optimized as effectively as inline SQL code or stored procedures due to their inherent complexity and the temporary object overhead they can generate.
46