SQL Day 5 Flashcards
(46 cards)
What is a B-tree
(Binary Tree) in
database indexing,
and why is it used?
-Used to retrieve data from a table
-Is the logical structure of the index
-It has three levels
1. The Root level
2. The Branch level
3. The Leaf levels
Levels on a clustered index contain the actual data pages where the data is physically stored.
In a nonclustered index, the leaf level has a pointer to the location in memory where the data is stored.
What is the difference between
CHAR and VARCHAR data types in SQL?
CHAR is a fixed-length data type,
meaning it always reserves a set amount of space for the data, even if the actual data is shorter.
VARCHAR, on the other
hand, is a variable-length data type,
which only uses as much space as the data requires.
CHAR is more efficient for
storing fixed-length data, while
VARCHAR is better for variable-length
data.
What are some performance considerations when choosing between CHAR and VARCHAR?
CHAR can be faster for fixed-length strings because the database knows exactly how much space to allocate and retrieve.
However, it wastes space for shorter strings.
VARCHAR is more space-efficient for variable-length data,
but
It may introduce a slight performance overhead.
When calculating the string’s actual length.
For large datasets with variable-length data, VARCHAR is generally preferred to save storage.
What is the difference between
VARCHAR and VARCHAR(MAX)
in SQL?
The VARCHAR data type is used to store variable-length strings, with a maximum length specified by the user (up to 8000 characters).
VARCHAR(MAX) can store much larger strings,
up to 2 GB of data.
The key difference is that
VARCHAR has a defined size limit, while VARCHAR(MAX) is designed to store much larger data when needed, such as large text or
documents.
When should you use
VARCHAR(MAX) instead of
VARCHAR?
VARCHAR(MAX) should be used
when you need to store very large strings that could exceed 8000 characters, such as file contents, large text blobs, or lengthy descriptions.
For smaller, predictable data lengths, VARCHAR with a defined limit is preferred because it is more memory-efficient and can perform
better for regular string operations.
What are theperformance
implications ofusing
VARCHAR(MAX)?
VARCHAR(MAX) can impact performance if
not used carefully.
Since it can store very large
data, SQL Server may allocate more resources than necessary for operations like sorting or
indexing.
In most cases, it’s better to use a
specific VARCHAR size if the data length is known and predictable, as VARCHAR(MAX) may also be stored out-of-row, adding extra overhead
for retrieval.
What is the
difference between
VARCHAR and
NVARCHAR in
SQL?
That VARCHAR stores non-Unicode characters, while NVARCHAR
stores Unicode characters.
NVARCHAR allows for storing multilingual or special characters and uses more storage space (twice the space per character) compared to VARCHAR.
VARCHAR is ideal for data that
doesn’t need to support multiple languages or special characters.
What are the performance
considerations when using NVARCHAR vs VARCHAR?
NVARCHAR requires more storage space (2 bytes per character) compared to VARCHAR (1 byte per character), which can lead to larger database sizes and slightly slower performance for large datasets.
However, if you need to store
multilingual data or special characters, NVARCHAR is necessary.
For performance optimization, use VARCHAR when Unicode support is not needed, as it uses less storage and may improve performance.
What is the
difference between
an Index Seek and
an Index Scan in
SQL?
An Index Seek is an efficient operation where SQL Server uses an index to directly locate specific rows that satisfy the query conditions.
An Index Scan, on the other hand, scans through all the rows in an index to find matching data, which is less efficient.
Index Seeks are preferred because they minimize the number of rows
accessed, while Index Scans can impact performance, especially on large tables.
When might SQL
Server use an
Index Scan instead
of an Index Seek?
SQL Server might use an Index Scan
when no appropriate index exists for the query’s WHERE clause or when a large portion of the table’s data needs to be retrieved.
It can also occur if the query is complex or requires reading the majority of the table’s rows, making a scan more efficient than performing multiple seeks.
Adding or optimizing indexes can often reduce unnecessary scans.
What are some
other important
operators seen in
an SQL execution
plan?
Other important operators in an execution plan include:
Table Scan, Nested Loop Join, Hash Join, Sort
Table Scan: Reads all rows in a table when no index is available.
Nested Loop Join: Iterates over rows from one table for each row of another, useful for small datasets.
Hash Join: Uses a hashing algorithm to join tables, often for large datasets.
Sort: Orders rows based on specified columns, which can be resource-intensive for large datasets.
What does ACID
stand for in
database
management, and
why is it important?
ACID stands for Atomicity,
Consistency, Isolation, and Durability.
These are the core principles that ensure the reliable processing of the database transactions.
They are essential for
maintaining data integrity and ensuring that transactions are processed completely, accurately, and reliably, even in cases of system failure or concurrent operations.
What is Atomicity in
the context of ACID,
and how does
it affect
transactions?
Atomicity ensures that a transaction is
treated as a single unit of work, meaning
either all operations within the transaction
are completed, or none of them are. This
prevents partial updates or changes from
being saved if an error occurs, ensuring
that the database remains in a consistent
state.
How do Isolation
and Durability
contribute to
database
reliability?
Isolation ensures that transactions are
executed independently of one another,
preventing concurrent transactions from
interfering with each other. Durability guarantees
that once a transaction has been committed, the
changes are permanent and will survive system
failures. Together, these principles protect data
integrity during concurrent processing and
ensure that committed data remains safe.
What is
Consistency in
the ACID
Principles?
Consistency
means that a
transaction will
never be left in a
half-finished state.
Principles?
What are
subqueries in SQL,
and where can
they be used?
A subquery is a query nested
within another query. It can be used
in SELECT, INSERT, UPDATE, or
DELETE statements or inside
clauses like WHERE, FROM, or
HAVING. Subqueries are often
used to return data that will be used
by the main query to filter or
manipulate results.
What is SQL
Profiler, and what
is its primary
function?
SQL Profiler is a tool in SQL Server
used to monitor and capture events
happening in the database. It allows
database administrators to track and
analyze queries, transactions, and server
activity in real time, helping diagnose
performance issues, optimize queries,
and troubleshoot problems like deadlocks
or excessive resource usage.
What are some
key considerations
when using SQL
Profiler?
When using SQL Profiler, it’s important to
capture only relevant events to avoid
performance overhead on the server. Filtering by
specific databases, users, or events can reduce
the volume of data captured. Running Profiler on
a production system can also affect
performance, so it’s recommended to run traces
during off-peak hours or use it sparingly for
troubleshooting specific issues.
What is a local
temporary table
in SQL, and
how is it used?
A local temporary table in SQL is a
temporary table that is visible only to the
session that created it. It is created with a
single # before the table name (e.g.,
#TempTable). Local temp tables are
used to store temporary data for that
session and are automatically dropped
when the session ends. They are useful
for intermediate calculations or storing
temporary results.
What is a global
temporary table,
and how does it
differ from a local
temporary table?
A global temporary table is accessible
by any session or connection and is
created with two ## before the table
name (e.g., ##GlobalTempTable). It
remains available to all users until the
session that created it is closed, and no
other sessions are referencing it. Global
temp tables are less commonly used but
are helpful when multiple sessions need
access to the same temporary data.
What are some
considerations
when using
temporary tables in
SQL?
When using temp tables, it’s important
to be mindful of performance, especially
with large datasets, as temporary tables
are stored in the tempdb database.
Indexing temp tables can improve query
performance. Additionally, be sure to
drop temp tables when they are no longer
needed to free up resources, although
SQL Server often handles this
automatically for local temp tables.
What is the
REPLACE function
in SQL, and how
does it differ from
STUFF?
The REPLACE function in SQL is used
to replace al occurrences of a specified
substring within a string with another
substring. Unlike STUFF, REPLACE
doesn’t work with character positions but
instead searches for and replaces al
instances of a specified substring. It’s
useful when you need to substitute
specific characters or words within a
string.
When should you
use the NOT NULL
constraint in a
database table?
The NOT NULL constraint should be
used when a column is required to
always have a value, meaning it cannot
be left blank. It is typically applied to
columns that are essential for data
completeness, such as identifiers (e.g.,
customer ID), dates, or critical
information (e.g., names or email
addresses). It ensures that essential data
is always provided.
What is a
Check
constraint?
It hinders the insertion of
data into a column if it
doesn’t meet certain
criteria specified by the
user. For example an
Email column must contain
‘@’ as well as a ‘.’.