SQL Flashcards Preview

Data Sci/Programming > SQL > Flashcards

Flashcards in SQL Deck (49):

relational database

A relational database is a database that organizes information into one or more tables.


SQL statement

SQL statement is text that the database recognizes as a valid command. Statements always end in a semi-colon ;.


let's create a new table in SQL

CREATE TABLE celebs (id INTEGER, name TEXT, age INTEGER); CREATE TABLE is a clause that tells SQL you want to create a new table. 2. celebs is the name of the table. 3. (id INTEGER, name TEXT, age INTEGER) is a list of parameters defining each column in the table and its data type. id is the first column in the table. It stores values of data type INTEGER name is the second column in the table. It stores values of data type TEXT age is the third column in the table. It stores values of data type INTEGER


Add a row to the table.

INSERT INTO celebs (id, name, age) VALUES (1, 'Justin Bieber', 21);


How to query only interested columns.

SELECT is used every time you want to query data from a database. SELECT column1, column2 FROM table_name;


SELECT * FROM celebs;

You can also query data from all columns in a table with SELECT. SELECT statements always return a new table called the result set.


let's edit a row

UPDATE celebs SET age = 25 WHERE id = 1;


Add a new column to the table.

ALTER TABLE celebs ADD COLUMN twitter_handle TEXT;


Delete all of the rows that have a NULL value in the twitter column.

DELETE FROM celebs WHERE twitter_handle IS NULL;


Create a new table with constraints on the values

CREATE TABLE awards ( id INTEGER PRIMARY KEY, recipient TEXT NOT NULL, award_name TEXT DEFAULT "Grammy");


Constraints that add information about how a column can be used are invoked after specifying the data type for a column. They can be used to tell the database to reject inserted data that does not adhere to a certain restriction. examples?

1. PRIMARY KEY columns can be used to uniquely identify the row. Attempts to insert a row with an identical value to a row already in the table will result in a constraint violation which will not allow you to insert the new row. 2. UNIQUE columns have a different value for every row. This is similar to PRIMARY KEY except a table can have many different UNIQUE columns. 3. NOT NULL columns must have a value. Attempts to insert a row without a value for a NOT NULL column will result in a constraint violation and the new row will not be inserted. 4. DEFAULT columns take an additional argument that will be the assumed value for an inserted row if the new row does not specify a value for that column.



SQL is a programming language designed to manipulate and manage data stored in relational databases.One of the core purposes of the SQL language is to retrieve information stored in a database. This is commonly referred to as querying. A relational database is a database that organizes information into one or more tables. A table is a collection of data organized into rows and columns.


how to renamed the "name" column to Movies in table "movies"

SELECT name AS 'Movies' FROM movies; AS is a keyword in SQL that allows you to rename a column or table using an alias. The new name can be anything you want as long as you put it inside of single quotes. It is important to remember that the columns have not been renamed in the table. The aliases only appear in the result.


DISTINCT statement

DISTINCT is used to return unique values in the output. It filters out all duplicate values in the specified column(s).e.g. SELECT DISTINCT tools FROM inventory;


WHERE clause

We can restrict our query results using the WHERE clause in order to obtain only the information we want. Following this format, the statement below filters the result set to only include top rated movies (IMDb ratings greater than 8): SELECT * FROM movies WHERE imdb_rating > 8;


what is is an operator

Operators create a condition that can be evaluated as either true or false. Comparison operators used with the WHERE clause are: = equal to != not equal to > greater than < less than


LIKE statement

LIKE is a special operator used with the WHERE clause to search for a specific pattern in a column. e.g. The movies table contains two films with similar titles, 'Se7en' and 'Seven'. SELECT * FROM movies WHERE name LIKE 'Se_en'; here, name LIKE 'Se_en' is a condition evaluating the name column for a specific pattern. Se_en represents a pattern with a wildcard character. The _ means you can substitute any individual character here without breaking the pattern. The names Seven and Se7en both match this pattern.


wildcard character % and _


% is a wildcard character that matches zero or more missing letters in the pattern. This statement below filters the result set to only include movies with names that begin with the letter 'A': SELECT * FROM movies WHERE name LIKE 'A%'; another e.g. , any movie that contains the word 'man' in its name will be returned in the result below. SELECT * FROM movies WHERE name LIKE '%man%'; LIKE is not case sensitive. 'Batman' and 'Man of Steel' will both appear in the result of the query above.

The _ means you can substitute any individual character here without breaking the pattern.


To filter for all movies with an IMDb rating in the name column of table movies

SELECT name FROM movies WHERE imdb_rating IS NOT NULL;


BETWEEN operator

BETWEEN operator can be used in a WHERE clause to filter the result set within a certain range. The values can be numbers, text or dates. This statement filters the result set to only include movies with names that begin with letters 'A' up to but not including 'J'. SELECT * FROM movies WHERE name BETWEEN 'A' AND 'J'; In this statement, the BETWEEN operator is being used to filter the result set to only include movies with years between 1990 up to and including 1999. SELECT * FROM movies WHERE year BETWEEN 1990 AND 1999; Really interesting point to emphasize again: BETWEEN two letters is not inclusive. BETWEEN two numbers is inclusive.


how to combine multiple conditions in a WHERE clause to make the result set more specific and useful.

e.g. we use the AND operator to only return 90's romance movies. SELECT * FROM movies WHERE year BETWEEN 1990 AND 1999 AND genre = 'romance';


Suppose we want to check out a new movie or something action-packed:

SELECT * FROM movies WHERE year > 2014 OR genre = 'action';



ORDER BY always goes after WHERE! ORDER BY is a clause that indicates you want to sort the result set by a particular column. if we want to sort everything by the movie's title from A through Z SELECT * FROM movies ORDER BY name; Sometimes we want to sort things in a decreasing order. For example, if we want to select all of the well-received movies, sorted from highest to lowest by their year: SELECT * FROM movies WHERE imdb_rating > 8 ORDER BY year DESC; DESC is a keyword used in ORDER BY to sort the results in descending order (high to low or Z-A). ASC is a keyword used in ORDER BY to sort the results in ascending order (low to high or A-Z). The column that we ORDER BY doesn't even have to be one of the columns that we're displaying.


CASE statement

It is SQL's way of handling if/then logic. A CASE statement allows us to create different outputs (usually in the SELECT statement). SELECT name, CASE WHEN imdb_rating > 7 THEN 'Good' WHEN imdb_rating > 5 THEN 'Okay' END FROM movies; Each WHEN tests a condition. The following THEN gives us the string if the condition is true. CASE statement must end with END. Here, we set all movies above 8 as 'Good' and movies above 5 'Okay'.


code for - Use a CASE statement to create a column called mood that is: fun if genre is romance fun if genre is comedy serious in all other cases

SELECT name, CASE WHEN genre = 'romance' OR 'comedy' THEN 'fun' ELSE 'serious' END AS 'mood' FROM movies;


what are aggregates. recall some important aggregates 

Calculations performed on multiple rows of a table are called aggregates.

Here is a quick preview of some important aggregates 

COUNT: count the number of values

SUM: add up all of the values in a column

MAX/MIN: get the largest/smallest value

AVG: get the mean for all values in a column

ROUND: round the values in the column


The fastest way to calculate how many rows are in a table is to use what 

The fastest way to calculate how many rows are in a table is to use the COUNT function.

COUNT is a function that takes the name of a column as an argument and counts the number of non-empty values in that column.

SELECT COUNT(*) FROM table_name;

Here, we want to count every row, so we pass *as an argument inside parenthesis.


define SUM function

SUM is a function that takes the name of a column as an argument and returns the sum of all the values in that column.e.g.

SELECT SUM(downloads) FROM fake_apps;

This adds all values in the downloads column.


define MAX and MIN function

MAX takes the name of a column as an argument and returns the largest value in that column. Here, we returned the largest value in the downloadscolumn.

MIN works the same way but it does the exact opposite; it returns the smallest value.e.g. 

How many downloads does the most popular app have?

SELECT MAX(downloads) FROM fake_apps;

The most popular app has 31,090 downloads!


ROUND function , How does it work

ROUND function takes two arguments inside the parenthesis:

a column name

an integer

It rounds the values in the column to the number of decimal places specified by the integer.

SELECT ROUND(price, 0) FROM fake_apps;

Here, we pass the column price and integer 0 as arguments. SQL rounds the values in the column to zero decimal places in the output.


define GROUP BY clause

GROUP BY is a clause in SQL that is used with aggregate functions. It is used in collaboration with the SELECT statement to arrange identical data into groups.

The GROUP BY statement comes after any WHEREstatements, but before ORDER BY or LIMIT.e.g. 

SELECT year, AVG(imdb_rating) FROM movies GROUP BY year ORDER BY year;


how to Make the code here simplified -

we might want to know how many movies have IMDb ratings that round to 1, 2, 3, 4, 5. We could do this using the following syntax:

SELECT ROUND(imdb_rating),
FROM movies
GROUP BY ROUND(imdb_rating)
ORDER BY ROUND(imdb_rating);

However, this query may be time-consuming to write and more prone to error.

The following query is equivalent to the above:

SELECT ROUND(imdb_rating), COUNT(name) FROM movies GROUP BY 1 ORDER BY 1;

Here, the 1 refers to the first column in our SELECT statement, ROUND(imdb_rating) 

SQL lets us use column reference(s) in our GROUP BY that will make our lives easier.

1 is the first column selected, 2 is the second column selected, and so on.


Difference between WHERE and HAVING clause


HAVING is very similar to WHERE. In fact, all types of WHERE clauses you learned about thus far can be used with HAVING.

When we want to limit the results of a query based on values of the individual rows, we use WHERE.

When we want to limit the results of a query based on an aggregate property, we use HAVING.

HAVING statement always comes after GROUP BY, but before ORDER BY and LIMIT.e.g. 

 SELECT price, 
 FROM fake_apps
 GROUP BY price;

Certain price points don't have very many apps, so the average is less meaningful.

Add a HAVING clause to restrict the query to prices where the total number of apps at that price point is greater than 9.

 SELECT price, 
 FROM fake_apps
 GROUP BY price


Suppose we have the three tables below. How do you combine table orders and table customers  using join clause,

Q image thumb


FROM orders

JOIN customers

  ON orders.customer_id = customers.customer_id

If we only want to select certain columns, we can specify which ones we want.e.g.

SELECT orders.order_id, customers.customer_name

FROM orders

JOIN customers

ON orders.customer_id = customers.customer_id


Describe what this code does -


FROM orders

JOIN subscriptions

ON orders.subscription_id = subscriptions.subscription_id

WHERE subscriptions.description = 'Fashion Magazine';

only selects rows from the join where description is equal to 'Fashion Magazine'.

Again, WHERE statement is used to refine and limit the results of a query based on values of individual rows


What are inner joins?

Consider an inner join of two tables on table1.key = table2.key:

suppose each table has 3 rows. The first and last rows have matching values of key. The middle rows do not match. The final result has all values form the first and last row, but does not include the non-matching middle row.


What data does the code below retrieve

FROM newspaper;

FROM online;

FROM newspaper
JOIN online
    ON newspaper.id = online.id;

A image thumb


An inner JOIN does not return unmatched rows, while LEFT JOIN does. Describe how it works

A left join will keep all rows from the first table, regardless of whether there is a matching row in the second table.  The final result will keep all rows of the first table but will omit the un-matched row from the second table.


How do you define primary keys, and what are the requirements?


Def-  a column in a table that uniquely identifies each row of that table. 

Primary keys have a few requirements:

None of the values can be NULL

Each value must be unique (i.e., you can't have two customers with the same customer_id in the customers table)

A table can not have more than one primary key column


How do you define a foreign key? and how is it used the task of  joining tables?

When the primary key for one table appears in a different table, it is called a foreign key. So customer_id is a primary key when it appears in customers, but a foreign key when it appears in orders.

 The most common types of joins will be joining a foreign key from one table with the primary key from another table. For instance, when we join orders and customers, we join on customer_id, which is a foreign key in orders and the primary key in customers.


What is a CROSS JOIN and how is it  constructed?

CROSS JOIN is used to combine all rows of one table with all rows of another table. A cross join is used when you wish to create combination of every row from two tables.  All row combinations are included in the result; this is commonly called cross product join.  A common use for a cross join is to create obtain all combinations of items, such as colors and sizes. e.g

SELECT shirts.shirt_color, pants.pant_color

FROM shirts


The first two lines select the columns shirt_color and pant_color

The third line pulls data from the table shirts

The fourth line performs a CROSS JOINwith pants

Notice that cross joins don't require an ONstatement. You're not really joining on any columns!

If we have three different shirts (red, yellow, and blue) and two different pants (navy and black), the results might look like this:

shirt_color pant_color

red navy

red black

yellow navy

yellow black

blue navy

blue black



A more common usage of CROSS JOIN is when we need to compare each row of a table to a list of values. how? 

Let's start by counting the number of customers who were subscribed to the newspaper during March (3).


-- First query


FROM newspaper

WHERE start_month < 3

AND end_month > 3;

The previous query lets us investigate one month at a time. In order to check across all months, we're going to need to use a cross join.

Our database contains another table called months which contains the numbers between 1 and 12. Select all columns from the cross join of newspaper and months.

-- Second query  


FROM newspaper

CROSS JOIN months;

Create a third query where you add a where statement to your cross join. The column month should be greater than start_month, but less than end_month. This will select all months where a user was subscribed.

-- Third query


FROM newspaper


WHERE start_month < month

AND end_month > month;


Create a final query where you aggregate over each month.

-- Fourth query

SELECT months.month,


FROM newspaper


WHERE start_month < month

AND end_month > month

GROUP BY months.month;


What  clause allows us to  stack one dataset on top of the other.?



Suppose we have two tables and they have the same columns:


name email

Sonny sonny@foobar.com

Eric posture@boy.net


name email

Chloe wallflower1991@gsnail.com

Ned sk8terboi@lotmail.com



FROM table1



FROM table2;

SQL has strict rules for appending data:

Tables must have the same # of columns.

The columns must have the same data types in the same order as the first table.


what is and when is the WITH clause used

WITH allows us to define a bunch of temporary tables that can be used in the final query. i.e. allows us to combine two tables, when one of the tables is the result of another calculation. 

The WITH statement allows us to perform a separate query (such as aggregating customer's subscriptions)

previous_results is the alias that we will use to reference any columns from the query inside of the WITH clause

We can then go on to join our results with another table. e.g. 

WITH previous_query AS (
SELECT customer_id,
       COUNT(subscription_id) as subscriptions
FROM orders
GROUP BY customer_id)
SELECT customers.customer_name,
FROM previous_query
JOIN customers
    ON customers.customer_id = previous_query.customer_id;



SELECT last_name, first_name

FROM employees

WHERE last_name = 'Johnson'

ORDER BY last_name DESC, first_name ASC;

When sorting your result set using the SQL Server ORDER BY clause, you can use the ASC and DESC attributes in a singleSELECT statement.This SQL Server ORDER BY would return all records sorted by the last_name field in descending order, with a secondary sort by first_name in ascending order.


expln wht ths does

INSERT INTO contacts

(contact_id, last_name, first_name)

SELECT employee_id, last_name, first_name

FROM employees

WHERE first_name = 'Joanne'

AND employee_id >= 800;

the AND condition can be used in the INSERT statement.

This SQL Server AND condition example would insert into the contacts table, all employee_id, last_name, and first_namerecords from the employees table where the first_name is 'Joanne' and the employee_id is greater than or equal to 800.


When combining AND condition and the OR condition together, it is important to use parentheses so that the database knows what order to evaluate each condition.e.g.

SELECT employee_id, last_name, first_name

FROM employees

WHERE (last_name = 'Smith')

OR (last_name = 'Anderson AND first_name = 'Sarah')

OR (employee_id > 1000 and state = 'California');



SELECT DISTINCT first_name, last_name

FROM employees

WHERE employee_id >=50

ORDER BY last_name;

This SQL Server DISTINCT clause example would return each unique first_name and last_name combination from the employees table where the employee_id is greater than or equal to 50. The results are sorted in ascending order by last_name.


The SQL Server (Transact-SQL) IN condition is used to help reduce the need to use multiple OR conditions in a SELECT, INSERT, UPDATE, or DELETE statement. e.g.



FROM employees

WHERE last_name IN ('Smith', 'Anderson', 'Johnson');

is th sme as


FROM employees

WHERE last_name = 'Smith'

OR last_name = 'Anderson'

OR last_name = 'Johnson';