Pandas and SQL Flashcards
(101 cards)
<p>What happens if execute the following code?</p>
<p>Create newTable(col1 INT, col2 CHAR(25));</p>
<p>It will raise me error because you forgot to mention the keyword table .</p>
<p>Correct Syntax</p>
<p>Create table newTable(col1 INT, col2 CHAR(25);</p>
<p>What is the difference between unqiue Key and primary key?</p>
<p>There may be many unique key columns in a table but there can be only ONE primary key column.</p>
<p>What happens when I assign a certain column as Unique key?</p>
<p>It makes sure that all the entries in that column are unique and has no duplicates</p>
<p>Create a table with 1)Auto increment 2)uinique primary key 3)char , int and date-time columns .</p>
<p>CREATE TABLE Worker ( WORKER_ID INT NOT NULL PRIMARY KEY AUTO_INCREMENT, FIRST_NAME CHAR(25), LAST_NAME CHAR(25), SALARY INT(15), JOINING_DATE DATETIME, DEPARTMENT CHAR(25) );</p>
<p>What happens if I try to execute the following code?</p>
<p>CREATE TABLE NewTable(S.No INT PRIMARY KEY);</p>
<p>It will raise and error because '.' is not allowed for COLUMN name assingment. '_' Is allowed.So the appropriate word would be 'S_No'</p>
<p>Insert values into table that has columns 'Column1'(INT) and 'Column2'(string)</p>
<p>INSERT INTO Table_Name (Column1, Column2) VALUES (001, 'Monika')</p>
<p>What happens if I execute this ?</p>
<p>INSERT INTO TABLE newtable3(col1,col2,col3) VALUES (1,2,'Hello');</p>
<p>This will raise me an error because we DONT HAVE to specify keyword table in this case .</p>
<p>Correct Syntax:</p>
<p>INSERT INTO newtable3(col1,col2,col3) VALUES (1,2,'Hello');</p>
<p>What is the differnce between create and Insert in terms of syntax?</p>
<p>In case of Create we will have to mention the keyword TABLE</p>
<p>But in case of Insert we SHOULDNT mention the keyword , or else it will give me syntax error</p>
<p>Get all unique elements from duplicates in certain column of sql table</p>
<p>SELECT DISTINCT column1, column2, ...FROM table_name;</p>
<p>Get all elements form a certain column in SQL table</p>
<p>Select column_name from table_name</p>
<p>Get elements from certain column while specifying criteria SQL</p>
<p>SELECT column1, column2, ... FROM table_name WHERE column1>thresh_val;</p>
<p>Get elements that DO NOT satisfy specific condition in said column</p>
<p>SELECT column1, column2, ...FROM table_nameWHERE NOT colum1="Sunny";</p>
<p>Get elements with multiple condtions are applied at the same time</p>
<p>SELECT * FROM Customers WHERE Country='Germany' AND City='Berlin';</p>
<p>Get elments that is union of completely different search filters</p>
<p>SELECT * FROM CustomersWHERE Country='Germany' OR Country='Spain';</p>
<p>Get elements in Ascending order</p>
<p>SELECT * FROM CustomersORDER BY Country ASC;</p>
<p>Get elements in Descedning order</p>
<p>SELECT * FROM CustomersORDER BY Country DESC;</p>
<p>Update specific records of a table based on search conditions</p>
<p>UPDATE Customers SET ContactName = 'Alfred Schmidt', City= 'Frankfurt' WHERE CustomerID = 1;</p>
<p>What happens if I do this?UPDATE CustomersSET ContactName='Juan';</p>
<p>All entries will get updated regradless of any country , age , City etc. It is essential that while updating the correct filters are applied .Or it will corrupt entire data.</p>
<p>What happens if we execute the following code?</p>
<p>UPDATE Customers SET ContactName = 'Amos' AND Country = 'India' WHERE CustomerID = 1;</p>
<p>I will raise me an error , because for set compnonet we shouldnt use AND, instead we should use comma</p>
<p>UPDATE Customers SET ContactName = 'Amos',Country='India' WHERE CustomerID = 1;</p>
<p>How to delete specific elements in table?</p>
<p>DELETE FROM table_name WHERE Name = "Sunny";</p>
<p>What is one important warning to remember while deleting entries in SQL ?</p>
<p>We should careful regarding the condition we set in DELETE query DELETE FROM table_name WHERE condition;If we are not mindful of the condition then we might end up deleting the wrong entries.</p>
<p>How to delete all records from a certain table?</p>
<p>DELETE FROM table_name;</p>
<p>What happens if we execute the follwing line?DELETE Age from Table where Name='Sunny';</p>
<p>It will raise me a syntax error .Because we cant delete a specific column from table . Deletion removes the complete entry ,not just a column.</p>
<p>How to update all entries in Table?</p>
<p>UPDATE CustomersSET ContactName = 'Alfred Schmidt';</p>
What happens if I exectue the following line?Update Student_Table where Name= "Sunny"
It will raise me an error because we havent specified the "set" component in update.
What are important components of update statement?
Udate table , SET values and condition
What are important components of delete statement?
delete table and condition
What happens if I excute the following code?
CREATE TABLE NewTable(Column INT AUTO_INCREMENT PRIMARY KEY);
It wont raise an error . The order of AUTO_INCREMENT and the key assignment doesent matter
What happens if I execute the following line?Update Student_Table set Age = 20 where "Name" = Sunny
It will raise me an error because the field should not be a string and the value its checking (Sunny) should be a string.
How to get top 10 elements of a table
SELECT S.No,NameFROM StudentLIMIT 10;
How to get bottom 10 element
SELECT S.No,NameFROM Student ORDER by S.No DESCLIMIT 10;
What type of statistical attributes can we calculate from numerical fields in data table?
We can calclulate , COUNT , MIN , MAX, SUM AND AVG
Format to calculate the statistical attributes of a table
Select MIN(Age) from Students where City = "Frankfurt"We can use the same for COUNT , MAX, SUM or AVG
Get Elements where the name starts with letter 'a'
SELECT * FROM CustomersWHERE CustomerName LIKE 'a%';
What key word should i use in case I want to find words that follow a certain pattern?
We should use the key words "LIKE"
Get elements that end with letter 'a'
SELECT * FROM CustomersWHERE CustomerName LIKE '%a';
What happens if I use the following syntax?SELECT * FROM CustomersWHERE CustomerName LIKE '*a';
This will raise me and error if I am using mySQL . But that wont be the case if we are using MS Access
Get elements where the middle word is "or"
SELECT * FROM CustomersWHERE CustomerName LIKE '%or%';
Get element that has second word 'r'
SELECT * FROM CustomersWHERE CustomerName LIKE '_r%';
Get elements that has starting letter 'a' and has atleast 3 words in the string
SELECT * FROM CustomersWHERE CustomerName LIKE 'a__%'; Note: There are 2 '_' in the syntax above
Get elements that start with 'a' and end with 'o'
SELECT * FROM CustomersWHERE ContactName LIKE 'a%o';
Get Customers with a CustomerName that does NOT start with "a"
SELECT * FROM CustomersWHERE CustomerName NOT LIKE 'a%';
Get all customers with a City starting with "b", "s", or "p"
SELECT * FROM Customers WHERE CustomerName LIKE '[bsp]%';
Get all customers with a City starting with "a", "b", "c","d"....till "p"
SELECT * FROM CustomersWHERE City LIKE '[a-p]%';
Get all customers with a City NOT starting with "b", "s", or "p"
SELECT * FROM CustomersWHERE City LIKE '[!bsp]%';
Get all customers that are located in "Germany", "France" or "UK"
SELECT * FROM CustomersWHERE Country ="Germany" OR Country ="France" OR Country ="UK"
What is an alternative for getting customers from "Germany", "France" or "UK"
SELECT * FROM CustomersWHERE Country IN ('Germany', 'France', 'UK');
Get all customers that are NOT located in "Germany", "France" or "UK"
SELECT * FROM CustomersWHERE Country NOT IN ('Germany', 'France', 'UK');
Alternative for getting customers that are not located in "Germany", "France" or "UK"
SELECT * FROM CustomersWHERE NOT Country ="Germany" OR NOT Country ="France" OR NOT Country ="UK"
Why do we use IN functionality when we could use OR logic or other logical expressions to get our results
Because we cant always cover ALL checking conditions . Sometimes there might be too many to cover ."IN" can be useful in situations like these.
Get all customers that are from the same countries as the suppliers
SELECT * FROM Customers WHERE Country IN (SELECT Country FROM Suppliers); Note : If we used OR logical expressions in statements like these it would take forever.We might as well search manually
Also to be noted that we are using 2 tables here
Get all products with a price between 10 and 20
SELECT * FROM ProductsWHERE Price BETWEEN 10 AND 20;
Get all products outside a specific range like say less than 10 or greater than 20
SELECT * FROM ProductsWHERE Price NOT BETWEEN 10 AND 20;
Get all products with a price between 10 and 20. AND In addition; do not show products with a CategoryID of 1,2, or 3:
SELECT * FROM ProductsWHERE Price BETWEEN 10 AND 20AND CategoryID NOT IN (1,2,3);
What happens if I execute the following syntax?SELECT * FROM CustomersWHERE Country NOT IN ['Germany', 'France', 'UK'];
It raises me an error , Square brackets are not allowed .It should be ('Germany', 'France', 'UK');
What happens if I try to arrange names in descending order example : Select name from student order by name DESC;
It will arrange in all names in opposite chronological order , Example : Z to A Manjimup , Lousiana , Herbert etc
Get all products with a ProductName BETWEEN Carnarvon Tigers and Chef Anton's Cajun Seasoning
SELECT * FROM ProductsWHERE ProductName BETWEEN "Carnarvon Tigers" AND "Chef Anton's Cajun Seasoning";
Get all products with a ProductName BETWEEN Carnarvon Tigers and Chef Anton's Cajun Seasoning WHILE maintaining chronological order
SELECT * FROM ProductsWHERE ProductName BETWEEN "Carnarvon Tigers" AND "Chef Anton's Cajun Seasoning" ORDER BY ProductName;
Get all products with a ProductName NOT BETWEEN Carnarvon Tigers and Mozzarella di Giovanni
SELECT * FROM ProductsWHERE ProductName NOT BETWEEN 'Carnarvon Tigers' AND 'Mozzarella di Giovanni';
How are primary key and unique key similar?
A primary key is by default a unqiue key but a unique key is not primary key .
What is an important charecteristc of primary key?
A primary key is a key that is assigned to a column and identifies each record of a table unqiuely , that said there CAN BE only one primary key .
What happens if I execute this?
CREATE TABLE newTable4(col1 INT AUTO_INCREMENT);
It will raise me an error , because you CANT assign AUTO_INCREMENT without assining key to that column.
What happens if I execute the following?
CREATE TABLE newTable4(col1 UNIQUE KEY AUTO_INCREMENT INT);
It will raise an error because you need to mention the type of the field first before you can assign tokens such as UNQIUE KEY or AUTO_INCREMENT
What happens if I execute the following code?
SELECT DISTINCT(COL1) FROM table_name
I tiwll raise me an error.You cant use "()"
Correct syntax :
SELECT DISTINCT COL1 FROM table_name
What happens if I execute the following code?
CREATE TABLE table_name(AUTO_INCREMENT PRIMARY KEY)
It will raise and error because we havent specified column name
What happens if I excute the follwing code ?
CREATE TABLE table_name (col_name INT, PRIMARY KEY , AUTOINCERMENT)
I will raise an error because of ',' in between . There should be only spaces
Line to read a CSV file
import pandas as pd
df=pd.read_csv("k.csv")
What does df.head(3) do ?
Prints the first 3 entires
What does df.tail(3) do ?
Fetches the last 3 entries of all entries
What is the line to read excel sheets
excel_data_df = pandas.read_excel('records.xlsx', sheet_name='Employees')
How do you read a file that ISNT CSV but has different delimitter? 8:18
df = pd.read_csv("pk_data.txt".delimiter='\t' )
Line to get names of all columns in the data file
df.columns
Line to print one certain column
dt['Col_Name']
Line to get specific column of top 5 entries
dt['Col'][0:5]
Line to get multple selective columns
dt[['Col1','Col2']]
Line to get all details concerning row at certain position
df.iloc[1]
(Getting entry at position 2 zero indexing applies)
Line to get all details concerning multiple first few row
df.iloc[0:2]
Line to get a specific column OF a certain entry using ONLY INDICES
df.iloc[1,2]
1 is the second entry
And 2 stands for the 3rd column (We have to account for the zero indexing)
Line to iterate through all entries of datasheet
for index,df in df.iterrows():
print(index,df)
/////df stands for data frame/////
Line to iterate through rows and print specific columns of those entries
for index,df in df.iterrows():
print(index,df['Name'])
Line to to fetch data entries that satisfy a certain condition(Search filters).
df.loc[df['Type 1'] == Grass]
df.loc['Type 1'] returns all data strings under Type 1 column and then the condition checks which of them are 'Grass'
Line to fetch data that has multiple filter layers
df.loc[df['Type 1'] == Grass].loc[df['Type 1'] == Fire]]
Line to get standard deviation of a certain column
df['Age'].describe()['std']
What is the use of describe() is pandas
It displays data like count ,mean , min , std, 25percent, 75 and max
Sort values in ascending order
df.sort_values('Power')
Sort names in leographic order
df.sort_values('Name')
Sort elements in descending order
df.sort_values('Name',ascending = False)
Sort first column with ascending and second column with desc
df.sort_values(['Type1','HP'],ascending=[1,0])
Line to Create new column which is function of previous columns
df['Total'] = df['HP']+df['Attack']-df['Damage']
/////It automatically creates a new column for the rest ////
Remove muliple columns
df = df.drop(columns = ['Total1','Total2'])
Getting stats for specific column
df["Age"].describe()
What does descibe function of pandas dataframe return me ?
It returns me a dictionary of various quantites
Line to get a specific column (VIA LABEL) OF a certain entry (VIA INDEX)
first = data.iloc[0]['Age']
We used 0 as index for entry (Meaning first row ) and label as identifier for specific column (Age is the column)
Line to get standard deviation of all columns
df.describe()['std']
Error debugging #1
What happens if you write print(df[Name])
It will raise an error because in df[Name] we should pass a stiring, not just Name(Which might work if it was defined as a string var)
What is the difference between iloc and loc functions ?
loc is used for fetching entires based on boolean conditions
whereas
iloc is used for fetching results based on simple indices.
(Hence the use of i, which probably means index locator)
Line for filtering data of a certain column using indexof column BUT NOT LABEL
df.loc[df.iloc[:,6]
Line for printing specific entires based on comparision filters
df.loc[df['Col1']
Line for printing specific columns AFTER performing a search criteria
df.loc[df['Units']
Line to replace all entries that contain specific element(string or value) with a DIFFERENT element
df.replace(to_replace ="Boston Celtics",value ="Omega Warrior")
Line to replace all entries that contain specific element (string or value) that belong to a set with a DIFFERENT element
df.replace(to_replace =["Boston Celtics", "Texas"], value ="Omega Warrior")