WEEK5 Flashcards

(51 cards)

1
Q

Pandas

A

a python library that is used to work with data in a structured way (like working with table sin excel).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is pandas used

A

it makes it easier to store, manipulate and analyze data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

pandas series

A

a column in a spreadsheet of list of numbers with labels attached to each value. ONE-DIMENSIONAL array of indexed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what happens if you create a series without specifying an index?

A

pandas assigns default integer indexes (0, 1, 2, 3, ..)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

.values

A

shows the actual numbers inside the series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

.index

A

shows the index positions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

numpy

A

working with numbers and data in a super fast way and efficient way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

whats the relationship between pandas and numpy

A

pandas is built on top of numpy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

does pandas follow the same indexing rues as python lists and NumPy arrays (data [start:end])

A

yes, you dont include the end

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

does pandas allow non-sequential numeric indexes (2, 5, 3, 7, …)

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

where does index matter: pandas or numpy?

A

pandas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

whats the difference with slicing between python and pandas

A

with pandas you include the last index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

pandas dataframe

A

like a table and it is made up of multiple series aligned together. It has rows indices and column names.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

whats special about pandas and creating/changing indexes

A

you cannot change its values (its immutable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

why can you not change the indexes in pandas?

A
  1. prevents unintended changes
  2. makes data structures more efficient
  3. allows the same index to be used in multiple dataframes without risk of modification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does the ‘interaction’ do

A

finds elements that exist in both indA and indB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what does union do? (|)

A

combines all elements from both sets, keeping only unique values.
so indA = pd.index ([1,3,5,7,9])
indB = pd.index ([2,3,5,7,11)]

union: 1,2,3,5,7,9,11

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what does ‘symmetric difference’ (^) do

A

finds elements that are only in one of the two sets (but not in both)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Pandas provides three methods for indexing (loc, loc, ix)

A

1.Using explicit index (loc): selects based on the actual index label. Works like dictionary-style lookup.

2.Using implicit index (iloc): selects based on the position number (like a list).

3.Using ix (deprecated): ix method combined both explicit and implicit indexing but is no longer used in Pandas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Ufuncs

A

functions that work element-wise on arrays. they are useful for mathematical transformations in data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When performing operations between two Pandas Series, how does Pandas align the data?

A

Pandas matches data by index labels, not by position.

22
Q

What happens if an index is missing in one of the Pandas Series during an operation?

A

The result for that index will be NaN (Not a Number).

23
Q

If population has ‘Texas’ and area has ‘Texas’, what happens when calculating population / area?

A

The operation works because both Series have the ‘Texas’ index.

24
Q

What happens if population has ‘New York’ but area does not when calculating population / area?

A

The result for ‘New York’ is NaN because one value is missing.

25
What does (population / area).fillna(0) do?
It replaces all NaN results with 0.
26
rng.randint(0, 20, (2, 2))
Generates random numbers between 0 and 19 (20 is exclusive). (2, 2) means 2 rows and 2 columns. This creates a 2x2 NumPy array filled with random numbers.
27
What are the two ways missing values can be represented in Pandas?
None → used for objects (like strings) NaN (Not a Number) → used for numerical data
28
What happens when None appears in a numerical NumPy array or Series?
Pandas automatically converts None to NaN in numerical Series.
29
What is the dtype of this array: np.array([1, None, 3, 4])?
dtype=object because None is not a number.
30
Why does adding None to a NumPy array change the dtype to object?
Because None is not numeric, NumPy treats the array as a collection of Python objects instead of numbers.
31
Objected oriented programming
is a way of writing computer programs using objects
32
Object
Objects represent real-world entities (like a car, a person, or an account)
33
Classes
Classes define the blueprint or structure for objects
34
there are two alternatives of objected oriented programming (OOP): imperative programming declarative programming
Imperative Programming: focuses on step-by-step instructions. Declarative Programming focuses on what to do rather than how to do it.
35
an example of imperative programming
A recipe where you tell the computer what to do in order.
36
an example of declarative programming
Telling a restaurant, "I want a pizza" you don’t explain how to make it.
37
the core principles of objected oriented programming 1. Encapsulation
Encapsulation: Groups related variables and functions together You don’t need to know how a car engine works — just use the steering and pedals. Everything (engine, fuel, wires) is hidden inside.
38
the core principles of objected oriented programming 1. Encapsulation 2. Abstraction
Focuses only on important details, ignoring unnecessary ones. Simplifies complexity: You drive a car without knowing how fuel turns into motion.
39
The core principles of objected oriented programming 1. Encapsulation 2. Abstraction 3. Polymorphism
The same function works differently for different objects One object, many forms: Different people drive the same car in their own way.
40
The core principles of objected oriented programming 1. Encapsulation 2. Abstraction 3. Polymorphism 4. Inheritance
New things can borrow features from existing ones. Reuses features: A SportsCar inherits from Car but adds speed features.
41
why use OOP? 1. modularity 2. reusability 3. scalability 4. efficiency
-Modularity: Code is organized into reusable objects so it is easier to understand, fix, or change one part without breaking the whole program. -Reusability: Objects and classes can be reused across programs. -Scalability: Easier to extend and maintain as programs grow. -Efficiency: Helps in structuring complex systems effectively.
42
each object has 3 fundamental characteristics
1. identify: each object has a unique identifier that distinguishes it from other objects 2. state: this refers to the attributes or properties of an object, which can be modified, but start out based on the class initialization 3. behavior: this is defined by the methods that an object can perform or respond to
43
What is the cake and what the recipe: object and class
Object: actual cake Class: recipe
44
constructer
special method that initializes objects when they are created
45
explain the example of a new phone with a constructor
When you first turn it on, it asks your name, language, and Wi-Fi, that’s the setup (constructor)
46
Why do we use .groupby() when working with sales data?
To combine similar rows (like same store and country) and calculate total or average sales per group.
47
What does df represent in Pandas?
df is the variable name for your DataFrame — your table of data.
48
What’s the correct basic syntax to filter a DataFrame?
Use df[condition] — only the condition goes inside the square brackets.
49
Should you ever write df(df[...]) when filtering?
No. You never put the whole df inside the brackets. Only the condition goes inside.
50
51