Pandas Flashcards

(62 cards)

1
Q

Create a new DataFrame

A

df = pd.DataFrame() (Atenção com as maiúsculas!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

parâmetro axis=0

A

eixo x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

parâmetro axis=1

A

eixo y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

selecionar colunas W e Z do dataframe

A

df[[‘W’,’Z’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

cria coluna nova no dataframe

A

df[‘new’] = df[‘W’] + df[‘Y’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Algo to cut the lines of the DataFrame up to a value

A

Find the line index using line= df.loc[df[‘COL’] == ‘Limite’].index.min()

df = df[df.index < line]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cut the columns on a DataFrame

A

col_names = [‘Data’, ‘A’]

df = df[col_names]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Caps Lock no nome das colunas

A

df.columns = df.columns.str.upper()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Substituir nan em uma coluna por um valor

A

df[‘A’] = df[‘A’].fillna(‘0,0’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

aplicar uma funcao em uma serie

A

df[‘A’] = df[‘A’].apply(function)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

change a DataFrame column from string to date

A

import datetime

df[‘A’] = pd.to_datetime(df[‘A’], format=’%d/%m/%y’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

truncate (cut) values on a column

A

df[‘A’] = df[‘A’].map(str).str.slice(0,10)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

consolida dataframes em um dataframe final

A

df_final.append(df_bradesco)
df_final = pd.concat(df_final, axis=0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

transforma uma coluna para o tipo string

A

df[‘A’].astype(str)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

remover duplicatas de um dataframe

A

df.drop_duplicates(inplace=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

left join de dois dataframes

A

df_final = pd.merge(df_final,df_teste,how=’left’,left_on=’KEY’,right_on=’KEY’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

para cada linha da serie, recebe um valor de outra serie

A

df.at[chave_base, ‘A’] = df_teste.at[chave_teste, ‘A’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Saving DataFrames to an Excel Workbook

A

from pandas import ExcelWriter

writer = ExcelWriter(‘filename.xlsx’)

df1. to_excel(writer, ‘Sheet1’)
writer. save()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Save DataFrame as a dictionary

A

d = df.to_dict()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Save DataFrame as a string

A

str = df.to_string()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Save DataFrame as a numpy matrix

A

m = df.to_matrix()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Transpose rows and columns in a DataFrame

A

df = df.T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Iterate between columns

A

df.iteritems()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Iterate between rows

A

df.iterrows()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Filter a DataFrame with like
df = df.filter(like='x')
26
Get first column label in a DataFrame
label = df.columns[0]
27
Get list of column labels in a DataFrame
lis = df.columns.tolist()
28
Get an array of column labels in a DataFrame
a = df.columns.values
29
Select column to series
s= df['colName']
30
Select column to DataFrame
df = df[['colName] ]
31
Select all but last column to a DataFrame
df = df[df.columns[:-1] ]
32
Swap columns content in a DataFrame
df[['B', 'A'] ] = df[['A', 'B'] ]
33
Dropping (deleting) columns
df.drop('col1', axis=1, inplace=True)
34
Apply log to a column
df['log\_data'] = np.log( df['col1'] )
35
Set column values based on criteria
df]'d'] = df['a].where(df.b != 0, other=df.c)
36
Find index label for min/max values in column
label = df['col1'].idxmin() label = df['col1'].idxmax()
37
Module of a column in a DataFrame
df['col'] = df['col'].abs()
38
Convert column to date
s = df['col'],to\_datetime()
39
Create a column with a rolling pct change
s = df['col'].pct\_change(periods=4)
40
Create a column with a rolling calculation
s = df['col'].rolling(window=4, min\_periods=4, center=False).sum()
41
Append a column of row sums in a DataFrame
df['Total'] = df.sum(axis=1)
42
Get the integer position of a column index label
i = df.columns.get\_loc('col\_name')
43
Adding rows to a DataFrame
df = original\_df.append(more\_rows\_in\_df)
44
Dropping rows (by name)
df = df.drop('row\_label')
45
Boolean row selection by values greater than
df = df[df['col2'] \>= 0.0]
46
Boolean row selection by values with OR condition
df = df[(df['col3'] \>= 0.0) | (df['col1'] \< 0.0) need parenthesis around comparisons
47
Boolean row selection by values in list
df = df[df['col'].isin( [1, 2, 5, 7, 11] ) ]
48
Boolean row selection by values NOT in list
df = df[~ df['col'].isin( [1, 2, 5, 7, 11] ) ]
49
Boolean row selection by values containing
df = df[df['col'].str.contains('hello') ]
50
Get integer position of rows that meet condition
a = np.where(df['col'] \>= 2) produces a numpy array
51
Find row index duplicates
if df.items.has\_duplicates: prin(df.index.duplicated() )
52
Select a cell by row and column labels
value = df.at['row', 'col'] .at[] is the fastest label based scalar lookup
53
Grouping with an aggregating function
s = df.groupby('cat')['col1'].sum()
54
Change a string to lower case
s = df['col'].str.lower()
55
Get the length of the strings in a column
s =df['col'].str.len()
56
Append values to a string
df['col'] += 'suffix'
57
Filter a DataFrame with a list of items
df = df.filter(items['a' , 'b'], axis=0)
58
Remove lines with NaN
df.dropna(inplace=True)
59
Definition of the Strip method for strings
Returns a copy of the string with both leading and trailing characters removed (based on the string argument passed).
60
61
62