Panda's Flashcards
Panda data structure Series. what is it?
A Series is a one-dimentional array-like object, including a sequence of value (similar to NumPy array) and an associated array of index. obj=pd.Series([4,5,-3,2]) obj 0 4 1 5 2 -3 3 2 dtype: int64
output the array values
obj.values
array([ 4, 5, -3, 2], dtype=int64)
output the list of index values in panda series
obj.index
RangeIndex(start=0, stop=4, step=1)
how to assign a different index in panda series
#Specify a different index obj2=pd.Series([4,5,-3,2],index=['d','c','a','b']) obj2
get value from index in panda series
pandas has more fexibility to use index than NumPy.
obj2[‘c’]
5
show same index but use normal index position
still works.
#pandas has more fexibility to use index than NumPy.
obj2[1]
5
get 2 values using the assigned letters in panda series.
obj2[[‘a’,’d’]]
#[‘a’,’d’] can be seen as a list of indices. It returns to a subset of the original Seires, which is also a Seiries.
a -3
d 4
you can do numpy like operations on the series array.
obj2[obj2>0] d 4 c 5 b 2 dtype: int64
find data type
type(new)
find missing data in pandas
pd. isnull(obj4)
obj3. isnull()
bool to find missing data
pd.notnull(obj4)
assign value 300 to bread
obj4[‘bread’]=300
DataFrame
DataFrame¶
There are many possible data inputs to DataFrame. Such as, np array, dict of lists ot tuples, dict of Series, dict of dicts and so on…
We only intorudce how to contruct DataFrame through dict of lists
create a dataframe
create a DataFrame through a dict of equal length lists or NumPy arrays:
data={‘state’:[‘Ohio’,’Ohio’,’Ohio’,’Nevada’,’Nevada’,’Nevada’],
‘year’:[2000,2001,2002,2000,2001,2002],
‘pop’:[1.5,1.7,3.6,2.4,2.9,3.2]}
frame=pd.DataFrame(data)
frame
state year pop 0 Ohio 2000 1.5 1 Ohio 2001 1.7 2 Ohio 2002 3.6 3 Nevada 2000 2.4 4 Nevada 2001 2.9 5 Nevada 2002 3.2
create another dataframe from dictionary
election = {'state':['New Jersey','Ohio','West Virginia'], 'Winner':['Hillary','Trump','Trump'], 'Margin':[5,7,15]} election type(election) electionresult = pd.DataFrame(election) #electionresult electionresult.head()
show first 2 indexes of dataframe
electionresult2=pd.DataFrame(electionresult,index=[0,1])
electionresult2
are lists mutable?
You have to understand that Python represents all its data as objects. … Some of these objects like lists and dictionaries are mutable , meaning you can change their content without changing their identity. Other objects like integers, floats, strings and tuples are objects that can not be changed.
create a Series
a=pd.Series([1,2,3,4],[‘a’,’b’,’c’,’d’])
show the data in a series
a.values
array([ 4, 5, -3, 2])
show the index with pandas
a.index
by using labels. #pandas has more fexibility to use index than NumPy.
a[‘c’]=5
a[‘c’]
5
numpy like operations
obj2[obj2>0]
np.exp(obj2)
create a series from a dictionary
dict1={'eggs':10,'ham':20} series1=pd.Series(dict1) series1 eggs 10 ham 20 dtype: int64
how to find if something exists in series
pd.isnull(obj4)
bread False
ham False
dtype: bool