pandas 预设
pip install pandas pytables
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('expand_frame_repr', False)
pd.set_option('precision', 2)
pandas 导入和查看数据(DataFrame)
导入数据
这里以Excel数据为例, CSV,HDF等格式可以类似得到(具体请参考官方文档)
df = pd.read_excel('test.xlsx',
sheet_name="sheet 1",
header=0,
skiprows=1,
index_col=0,
encoding='utf8',
parse_dates=['date'],
)
查看
df.head(5)
df.tail(5)
df.sample(3)
df.describe()
DataFrame数据选取
生成测试数据
import numpy as np
arr = np.array([
["jack",78,10],
["lili",86,0],
["amy",97,20],
["tom",100,30]
])
df = pd.DataFrame(arr,index=["1","2","3","4"],columns=["a","b",'c'])
print(df)
"""
a b c
1 jack 78 10
2 lili 86 0
3 amy 97 20
4 tom 100 30
"""
按列名选取
df[['a']]
df['a']
df.loc[:, 'a']
"""
1 jack
2 lili
3 amy
4 tom
Name: a, dtype: object
"""
df.loc[:, ['a','c']]
"""
a c
1 jack 10
2 lili 0
3 amy 20
4 tom 30
"""
df.loc[:, 'a':'c']
"""
a b c
1 jack 78 10
2 lili 86 0
3 amy 97 20
4 tom 100 30
"""
按行名选取
df.loc['1']
"""
a jack
b 78
c 10
Name: 1, dtype: object
"""
df.loc[['1','3']]
'''
a b c
1 jack 78 10
3 amy 97 20
'''
df.loc['1': '3']
"""
a b c
1 jack 78 10
2 lili 86 0
3 amy 97 20
"""
按行号列号进行读取
df.iloc[0]
df.iloc[:, 1]
df.iloc[0:2, 0:2]
df.iat[1,2]
任意读取
df.loc[:]
"""
a b c
1 jack 78 10
2 lili 86 0
3 amy 97 20
4 tom 100 30
"""
df.at['1','a']
"""
'jack'
"""
|