一、数据表的创建 1、导入数据表
df1 = pd.read_excel('pandas_test.xlsx')
2、字典方式创建数据表
(columns参数不仅可以提供字段名,还可以确定字段顺序)
df2 = pd.DataFrame({"name":["Tom","Marry","Luse"],
"age":[18,19,20],
"enjoy":["football","swim","study"],
"city":["beijing","nanjing","tianjing"]
},columns =['name','age','enjoy','city'],index=list("abc"))
3、列表方式创建数据表
df3 = pd.DataFrame([["Tom",18,"football","beijing"],
["Marry",19,"swim","nanjing"],
["Luse",20,"study","tianjing"]]
,columns =['name','age','enjoy','city'],index=list("abc"))
二、数据的输出 1、文件单次覆盖写入
df.to_excel(excel_writer='demo.xlsx', sheet_name='sheet_1')
2、文件连续写入(文件存在会报错)
writer = pd.ExcelWriter("demo.xlsx")
2.1、同一文件不同sheet
df.to_excel(excel_writer=writer, sheet_name='sheet_1')
df2.to_excel(excel_writer=writer, sheet_name='sheet_2')
2.2、同一sheet表格(startcol代表起始行数、startrow代表起始列数)
df.to_excel(excel_writer=writer,startcol=1,startrow=1)
df2.to_excel(excel_writer=writer,startcol=1,startrow=6)
保存关闭
writer.save()
writer.close()
遇到如下问题:
ValueError: Excel does not support datetimes with timezones. Please ensure that datetimes are timezone unaware before writing to Excel.
excel并不支持pandas带时区(timezones)的时间数据导出!所以需要通过该函数去掉时区
1.检查哪些列具有时区datetime64[ns,UTC]
df.dtypes
2.去掉该列的时区
df[‘date’] = df[‘date’].dt.tz_localize(None)
3.导出execl
df.to_excel(r’filename.xlsx’)
|