公众号:尤而小屋 作者:Peter 编辑:Peter
大家好,我是Peter~
本文中主要介绍的是如何来美化Pandas的DataFrame的数据。主要是通过Pandas中的两个方法来实现:
- Styler.applymap:逐个元素,返回带有CSS属性-值对的单个字符串
- Styler.apply:列、表、行的方式,返回具有相同形状的Series或者DataFrame,其中每个值都是带有CSS属性值对的字符串。该方法在作用的时候,通过参数axis来传递,axis=0表示按列作用,axis=1表示按行作用。
官网学习地址:
- https://pandas.pydata.org/pandas-docs/version/1.1.5/user_guide/style.html#Export-to-Excel
- https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html
data:image/s3,"s3://crabby-images/44599/44599c2ff1a7c8f704df3e941c6b991519b5372c" alt=""
Pandas连载
Pandas的文章一直在连载中: data:image/s3,"s3://crabby-images/1b482/1b48213e5a261da638439773fb5c27276b1a1024" alt=""
模拟数据
import pandas as pd
import numpy as np
np.random.seed(20)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df1 = pd.concat([df, pd.DataFrame(np.random.randn(10,4), columns=list("BCDE"))],axis=1)
df1
numpy中的linspace(start,stop,setp) :表示按照步长取数,包含头尾的数据:
data:image/s3,"s3://crabby-images/7bcf4/7bcf4346afbab0d4e84c33525dc54605a346e3aa" alt=""
为了查看空值的效果,特意设置了3个空值nan:
data:image/s3,"s3://crabby-images/820c3/820c3d9bf9fdf41c26ec7b78f6ffc75b7c725237" alt=""
查看和隐藏样式
首先是查看DataFrame的样式设置:
data:image/s3,"s3://crabby-images/cdf0c/cdf0c731a5abba8026fac6a3e9bd1bff7f1ae60f" alt=""
我们可以通过render方法来查看数据框样式的具体值:发现都是默认的CSS样式的代码
data:image/s3,"s3://crabby-images/231c1/231c174bc61ca552368639091f13d58f5da6276c" alt=""
隐藏索引
隐藏index的代码:
data:image/s3,"s3://crabby-images/6518a/6518a5552010a03e4762b842a9e8faf05a9c9068" alt=""
隐藏指定的列
我们发现AC两个列被隐藏了,使用的是参数subset参数:
data:image/s3,"s3://crabby-images/2b155/2b1551146f37a844bb9ddb07402c72ce631549d6" alt=""
下面是各种显示样式的案例讲解:
案例1:正负颜色
小于0显示红色,大于0显示蓝色
def color_change(val):
color = 'red' if val < 0 else 'blue'
return 'color: %s' % color
df1.style.applymap(color_change)
applymap方法是针对整个DataFrame的
data:image/s3,"s3://crabby-images/bd0c0/bd0c078626fbbaae9006fbaf8ef5d547cf267afa" alt=""
案例2:高亮显示数据
可以高亮显示最大、最小和缺失值。手写一个简单的高亮函数:
def highlight_min(x):
is_min = x == x.min()
return ["background-color: yellow" if v else '' for v in is_min]
data:image/s3,"s3://crabby-images/ab4cf/ab4cf9a21bae2c68d5ec52a34925630e327f0b76" alt=""
实例3:使用默认高亮函数
Pandas中已经默认写好了内置的3个高亮函数:
- highlight_max(): 最大
- highlight_min(): 最小
- highlight_null(): 空值
同时我们还可以搭配参数axis来显示行或者列
1、高亮最大值
data:image/s3,"s3://crabby-images/221c8/221c89c2209e539203925cf60317d05eceb19f91" alt=""
2、高亮每列的最小值
data:image/s3,"s3://crabby-images/26e89/26e89947fdec1cef37f1fb04930f0ba28caa6489" alt=""
3、高亮空值:默认是红色
data:image/s3,"s3://crabby-images/3ffd9/3ffd94e18b2fd07028cad52cd45f5f4b8e74c872" alt=""
当然我们是可以改变颜色的,通过使用参数null_color:
data:image/s3,"s3://crabby-images/e03ab/e03abea79c3e31749574da709b4ca5f97bedfac8" alt=""
使用参数axis=1,表示在行的方向上进行操作:
data:image/s3,"s3://crabby-images/94335/94335f655f6f427db35a34b045711c5fb1487edb" alt=""
案例4:链式调用
链式调用指的是在同一个样式操作中同时使用多个函数(方法):
df1.style.applymap(color_change).apply(highlight_min)
如果方法特别多,可以分行写:
df1.style.\
applymap(color_change).\
apply(highlight_min)
(df1.style.
applymap(color_change).
apply(highlight_min))
data:image/s3,"s3://crabby-images/2a89d/2a89d30c5d0a0fc3a292461815aa6258783c1d0f" alt=""
空值高亮和改变颜色参数同时使用,实现链式调用:
data:image/s3,"s3://crabby-images/51550/515501a64a3cb0c7133f908d296b6324bbe5c26d" alt=""
隐藏列属性和自定义参数同时使用:
data:image/s3,"s3://crabby-images/2feee/2feee1022ae46f7a91e766ac35a462f53de9e677" alt=""
实例5:部分数据美化
使用参数subset参数控制作用的行列,传入的参数可以是:
- 列标签
- 列表(numpy数组)
- 元组(row_indexer, column_indexer)
部分列属性
通过参数subset来指定我们想要美化的列属性:
data:image/s3,"s3://crabby-images/ac247/ac247008da2cd4aab8e6242194ca9a5e7c423c4d" alt=""
pd.IndexSlice构造列表
通过pd.IndexSlice来构造
data:image/s3,"s3://crabby-images/d2cf1/d2cf181c8d670d98f16f861154e2596bed603fd8" alt=""
实例6:值的格式化显示(Styler.format)
Styler.format来控制数值的格式化输出,类似于Python中的字符串格式化
整体显示
比如我们想要全部数据只显示两位小数:
data:image/s3,"s3://crabby-images/38110/38110ecd9d3dd3b2c711bd622f51b3be11ef8dee" alt=""
使用字典,格式化特定列
通过字典的形式,列属性名当做键,格式说明当做值,进行格式化设置:
data:image/s3,"s3://crabby-images/957b5/957b54c17a7e542cbbea8d8ebb20ddeb04fcff54" alt=""
使用lambda匿名函数
data:image/s3,"s3://crabby-images/766ee/766ee7a5cd438a0e228821c5b74999d88c13bfee" alt=""
data:image/s3,"s3://crabby-images/87775/8777558df36ca2339d20abe3c8d7ce40b3054988" alt=""
空值显示
通过参数na_rep,将空值进行替代:
data:image/s3,"s3://crabby-images/9c530/9c5300209166aa8ed5ac35de00e9c98097b8401b" alt=""
用“空值”两个字来替代:
data:image/s3,"s3://crabby-images/d7b2e/d7b2e91a7a5077f50a78570ea960f680aaac3eed" alt=""
链式调用同时使用“-”来替代:
data:image/s3,"s3://crabby-images/a613c/a613ca78055ce3e8860890b0527715b3bce9b8f4" alt=""
表情符
使用emoji表情符:
data:image/s3,"s3://crabby-images/95cf5/95cf51fccd53866ff824c830372611ef7bfd9fc7" alt=""
实例7:内置样式
内置函数
在这里使用的是内置的空值高亮函数:
data:image/s3,"s3://crabby-images/60749/6074991f45bd221aaf708f2b1ec3fc56abd031e4" alt=""
热力图制作
我们借助seaborn库来实现:
import seaborn as sns
cm = sns.light_palette("blue", as_cmap=True)
s = df1.style.background_gradient(cmap=cm)
s
data:image/s3,"s3://crabby-images/67771/677712274cdbbe0a71c588ce55b42eeadde4e1bf" alt=""
连续色谱
data:image/s3,"s3://crabby-images/8625f/8625f84913644ed63f2a8faa79a9fe3985fbcad2" alt=""
通过参数low和high来指定色谱的范围:
data:image/s3,"s3://crabby-images/3a6e9/3a6e96c170ece45174566886284e910e88a0039b" alt=""
set_properties使用
df1.style.set_properties(**{"background":"yellow",
"color":"red",
"border-color":"black"})
data:image/s3,"s3://crabby-images/c7839/c7839023f069e020371ab50cbfb8221f7a57bd9f" alt=""
实例8:个性化条形图
默认方式
通过bar方法来操作:
data:image/s3,"s3://crabby-images/0e3ae/0e3ae60360aa92813e60b420e5be0430c420028e" alt=""
参数align
有3种对齐方式:
- left:最小值从单元格的左侧开始
- zero:零值位于单元格的中心
- mid:单元格中的中心在(max-min)/2处
data:image/s3,"s3://crabby-images/d5b30/d5b3054b0f03f3f6d4bd74bd4d2e5b00cbe3719b" alt=""
data:image/s3,"s3://crabby-images/265e1/265e1d083c4b9b9594c30e71f287d4c9b2fb02f4" alt=""
data:image/s3,"s3://crabby-images/b64c9/b64c962ecbb5527549c0d4687321bfb6434f528d" alt=""
实例9:样式共享
假设我们某个一个DataFrame创建了一个样式,然后想在另一个DataFrame中直接使用这个样式,该怎么做呢?
data:image/s3,"s3://crabby-images/ce7bc/ce7bc9502a99bc2fd5487d2ac642bcaae7583e91" alt=""
我们先创建一个样式style1:
data:image/s3,"s3://crabby-images/7349b/7349b63795eb894ee3775f19795308e9f923e218" alt=""
将style1运用到style2中:
data:image/s3,"s3://crabby-images/5b555/5b555561356052028f934138d5d7e649c18d9c74" alt=""
实例10:设置精度set_precision
通常是给数据框中的数据指定精度(小数位):
data:image/s3,"s3://crabby-images/75a6b/75a6b6380dc028d3b4d3958e1c5d1c1994748044" alt=""
data:image/s3,"s3://crabby-images/0d404/0d4044dfdc8f21c5ffad5f402473b86d45ecc040" alt=""
实例11:设置标题
使用set_caption方法
data:image/s3,"s3://crabby-images/1ea0f/1ea0fee9f4568438ad618600b1f1321e8a4f3efa" alt=""
缺失值设置
使用的是set_na_rep函数:
(df1.style.set_na_rep("FAIL")
.format(None, na_rep="PASS", subset=["D"])
.highlight_null("yellow"))
data:image/s3,"s3://crabby-images/b1453/b14539092195a58fea8b97c7b6c2eb703e888dce" alt=""
实例13:综合案例
(df1.style
.set_na_rep('-')
.format({'C':'{:.4f}',
'D':'{:.4f}',
'E':'{:.2%}'},na_rep="-")
.highlight_null("green")
.applymap(color_change,subset=['D'])
.apply(highlight_min,subset=['C'])
.background_gradient(cmap='cubehelix',subset=['B','D'])
.bar(subset=['E'], align='mid', color=['#60BCD4','#6A1B9A'])
.set_caption("Title of Pandas Style")
.hide_index()
.hide_columns(subset=['A']))
data:image/s3,"s3://crabby-images/e19f9/e19f9929a4dd5e8d78192ea28a9713c8b068113a" alt=""
实例14:终极武器
style1 = [
dict(selector="th", props=[("font-size", "125%"),
("text-align", "center"),
("background-color", "#F0F3CF"),
('width',"100px"),
('height','80px')]),
dict(selector="td", props=[("font-size", "105%"),
("text-align", "right"),
('width',"150px"),
('height','50px')]),
dict(selector="caption", props=[("caption-side", "top"),
("font-size","150%"),
("font-weight","bold"),
("text-align", "left"),
('height','50px'),
('color','#E74C3C')])]
style2 = {
'A': [dict(selector='td', props=[('text-align','center'),
("font-weight","bold"),
("text-transform","capitalize")])],
'B': [dict(selector='td', props=[('text-align','left'),
("font-style","italic")])],
'C': [dict(selector='td', props=[('text-decoration','underline'),
('text-decoration-color','red'),
('text-decoration-style','wavy')])]}
(df1.style
.set_na_rep('-')
.format({'C':'{:.4f}','D':'{:.4f}','E':'{:.2%}'},na_rep="-")
.highlight_null("gray")
.applymap(color_change,subset=['B']).highlight_max(subset=['A'])
.background_gradient(cmap='Pastel1',subset=['C','D'])
.bar(subset=['E'], align='mid', color=['#90BCD4','#6A1B9A'])
.set_caption("Advanced use of Pandas Style")
.hide_index()
.hide_columns(subset=['E'])
.set_table_styles(style1).set_table_styles(style2,overwrite=False)
.set_properties(**{'font-family': 'Microsoft Yahei','border-collapse': 'collapse',
'border-top': '1px solid black','border-bottom': '1px solid black'}))
data:image/s3,"s3://crabby-images/8d466/8d46608a12b2809b31f71d9a4c7cbf0c3b1b4290" alt=""
输出到Excel
这是一个还在开发阶段的功能,将DataFrame使用openyxl或者xlswriter作为引擎导出到Excel表格中,官网是这样说的:
data:image/s3,"s3://crabby-images/9817d/9817d610702c4d790fb02abb2c334f6b5f5b4e4e" alt=""
在这里看一个简单的案例:
(df1.style
.applymap(color_change)
.apply(highlight_min)
.to_excel('styled.xlsx', engine='openpyxl'))
data:image/s3,"s3://crabby-images/fb098/fb098c92981b119dc53c816b26a0e4d921d3780c" alt=""
如果我们不想要索引号,添加index=False:
(df1.style
.applymap(color_change)
.apply(highlight_min)
.to_excel('styled.xlsx', engine='openpyxl', index=False))
data:image/s3,"s3://crabby-images/2bda8/2bda85b4f3247a102333a1c34aa46faeb9746a5d" alt=""
|