??2020年东京奥运会,即第32届夏季奥林匹克运动会,是由日本奥林匹克委员会举办的国际性运动会,于2021年7月23日开幕、8月8日闭幕。受新冠疫情影响,2020年东京奥运会的举办之路充满争议与艰辛,因其前所未有的复杂的环境因素,2020年东京奥运会在充满争议的同时也产生了独特的赛事影响。
??为了以直观的方式呈现东京奥运会的各项重要事件、表达东京奥运会的大量奖牌、运动员、赛事数据、挖掘东京奥运会期间中国及世界其他国家的表现优劣、进退步趋势等重要信息,本项目通过网络爬虫获取可视化数据,以pyecharts 为可视化工具,实现了包括柱状图、条形图、饼图、直方图、散点图、水滴图、矩形树图、旭日图、桑基图、雷达图、地图、地理坐标系等基本图表和柱状折线图、组合雷达图、组合饼图、饼图水滴图、时间线折线图、时间线地图、时间线地理坐标系等组合图表。
??在可视化的基础上,本项目利用Django 框架建设了奥运会可视化网站,通过检索、下拉列表等表单为用户提供交互功能,结合pyecharts 图表本身的拖拽、选择等交互功能,实现用户自主控制数据,并呈现和反馈符合用户期望的可视化效果。
1 数据来源
1.1 东京奥运会官网
??本项目的首要数据来源是2020年东京奥运会官网,其涵盖的数据内容主要包括奖牌榜、国家奥委会、运动员、运动项目、赛事新闻、集锦和回看等等,链接:https://olympics.com/en/olympic-games/tokyo-2020。
1.2 咪咕视频东京奥运会数据接口
??咪咕视频数据接口中包括详细的东京奥运会奖牌榜数据、各日期获奖数据等,可用于补充与充实东京奥运会官网的数据。 ??此处省略数据爬取与数据预处理的步骤,直接给出处理后的数据,数据来源如下:
链接:https://pan.baidu.com/s/1Rth8ejouYOhnZnNu4cv0wA 提取码:yibo
2 可视化工具
??pyecharts 是一个用于生成 Echarts 图表的类库,是python与Echarts的结合。其中,Echarts 是一个由百度开源的数据可视化,凭借着良好的交互性,精巧的图表设计,得到了众多开发者的认可。而 Python 是一门富有表达力的语言,很适合用于数据处理。pyecharts具有以下特性:
(1)简洁的 API 设计,使用如丝滑般流畅,支持链式调用
(2)囊括了30多种常见图表,应有尽有
(3)支持主流 Notebook 环境,Jupyter Notebook 和 JupyterLab
(4)可轻松集成至 Flask,Django 等主流 Web 框架
(5)高度灵活的配置项,可轻松搭配出精美的图表
(6)详细的文档和示例,帮助开发者更快地上手项目
(7)多达 400+ 地图文件以及原生的百度地图,支持地理数据可视化
3 可视化过程
??本项目对东京奥运会的可视化主要分为三大部分:奖牌榜可视化、运动员可视化、国家奥委会可视化。由于篇幅限制,本文展示奖牌榜可视化,而运动员可视化与国家奥委会可视化请见东京奥运会可视化(二)(三)。
3.1 奖牌榜可视化
3.1.1 数量可视化
1. TOP20国家的金银铜数量堆叠柱状图/条形图
from pyecharts.charts import Bar
import pyecharts.options as opts
import pandas as pd
medals=pd.read_csv("./DataSet/Medals/all-sports_medals.csv")
top20_medals=medals.iloc[:20]
bar=(
Bar()
.add_xaxis([str(x) for x in top20_medals['国家奥委会']])
.add_yaxis('金牌数',[int(x) for x in top20_medals['金牌数']],color="#f58220",stack=1)
.add_yaxis('银牌数',[int(x) for x in top20_medals['银牌数']],color="#d3d7d4",stack=1)
.add_yaxis('铜牌数',[int(x) for x in top20_medals['铜牌数']],color="#ae6642",stack=1)
.set_global_opts(
title_opts=opts.TitleOpts(title='2020东京奥运会奖牌分布'),
xaxis_opts=opts.AxisOpts(
name='国家',axislabel_opts={'rotate':45},
),
yaxis_opts=opts.AxisOpts(
name='数量(个)',name_location='center',
name_gap=30,
),
)
.set_series_opts(
label_opts=opts.LabelOpts(is_show=False)
)
.render('./Visual/[堆叠柱状图]金银铜奖牌分布.html')
)
??使用reversal_axis()调整为条形图: ??将拥有较多奖牌数的国家置于上方:
medals=pd.read_csv("./DataSet/Medals/all-sports_medals.csv")
top20_medals=medals.iloc[:20]
from pyecharts.charts import Bar
import pyecharts.options as opts
bar=(
Bar()
.add_xaxis([str(x) for x in top20_medals.sort_index(ascending=False)['国家奥委会']])
.add_yaxis('金牌数',[int(x) for x in top20_medals.sort_index(ascending=False)['金牌数']],color="#f58220",stack=1)
.add_yaxis('银牌数',[int(x) for x in top20_medals.sort_index(ascending=False)['银牌数']],color="#d3d7d4",stack=1)
.add_yaxis('铜牌数',[int(x) for x in top20_medals.sort_index(ascending=False)['铜牌数']],color="#ae6642",stack=1)
.reversal_axis()
.set_global_opts(
title_opts=opts.TitleOpts(title='2020东京奥运会奖牌分布'),
xaxis_opts=opts.AxisOpts(
name='数量(个)',
name_gap=30,
axislabel_opts={'rotate':45},
),
)
.set_series_opts(
label_opts=opts.LabelOpts(is_show=False)
)
.render('./Visual/[堆叠条形图]金银铜奖牌分布.html')
)
2. TOP20国家的金牌数VS奖牌数对比柱状图
import pandas as pd
from pyecharts.charts import Bar,Line
import pyecharts.options as opts
from pyecharts.globals import ThemeType
medals=pd.read_csv("./DataSet/Medals/all-sports_medals.csv")
top20_medals=medals.iloc[:20]
bar=(
Bar({"theme": ThemeType.MACARONS})
.add_xaxis([str(x) for x in top20_medals['国家奥委会']])
.add_yaxis('金牌数',[int(x) for x in top20_medals['金牌数']],stack=0,gap='0%')
.add_yaxis('奖牌数',[int(x) for x in top20_medals['总分']],stack=0,gap='0%')
.set_global_opts(
title_opts=opts.TitleOpts(title='金牌数 VS 奖牌数'),
xaxis_opts=opts.AxisOpts(
name='数量(个)',
name_gap=30,
axislabel_opts={'rotate':45},
),
)
.set_series_opts(
label_opts=opts.LabelOpts(is_show=False)
)
.render('./Visual/[堆叠柱状图]金牌数VS奖牌数.html')
)
??使用Overlap组件添加总数排名的折线图,使之对比更加清晰:
import pandas as pd
from pyecharts.charts import Bar,Line
import pyecharts.options as opts
from pyecharts.globals import ThemeType
medals=pd.read_csv("./DataSet/Medals/all-sports_medals.csv")
top20_medals=medals.iloc[:20]
bar=(
Bar({"theme": ThemeType.MACARONS})
.add_xaxis([str(x) for x in top20_medals['国家奥委会']])
.add_yaxis('金牌数',[int(x) for x in top20_medals['金牌数']],stack=0,gap='0%')
.add_yaxis('奖牌数',[int(x) for x in top20_medals['总分']],stack=0,gap='0%')
.extend_axis(
yaxis=opts.AxisOpts(
axislabel_opts=opts.LabelOpts(formatter="{value}"), interval=5,
)
)
.set_global_opts(
title_opts=opts.TitleOpts(title='金牌数 VS 奖牌数'),
xaxis_opts=opts.AxisOpts(
name_gap=30,
axislabel_opts={'rotate':45},
),
)
.set_series_opts(
label_opts=opts.LabelOpts(is_show=False)
)
)
line=(
Line()
.add_xaxis([str(x) for x in top20_medals['国家奥委会']])
.add_yaxis("", [30-int(x) for x in top20_medals['按总数排名']] , yaxis_index=1)
.set_series_opts(
label_opts=opts.LabelOpts(is_show=False)
)
)
bar.overlap(line)
bar.render('./Visual/[柱状折线图]金牌数VS奖牌数.html')
3. 国家金牌优势项目分布旭日图
sports_dict={
'all-sports':'所有赛事',
'baseball-softball':'棒球/垒球',
'trampoline-gymnastics':'蹦床体操',
'cycling-track':'场地自行车',
'surfing':'冲浪',
'sailing':'帆船',
'golf':'高尔夫',
'cycling-road':'公路自行车',
'artistic-swimming':'花样游泳',
'skateboarding':'滑板',
'fencing':'击剑',
'canoe-slalom':'激流皮划艇',
'artistic-gymnastics':'竞技体操',
'cycling-bmx-racing':'竞速小轮车',
'canoe-sprint':'静水皮划艇',
'weightlifting':'举重',
'karate':'空手道',
'marathon-swimming':'马拉松游泳',
'equestrian':'马术',
'volleyball':'排球',
'table-tennis':'乒乓球',
'rugby-sevens':'七人制橄榄球',
'hockey':'曲棍球',
'boxing':'拳击',
'judo':'柔道',
'rowing':'赛艇',
'3x3-basketball':'三对三篮球',
'beach-volleyball':'沙滩排球',
'cycling-mountain-bike':'山地自行车',
'shooting':'射击',
'archery':'射箭',
'handball':'手球',
'wrestling':'摔跤',
'water-polo':'水球',
'taekwondo':'跆拳道',
'athletics':'田径',
'diving':'跳水',
'triathlon':'铁人三项',
'tennis':'网球',
'modern-pentathlon':'现代五项',
'rhythmic-gymnastics':'艺术体操',
'swimming':'游泳',
'badminton':'羽毛球',
'sport-climbing':'运动攀登',
'cycling-bmx-freestyle':'自由式小轮车',
'football':'足球',
}
import pandas as pd
values=[]
for sport in sports_dict.keys():
sport_df=pd.read_csv("./DataSet/Medals/"+sport+"_medals.csv")
values.append([sports_dict[sport],sport_df.iloc[0]['国家奥委会']])
from pyecharts.charts import Sunburst
c = (
Sunburst(init_opts=opts.InitOpts(width="1000px", height="600px"))
.add(
"",
data_pair=data,
highlight_policy="ancestor",
radius=[0, "95%"],
sort_="null",
levels=[
{},
{
"r0": "15%",
"r": "35%",
"itemStyle": {"borderWidth": 2},
"label": {"rotate": "tangential"},
},
{"r0": "35%", "r": "70%", "label": {"align": "right"}},
{
"r0": "70%",
"r": "72%",
"label": {"position": "outside", "padding": 3, "silent": False},
"itemStyle": {"borderWidth": 3},
},
],
)
.set_global_opts(title_opts=opts.TitleOpts(title="国家金牌优势项目分布",pos_left='center'))
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}"))
.render("./Visual/[旭日图]国家金牌优势项目分布.html")
)
4. 项目奖牌汇聚国家(美国|中国|日本)桑基图
from pyecharts import options as opts
from pyecharts.charts import Sankey
nodes=[]
for sport in sports_dict.values():
nodes.append({"name":sport})
for medal in ("金牌","银牌","铜牌"):
nodes.append({"name":medal})
for top3 in ("美国","中国","日本"):
nodes.append({"name":top3})
links=[]
gold_usa,silver_usa,bronze_usa,gold_ch,silver_ch,bronze_ch,gold_jp,silver_jp,bronze_jp=[0 for i in range(9)]
for sport in sports_dict.keys():
sport_df=pd.read_csv("./DataSet/Medals/"+sport+"_medals.csv")
country_list=sport_df['国家奥委会']
if '美国' in country_list.to_list():
gold=sport_df[sport_df['国家奥委会']=='美国']['金牌数'].values[0]
silver=sport_df[sport_df['国家奥委会']=='美国']['银牌数'].values[0]
bronze=sport_df[sport_df['国家奥委会']=='美国']['铜牌数'].values[0]
gold_usa+=gold
silver_usa+=silver
bronze_usa+=bronze
if gold>0:
links.append({"source":sports_dict[sport],"target":'金牌','value':gold})
if silver>0:
links.append({"source":sports_dict[sport],"target":'银牌','value':silver})
if bronze>0:
links.append({"source":sports_dict[sport],"target":'铜牌','value':bronze})
if '中国' in country_list.to_list():
gold=sport_df[sport_df['国家奥委会']=='中国']['金牌数'].values[0]
silver=sport_df[sport_df['国家奥委会']=='中国']['银牌数'].values[0]
bronze=sport_df[sport_df['国家奥委会']=='中国']['铜牌数'].values[0]
gold_ch+=gold
silver_ch+=silver
bronze_ch+=bronze
if gold>0:
links.append({"source":sports_dict[sport],"target":'金牌','value':gold})
if silver>0:
links.append({"source":sports_dict[sport],"target":'银牌','value':silver})
if bronze>0:
links.append({"source":sports_dict[sport],"target":'铜牌','value':bronze})
if '日本' in country_list.to_list():
gold=sport_df[sport_df['国家奥委会']=='日本']['金牌数'].values[0]
silver=sport_df[sport_df['国家奥委会']=='日本']['银牌数'].values[0]
bronze=sport_df[sport_df['国家奥委会']=='日本']['铜牌数'].values[0]
gold_jp+=gold
silver_jp+=silver
bronze_jp+=bronze
if gold>0:
links.append({"source":sports_dict[sport],"target":'金牌','value':gold})
if silver>0:
links.append({"source":sports_dict[sport],"target":'银牌','value':silver})
if bronze>0:
links.append({"source":sports_dict[sport],"target":'铜牌','value':bronze})
links.append({"source":"金牌","target":'美国','value':gold_usa})
links.append({"source":"银牌","target":'美国','value':silver_usa})
links.append({"source":"铜牌","target":'美国','value':bronze_usa})
links.append({"source":"金牌","target":'中国','value':gold_ch})
links.append({"source":"银牌","target":'中国','value':silver_ch})
links.append({"source":"铜牌","target":'中国','value':bronze_ch})
links.append({"source":"金牌","target":'日本','value':gold_jp})
links.append({"source":"银牌","target":'日本','value':silver_jp})
links.append({"source":"铜牌","target":'日本','value':bronze_jp})
c = (
Sankey(init_opts=opts.InitOpts())
.add(
"",
nodes,
links,
linestyle_opt=opts.LineStyleOpts(opacity=0.2, curve=0.5, color="source"),
label_opts=opts.LabelOpts(position="left"),
)
.set_global_opts(title_opts=opts.TitleOpts(title="项目奖牌汇聚国家(美国|中国|日本)",pos_left='center'))
.render("./Visual/[桑基图]项目奖牌汇聚国家(美国中国日本).html")
)
5. 美国|中国|日本球类运动优势雷达图
??选择奥运项目中的七项球类运动:
import pandas as pd
import pyecharts.options as opts
from pyecharts.charts import Radar
radar=(
Radar(init_opts=opts.InitOpts())
.add_schema(
schema=[
opts.RadarIndicatorItem(name="棒球/垒球",max_=5),
opts.RadarIndicatorItem(name="3x3篮球",max_=5),
opts.RadarIndicatorItem(name="排球",max_=5),
opts.RadarIndicatorItem(name="乒乓球",max_=5),
opts.RadarIndicatorItem(name="网球",max_=5),
opts.RadarIndicatorItem(name="羽毛球",max_=5),
opts.RadarIndicatorItem(name="足球",max_=5),
],
center=["50%", "60%"],
splitarea_opt=opts.SplitAreaOpts(
is_show=True, areastyle_opts=opts.AreaStyleOpts(opacity=0.5)
),
textstyle_opts=opts.TextStyleOpts(color="#000"),
)
.add(
series_name="美国",
data=[[3,4,4,1,1,1,2]],
linestyle_opts=opts.LineStyleOpts(color="#5CACEE"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#5CACEE'),
)
.add(
series_name="中国",
data=[[1,2,1,4,1,4,1]],
linestyle_opts=opts.LineStyleOpts(color="#CD0000"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#CD0000'),
)
.add(
series_name="日本",
data=[[4,1,1,3,1,2,1]],
linestyle_opts=opts.LineStyleOpts(color="#faa755"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#faa755'),
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="美国|中国|日本球类运动雷达图",pos_left='center'), legend_opts=opts.LegendOpts(pos_left='80%',orient='vertical')
)
.render("./Visual/[雷达图]美国-中国-日本球类运动雷达.html")
)
6. 美国|中国|日本球类性别雷达图
??使用Radar图表创建两个新的雷达图(男子和女子):
radar_m=(
Radar(init_opts=opts.InitOpts())
.add_schema(
schema=[
opts.RadarIndicatorItem(name="棒球/垒球",max_=5),
opts.RadarIndicatorItem(name="3x3篮球",max_=5),
opts.RadarIndicatorItem(name="排球",max_=5),
opts.RadarIndicatorItem(name="乒乓球",max_=5),
opts.RadarIndicatorItem(name="网球",max_=5),
opts.RadarIndicatorItem(name="羽毛球",max_=5),
opts.RadarIndicatorItem(name="足球",max_=5),
],
center=["50%", "60%"],
splitarea_opt=opts.SplitAreaOpts(
is_show=True, areastyle_opts=opts.AreaStyleOpts(opacity=0.5)
),
textstyle_opts=opts.TextStyleOpts(color="#000"),
)
.add(
series_name="美国男子",
data=[[3,1,1,1,1,1,1]],
linestyle_opts=opts.LineStyleOpts(color="#5CACEE"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#5CACEE'),
)
.add(
series_name="中国男子",
data=[[1,1,1,4,1,4,1]],
linestyle_opts=opts.LineStyleOpts(color="#CD0000"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#CD0000'),
)
.add(
series_name="日本男子",
data=[[4,1,1,3,1,1,1]],
linestyle_opts=opts.LineStyleOpts(color="#faa755"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#faa755'),
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
legend_opts=opts.LegendOpts(pos_top='8%')
)
)
radar_w=(
Radar(init_opts=opts.InitOpts())
.add_schema(
schema=[
opts.RadarIndicatorItem(name="棒球/垒球",max_=5),
opts.RadarIndicatorItem(name="3x3篮球",max_=5),
opts.RadarIndicatorItem(name="排球",max_=5),
opts.RadarIndicatorItem(name="乒乓球",max_=5),
opts.RadarIndicatorItem(name="网球",max_=5),
opts.RadarIndicatorItem(name="羽毛球",max_=5),
opts.RadarIndicatorItem(name="足球",max_=5),
],
center=["50%", "60%"],
splitarea_opt=opts.SplitAreaOpts(
is_show=True, areastyle_opts=opts.AreaStyleOpts(opacity=0.5)
),
textstyle_opts=opts.TextStyleOpts(color="#000"),
)
.add(
series_name="美国女子",
data=[[3,4,4,1,1,1,2]],
linestyle_opts=opts.LineStyleOpts(color="#5CACEE"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#5CACEE'),
)
.add(
series_name="中国女子",
data=[[1,2,1,4,1,4,1]],
linestyle_opts=opts.LineStyleOpts(color="#CD0000"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#CD0000'),
)
.add(
series_name="日本女子",
data=[[4,1,1,3,1,1,1]],
linestyle_opts=opts.LineStyleOpts(color="#faa755"),
areastyle_opts=opts.AreaStyleOpts(opacity=0.2,color='#faa755'),
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
legend_opts=opts.LegendOpts(pos_top='8%'),
)
)
??使用Page图表组合上述三个雷达图(使用可拖拽布局模式):
page=(
Page(layout=Page.DraggablePageLayout)
.add(radar,radar_m,radar_w)
.render("./Visual/[雷达多图]美国-中国-日本球类运动雷达性别组合图.html")
)
??将图标拖拽成喜欢的布局,使用左上角的【Save Config】得到一个json文件: ??使用Page图表的save_resize_html 方法生成调整布局后的网页:
Page.save_resize_html("./Visual/[雷达多图]美国-中国-日本球类运动雷达性别组合图.html", cfg_file="./Visual/chart_config.json", dest="[布局雷达多图]美国-中国-日本球类运动雷达性别组合图.html")
3.1.2 地理可视化
1. 东京奥运会各国奖牌分布图
from pyecharts import options as opts
from pyecharts.charts import Map
namemap_df=pd.read_csv("./DataSet/Medals/namemap_medals.csv")
data_list=namemap_df.dropna()[['英文名称','奖牌总数']].values.tolist()
map = (
Map()
.add("", data_list, "world",
is_map_symbol_show=False,
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="2020东京奥运会各国奖牌分布图"),
visualmap_opts=opts.VisualMapOpts(max_=120)
)
.render("./Visual/[地图]各国奖牌分布图.html")
)
??更改颜色为同色系,使得奖牌分布多少更加明显:
map = (
Map()
.add("", data_list, "world",is_map_symbol_show=False)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="2020东京奥运会各国奖牌分布图"),
visualmap_opts=opts.VisualMapOpts(max_=120,range_color=['#90d7ec','#2b4490'])
)
.render("./Visual/[地图]各国奖牌分布图.html")
)
??同理,可获得金牌、银牌、铜牌分布图。
from pyecharts import options as opts
from pyecharts.charts import Map
namemap_df=pd.read_csv("./DataSet/Medals/namemap_medals.csv")
data_list=namemap_df.dropna()[['英文名称','金牌']].values.tolist()
map = (
Map()
.add("", data_list, "world",
is_map_symbol_show=False,
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="2020东京奥运会各国金牌分布图"),
visualmap_opts=opts.VisualMapOpts(max_=50,range_color=['#fedcbd','#f47920'])
)
.render("./Visual/[地图]各国金牌分布图.html")
)
from pyecharts import options as opts
from pyecharts.charts import Map
namemap_df=pd.read_csv("./DataSet/Medals/namemap_medals.csv")
data_list=namemap_df.dropna()[['英文名称','银牌']].values.tolist()
map = (
Map()
.add("", data_list, "world",
is_map_symbol_show=False,
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="2020东京奥运会各国银牌分布图"),
visualmap_opts=opts.VisualMapOpts(max_=50,range_color=['#f6f5ec','#464547'])
)
.render("./Visual/[地图]各国银牌分布图.html")
)
from pyecharts import options as opts
from pyecharts.charts import Map
namemap_df=pd.read_csv("./DataSet/Medals/namemap_medals.csv")
data_list=namemap_df.dropna()[['英文名称','铜牌']].values.tolist()
map = (
Map()
.add("", data_list, "world",
is_map_symbol_show=False,
)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="2020东京奥运会各国铜牌分布图"),
visualmap_opts=opts.VisualMapOpts(max_=50,range_color=['#ffce7b','#b36d41'])
)
.render("./Visual/[地图]各国铜牌分布图.html")
)
3.1.3 趋势可视化
1. 中国每日奖牌数量趋势
??利用Pandas读取获奖数据,筛选中国的数据,以日期聚类,统计奖牌总数: ??将数据转换为列表和DataFrame:
from pyecharts import options as opts
from pyecharts.charts import Line
from pyecharts.globals import ThemeType
CHN = []
x_data=cols[1:]
for d_time in cols[1:]:
CHN.append(date_medals_df[d_time][date_medals_df['国家']=='中国'].values.tolist()[0])
l1 = (
Line()
.add_xaxis(x_data)
.add_yaxis(
'中国',
CHN,
label_opts=opts.LabelOpts(is_show=True))
.set_global_opts(
title_opts=opts.TitleOpts(
title='中国每日奖牌数量趋势',
pos_left='center',
),
xaxis_opts=opts.AxisOpts(
axislabel_opts={'rotate':30},
),
yaxis_opts=opts.AxisOpts(
name='奖牌/枚',
is_scale=True,
max_=15),
legend_opts=opts.LegendOpts(is_show=False),
)
.render("./Visual/[折线图]中国每日奖牌数量趋势.html")
)
??使用TimeLine图表添加时间线,并美化图表:
from pyecharts import options as opts
from pyecharts.charts import Line,Timeline
from pyecharts.globals import ThemeType,JsCode
background_color_js = (
"new echarts.graphic.LinearGradient(0, 0, 0, 1, "
"[{offset: 0, color: '#c86589'}, {offset: 1, color: '#06a7ff'}], false)"
)
linestyle_dic = {'normal': {
'width': 4,
'shadowColor': '#696969',
'shadowBlur': 10,
'shadowOffsetY': 10,
'shadowOffsetX': 10,
}}
timeline = Timeline(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px', height='600px'))
timeline.add_schema(is_auto_play=True, is_loop_play=True,
is_timeline_show=True, play_interval=500)
CHN = []
x_data = cols[1:]
for d_time in cols[1:]:
CHN.append(date_medals_df[d_time][date_medals_df['国家']=='中国'].values.tolist()[0])
line = (
Line(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px', height='600px'))
.add_xaxis(x_data)
.add_yaxis(
'',
CHN,
symbol_size=10,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol='image://./DataSet/Image/中国.png',
symbol_size=[40, 25],
)],
label_opts=opts.LabelOpts(is_show=False),
)
)
.set_series_opts(linestyle_opts=linestyle_dic, label_opts=opts.LabelOpts(font_size=12, color='red'))
.set_global_opts(
title_opts=opts.TitleOpts(
title='中国奖牌',
pos_left='center',
pos_top='2%',
title_textstyle_opts=opts.TextStyleOpts(color='#DC143C', font_size=20)),
xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(font_size=14, color='red'),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))),
yaxis_opts=opts.AxisOpts(
name='奖牌/枚',
is_scale=True,
max_=15,
name_textstyle_opts=opts.TextStyleOpts(
font_size=16, font_weight='bold', color='#DC143C'),
axislabel_opts=opts.LabelOpts(
font_size=13, color='red'),
splitline_opts=opts.SplitLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(type_='dashed')),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))
),
legend_opts=opts.LegendOpts(is_show=True, pos_right='1%', pos_top='2%',
legend_icon='roundRect', orient='vertical'),
)
)
timeline.add(line, '{}'.format(d_time))
timeline.render("./Visual/[时间线折线图]中国每日奖牌数量趋势.html")
2. TOP3国家每日奖牌数量趋势
? 按照类似的方法获取TOP3国家的每日奖牌数量的数据:
background_color_js = (
"new echarts.graphic.LinearGradient(0, 0, 0, 1, "
"[{offset: 0, color: '#c86589'}, {offset: 1, color: '#06a7ff'}], false)"
)
linestyle_dic = { 'normal': {
'width': 4,
'shadowColor': '#696969',
'shadowBlur': 10,
'shadowOffsetY': 10,
'shadowOffsetX': 10,
}
}
timeline = Timeline(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px',height='600px'))
timeline.add_schema(is_auto_play=True, is_loop_play=True,
is_timeline_show=True, play_interval=500)
CHN, USA, JPN = [], [], []
x_data=cols[1:]
for d_time in cols[1:]:
CHN.append(date_medals_df[d_time][date_medals_df['国家']=='中国'].values.tolist()[0])
USA.append(date_medals_df[d_time][date_medals_df['国家']=='美国'].values.tolist()[0])
JPN.append(date_medals_df[d_time][date_medals_df['国家']=='日本'].values.tolist()[0])
line = (
Line(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px',height='600px'))
.add_xaxis(x_data)
.add_yaxis(
'中国',
CHN,
symbol_size=10,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[ opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol_size=[40, 25],
)],
label_opts=opts.LabelOpts(is_show=False),
)
)
.add_yaxis(
'美国',
USA,
symbol_size=5,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[
opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol_size=[40, 25],
)
],
label_opts=opts.LabelOpts(is_show=False),
)
)
.add_yaxis(
'日本',
JPN,
symbol_size=5,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[ opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol_size=[40, 25],
)],
label_opts=opts.LabelOpts(is_show=False),
)
)
.set_series_opts(linestyle_opts=linestyle_dic)
.set_global_opts(
title_opts=opts.TitleOpts(
title='中国 VS 美国 VS 日本',
pos_left='center',
pos_top='2%',
title_textstyle_opts=opts.TextStyleOpts(
color='#DC143C', font_size=20)
),
xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(font_size=14, color='red'),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))),
yaxis_opts=opts.AxisOpts(
name='奖牌/枚',
is_scale=True,
max_=15,
name_textstyle_opts=opts.TextStyleOpts(font_size=16,font_weight='bold',color='#DC143C'),
axislabel_opts=opts.LabelOpts(font_size=13),
splitline_opts=opts.SplitLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(type_='dashed')),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))
),
legend_opts=opts.LegendOpts(is_show=True, pos_right='1%', pos_top='2%',
legend_icon='roundRect',orient = 'vertical'),
))
timeline.add(line, '{}'.format(d_time))
timeline.render("./Visual/[时间线折线图]TOP3国家每日奖牌数量趋势.html")
3. 中国累计奖牌数量趋势
? 将中国每日奖牌数量的数据按照日期进行累加,得到按日期的累计奖牌数量:
from pyecharts import options as opts
from pyecharts.charts import Line,Timeline
from pyecharts.globals import ThemeType,JsCode
background_color_js = (
"new echarts.graphic.LinearGradient(0, 0, 0, 1, "
"[{offset: 0, color: '#c86589'}, {offset: 1, color: '#06a7ff'}], false)"
)
linestyle_dic = {'normal': {
'width': 4,
'shadowColor': '#696969',
'shadowBlur': 10,
'shadowOffsetY': 10,
'shadowOffsetX': 10,
}
}
timeline = Timeline(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px', height='600px'))
timeline.add_schema(is_auto_play=True, is_loop_play=True,
is_timeline_show=True, play_interval=500)
CHN = []
x_data = cols[1:]
for d_time in cols[1:]:
CHN.append(date_add_medals_df[d_time][date_add_medals_df['国家']=='中国'].values.tolist()[0])
line = (
Line(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px', height='600px'))
.add_xaxis(x_data)
.add_yaxis(
'',
CHN,
symbol_size=10,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol='image://./DataSet/Image/中国.png',
symbol_size=[40, 25],
)],
label_opts=opts.LabelOpts(is_show=False),
)
)
.set_series_opts(linestyle_opts=linestyle_dic, label_opts=opts.LabelOpts(font_size=12, color='red'))
.set_global_opts(
title_opts=opts.TitleOpts(
title='中国奖牌',
pos_left='center',
pos_top='2%',
title_textstyle_opts=opts.TextStyleOpts(color='#DC143C', font_size=20)),
xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(font_size=14, color='red'),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))),
yaxis_opts=opts.AxisOpts(
name='奖牌/枚',
is_scale=True,
max_=120,
name_textstyle_opts=opts.TextStyleOpts(
font_size=16, font_weight='bold', color='#DC143C'),
axislabel_opts=opts.LabelOpts(
font_size=13, color='red'),
splitline_opts=opts.SplitLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(type_='dashed')),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))
),
legend_opts=opts.LegendOpts(is_show=True, pos_right='1%', pos_top='2%',
legend_icon='roundRect', orient='vertical'),
)
)
timeline.add(line, '{}'.format(d_time))
timeline.render("./Visual/[时间线折线图]中国累计奖牌数量趋势.html")
4. TOP3国家累计奖牌数量趋势
from pyecharts import options as opts
from pyecharts.charts import Line,Timeline
from pyecharts.globals import ThemeType,JsCode
background_color_js = (
"new echarts.graphic.LinearGradient(0, 0, 0, 1, "
"[{offset: 0, color: '#c86589'}, {offset: 1, color: '#06a7ff'}], false)"
)
linestyle_dic = { 'normal': {
'width': 4,
'shadowColor': '#696969',
'shadowBlur': 10,
'shadowOffsetY': 10,
'shadowOffsetX': 10,
}
}
timeline = Timeline(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px',height='600px'))
timeline.add_schema(is_auto_play=True, is_loop_play=True,
is_timeline_show=True, play_interval=500)
CHN, USA, JPN = [], [], []
x_data=cols[1:]
for d_time in cols[1:]:
CHN.append(date_add_medals_df[d_time][date_add_medals_df['国家']=='中国'].values.tolist()[0])
USA.append(date_add_medals_df[d_time][date_add_medals_df['国家']=='美国'].values.tolist()[0])
JPN.append(date_add_medals_df[d_time][date_add_medals_df['国家']=='日本'].values.tolist()[0])
line = (
Line(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js),
width='980px',height='600px'))
.add_xaxis(x_data)
.add_yaxis(
'中国',
CHN,
symbol_size=10,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[ opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol_size=[40, 25],
)],
label_opts=opts.LabelOpts(is_show=False),
)
)
.add_yaxis(
'美国',
USA,
symbol_size=5,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[
opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol_size=[40, 25],
)
],
label_opts=opts.LabelOpts(is_show=False),
)
)
.add_yaxis(
'日本',
JPN,
symbol_size=5,
is_smooth=True,
label_opts=opts.LabelOpts(is_show=True),
markpoint_opts=opts.MarkPointOpts(
data=[ opts.MarkPointItem(
name="",
type_='max',
value_index=0,
symbol_size=[40, 25],
)],
label_opts=opts.LabelOpts(is_show=False),
)
)
.set_series_opts(linestyle_opts=linestyle_dic)
.set_global_opts(
title_opts=opts.TitleOpts(
title='中国 VS 美国 VS 日本',
pos_left='center',
pos_top='2%',
title_textstyle_opts=opts.TextStyleOpts(
color='#DC143C', font_size=20)
),
xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(font_size=14, color='red'),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))),
yaxis_opts=opts.AxisOpts(
name='奖牌/枚',
is_scale=True,
max_=120,
name_textstyle_opts=opts.TextStyleOpts(font_size=16,font_weight='bold',color='#DC143C'),
axislabel_opts=opts.LabelOpts(font_size=13),
splitline_opts=opts.SplitLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(type_='dashed')),
axisline_opts=opts.AxisLineOpts(is_show=True,
linestyle_opts=opts.LineStyleOpts(width=2, color='#DB7093'))
),
legend_opts=opts.LegendOpts(is_show=True, pos_right='1%', pos_top='2%',
legend_icon='roundRect',orient = 'vertical'),
))
timeline.add(line, '{}'.format(d_time))
timeline.render("./Visual/[时间线折线图]TOP3国家累计奖牌数量趋势.html")
3.1.4 比例可视化
1. 中国各项目获奖分布饼图
? 利用Pandas读取获奖详情的数据和获奖结果数据,并将两个表以ID连接: ? 将奖牌类型的数字对应成奖牌名称(1、2、3分别代表金牌、银牌、铜牌): ? 筛选中国的数据,并以项目名聚类,统计奖牌个数,再转换成所需列表格式:
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.globals import ThemeType
c = (
Pie()
.add("",[['跳水', 12], ['射击', 11], ['举重', 8], ['竞技体操', 8], ['乒乓球', 7], ['游泳', 6], ['羽毛球', 6], ['田径', 5], ['静水皮划艇', 3], ['蹦床体操', 3], ['自由式摔跤', 3], ['赛艇', 3], ['空手道', 2], ['拳击', 2], ['帆船', 2], ['花样游泳', 2], ['跆拳道', 1], ['场地自行车赛', 1], ['古典式摔跤', 1], ['击剑', 1], ['三人篮球', 1]],
center=["50%", "55%"])
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
.render("./Visual/[饼图]中国各项目获奖分布.html")
)
? 利用ThemeType更改图表主题为LIGHT:
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.globals import ThemeType
c = (
Pie(init_opts=opts.InitOpts(theme=ThemeType.LIGHT))
.add("",[['跳水', 12], ['射击', 11], ['举重', 8], ['竞技体操', 8], ['乒乓球', 7], ['游泳', 6], ['羽毛球', 6], ['田径', 5], ['静水皮划艇', 3], ['蹦床体操', 3], ['自由式摔跤', 3], ['赛艇', 3], ['空手道', 2], ['拳击', 2], ['帆船', 2], ['花样游泳', 2], ['跆拳道', 1], ['场地自行车赛', 1], ['古典式摔跤', 1], ['击剑', 1], ['三人篮球', 1]],
center=["50%", "55%"])
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
.render("./Visual/[饼图]中国各项目获奖分布.html")
)
??到此,奖牌榜可视化的内容就基本结束啦,上面的代码可以给大家作个参考,希望能有所帮助,具体的颜色、大小、位置等等可以根据自身的需要进行调整~ 之后的文章将继续分享运动员可视化和国家奥委会可视化的内容!Bye~
|