先找到有关奖牌信息的链接 https://tiyu.baidu.com/tokyoly/home/tab/奖牌榜/from/pc
运行后发现返回值为[200],则成功响应,可以直接爬取
写完代码后import pandas制作excel表格,会自动保存在和文件同一个文件夹中 整体代码如下:
import requests
from bs4 import BeautifulSoup
import pandas as pd
data = []
countries = []
gold_medals = []
r = requests.get("https://tiyu.baidu.com/tokyoly/home/tab/奖牌榜/from/pc")
soup = BeautifulSoup(r.content,'html.parser')
Every_country_name =soup.find('div',class_='rank-list').find_all('div',class_='rcountry')
for i in Every_country_name:
country = i.find('span',class_='name')
countries.append(country.string)
Every_gold_medal = soup.find('div',class_='rank-list').find_all('div',class_='integral')
for j in Every_gold_medal:
gold_medal = j.find('div',class_='item-gold')
gold_medals.append(gold_medal.string.replace('\n','').rstrip())
for m in range(len(countries)):
data.append({'国家':countries[m],
'金牌数':gold_medals[m]})
print(data)
table = pd.DataFrame(data)
table.to_excel("奥运会金牌数.xlsx")
打开excel是这样的
最后一句 中国YYDS!
|