大家好,我是天空之城,今天给大家带来小福利3,带你用Python里面的正则表达式爬取百度全球疫情大数据,效率杠杠滴!
import requests,re
headers = {
'Referer': 'http://www.voice.baidu.com/',
'Origin':'http://www.voice.baidu.com/',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
url='https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_aladin_banner&city=%E7%BE%8E%E5%9B%BD-%E7%BE%8E%E5%9B%BD'
res=requests.get(url=url,headers=headers).text
result=re.findall('"city":"(.*?)","cityCode"',res)
for i in result:
am=bytes(i,'utf-8')
print(am.decode('unicode-escape'))
爬取数据截图如下: 进一步处理得到:
import requests,re
headers = {
'Referer': 'http://www.voice.baidu.com/',
'Origin':'http://www.voice.baidu.com/',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36',
}
url='https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_aladin_banner&city=%E7%BE%8E%E5%9B%BD-%E7%BE%8E%E5%9B%BD'
res=requests.get(url=url,headers=headers).text
result=re.findall('"city":"(.*?)","cityCode"',res)
for i in result:
am=bytes(i,'utf-8')
amn=am.decode('unicode-escape')
ams=amn.replace("crued","治愈").replace("confirmedRelative","确诊相关").replace("died","死亡").replace("confirmed","确诊").replace("asymptomaticRelative","无症状相关").replace("nativeRelative","本土相关").replace("curConfirm","确诊治愈").replace("asymptomatic","无症状").replace("crued","治愈")
print(ams)
处理后数据截图得到:
下面捕捉一下其他国家的疫情数据:
import requests,re
headers = {
'Referer': 'http://www.voice.baidu.com/',
'Origin':'http://www.voice.baidu.com/',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36',
}
url='https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_aladin_banner&city=%E7%BE%8E%E5%9B%BD-%E7%BE%8E%E5%9B%BD'
res=requests.get(url=url,headers=headers).text
result=re.findall('"city":"(.*?)","diedPercent"',res)
for i in result:
am=bytes(i,'utf-8')
amn=am.decode('unicode-escape')
ams = amn.replace("died", "死亡").replace("diedPercent", "死亡率").replace("crued", "治愈").replace("confirmedRelative", "确诊相关").replace("confirmed","确诊").replace("curedPercent", "治愈率").replace("curConfirm", "确诊治愈")
print(ams)
获得国外疫情数据截图如下:
|