整个设计的规划是先获取数据,然后算法推荐旅游的景点,最后就是搭建服务。
第一步,我们去获取天气信息,代码如下:
import requests
payload = {}
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Cookie': 'Hm_lvt_c758855eca53e5d78186936566552a13=1651050045,1651110482,1651312770; Hm_lpvt_c758855eca53e5d78186936566552a13=1651312840',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"'
}
response = requests.request("GET", url, headers=headers, data=payload)
把获取的天气信息存在mysql,使用xpath解析数据。解析后的数据展示:
同样,我们还需要采集景点信息,代码如下:
import requests
cookies = {
'BAIDU_SSP_lcr': 'https://www.baidu.com/link?url=eJuu4G72m_bcrR00lMKqEbaWOCyrTWnG_FO2oww5O-NZZ1Ckue5F5OelotsxhBXFtqVJAE44ErbK4HGBFY72U_&wd=&eqid=edad32f500028cbe00000003626916a1',
'__gads': 'ID=b57a9e27539ca48e-224949db70d200e6:T=1651054246:RT=1651054246:S=ALNI_MZB0TBq2eU0WtGZ7JMYtuvardMQfQ',
'Hm_lvt_0283262b2e9be756492e6b078db678a7': '1651054248',
'Hm_lpvt_0283262b2e9be756492e6b078db678a7': '1651054248',
}
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Cache-Control': 'max-age=0',
# 'If-Modified-Since': 'Mon, 18 Apr 2022 09:20:31 GMT',
# 'If-None-Match': '"fe38138d553d81:11e722"',
'Proxy-Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',
}
response = requests.get(url), headers=headers, cookies=cookies,
verify=False)
对景点的html数据进行解析清理,数据展示如下:
数据现在都有了,我们需要根据sklearn算法对景点进行智能推荐。flask服务首页有搜索功能和推荐功能、热点城市等。等进入城市的详情页我们可以看见该城市对应的每个景点信息。整个设计的结果如下视频所示:
基于python景点天气及评价的信息采集系统设计和实现
|