淘宝评论
打开淘宝,选择打开需要爬取的商品的评论。在谷歌浏览器选择开发者工具,选择network,找到list_detail_rate.htm?itemId=624466733121,点击headers、previews,可以看见相应的评论数据。 爬取一页屏评论代码如下
import re
import requests
headers = {
'Referer': '在headers中',
'User-Agent': '在headers中',
'cookie':"在headers中"
}
url = 'https://rate.tmall.com/list_detail_rate.htm?itemId=624466733121&spuId=1759742463&sellerId=2134802284&order=3¤tPage=1&append=0&content=1&tagId=&posi=&picture=&groupId=&ua=098#E1hvvQvWvRyvUvCkvvvvvjiWPsLhQjiEn2cW0j3mPmPv6jiPPFcwljlRRFchAjtE9vhvHnsGmD/nzYswzbT57/JFzhdwCliIdvhvmpvh6O8h3vCO5UOCvvpvCvvvRvhvCvvvvvvRvpvhvv2MMQvCvvOvCvvvphmUvpvVmvvC9cDPuvhvmvvv9bgexZbWKvhv8vvvvblvpvCgvvvCeZCvmR6vvUEpphvWh9vv9DCvpv1OvvvmlhCvm8+UvpCWv8tTvvakfJClK2kTWlK9Q8oQ+ulAbMoxfJmKHkx/gjc60fJ6EvLv+Exre8tYVVzUafmAdcvrYUkU+b8raAd6QbmD5i3gLwoQ0f06WuOCvvpvvUmmRvhvCvvvvvv=&needFold=0&_ksTS=1638101395614_1797&callback=jsonp1798'
response = requests.get(url=url, headers=headers).text
print(response)
contents = re.compile('"rateContent":"(.*?)"').findall(response)
for content in contents:
print(content)
参考: python3抓取淘宝评论内容
|