Python爬虫精进-第5关狂热粉丝
一、练习介绍
爬取张杰首页第一首歌“只要平凡”的第一页的15条精彩评论。
网页链接:https://y.qq.com/n/yqq/singer/002azErJ0UcDN6.html
【没点击加载“更多精彩评论”之前,第一页的精彩评论共有15条。这边只给出了第一、二条和最后一条评论的截图。】
二、Python参考解答
'''
Author: Gu Jiakai
Date: 2021-07-20 15:58:16
LastEditTime: 2021-07-20 18:40:53
LastEditors: Gu Jiakai
Description:
FilePath: \第5关-狂热粉丝\习题再练-爬取歌曲精彩评论.py
'''
import requests
url='https://c.y.qq.com/base/fcgi-bin/fcg_global_comment_h5.fcg'
params={
'g_tk_new_20200303': '5381',
'g_tk': '5381',
'loginUin': '0',
'hostUin': '0',
'format': 'json',
'inCharset': 'utf8',
'outCharset': 'GB2312',
'notice': '0',
'platform': 'yqq.json',
'needNewCode': '0',
'cid': '205360772',
'reqtype': '2',
'biztype': '1',
'topid': '214172912',
'cmd': '8',
'needmusiccrit': '0',
'pagenum': '0',
'pagesize': '25',
'lasthotcommentid': '',
'domain': 'qq.com',
'ct': '24',
'cv': '10101010'
}
res=requests.get(url,params=params)
content=res.json()
lst=content['hot_comment']['commentlist']
id=1
for ele in lst:
print(str(id)+":")
print((ele['rootcommentcontent'].replace('\\n','\n').strip()))
id+=1
三、补充练习
爬取张杰前五页的歌曲名、歌曲专辑名、歌曲时长、歌曲播放链接。
网页链接:https://y.qq.com/portal/search.html#page=1&searchid=1&remoteplace=txt.yqq.top&t=song&w=%E5%BC%A0%E6%9D%B0
四、补充练习【Python参考解答】
'''
Author: Gu Jiakai
Date: 2021-07-20 18:44:37
LastEditTime: 2021-07-20 19:09:54
LastEditors: Gu Jiakai
Description:
FilePath: \第5关-狂热粉丝\习题再练-爬取张杰前5页歌曲名.py
'''
import requests
from time import strftime
from time import gmtime
a=[]
for i in range(1,6):
url='https://c.y.qq.com/soso/fcgi-bin/client_search_cp'
params={
'ct': '24',
'qqmusic_ver': '1298',
'new_json': '1',
'remoteplace': 'txt.yqq.song',
'searchid': '54847913553742588',
't': '0',
'ggr': '1',
'cr': '1',
'catZhida': '1',
'lossless': '0',
'flag_qc': '0',
'p': str(i),
'n': '10',
'w': '张杰',
'g_tk_new_20200303': '5381',
'g_tk': '5381',
'loginUin': '0',
'hostUin': '0',
'format': 'json',
'inCharset': 'utf8',
'outCharset': 'utf-8',
'notice': '0',
'platform': 'yqq.json',
'needNewCode': '0'
}
res=requests.get(url,params=params)
content=res.json()
lst=content['data']['song']['list']
for ele in lst:
song=ele['name']
album=ele['album']['name']
time=strftime('%M:%S',gmtime(ele['interval']))
url1='https://y.qq.com/n/yqq/song/'+ele['mid']
a.append([song,album,time,url1])
for i in a:
print(i)
|