前因 原本想爬取点股票的数据分析分析,然后就遇到了这个坑,已经有段时间没再接触python,语法都差不多忘光了,所幸python简单的东西不难。 教程
python requests.get(…).json()方法获取失败
错误日志:
Traceback (most recent call last):
File "D:/E/code/python/stock/demo/demo.py", line 66, in <module>
getStock()
File "D:/E/code/python/stock/demo/demo.py", line 45, in getStock
html_data = response.json()
File "D:\E\code\python\stock\lib\site-packages\requests\models.py", line 899, in json
return complexjson.loads(
File "D:\E\program\python3.8\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "D:\E\program\python3.8\lib\json\decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)
明明我用浏览器可以得到json数据,为什么这里就不行了呢? 这实在令人恼火,百度了很久,也用Google了,仍然没有找到理想的解决方法。
于是通过response信息,确定问题所在
print(response.url)
print("\n")
print(response.cookies)
print("\n")
print(response.content)
print('\n')
print(response.ok)
输出:
https://xueqiu.com/service/v5/stock/screener/quote/list ...
<RequestsCookieJar[<Cookie acw_tc=2760779516447594541566859ecab3dd6b548269d36d7bc8dc9ff0123144b9 for xueqiu.com/>]>
b'403 Forbidden. Your IP Address: 112.48.x.x .'
False
这就json一直解析的不成功的表象。 于是就产生了这样的疑问,其实不是Json的问题。
Python问题-requests库爬虫报403
就是访问需要添加header的User-agent
源码放送:
import requests
import csv
import json
import time
import datetime
url = 'https://xueqiu.com/service/v5/stock/screener/quote/list?'
headers = {
'Content-Type': 'application/json; charset=utf-8',
'User-Agent': 'xxxx',
}
def getBaidu():
rq = requests.get('http://httpbin.org/get')
print(rq.json())
def getStock():
t = time.time()
nowTime = lambda: int(round(t * 1000))
print(nowTime());
params = {
'page': 1,
'size': 1,
'order': 'desc',
'order_by': 'amount',
'exchange': 'CN',
'market': 'CN',
'type': 'sha',
'_': nowTime
}
response = requests.get(url=url, params=params, headers=headers)
print(response.url)
print("\n")
print(response.cookies)
print("\n")
print(response.content)
print('\n')
print(response.ok)
html_data = response.json()
data_list = html_data['data']['list']
for i in data_list:
dit = {}
dit['股票代码'] = i['symbol']
dit['股票名字'] = i['name']
dit['当前价'] = i['current']
dit['涨跌额'] = i['chg']
dit['涨跌幅/%'] = i['percent']
dit['年初至今/%'] = i['current_year_percent']
dit['成交量'] = i['volume']
dit['成交额'] = i['amount']
dit['换手率/%'] = i['turnover_rate']
dit['市盈率TTM'] = i['pe_ttm']
dit['股息率/%'] = i['dividend_yield']
dit['市值'] = i['market_capital']
print(dit)
if __name__ == '__main__':
getStock()
这里的User-Agent 自己查一下就知道了。方法其实也不难,复制一下请求信息就知道了。 包括请求参数啥的,这里其实基本上都有,在学校时学网站开发时竟然没发现,可惜可惜。
|