渣渣Python学习打卡之爬虫篇——第二天(requests高级)
一、SSL验证
import requests
response = requests.get('https://www.12306.cn/index/')
print(response.status_code)
运行结果:
import requests
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)
运行结果:
import requests
from requests.packages import urllib3
urllib3.disable_warnings()
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)
运行结果:
import logging
import requests
logging.captureWarnings(True)
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)
运行结果: 二、代理设置
#使用proxies参数来设置代理 首先,安装socks库:
!pip install socks
安装成功界面如下:
proxies 参数形式例子(运行无效),需要换成自己买的有效代理才可行
import requests
proxies = {
'http':'http://10.10.1.10:3128',
'https':'http://10.10.1.10:1080',
}
requests.get('http://www.taobao.com',proxies=proxies)
运行:略
import requests
proxies = {'https': 'http://user:password@10.10.1.10:3128/',}
requests.get('https://www.taobao.com', proxies=proxies)
运行:略
import requests
proxy = '123.58.10.36:8080'
proxies={
'http':'http://'+proxy,
'https':'https://'+proxy
}
try:
response = requests.get('http://httpbin.org/get',proxies=proxies)
print(response.txt)
except requests.exceptions.ConnectionError as e:
print('错误:',e.args)
三、超时设置
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog',timeout=1)
print(r.status_code)
运行结果:
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog', timeout=(10,20))
print(r.status_code)
运行结果:
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog', timeout=None)
print(r.status_code)
想永久等待而参数设置为timeout=None运行结果:
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog')
print(r.status_code)
想永久等待而不加参数运行结果: 四、身份认证
import requests
from requests.auth import HTTPBasicAuth
r = requests.get('http://localhost:5000',auth=HTTPBasicAuth('username','password'))
print(r.status_code)
运行结果:
import requests
from requests.auth import HTTPBasicAuth
r = requests.get('https://static3.scrape.cuiqingcai.com/', auth=HTTPBasicAuth('username', 'password'),verify=False)
print(r.status_code)
基本身份认证运行结果: #认证失败 咋办呢???
import requests
from requests.auth import HTTPDigestAuth
url = 'http://httpbin.org/digest-auth/auth/user/pass'
requests.get(url,auth=HTTPDigestAuth('user','pass'))
Digest Authentication运行结果:
|