渣渣Python学习打卡之爬虫篇——第二天(requests高级)
一、SSL验证
import requests
response = requests.get('https://www.12306.cn/index/')
print(response.status_code)
运行结果:
![请添加图片描述](https://img-blog.csdnimg.cn/07ea67adecf347b98e5cd42b785a7b0f.png)
import requests
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)
运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/ad4713f1d9fa4e4982a362b899aaaf8c.png?x-oss-process=image/watermark,type_ZHJvaWRzYW5zZmFsbGJhY2s,shadow_50,text_Q1NETiBA5qKm5Zyo5YmN5pa577yM5Lul5Yuk5pa5546w,size_17,color_FFFFFF,t_70,g_se,x_16)
import requests
from requests.packages import urllib3
urllib3.disable_warnings()
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)
运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/46beb390aa8f4458a7e8f0a5a3be75ae.png)
import logging
import requests
logging.captureWarnings(True)
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)
运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/46beb390aa8f4458a7e8f0a5a3be75ae.png) 二、代理设置
#使用proxies参数来设置代理 首先,安装socks库:
!pip install socks
安装成功界面如下: ![请添加图片描述](https://img-blog.csdnimg.cn/24f83dabf9244eeab16874ae41ad6474.png?x-oss-process=image/watermark,type_ZHJvaWRzYW5zZmFsbGJhY2s,shadow_50,text_Q1NETiBA5qKm5Zyo5YmN5pa577yM5Lul5Yuk5pa5546w,size_17,color_FFFFFF,t_70,g_se,x_16)
proxies 参数形式例子(运行无效),需要换成自己买的有效代理才可行
import requests
proxies = {
'http':'http://10.10.1.10:3128',
'https':'http://10.10.1.10:1080',
}
requests.get('http://www.taobao.com',proxies=proxies)
运行:略
import requests
proxies = {'https': 'http://user:password@10.10.1.10:3128/',}
requests.get('https://www.taobao.com', proxies=proxies)
运行:略
import requests
proxy = '123.58.10.36:8080'
proxies={
'http':'http://'+proxy,
'https':'https://'+proxy
}
try:
response = requests.get('http://httpbin.org/get',proxies=proxies)
print(response.txt)
except requests.exceptions.ConnectionError as e:
print('错误:',e.args)
三、超时设置
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog',timeout=1)
print(r.status_code)
运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/7c09c865818441f194b761b6c418b0db.png)
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog', timeout=(10,20))
print(r.status_code)
运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/7c09c865818441f194b761b6c418b0db.png)
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog', timeout=None)
print(r.status_code)
想永久等待而参数设置为timeout=None运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/7c09c865818441f194b761b6c418b0db.png)
import requests
r = requests.get('https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343&type=blog')
print(r.status_code)
想永久等待而不加参数运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/7c09c865818441f194b761b6c418b0db.png) 四、身份认证
import requests
from requests.auth import HTTPBasicAuth
r = requests.get('http://localhost:5000',auth=HTTPBasicAuth('username','password'))
print(r.status_code)
运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/918bcab8699e4a9fb1b68c8b9ce7c9bc.png?x-oss-process=image/watermark,type_ZHJvaWRzYW5zZmFsbGJhY2s,shadow_50,text_Q1NETiBA5qKm5Zyo5YmN5pa577yM5Lul5Yuk5pa5546w,size_16,color_FFFFFF,t_70,g_se,x_16)
import requests
from requests.auth import HTTPBasicAuth
r = requests.get('https://static3.scrape.cuiqingcai.com/', auth=HTTPBasicAuth('username', 'password'),verify=False)
print(r.status_code)
基本身份认证运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/4493beb76b0f49fea0305ece73ad9ede.png) #认证失败 咋办呢???
import requests
from requests.auth import HTTPDigestAuth
url = 'http://httpbin.org/digest-auth/auth/user/pass'
requests.get(url,auth=HTTPDigestAuth('user','pass'))
Digest Authentication运行结果: ![请添加图片描述](https://img-blog.csdnimg.cn/517e30fd78f5404a98ea88fd69e2d3ad.png)
|