2016年10月18日 星期二

[ Python 常見問題 ] urllib3 - 解决Python爬取HTTPS网页时的错误

Source From Here
Question
因为想做一个爬虫定时领取淘宝的淘金币,无奈在使用 requests 获取页面内容时,收到了错误提示:
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py:791: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See:https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)

How-To
根据 Google 到的结果,解决方案如下:
1. 在使用 requests 前加入:requests.packages.urllib3.disable_warnings()
2. 为 requests 添加 verify=False 参数,比如:r = requests.get('https://blog.bbzhh.com',verify=False)

如下是 urllib3 文档的说明:
- InsecurePlatformWarning
New in version 1.11.
Certain Python platforms (specifically, versions of Python earlier than 2.7.9) have restrictions in their ssl module that limit the configuration that urllib3 can apply. In particular, this can cause HTTPS requests that would succeed on more featureful platforms to fail, and can cause certain security features to be unavailable.
If you encounter this warning, it is strongly recommended you upgrade to a newer Python version, or that you use pyOpenSSL as described in theOpenSSL / PyOpenSSL section.
For info about disabling warnings, see Disabling Warnings.

2016年03月02日更新:
先附加一个官方的链接:
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning
之所以更新是因为,我在virutalenv的环境下,使用pip安装第三方包时,也遇到了类似的提示:
/root/env/python27/debug/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:315: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. For more information, seehttps://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
SNIMissingWarning

/root/env/python27/debug/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:120: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning

然而这种时候我是不可能去修改代码的,那么解决方案就在官方链接中的最后一部分:
Disabling Warnings
Making unverified HTTPS requests is strongly discouraged. ˙ ͜ʟ˙ But if you understand the ramifications and still want to do it...

Within the code
If you know what you’re doing and would like to disable all urllib3 warnings, you can use disable_warnings():
  1. import urllib3  
  2. urllib3.disable_warnings()  
Alternatively, if you are using Python’s logging module, you can capture the warnings to your own log:
  1. logging.captureWarnings(True)  
Capturing the warnings to your own log is much preferred over simply disabling the warnings.

Without modifying code
If you are using a program that uses urllib3 and don’t want to change the code, you can suppress warnings by setting the PYTHONWARNINGS environment variable in Python 2.7+ or by using the -W flag with the Python interpreter (see docs), such as:
# PYTHONWARNINGS="ignore:Unverified HTTPS request" ./do-insecure-request.py


沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...