Python之Requests模块使用详解-CFANZ编程社区

Python之Requests模块使用详解

Requests模块是一个用于网络访问的模块，其实类似的模块有很多，比如urllib，urllib2，httplib，httplib2，他们基本都提供相似的功能，那为什么Requests模块就能够脱引而出呢？可以打开它的官网看一下，是一个“人类“用的http模块。那么，它究竟怎样的人性化呢？相信如果你之前用过urllib之类的模块的话，对比下就会发现它确实很人性化。

一、导入

下载完成后，导入模块很简单，代码如下：

1	`import``requests`

二、请求url

这里我们列出最常见的发送get或者post请求的语法。

1.发送无参数的get请求：

1	`r``=``requests.get(``"http://pythontab.com/justTest"``)`

现在，我们得到了一个响应对象r，我们可以利用这个对象得到我们想要的任何信息。

上面的例子中，get请求没有任何参数，那如果请求需要参数怎么办呢？

2.发送带参数的get请求

12	`payload` `=``{``'key1'``:` `'value1'``,` `'key2'``:` `'value2'``}``r` `=``requests.get(``"http://pythontab.com/justTest"``, params``=``payload)`

以上得知，我们的get参数是以params关键字参数传递的。

我们可以打印请求的具体url来看看到底对不对：

12	`>>>print r.url``http:``//pythontab``.com``/justTest``?key2=value2&key1=value1`

可以看到确实访问了正确的url。

还可以传递一个list给一个请求参数：

1234	`>>> payload = {``'key1'``:` `'value1'``,` `'key2'``: [``'value2'``,` `'value3'``]}``>>> r = requests.get(``"http://pythontab.com/justTest"``, params=payload)``>>> print r.url``http:``//pythontab``.com``/justTest``?key1=value1&key2=value2&key2=value3`

以上就是get请求的基本形式。

3.发送post请求

1	`r` `=``requests.post(``"http://pythontab.com/postTest"``, data` `=``{``"key"``:``"value"``})`

以上得知，post请求参数是以data关键字参数来传递的。

现在的data参数传递的是字典，我们也可以传递一个json格式的数据，如下：

1234	`>>>` `import``json``>>>` `import``requests``>>> payload = {``"key"``:``"value"``}``>>> r = requests.post(``"http://pythontab.com/postTest"``, data = json.dumps(payload))`

由于发送json格式数据太常见了，所以在Requests模块的高版本中，又加入了json这个关键字参数，可以直接发送json数据给post请求而不用再使用json模块了，见下：

12	`>>> payload = {``"key"``:``"value"``}``>>> r = requests.post(``"http://pythontab.com/postTest"``, json=payload)`

如果我们想post一个文件怎么办呢？这个时候就需要用到files参数了：

1234	`>>> url =` `'http://pythontab.com/postTest'``>>> files = {``'file'``:` `open``(``'report.xls'``,` `'rb'``)}``>>> r = requests.post(url, files=files)``>>> r.text`

我们还可以在post文件时指定文件名等额外的信息：

123	`>>> url =` `'http://pythontab.com/postTest'``>>> files = {``'file'``: (``'report.xls'``,` `open``(``'report.xls'``,` `'rb'``),` `'application/vnd.ms-excel'``, {``'Expires'``:` `'0'``})}``>>> r = requests.post(url, files=files)`

tips：强烈建议使用二进制模式打开文件，因为如果以文本文件格式打开时，可能会因为“Content-Length”这个header而出错。

可以看到，使用Requests发送请求简单吧！

三、获取返回信息

下面我们来看下发送请求后如何获取返回信息。我们继续使用最上面的例子：

123	`>>>` `import``requests``>>> r=requests.get(``'http://pythontab.com/justTest'``)``>>> r.text`

r.text是以什么编码格式输出的呢？

12	`>>> r.encoding``'utf-8'`

原来是以utf-8格式输出的。那如果我想改一下r.text的输出格式呢？

1	`>>> r.encoding =` `'ISO-8859-1'`

这样就把输出格式改为“ISO-8859-1”了。

还有一个输出语句，叫r.content，那么这个和r.text有什么区别呢？r.content返回的是字节流，如果我们请求一个图片地址并且要保存图片的话，就可以用到，这里举个代码片段如下：

1234567891011121314	`def``saveImage( imgUrl,imgName` `=``"default.jpg"``):` `r` `=``requests.get(imgUrl, stream``=``True``)` `image` `=``r.content` `destDir``=``"D:\"` `print``(``"保存图片"``+``destDir``+``imgName``+``"\n"``)` `try``:` `with` `open``(destDir``+``imgName ,``"wb"``) as jpg:` `jpg.write(image)` `return` `except``IOError:` `print``(``"IO Error"``)` `return` `finally``:` `jpg.close`

刚才介绍的r.text返回的是字符串，那么，如果请求对应的响应是一个json，那我可不可以直接拿到json格式的数据呢？r.json()就是为这个准备的。

我们还可以拿到服务器返回的原始数据，使用r.raw.read()就可以了。不过，如果你确实要拿到原始返回数据的话，记得在请求时加上“stream=True”的选项，如：

1	`r` `=``requests.get(``'https://api.github.com/events'``, stream``=``True``)。`

我们也可以得到响应状态码：

123	`>>> r = requests.get(``'http://pythontab.com/justTest'``)``>>> r.status_code``200`

也可以用requests.codes.ok来指代200这个返回值：

12	`>>> r.status_code == requests.codes.ok``True`

四、关于headers

我们可以打印出响应头：

12	`>>> r= requests.get(``"http://pythontab.com/justTest"``)``>>> r.headers`

｀r.headers｀返回的是一个字典，例如：

123456789	`{` `'content-encoding'``:` `'gzip'``,` `'transfer-encoding'``:` `'chunked'``,` `'connection'``:` `'close'``,` `'server'``:` `'nginx/1.0.4'``,` `'x-runtime'``:` `'147ms'``,` `'etag'``:` `'"e1ca502697e5c9317743dc078f67693a"'``,` `'content-type'``:` `'application/json'``}`

我们可以使用如下方法来取得部分响应头以做判断：

1	`r.headers[``'Content-Type'``]`

或者

1	`r.headers.get(``'Content-Type'``)`

如果我们想获得请求头（也就是我们向服务器发送的头信息）该怎么办呢？可以使用r.request.headers直接获得。

同时，我们在请求数据时也可以加上自定义的headers（通过headers关键字参数传递）：

12	`>>> headers = {``'user-agent'``:` `'myagent'``}``>>> r= requests.get(``"http://pythontab.com/justTest"``,headers=headers)`

五、关于Cookies

如果一个响应包含cookies的话，我们可以使用下面方法来得到它们：

1234	`>>> url =` `'http://'``>>> r = requests.get(url)``>>> r.cookies[``'example_cookie_name'``]``'example_cookie_value'`

我们也可以发送自己的cookie(使用cookies关键字参数)：

123	`>>> url =` `'http://pythontab.com/cookies'``>>> cookies={``'cookies_are'``:``'working'``}``>>> r = requests.get(url, cookies=cookies)`

六、关于重定向

有时候我们在请求url时，服务器会自动把我们的请求重定向，比如github会把我们的http请求重定向为https请求。我们可以使用r.history来查看重定向：

12345	`>>> r = requests.get(``'http://pythontab.com/'``)``>>> r.url``'http://pythontab.com/'``>>> r.``history``[]`

从上面的例子中可以看到，我们使用http协议访问，结果在r.url中，打印的却是https协议。那如果我非要服务器使用http协议，也就是禁止服务器自动重定向，该怎么办呢？使用allow_redirects 参数：

1	`r = requests.get(``'http://pythontab.com'``, allow_redirects=False)`

七、关于请求时间

我们可以使用timeout参数来设定url的请求超时时间（时间单位为秒）：

1	`requests.get(``'http://pythontab.com'``, timeout=1)`

八、关于代理

我们也可以在程序中指定代理来进行http或https访问（使用proxies关键字参数），如下：

12345	`proxies = {` `"http"``:` `"http://10.10.1.10:3128"``,` `"https"``:` `"http://10.10.1.10:1080"``,``}``requests.get(``"http://pythontab.com"``, proxies=proxies)`

九、关于session

我们有时候会有这样的情况，我们需要登录某个网站，然后才能请求相关url，这时就可以用到session了，我们可以先使用网站的登录api进行登录，然后得到session，最后就可以用这个session来请求其他url了：

12345	`s=requests.Session()``login_data={``'form_email'``:``'youremail@example.com'``,``'form_password'``:``'yourpassword'``}``s.post(``"http://pythontab.com/testLogin"``,login_data)``r = s.get(``'http://pythontab.com/notification/'``)``print r.text`