0
点赞
收藏
分享

微信扫一扫

【Python】爬虫获取微博热搜数据,response中文显示“\u7814\u7a76\u8bc1\u5b9e\u”

进击的包籽 04-08 15:40 阅读 1
python

问题描述

在爬虫获取微博热搜数据的时候,response中文出现了不便于理解的字段,截取如下:

......[{"title_sub":"\u7814\u7a76\u8bc1\u5b9e\u559d\u5496\u5561\u80fd\u964d\u4f4e\u75db\u98ce\u98ce\u9669","item_log":{"key":"#\u7814\u7a76\u8bc1\u5b9e\u559d\u5496\u5561\u80fd\u964d\u4f4e\u75db\u98ce\u98ce\u9669#"}

在这里插入图片描述

解决方法

引入json模块,在拿数据的时候用json.loads处理下就ok了;

demo_code:

import requests
import json

url = "https://m.weibo.cn/api/container/getIndex"

querystring = {"containerid":"231583","page_type":"searchall"}

headers = {
    'sec-ch-ua': "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"100\", \"Google Chrome\";v=\"100\"",
    'x-xsrf-token': "99c11b",
    'sec-ch-ua-mobile': "?0",
    'user-agent': "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
    'accept': "application/json, text/plain, */*",
    'mweibo-pwa': "1",
    'x-requested-with': "XMLHttpRequest",
    'sec-ch-ua-platform': "\"Windows\"",
    'sec-fetch-site': "same-origin",
    'sec-fetch-mode': "cors",
    'sec-fetch-dest': "empty",
    }

response = requests.request("GET", url, headers=headers, params=querystring)
data = json.loads(response.content)
print(data)

返回结果
在这里插入图片描述
over~~~

举报

相关推荐

0 条评论