思来想去,虽然很忙,但还是挤时间针对这次YQ写个Python大数据分析系列博客,包括网络爬虫、可视化分析、GIS地图显示、情感分析、舆情分析、主题挖掘、威胁情报溯源、知识图谱、预测预警及AI和NLP应用等。希望该系列线上远程教学对您有所帮助,加油。


前文分享了实时数据抓取,获取全国各地和贵州省各地区的实时数据,并将数据存储至本地,最后调用Maplotlib和Seaborn绘制。本文将结合PyEcharts绘制地图、折线图、柱状图,希望这篇可视化分析文章对您有所帮助,也非常感谢参考文献中老师的分享!如果您有想学习的知识或建议,可以给作者留言~
代码下载地址:https://github.com/eastmountyxz/Wuhan-data-analysis
 CSDN下载地址:https://download.csdn.net/download/Eastmount/12239638
文章目录
同时推荐前面作者另外五个Python系列文章。从2014年开始,作者主要写了三个Python系列文章,分别是基础知识、网络爬虫和数据分析。2018年陆续增加了Python图像识别和Python人工智能专栏。
- Python基础知识系列:Python基础知识学习与提升
 - Python网络爬虫系列:Python爬虫之Selenium+BeautifulSoup+Requests
 - Python数据分析系列:知识图谱、web数据挖掘及NLP
 - Python图像识别系列:Python图像处理及图像识别
 - Python人工智能系列:Python人工智能及知识图谱实战
 

前文阅读:
 [Pyhon大数据分析] 一.实时数据爬取、Matplotlib和Seaborn可视化分析全国各地区、某省各城市、新增趋势
一.数据爬取及PyEcharts绘制折线图
前一篇文章作者详细讲解了TX新闻实时数据抓取过程,为了更好地进行可视化分析或数据分析,建议读者将数据存储至本地或数据库中。这里作者直接给出网络爬虫代码,将每日数据增长情况存储至本地。
爬虫目标网站:
- https://news.qq.com/zt2020/page/feiyan.htm
 
推荐参考文章:
- [Pyhon大数据分析] 一.实时数据爬取、Matplotlib和Seaborn可视化分析全国各地区、某省各城市、新增趋势
 - Python实战:抓取实时数据,画2019-nCoV地图 - 许老师
 - 用Python抓数据,绘制全国分布图 - shineych老师
 - 2020Python开发者日:爬虫框架的技术实现与模块应用的经验分享 - 许老师
 
第一步 分析网站
 通过浏览器“审查元素”查看源代码及“网络”反馈的消息。
第二步 网络爬虫代码
 通过分析url地址、请求方法、参数及响应格式,可以获取Json数据,注意url需要增加一个时间戳,同时根据不同日期将数据存储至本地CSV文件。
# -*- coding: utf-8 -*-
import time, json, requests
from datetime import datetime
 
#------------------------------------------------------------------------------
 # 第一步 抓取实时json数据
 # 参考文章:许老师博客 https://blog.csdn.net/xufive/article/details/104093197
 #------------------------------------------------------------------------------
 def catch_daily():
 url = ‘https://view.inews.qq.com/g2/getOnsInfo?name=wuwei_ww_cn_day_counts&callback=&_=%d’%int(time.time()*1000)
 data = json.loads(requests.get(url=url).json()[‘data’])
 data.sort(key=lambda x:x[‘date’])
date_list <span class="token operator">=</span> <span class="token builtin">list</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token comment"># 日期</span>
confirm_list <span class="token operator">=</span> <span class="token builtin">list</span><span class="token punctuation">(</span><span class="token punctuation">)</span>     <span class="token comment"># 确诊</span>
suspect_list <span class="token operator">=</span> <span class="token builtin">list</span><span class="token punctuation">(</span><span class="token punctuation">)</span>     <span class="token comment"># 疑似</span>
dead_list <span class="token operator">=</span> <span class="token builtin">list</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token comment"># 死亡</span>
heal_list <span class="token operator">=</span> <span class="token builtin">list</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token comment"># 治愈</span>
<span class="token keyword">for</span> item <span class="token keyword">in</span> data<span class="token punctuation">:</span>
    month<span class="token punctuation">,</span> day <span class="token operator">=</span> item<span class="token punctuation">[</span><span class="token string">'date'</span><span class="token punctuation">]</span><span class="token punctuation">.</span>split<span class="token punctuation">(</span><span class="token string">'/'</span><span class="token punctuation">)</span>
    date_list<span class="token punctuation">.</span>append<span class="token punctuation">(</span>datetime<span class="token punctuation">.</span>strptime<span class="token punctuation">(</span><span class="token string">'2020-%s-%s'</span><span class="token operator">%</span><span class="token punctuation">(</span>month<span class="token punctuation">,</span> day<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">'%Y-%m-%d'</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    confirm_list<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token builtin">int</span><span class="token punctuation">(</span>item<span class="token punctuation">[</span><span class="token string">'confirm'</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    suspect_list<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token builtin">int</span><span class="token punctuation">(</span>item<span class="token punctuation">[</span><span class="token string">'suspect'</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    dead_list<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token builtin">int</span><span class="token punctuation">(</span>item<span class="token punctuation">[</span><span class="token string">'dead'</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    heal_list<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token builtin">int</span><span class="token punctuation">(</span>item<span class="token punctuation">[</span><span class="token string">'heal'</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token keyword">return</span> date_list<span class="token punctuation">,</span> confirm_list<span class="token punctuation">,</span> suspect_list<span class="token punctuation">,</span> dead_list<span class="token punctuation">,</span> heal_list
 
#------------------------------------------------------------------------------
 # 第二步 存储数据至CSV文件
 #------------------------------------------------------------------------------
 def load_csv():
 # 获取数据
 date_list, confirm_list, suspect_list, dead_list, heal_list = catch_daily()
 print(date_list) # 日期
 print(confirm_list) # 确诊数据
 print(suspect_list) # 疑似数据
 print(dead_list) # 死亡数据
 print(heal_list) # 治愈数据
<span class="token comment"># 获取当前日期命名(2020-02-13-daily.csv)</span>
n <span class="token operator">=</span> time<span class="token punctuation">.</span>strftime<span class="token punctuation">(</span><span class="token string">"%Y-%m-%d"</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token string">"-daily.csv"</span>
fw <span class="token operator">=</span> <span class="token builtin">open</span><span class="token punctuation">(</span>n<span class="token punctuation">,</span> <span class="token string">'w'</span><span class="token punctuation">,</span> encoding<span class="token operator">=</span><span class="token string">'utf-8'</span><span class="token punctuation">)</span>
fw<span class="token punctuation">.</span>write<span class="token punctuation">(</span><span class="token string">'date,confirm,suspect,dead,heal\n'</span><span class="token punctuation">)</span>
i <span class="token operator">=</span> <span class="token number">0</span>
<span class="token keyword">while</span> i<span class="token operator"><</span><span class="token builtin">len</span><span class="token punctuation">(</span>date_list<span class="token punctuation">)</span><span class="token punctuation">:</span>
    date <span class="token operator">=</span> <span class="token builtin">str</span><span class="token punctuation">(</span>date_list<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>strftime<span class="token punctuation">(</span><span class="token string">"%Y-%m-%d"</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    fw<span class="token punctuation">.</span>write<span class="token punctuation">(</span>date<span class="token operator">+</span><span class="token string">','</span><span class="token operator">+</span><span class="token builtin">str</span><span class="token punctuation">(</span>confirm_list<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">+</span><span class="token string">','</span><span class="token operator">+</span><span class="token builtin">str</span><span class="token punctuation">(</span>suspect_list<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">+</span><span class="token string">','</span><span class="token operator">+</span><span class="token builtin">str</span><span class="token punctuation">(</span>dead_list<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">+</span><span class="token string">','</span><span class="token operator">+</span><span class="token builtin">str</span><span class="token punctuation">(</span>heal_list<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">+</span><span class="token string">'\n'</span><span class="token punctuation">)</span>
    i <span class="token operator">=</span> i <span class="token operator">+</span> <span class="token number">1</span>
<span class="token keyword">else</span><span class="token punctuation">:</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"Over write file!"</span><span class="token punctuation">)</span>
    fw<span class="token punctuation">.</span>close<span class="token punctuation">(</span><span class="token punctuation">)</span>
 
# 主函数
 if name == ‘main’:
 load_csv()
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 - 45
 - 46
 - 47
 - 48
 - 49
 - 50
 - 51
 - 52
 - 53
 - 54
 - 55
 - 56
 
输出结果如下图所示,可以看到将1月13日至2月21日的数据抓取至本地。

第三步 安装PyEcharts扩展包
 前端或网站开发的博友可能都使用过强大的Echarts技术。ECharts是一个纯Javascript的图表库,可以流畅的运行在PC和移动设备上,兼容当前绝大部分浏览器,底层依赖轻量级的Canvas类库ZRender,提供直观、生动、可交互、可高度个性化定制的数据可视化图表。ECharts提供了常规的折线图、柱状图、散点图、饼图、K线图,用于统计的盒形图,用于地理数据可视化的地图、热力图、线图,用于关系数据可视化的关系图、treemap,多维数据可视化的平行坐标,还有用于BI的漏斗图、仪表盘,并且支持图与图之间的混搭。
下图是Echarts简单绘制折线图的示例,左边是脚本代码,右边是显示图形,非常美观。
 http://echarts.baidu.com/echarts2/doc/example/line1.html#helianthus

既然Echarts这么好用,Python也有相应的第三方扩展包支持,它就是这篇文章讲解的PyEcharts库。PyEcharts是一个用于生成Echarts图表的类库,即Echarts与Python的对接,并推荐大家结合Django或Flask使用。接下来我们通过“pip install pyecharts”安装扩展包,再调用PyEcharts绘制柱状图。
 
推荐作者前文:
 [Echarts可视化] 一.入门篇之简单绘制中国地图和贵州地区
 [Python可视化] pyecharts安装入门及绘制中国贵州地图
第四步 调用PyEcharts绘制柱状图
# -*- coding: utf-8 -*-
# By: Eastmount CSDN xiuzhang
import time, json, requests
from datetime import datetime
import pandas as pd
import pyecharts.options as opts
from pyecharts.charts import Line
from pyecharts.commons.utils import JsCode
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-daily.csv”
 data = pd.read_csv(n)
 date_list = list(data[‘date’])
 confirm_list = list(data[‘confirm’])
 suspect_list = list(data[‘suspect’])
 dead_list = list(data[‘dead’])
 heal_list = list(data[‘heal’])
 print(date_list) # 日期
 print(confirm_list) # 确诊数据
 print(suspect_list) # 疑似数据
 print(dead_list) # 死亡数据
 print(heal_list) # 治愈数据
#-------------------------------------------------------------------------------------
 # 第二步:绘制折线图
 #-------------------------------------------------------------------------------------
 line = (
 Line()
 .add_xaxis(date_list)
 .add_yaxis(‘确诊数据’, confirm_list)
 .add_yaxis(‘疑似数据’, suspect_list, is_smooth=True) #平滑
 .add_yaxis(‘死亡数据’, dead_list)
 .add_yaxis(‘治愈数据’, heal_list)
 # 隐藏数字
 .set_series_opts(label_opts=opts.LabelOpts(is_show=False))
 # 设置x轴标签旋转角度
 .set_global_opts(xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-30)),
 yaxis_opts=opts.AxisOpts(name=‘人数’, min_=3),
 title_opts=opts.TitleOpts(title=‘2019-nCoV曲线图’))
 )
line.render(‘2019-nCoV曲线图.html’)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 - 45
 
输出结果如下图所示,隐藏了所有点对应的数字,否则整体效果比较乱。同时,当点击某个点能看到具体的数值,比如2月21日全国总治愈人数为20673。

第五步 绘制折线面积图及平均曲线
# -*- coding: utf-8 -*-
# By: Eastmount CSDN xiuzhang
import time, json, requests
from datetime import datetime
import pandas as pd
import pyecharts.options as opts
from pyecharts.charts import Line
from pyecharts.commons.utils import JsCode
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-daily.csv”
 data = pd.read_csv(n)
 date_list = list(data[‘date’])
 confirm_list = list(data[‘confirm’])
 suspect_list = list(data[‘suspect’])
 dead_list = list(data[‘dead’])
 heal_list = list(data[‘heal’])
 print(date_list) # 日期
 print(confirm_list) # 确诊数据
 print(suspect_list) # 疑似数据
 print(dead_list) # 死亡数据
 print(heal_list) # 治愈数据
#-------------------------------------------------------------------------------------
 # 第二步:绘制折线面积图
 #-------------------------------------------------------------------------------------
 line = (
 Line()
 .add_xaxis(date_list)
 .add_yaxis(‘确诊数据’, confirm_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]))
 .add_yaxis(‘疑似数据’, suspect_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),)
 .add_yaxis(‘死亡数据’, dead_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),)
 .add_yaxis(‘治愈数据’, heal_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),)
 # 隐藏数字 设置面积
 .set_series_opts(
 areastyle_opts=opts.AreaStyleOpts(opacity=0.5),
 label_opts=opts.LabelOpts(is_show=False))
 # 设置x轴标签旋转角度
 .set_global_opts(xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-30)),
 yaxis_opts=opts.AxisOpts(name=‘人数’, min_=3),
 title_opts=opts.TitleOpts(title=‘2019-nCoV曲线图’))
 )
line.render(‘2019-nCoV曲线图2.html’)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 - 45
 - 46
 - 47
 - 48
 - 49
 - 50
 - 51
 
输出结果如下图所示,可以看到确诊数据、疑似数据、死亡数据和治愈数据的面积及平均曲线。

第六步 增加最大值和最小值
# -*- coding: utf-8 -*-
# By: Eastmount CSDN xiuzhang
import time, json, requests
from datetime import datetime
import pandas as pd
import pyecharts.options as opts
from pyecharts.charts import Line
from pyecharts.commons.utils import JsCode
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-daily.csv”
 data = pd.read_csv(n)
 date_list = list(data[‘date’])
 confirm_list = list(data[‘confirm’])
 suspect_list = list(data[‘suspect’])
 dead_list = list(data[‘dead’])
 heal_list = list(data[‘heal’])
 print(date_list) # 日期
 print(confirm_list) # 确诊数据
 print(suspect_list) # 疑似数据
 print(dead_list) # 死亡数据
 print(heal_list) # 治愈数据
#-------------------------------------------------------------------------------------
 # 第二步:绘制折线面积图
 #-------------------------------------------------------------------------------------
 line = (
 Line()
 .add_xaxis(date_list)
 # 平均线 最大值 最小值
 .add_yaxis(‘确诊数据’, confirm_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),
 markpoint_opts=opts.MarkPointOpts(data=[opts.MarkPointItem(type_=“max”),
 opts.MarkPointItem(type_=“min”)]))
 .add_yaxis(‘疑似数据’, suspect_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),
 markpoint_opts=opts.MarkPointOpts(data=[opts.MarkPointItem(type_=“max”),
 opts.MarkPointItem(type_=“min”)]))
 .add_yaxis(‘死亡数据’, dead_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),
 markpoint_opts=opts.MarkPointOpts(data=[opts.MarkPointItem(type_=“max”),
 opts.MarkPointItem(type_=“min”)]))
 .add_yaxis(‘治愈数据’, heal_list, is_smooth=True,
 markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_=“average”)]),
 markpoint_opts=opts.MarkPointOpts(data=[opts.MarkPointItem(type_=“max”),
 opts.MarkPointItem(type_=“min”)]))
 # 隐藏数字 设置面积
 .set_series_opts(
 areastyle_opts=opts.AreaStyleOpts(opacity=0.5),
 label_opts=opts.LabelOpts(is_show=False))
 # 设置x轴标签旋转角度
 .set_global_opts(xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-30)),
 yaxis_opts=opts.AxisOpts(name=‘人数’, min_=3),
 title_opts=opts.TitleOpts(title=‘2019-nCoV曲线图’))
 )
line.render(‘2019-nCoV曲线图3.html’)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 - 45
 - 46
 - 47
 - 48
 - 49
 - 50
 - 51
 - 52
 - 53
 - 54
 - 55
 - 56
 - 57
 - 58
 - 59
 - 60
 
输出结果如下图所示:

二.PyEcharts绘制全国各地区
第一步 下载数据
 如何获取全国各地数据,推荐大家阅读上一篇文章。
# -*- coding: utf-8 -*-
#------------------------------------------------------------------------------
# 第一步:抓取数据
#------------------------------------------------------------------------------
import time, json, requests
# 抓取实时json数据
 url = ‘https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5&callback=&_=%d’%int(time.time()*1000)
 data = json.loads(requests.get(url=url).json()[‘data’])
 print(data)
 print(data.keys())
# 统计省份信息(34个省份 湖北 广东 河南 浙江 湖南 安徽…)
 num = data[‘areaTree’][0][‘children’]
 print(len(num))
 for item in num:
 print(item[‘name’],end=" “) # 不换行
 else:
 print(”\n") # 换行
# 显示湖北省数据
 hubei = num[0][‘children’]
 for item in hubei:
 print(item)
 else:
 print("\n")
# 解析确诊数据
 total_data = {
  }
 for item in num:
 if item[‘name’] not in total_data:
 total_data.update({
  item[‘name’]:0})
 for city_data in item[‘children’]:
 total_data[item[‘name’]] +=int(city_data[‘total’][‘confirm’])
 print(total_data)
 # {‘湖北’: 48206, ‘广东’: 1241, ‘河南’: 1169, ‘浙江’: 1145, ‘湖南’: 968, …, ‘澳门’: 10, ‘西藏’: 1}
# 解析疑似数据
 total_suspect_data = {
  }
 for item in num:
 if item[‘name’] not in total_suspect_data:
 total_suspect_data.update({
  item[‘name’]:0})
 for city_data in item[‘children’]:
 total_suspect_data[item[‘name’]] +=int(city_data[‘total’][‘suspect’])
 print(total_suspect_data)
# 解析死亡数据
 total_dead_data = {
  }
 for item in num:
 if item[‘name’] not in total_dead_data:
 total_dead_data.update({
  item[‘name’]:0})
 for city_data in item[‘children’]:
 total_dead_data[item[‘name’]] +=int(city_data[‘total’][‘dead’])
 print(total_dead_data)
# 解析治愈数据
 total_heal_data = {
  }
 for item in num:
 if item[‘name’] not in total_heal_data:
 total_heal_data.update({
  item[‘name’]:0})
 for city_data in item[‘children’]:
 total_heal_data[item[‘name’]] +=int(city_data[‘total’][‘heal’])
 print(total_heal_data)
# 解析新增确诊数据
 total_new_data = {
  }
 for item in num:
 if item[‘name’] not in total_new_data:
 total_new_data.update({
  item[‘name’]:0})
 for city_data in item[‘children’]:
 total_new_data[item[‘name’]] +=int(city_data[‘today’][‘confirm’]) # today 
 print(total_new_data)
#------------------------------------------------------------------------------
 # 第二步:存储数据至CSV文件
 #------------------------------------------------------------------------------
 names = list(total_data.keys()) # 省份名称
 num1 = list(total_data.values()) # 确诊数据
 num2 = list(total_suspect_data.values()) # 疑似数据(全为0)
 num3 = list(total_dead_data.values()) # 死亡数据
 num4 = list(total_heal_data.values()) # 治愈数据
 num5 = list(total_new_data.values()) # 新增确诊病例
 print(names)
 print(num1)
 print(num2)
 print(num3)
 print(num4)
 print(num5)
# 获取当前日期命名(2020-02-13-all.csv)
 n = time.strftime("%Y-%m-%d") + “-china.csv”
 fw = open(n, ‘w’, encoding=‘utf-8’)
 fw.write(‘province,confirm,dead,heal,new_confirm\n’)
 i = 0
 while i<len(names):
 fw.write(names[i]+’,’+str(num1[i])+’,’+str(num3[i])+’,’+str(num4[i])+’,’+str(num5[i])+’\n’)
 i = i + 1
 else:
 print(“Over write file!”)
 fw.close()
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 - 45
 - 46
 - 47
 - 48
 - 49
 - 50
 - 51
 - 52
 - 53
 - 54
 - 55
 - 56
 - 57
 - 58
 - 59
 - 60
 - 61
 - 62
 - 63
 - 64
 - 65
 - 66
 - 67
 - 68
 - 69
 - 70
 - 71
 - 72
 - 73
 - 74
 - 75
 - 76
 - 77
 - 78
 - 79
 - 80
 - 81
 - 82
 - 83
 - 84
 - 85
 - 86
 - 87
 - 88
 - 89
 - 90
 - 91
 - 92
 - 93
 - 94
 - 95
 - 96
 - 97
 - 98
 - 99
 - 100
 
下载数据如下图所示:

第二步 绘制地图
# -*- coding: utf-8 -*-
import time, json, requests
import pandas as pd
from pyecharts.charts import Map
import pyecharts.options as opts
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-china.csv”
 data = pd.read_csv(n)
 list_data = zip(list(data[‘province’]), list(data[‘confirm’]))
 print(list_data)
 # [(‘湖北’, 48206), (‘广东’, 1241), (‘河南’, 1169), (‘浙江’, 1145), …, (‘澳门’, 10), (‘西藏’, 1)]
#-------------------------------------------------------------------------------------
 # 第二步:绘制全国地图
 # 参考文章:https://blog.csdn.net/shineych/article/details/104231072 [shineych大神]
 #-------------------------------------------------------------------------------------
 def map_cn_disease_dis() -> Map:
 c = (
 Map()
 .add(‘中国’, list_data, ‘china’)
 .set_global_opts(
 title_opts=opts.TitleOpts(title=‘全国地图(确诊数)’),
 visualmap_opts=opts.VisualMapOpts(is_show=True,
 split_number=6,
 is_piecewise=True, # 是否为分段型
 pos_top=‘center’,
 pieces=[
 {
  ‘min’: 10000, ‘color’: ‘#7f1818’}, #不指定 max
 {
  ‘min’: 1000, ‘max’: 10000},
 {
  ‘min’: 500, ‘max’: 999},
 {
  ‘min’: 100, ‘max’: 499},
 {
  ‘min’: 10, ‘max’: 99},
 {
  ‘min’: 0, ‘max’: 5} ],
 ),
 )
 )
 return c
 map_cn_disease_dis().render(‘全国地图.html’)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 
输出结果如下图所示:

三.PyEcharts绘制贵州省地区
第一步 下载数据
# -*- coding: utf-8 -*-
# By:Easmount CSDN xiuzhang
#------------------------------------------------------------------------------
 # 第一步:抓取数据
 #------------------------------------------------------------------------------
 import time, json, requests
# 抓取实时json数据
 url = ‘https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5&callback=&_=%d’%int(time.time()*1000)
 data = json.loads(requests.get(url=url).json()[‘data’])
 print(data)
 print(data.keys())
# 统计省份信息(34个省份 湖北 广东 河南 浙江 湖南 安徽…)
 num = data[‘areaTree’][0][‘children’]
 print(len(num))
# 获取贵州下标
 k = 0
 for item in num:
 print(item[‘name’],end=" “) # 不换行
 if item[‘name’] in “贵州”:
 print(”")
 print(item[‘name’], k)
 break
 k = k + 1
 print("") # 换行
# 显示贵州省数据
 gz = num[k][‘children’]
 for item in gz:
 print(item)
 else:
 print("\n")
#------------------------------------------------------------------------------
 # 第二步:解析数据
 #------------------------------------------------------------------------------
 # 解析确诊数据
 total_data = {
  }
 for item in gz:
 if item[‘name’] not in total_data:
 total_data.update({
  item[‘name’]:0})
 total_data[item[‘name’]] = item[‘total’][‘confirm’]
 print(‘确诊人数’)
 print(total_data)
 # {‘贵阳’: 33, ‘遵义’: 25, ‘毕节’: 22, ‘黔南州’: 17, ‘六盘水’: 10, ‘铜仁’: 10, ‘黔东南州’: 10, ‘黔西南州’: 4, ‘安顺’: 4}
# 解析疑似数据
 total_suspect_data = {
  }
 for item in gz:
 if item[‘name’] not in total_suspect_data:
 total_suspect_data.update({
  item[‘name’]:0})
 total_suspect_data[item[‘name’]] = item[‘total’][‘suspect’]
 print(‘疑似人数’)
 print(total_suspect_data)
# 解析死亡数据
 total_dead_data = {
  }
 for item in gz:
 if item[‘name’] not in total_dead_data:
 total_dead_data.update({
  item[‘name’]:0})
 total_dead_data[item[‘name’]] = item[‘total’][‘dead’]
 print(‘死亡人数’)
 print(total_dead_data)
# 解析治愈数据
 total_heal_data = {
  }
 for item in gz:
 if item[‘name’] not in total_heal_data:
 total_heal_data.update({
  item[‘name’]:0})
 total_heal_data[item[‘name’]] = item[‘total’][‘heal’]
 print(‘治愈人数’)
 print(total_heal_data)
# 解析新增确诊数据
 total_new_data = {
  }
 for item in gz:
 if item[‘name’] not in total_new_data:
 total_new_data.update({
  item[‘name’]:0})
 total_new_data[item[‘name’]] = item[‘today’][‘confirm’] # today 
 print(‘新增确诊人数’)
 print(total_new_data)
#------------------------------------------------------------------------------
 # 第三步:存储数据至CSV文件
 #------------------------------------------------------------------------------
 names = list(total_data.keys()) # 省份名称
 num1 = list(total_data.values()) # 确诊数据
 num2 = list(total_suspect_data.values()) # 疑似数据(全为0)
 num3 = list(total_dead_data.values()) # 死亡数据
 num4 = list(total_heal_data.values()) # 治愈数据
 num5 = list(total_new_data.values()) # 新增确诊病例
 print(names)
 print(num1)
 print(num2)
 print(num3)
 print(num4)
 print(num5)
# 获取当前日期命名(2020-02-13-gz.csv)
 n = time.strftime("%Y-%m-%d") + “-gz.csv”
 fw = open(n, ‘w’, encoding=‘utf-8’)
 fw.write(‘province,confirm,dead,heal,new_confirm\n’)
 i = 0
 while i<len(names):
 fw.write(names[i]+’,’+str(num1[i])+’,’+str(num3[i])+’,’+str(num4[i])+’,’+str(num5[i])+’\n’)
 i = i + 1
 else:
 print(“Over write file!”)
 fw.close()
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 - 45
 - 46
 - 47
 - 48
 - 49
 - 50
 - 51
 - 52
 - 53
 - 54
 - 55
 - 56
 - 57
 - 58
 - 59
 - 60
 - 61
 - 62
 - 63
 - 64
 - 65
 - 66
 - 67
 - 68
 - 69
 - 70
 - 71
 - 72
 - 73
 - 74
 - 75
 - 76
 - 77
 - 78
 - 79
 - 80
 - 81
 - 82
 - 83
 - 84
 - 85
 - 86
 - 87
 - 88
 - 89
 - 90
 - 91
 - 92
 - 93
 - 94
 - 95
 - 96
 - 97
 - 98
 - 99
 - 100
 - 101
 - 102
 - 103
 - 104
 - 105
 - 106
 - 107
 - 108
 - 109
 - 110
 - 111
 - 112
 

第二步 绘制地图
# -*- coding: utf-8 -*-
import time, json, requests
import pandas as pd
from pyecharts.charts import Map
import pyecharts.options as opts
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-gz.csv”
 data = pd.read_csv(n)
 list_data_guizhou = zip(list(data[‘province’]), list(data[‘confirm’]))
 gz_data = list(list_data_guizhou)
 print(gz_data)
for a,b in gz_data:
 print(a, b, type(b))
#-------------------------------------------------------------------------------------
 # 第二步:绘制贵州地图
 #-------------------------------------------------------------------------------------
 def map_gz_disease_dis() -> Map:
 c = (
 Map()
 .add(‘贵州省’, gz_data, ‘贵州’)
 .set_series_opts(label_opts=opts.LabelOpts(is_show=True, formatter=’{b}\n{c}例’))
 .set_global_opts(
 title_opts=opts.TitleOpts(title=‘贵州省地图(确诊数)’),
 visualmap_opts=opts.VisualMapOpts(is_show=True,
 split_number=6,
 is_piecewise=True, # 是否为分段型
 pos_top=‘center’,
 pieces=[
 {
  ‘min’: 50},
 {
  ‘min’: 30, ‘max’: 49},
 {
  ‘min’: 20, ‘max’: 29},
 {
  ‘min’: 10, ‘max’: 19},
 {
  ‘min’: 1, ‘max’: 9},
 {
  ‘value’: 0, “label”: ‘无确诊病例’, “color”: ‘green’} ],
 ),
 )
 )
 return c
 map_gz_disease_dis().render(‘贵州省地图.html’)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 - 37
 - 38
 - 39
 - 40
 - 41
 - 42
 - 43
 - 44
 
输出结果如下图所示:

注意,读者可能会绘制图形无数据,如下图所示显示NaN。这是因为我们需要补充城市的完整称呼,比如“贵阳市”、“黔东南苗族侗族自治州”等。


第三步 绘制柱状图
# -*- coding: utf-8 -*-
import time, json, requests
import pandas as pd
from pyecharts.charts import Bar
import pyecharts.options as opts
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-gz.csv”
 data = pd.read_csv(n)
 province_list = list(data[‘province’])
 confirm_list = list(data[‘confirm’])
 dead_list = list(data[‘dead’])
 heal_list = list(data[‘heal’])
 new_confirm_list = list(data[‘new_confirm’])
 print(province_list) # 地区
 print(confirm_list) # 确诊数据
 print(dead_list) # 死亡数据
 print(heal_list) # 治愈数据
 print(new_confirm_list) # 新增确诊
#-------------------------------------------------------------------------------------
 # 第二步:绘制贵州柱状图
 #-------------------------------------------------------------------------------------
 bar=(
 Bar()
 .add_xaxis(province_list)
 .add_yaxis(“确诊数据”, confirm_list)
 .add_yaxis(“死亡数据”, dead_list)
 .add_yaxis(“治愈数据”, heal_list)
 .add_yaxis(“新增确诊”, new_confirm_list)
 .set_global_opts(title_opts=opts.TitleOpts(title=“贵州数据”, subtitle=“人数”))
 )
bar.render(“贵州省2.html”)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 
输出结果如下图所示:

四.PyEcharts绘制其他图形
更多图形推荐读者结合自己的项目或论文进行实践,这里不再详细补充。后续作者可能也会分享相关知识。比如我们可以对数据进行简单预处理,再进行可视化分析。
# -*- coding: utf-8 -*-
import time, json, requests
import pandas as pd
from pyecharts import options as opts
from pyecharts.charts import Bar
#-------------------------------------------------------------------------------------
 # 第一步:读取数据
 #-------------------------------------------------------------------------------------
 n = time.strftime("%Y-%m-%d") + “-china-bj.csv”
 data = pd.read_csv(n)
 province_list = list(data[‘province’])
 confirm_list = list(data[‘confirm’])
 dead_list = list(data[‘dead’])
 heal_list = list(data[‘heal’])
 new_confirm_list = list(data[‘new_confirm’])
 print(province_list) # 地区
 print(confirm_list) # 确诊数据
 print(dead_list) # 死亡数据
 print(heal_list) # 治愈数据
 print(new_confirm_list) # 新增确诊
#-------------------------------------------------------------------------------------
 # 第二步:绘制全国箱图
 #-------------------------------------------------------------------------------------
 bar=(
 Bar()
 .add_xaxis(province_list)
 .add_yaxis(“确诊数据”, confirm_list)
 .add_yaxis(“死亡数据”, dead_list)
 .add_yaxis(“治愈数据”, heal_list)
 .add_yaxis(“新增确诊”, new_confirm_list)
 .set_global_opts(title_opts=opts.TitleOpts(title=“全国数据”, subtitle=“人数”))
 )
bar.render(“全国柱状图.html”)
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 - 8
 - 9
 - 10
 - 11
 - 12
 - 13
 - 14
 - 15
 - 16
 - 17
 - 18
 - 19
 - 20
 - 21
 - 22
 - 23
 - 24
 - 25
 - 26
 - 27
 - 28
 - 29
 - 30
 - 31
 - 32
 - 33
 - 34
 - 35
 - 36
 

五.总结
写到这里,第二篇YQ分析的文章就讲解完毕,希望对您有所帮助。主要包括两部分内容:
- 实时数据爬取
 - PyEcharts可视化分析
 - 中国地图绘制、贵州省地图绘制
 - 柱状图及折线图
 
后续还会分享GIS地图显示、情感分析、舆情分析、主题挖掘、威胁情报溯源、知识图谱、预测预警及AI和NLP应用等。如果文章对您有所帮助,将是我写作的最大动力。作者将源代码上传至github,大家可以直接下载。

同时,向钟院士致敬,向一线工作者致敬。侠之大者,为国为民。咱们中国人一生的最高追求,为天地立心,为生民立命,为往圣继绝学,为万世开太平。以一人之力系万民康乐,以一身犯险保大业安全。他们真是做到了,武汉加油,中国加油!

(By:Eastmount 2020-02-22 晚上10点夜于贵阳 http://blog.csdn.net/eastmount/)
参考文献:
[1] [Python可视化] pyecharts安装入门及绘制中国贵州地图 - 杨秀璋
 [2] [Echarts可视化] 一.入门篇之简单绘制中国地图和贵州地区 - 杨秀璋
 [3] https://news.qq.com/zt2020/page/feiyan.htm
 [4] [Pyhon大数据分析] 一.实时数据爬取、Matplotlib和Seaborn可视化分析全国各地区、某省各城市、新增趋势
 [5] Python实战:抓取实时数据,画2019-nCoV地图 - 许老师
 [6] 用Python抓取数据,绘制全国分布图 - shineych老师
 [7] 2020Python开发者日:爬虫框架的技术实现与模块应用的经验分享 - 许老师
 [8] 用Python pyecharts v1.x 绘制图形(二):折线图、折线面积图、散点图、雷达图、箱线图、词云图 - shineych老师
 [9] pyecharts v1版本 学习笔记 折线图面积图 - baili-luoyun










