所有代码块都是在Jupyter Notebook下进行调试运行,前后之间都相互关联。
 文中所有代码块所涉及到的函数里面的详细参数均可通过scikit-learn官网API文档进行查阅,这里我只写下每行代码所实现的功能,参数的调整读者可以多进行试验调试。多动手!!!
一、简介
线性回归是回归问题,可以得到一个具体的回归值;而逻辑回归是分类问题,可以得到将两种类别物体分类。
 逻辑回归借助sigmoid函数进行了数值映射,将求出的值转换为0-1之间的概率,通过比较相关概率从而实现分类任务。
导包
import numpy as np
import os
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
二、sigmoid函数
t = np.linspace(-10, 10, 100)
sig = 1 / (1 + np.exp(-t))
plt.figure(figsize=(9, 3))
plt.plot([-10, 10], [0, 0], "k-")
plt.plot([-10, 10], [0.5, 0.5], "k:")
plt.plot([-10, 10], [1, 1], "k:")
plt.plot([0, 0], [-1.1, 1.1], "k-")
plt.plot(t, sig, "b-", linewidth=2, label=r"$\sigma(t) = \frac{1}{1 + e^{-t}}$")
plt.xlabel("t")
plt.legend(loc="upper left", fontsize=20)
plt.axis([-10, 10, -0.1, 1.1])
plt.title('Figure 4-21. Logistic function')
plt.show()

 其对应的相关推导公式如下:
三、鸢尾花数据集
这个鸢尾花数据集是sklearn库里面自带的数据集,主要由三个类别的花,每种类别的花都有四个特征参数。
 逻辑回归可以实现二分类问题,对于三分类问题,只需要将剩余其他两种类别的花当成一种(当成其他即可),依次分别进行三次二分类就可以实现三分类的任务。
Ⅰ,加载iris(鸢尾花)数据集
from sklearn import datasets
iris = datasets.load_iris()
list(iris.keys())#查看数据集中都有哪些属性可以调用,这里主要使用data---x,target---y
"""
['data',
 'target',
 'frame',
 'target_names',
 'DESCR',
 'feature_names',
 'filename']
"""
Ⅱ,查看iris数据集的详细信息描述
print (iris.DESCR)#当前iris鸢尾花数据集的所有信息的描述
"""
.. _iris_dataset:
Iris plants dataset
--------------------
**Data Set Characteristics:**
    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                
    :Summary Statistics:
    ============== ==== ==== ======= ===== ====================
                    Min  Max   Mean    SD   Class Correlation
    ============== ==== ==== ======= ===== ====================
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)
    ============== ==== ==== ======= ===== ====================
    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :Date: July, 1988
The famous Iris database, first used by Sir R.A. Fisher. The dataset is taken
from Fisher's paper. Note that it's the same as in R, but not as in the UCI
Machine Learning Repository, which has two wrong data points.
This is perhaps the best known database to be found in the
pattern recognition literature.  Fisher's paper is a classic in the field and
is referenced frequently to this day.  (See Duda & Hart, for example.)  The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant.  One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.
.. topic:: References
   - Fisher, R.A. "The use of multiple measurements in taxonomic problems"
     Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
     Mathematical Statistics" (John Wiley, NY, 1950).
   - Duda, R.O., & Hart, P.E. (1973) Pattern Classification and Scene Analysis.
     (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
   - Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
     Structure and Classification Rule for Recognition in Partially Exposed
     Environments".  IEEE Transactions on Pattern Analysis and Machine
     Intelligence, Vol. PAMI-2, No. 1, 67-71.
   - Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE Transactions
     on Information Theory, May 1972, 431-433.
   - See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al"s AUTOCLASS II
     conceptual clustering system finds 3 classes in the data.
   - Many, many more ...
"""
Ⅲ,选出其中一种类别的花,将其余两种花分为一类
X = iris['data'][:,3:]#选出所有数据中的其中一个特征
y = (iris['target'] == 2).astype(np.int)#将这种花设定为1,剩下的两种花设定为0
y#很显然,前面的0为其余两种花,后面的1是当前这种花
"""
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
"""
Ⅳ,训练模型及展示
①模型训练
from sklearn.linear_model import LogisticRegression#导入逻辑回归包
log_res = LogisticRegression()#实例化
log_res.fit(X,y)#传入参数训练模型
"""
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='warn',
          n_jobs=None, penalty='l2', random_state=None, solver='warn',
          tol=0.0001, verbose=0, warm_start=False)
"""
X_new = np.linspace(0,3,1000).reshape(-1,1)#从0-3取1000个数
y_proba = log_res.predict_proba(X_new)#得出预测结果
y_proba#得出1000个样本通过模型得出的概率值,左边表示属于当前类别的概率,右边表示不属于当前类别的概率
"""
array([[0.98554411, 0.01445589],
       [0.98543168, 0.01456832],
       [0.98531838, 0.01468162],
       ...,
       [0.02618938, 0.97381062],
       [0.02598963, 0.97401037],
       [0.02579136, 0.97420864]])
"""
②可视化展示
plt.figure(figsize=(12,5))#整个一张图绘制
decision_boundary = X_new[y_proba[:,1]>=0.5][0]#指定决策边界所处的位置
plt.plot([decision_boundary,decision_boundary],[-1,2],'k:',linewidth = 2)#将边界从上往下绘制
plt.plot(X_new,y_proba[:,1],'g-',label = 'Iris-Virginica')#是当前类别的花
plt.plot(X_new,y_proba[:,0],'b--',label = 'Not Iris-Virginica')#不是当前类别的花
plt.arrow(decision_boundary,0.08,-0.3,0,head_width = 0.05,head_length=0.1,fc='b',ec='b')#指定箭头
plt.arrow(decision_boundary,0.92,0.3,0,head_width = 0.05,head_length=0.1,fc='g',ec='g')
plt.text(decision_boundary+0.02,0.15,'Decision Boundary',fontsize = 16,color = 'k',ha='center')#添加字符串指定决策边界
plt.xlabel('Peta width(cm)',fontsize = 16)#x轴标签,花瓣宽度
plt.ylabel('y_proba',fontsize = 16)#y轴标签,最终预测的概率值
plt.axis([0,3,-0.02,1.02])#设置x和y轴的取值范围
plt.legend(loc = 'center left',fontsize = 16)#显示标签,放在中间偏左位置

 遇到不太熟悉的函数,比如画箭头arrow,可以通过查阅帮助文档进行解决
print (help(plt.arrow))
"""
Help on function arrow in module matplotlib.pyplot:
arrow(x, y, dx, dy, **kwargs)
    Add an arrow to the axes.
    
    This draws an arrow from ``(x, y)`` to ``(x+dx, y+dy)``.
    
    Parameters
    ----------
    x, y : float
        The x/y-coordinate of the arrow base.
    dx, dy : float
        The length of the arrow along x/y-direction.
    
    Returns
    -------
    arrow : `.FancyArrow`
        The created `.FancyArrow` object.
    
    Other Parameters
    ----------------
    **kwargs
        Optional kwargs (inherited from `.FancyArrow` patch) control the
        arrow construction and properties:
    
    Constructor arguments
      *width*: float (default: 0.001)
        width of full arrow tail
    
      *length_includes_head*: bool (default: False)
        True if head is to be counted in calculating the length.
    
      *head_width*: float or None (default: 3*width)
        total width of the full arrow head
    
      *head_length*: float or None (default: 1.5 * head_width)
        length of arrow head
    
      *shape*: ['full', 'left', 'right'] (default: 'full')
        draw the left-half, right-half, or full arrow
    
      *overhang*: float (default: 0)
        fraction that the arrow is swept back (0 overhang means
        triangular shape). Can be negative or greater than one.
    
      *head_starts_at_zero*: bool (default: False)
        if True, the head starts being drawn at coordinate 0
        instead of ending at coordinate 0.
    
    Other valid kwargs (inherited from :class:`Patch`) are:
      agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array 
      alpha: float or None
      animated: bool
      antialiased: unknown
      capstyle: {'butt', 'round', 'projecting'}
      clip_box: `.Bbox`
      clip_on: bool
      clip_path: [(`~matplotlib.path.Path`, `.Transform`) | `.Patch` | None] 
      color: color
      contains: callable
      edgecolor: color or None or 'auto'
      facecolor: color or None
      figure: `.Figure`
      fill: bool
      gid: str
      hatch: {'/', '\\', '|', '-', '+', 'x', 'o', 'O', '.', '*'}
      in_layout: bool
      joinstyle: {'miter', 'round', 'bevel'}
      label: object
      linestyle: {'-', '--', '-.', ':', '', (offset, on-off-seq), ...}
      linewidth: float or None for default 
      path_effects: `.AbstractPathEffect`
      picker: None or bool or float or callable
      rasterized: bool or None
      sketch_params: (scale: float, length: float, randomness: float) 
      snap: bool or None
      transform: `.Transform`
      url: str
      visible: bool
      zorder: float
    
    Notes
    -----
    The resulting arrow is affected by the axes aspect ratio and limits.
    This may produce an arrow whose head is not square with its stem. To
    create an arrow whose head is square with its stem,
    use :meth:`annotate` for example:
    
    >>> ax.annotate("", xy=(0.5, 0.5), xytext=(0, 0),
    ...             arrowprops=dict(arrowstyle="->"))
None
"""
Ⅴ,笛卡尔积(棋盘操作)样例
x0,x1 = np.meshgrid(np.linspace(1,2,2).reshape(-1,1),np.linspace(10,20,3).reshape(-1,1))#笛卡尔积
x0
"""
array([[1., 2.],
       [1., 2.],
       [1., 2.]])
"""
x1
"""
array([[10., 10.],
       [15., 15.],
       [20., 20.]])
"""
np.c_[x0.ravel(),x1.ravel()]#拼接
"""
array([[ 1., 10.],
       [ 2., 10.],
       [ 1., 15.],
       [ 2., 15.],
       [ 1., 20.],
       [ 2., 20.]])
"""
从运行结果也不难看出,笛卡尔积的操作就类似一个棋盘,(1,2,2)也就是从1-2之间选取2个数赋值给x0,(10,20,3)从10到20之间选取3个数赋值给x1。
 拼接之后即可得到这几个数的全部的排列组合情况。
Ⅵ,分类决策边界
X[:,0].min(),X[:,0].max()#获取标签为0的数据的大致范围为后续的画图做参考
"""
(1.0, 6.9)
"""
X[:,1].min(),X[:,1].max()#获取标签为1的数据的大致范围为后续的画图做参考
"""
(0.1, 2.5)
"""
x0,x1 = np.meshgrid(np.linspace(2.9,7,500).reshape(-1,1),np.linspace(0.8,2.7,200).reshape(-1,1))
X_new = np.c_[x0.ravel(),x1.ravel()]#拼接
X_new#获得测试数据
"""
array([[2.9       , 0.8       ],
       [2.90821643, 0.8       ],
       [2.91643287, 0.8       ],
       ...,
       [6.98356713, 2.7       ],
       [6.99178357, 2.7       ],
       [7.        , 2.7       ]])
"""
X_new.shape#100000=500*200
"""
(100000, 2)
"""
y_proba = log_res.predict_proba(X_new)#通过训练好的模型去预测测试数据的概率值
x0.shape#维度参数得与后续z轴一致
"""
(200, 500)
"""
x1.shape#维度参数得与后续z轴一致
"""
(200, 500)
"""
plt.figure(figsize=(10,4))#绘制图片的框架大小
plt.plot(X[y==0,0],X[y==0,1],'bs')#展示数据点
plt.plot(X[y==1,0],X[y==1,1],'g^')#展示数据点
zz = y_proba[:,1].reshape(x0.shape)#绘制z轴
contour = plt.contour(x0,x1,zz,cmap=plt.cm.brg)#绘制等高线
plt.clabel(contour,inline = 1)#等高线上添加概率值
plt.axis([2.9,7,0.8,2.7])#限制x和y轴的取值范围
plt.text(3.5,1.5,'NOT Vir',fontsize = 16,color = 'b')#展示标签
plt.text(6.5,2.3,'Vir',fontsize = 16,color = 'g')#展示标签

四、Softmax
如何实现之间对多列别进行分类,这里Softmax就派上用场了。
 
 由公式很明显可知,softmax实际上就是先对数据求指数,然后目的就是为了拉大差距,之后再进行归一化操作。
 损失函数也就是对数,0-1之间联想下对数函数。
X = iris['data'][:,(2,3)]#获取数据
y = iris['target']#获取标签
softmax_reg = LogisticRegression(multi_class = 'multinomial',solver='lbfgs')#指定多分类,指定求解的方法
softmax_reg.fit(X,y)#训练
"""
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='multinomial',
          n_jobs=None, penalty='l2', random_state=None, solver='lbfgs',
          tol=0.0001, verbose=0, warm_start=False)
"""
softmax_reg.predict([[5,2]])#二维数据
"""
array([2])
"""
softmax_reg.predict_proba([[5,2]])#预测看看有几个概率值,也就是分成了几类,也就证实了这是个多分类的任务
"""
array([[2.43559894e-04, 2.14859516e-01, 7.84896924e-01]])
"""
#绘制等高线
x0, x1 = np.meshgrid(
        np.linspace(0, 8, 500).reshape(-1, 1),
        np.linspace(0, 3.5, 200).reshape(-1, 1),
    )
X_new = np.c_[x0.ravel(), x1.ravel()]
y_proba = softmax_reg.predict_proba(X_new)
y_predict = softmax_reg.predict(X_new)
zz1 = y_proba[:, 1].reshape(x0.shape)
zz = y_predict.reshape(x0.shape)
plt.figure(figsize=(10, 4))
plt.plot(X[y==2, 0], X[y==2, 1], "g^", label="Iris-Virginica")
plt.plot(X[y==1, 0], X[y==1, 1], "bs", label="Iris-Versicolor")
plt.plot(X[y==0, 0], X[y==0, 1], "yo", label="Iris-Setosa")
from matplotlib.colors import ListedColormap
custom_cmap = ListedColormap(['#fafab0','#9898ff','#a0faa0'])
plt.contourf(x0, x1, zz, cmap=custom_cmap)
contour = plt.contour(x0, x1, zz1, cmap=plt.cm.brg)
plt.clabel(contour, inline=1, fontsize=12)
plt.xlabel("Petal length", fontsize=14)
plt.ylabel("Petal width", fontsize=14)
plt.legend(loc="center left", fontsize=14)
plt.axis([0, 7, 0, 3.5])
plt.show()











