當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ML之sklearn：sklearn.linear_mode中的LogisticRegression函数的简介、使用方法之详细攻略

發布時間：2025/3/21 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 ML之sklearn：sklearn.linear_mode中的LogisticRegression函数的简介、使用方法之详细攻略小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

ML之sklearn：sklearn.linear_mode中的LogisticRegression函數的簡介、使用方法之詳細攻略

sklearn.linear_mode中的LogisticRegression函數的簡介、使用方法

class LogisticRegression Found at: sklearn.linear_model._logisticclass LogisticRegression(BaseEstimator, LinearClassifierMixin,? SparseCoefMixin): ? ? """ ? ? Logistic Regression (aka logit, MaxEnt) classifier. ? ? In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr', and uses the cross-entropy loss if the 'multi_class' option is set to 'multinomial'. (Currently the 'multinomial' option is supported only by the 'lbfgs',?'sag', 'saga' and 'newton-cg' solvers.) ? ?? ? ? This class implements regularized logistic regression using the?'liblinear' library, 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers. Note that regularization is applied by default. It can handle both dense?and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit?floats for optimal performance; any other input format will be?converted?(and copied). ? ?? ? ? The 'newton-cg', 'sag', and 'lbfgs' solvers support only L2?regularization with primal formulation, or no regularization. The 'liblinear' solver?supports both L1 and L2 regularization, with a dual formulation only?for?the L2 penalty. The Elastic-Net regularization is only supported by the 'saga' solver. ? ?? ? ? Read more in the :ref:`User Guide <logistic_regression>`.	? ? 邏輯回歸(又名logit, MaxEnt)分類器。在多類情況下，如果“multi_class”選項設置為“OvR”，訓練算法使用one vs-rest (OvR)方案，如果“multi_class”選項設置為“多項”，訓練算法使用交叉熵損失。(目前，“多項”選項僅由“lbfgs”、“sag”、“saga”和“newton-cg”求解器支持。) 這個類使用“liblinear”庫、“newton-cg”、“sag”、“saga”和“lbfgs”求解器實現正則邏輯回歸。注意正則化是在默認情況下應用的。它可以處理稠密和稀疏輸入。使用C-ordered數組或包含64位浮點數的CSR矩陣，以獲得最佳性能;任何其他輸入格式都將被轉換(和復制)。 “newton-cg”、“sag”和“lbfgs”求解器只支持使用原始公式的L2正則化，或者不支持正則化。“liblinear”求解器支持L1和L2正則化，只有L2懲罰的對偶公式。彈性網正則化僅由“saga”求解器支持。詳見:ref: ' User Guide <logistic_regression> '。</logistic_regression>
? Parameters ? ? ---------- ? ? penalty : {'l1', 'l2', 'elasticnet', 'none'}, default='l2' ? ? Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 penalties. 'elasticnet' is only supported by the 'saga' solver. If 'none' (not supported by the?liblinear solver), no regularization is applied. ? ?? ? ? .. versionadded:: 0.19 ? ? l1 penalty with SAGA solver (allowing 'multinomial' + L1) ? ?? ? ? dual : bool, default=False ? ? Dual or primal formulation. Dual formulation is only implemented for?l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features. ? ?? ? ? tol : float, default=1e-4 ? ? Tolerance for stopping criteria. ? ?? ? ? C : float, default=1.0 ? ? Inverse of regularization strength; must be a positive float.? Like in support vector machines, smaller values specify stronger?regularization. ? ?? ? ? fit_intercept : bool, default=True ? ? Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function. ? ?? ? ? intercept_scaling : float, default=1 ? ? Useful only when the solver 'liblinear' is used? and self.fit_intercept is set to True. In this case, x becomes?[x, self.intercept_scaling],?i.e. a "synthetic" feature with constant value equal to?intercept_scaling is appended to the instance vector.The intercept becomes ``intercept_scaling * synthetic_feature_weight``. ? ?? ? ? Note! the synthetic feature weight is subject to l1/l2 regularization?as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased. ? ?? ? ? class_weight : dict or 'balanced', default=None ? ? Weights associated with classes in the form ``{class_label: weight}``.?If not given, all classes are supposed to have weight one. ? ?? ? ? The "balanced" mode uses the values of y to automatically adjust? weights inversely proportional to class frequencies in the input data??as ``n_samples / (n_classes * np.bincount(y))``. ? ?? ? ? Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified. ? ?? ? ? .. versionadded:: 0.17 ? ? class_weight='balanced' ? ?? ? ? random_state : int, RandomState instance, default=None?Used when ``solver`` == 'sag', 'saga' or 'liblinear' to shuffle the?data. See :term:`Glossary <random_state>` for details. ? ?? ? ? solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'}, \ default='lbfgs' ? ?? ? ? Algorithm to use in the optimization problem. ? ?? ? ? - For small datasets, 'liblinear' is a good choice, whereas 'sag' and?'saga' are faster for large ones. ? ? - For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs'? handle multinomial loss; 'liblinear' is limited to one-versus-rest? schemes. ? ? - 'newton-cg', 'lbfgs', 'sag' and 'saga' handle L2 or no penalty ? ? - 'liblinear' and 'saga' also handle L1 penalty ? ? - 'saga' also supports 'elasticnet' penalty ? ? - 'liblinear' does not support setting ``penalty='none'`` ? ?? ? ? Note that 'sag' and 'saga' fast convergence is only guaranteed on? features with approximately the same scale. You can?preprocess the data with a scaler from sklearn.preprocessing.	參數 --------- 處罰:{l1, l2,‘elasticnet’,‘沒有’},默認=“l2” 用于指定在處罰中使用的規范。“newton-cg”，“sag”和“lbfgs”求解器只支持l2懲罰。“elasticnet”僅由“saga”求解器支持。如果“none”(liblinear求解器不支持)，則不應用正則化。 . .versionadded:: 0.19 l1懲罰與SAGA求解器(允許“多項”+ l1) bool，默認=False 雙重或原始配方。對偶公式僅適用于l2罰用線性求解器。當n_samples > n_features時，preferred dual=False。 tol:浮動，默認=1e-4 停止標準的容忍度。 C: float, default=1.0 正則化強度的逆;必須是正浮點數。與支持向量機一樣，值越小，正則化越強。 fit_intercept: bool，默認=True 指定一個常數(即偏差或攔截)是否應該添加到決策函數中。 intercept_scaling:浮動，默認=1 只有在使用“liblinear”求解器和self時才有用。fit_intercept設置為True。在這種情況下，x變成[x, self。intercept_scaling]，即。一個常數值等于intercept_scaling的“合成”特性被附加到實例向量中。攔截變成' ' intercept_scaling * synthetic_feature_weight ' '。注意!合成特征權重與所有其他特征一樣，采用l1/l2正則化。為了減少正則化對合成特征權重的影響(因此對攔截的影響)，必須增加intercept_scaling。 class_weight: dict或'balanced'，默認為None 以' ' {class_label: weight} ' ' '形式關聯類的權重。如果沒有給出，所有類的權重都應該是1。 “平衡”模式使用y的值自動調整權重與輸入數據中的類頻率成反比，如' ' n_samples / (n_classes * np.bincount(y)) ' '。注意，如果指定了sample_weight，那么這些權重將與sample_weight相乘(通過fit方法傳遞)。 . .versionadded:: 0.17 * class_weight = '平衡' * random_state: int, RandomState instance, default=None，當' ' solver ' ' = 'sag'， 'saga'或'liblinear'洗發數據時使用。詳見:term: ' Glossary <random_state> '。</random_state> 解決:{‘newton-cg’,‘lbfgs’,‘liblinear’,“凹陷”,“傳奇”},\默認=“lbfgs” 算法用于優化問題。對于小數據集，“liblinear”是一個不錯的選擇，而“sag”和“saga”對于大數據集更快。 -對于多類問題，只有“newton-cg”、“sag”、“saga”和“lbfgs”處理多項損失;“liblinear”僅限于“一對二”方案。 - 'newton-cg'， 'lbfgs'， 'sag'和'saga'處理L2或沒有處罰 -“liblinear”和“saga”也可以處理L1懲罰 -《英雄傳奇》也支持《彈性網》的懲罰 - 'liblinear'不支持設置' ' penalty='none' ' ' 請注意，“sag”和“saga”的快速收斂只能保證在大致相同規模的特性上。您可以使用sklearn.preprocessing中的scaler對數據進行預處理。
? .. versionadded:: 0.17 ? ? Stochastic Average Gradient descent solver. ? ? .. versionadded:: 0.19 ? ? SAGA solver. ? ? .. versionchanged:: 0.22 ? ? The default solver changed from 'liblinear' to 'lbfgs' in 0.22. ? ?? ? ? max_iter : int, default=100 ? ? Maximum number of iterations taken for the solvers to converge. ? ?? ? ? multi_class : {'auto', 'ovr', 'multinomial'}, default='auto' ? ? If the option chosen is 'ovr', then a binary problem is fit for each?label. For 'multinomial' the loss minimised is the multinomial loss fit? across the entire probability distribution, even when the data is?binary. 'multinomial' is unavailable when solver='liblinear'.? 'auto' selects 'ovr' if the data is binary, or if solver='liblinear',??and otherwise selects 'multinomial'. ? ?? ? ? .. versionadded:: 0.18 ? ? Stochastic Average Gradient descent solver for 'multinomial' case. ? ? .. versionchanged:: 0.22 ? ? Default changed from 'ovr' to 'auto' in 0.22. ? ?? ? ? verbose : int, default=0 ? ? For the liblinear and lbfgs solvers set verbose to any positive? number for verbosity. ? ?? ? ? warm_start : bool, default=False ? ? When set to True, reuse the solution of the previous call to fit as?initialization, otherwise, just erase the previous solution.?Useless for liblinear solver. See :term:`the Glossary <warm_start>`. ? ?? ? ? .. versionadded:: 0.17 ? ? warm_start to support lbfgs, newton-cg, sag, saga solvers. ? ?? ? ? n_jobs : int, default=None ? ? Number of CPU cores used when parallelizing over classes if? multi_class='ovr'". This parameter is ignored when the ``solver`` is?set to 'liblinear' regardless of whether 'multi_class' is specified or?not. ``None`` means 1 unless in a :obj:`joblib.parallel_backend`??context. ``-1`` means using all processors. ? ? See :term:`Glossary <n_jobs>` for more details. ? ?? ? ? l1_ratio : float, default=None ? ? The Elastic-Net mixing parameter, with ``0 <= l1_ratio <= 1``. Only? used if ``penalty='elasticnet'``. Setting ``l1_ratio=0`` is equivalent?to using ``penalty='l2'``, while setting ``l1_ratio=1`` is equivalent??to using ``penalty='l1'``. For ``0 < l1_ratio <1``, the penalty is a combination of L1 and L2.	. .versionadded:: 0.17 隨機平均梯度下降求解器。 . .versionadded:: 0.19 SAGA solver。 . .versionchanged:: 0.22 在0.22中，默認求解器從“liblinear”更改為“lbfgs”。 max_iter: int，默認=100 使求解器收斂的最大迭代次數。 multi_class: {'auto'， 'ovr'， '多項'}，默認='auto' 如果選擇的選項是'ovr'，那么每個標簽都適合一個二進制問題。對于“多項”損失最小化是多項式損失適合整個概率分布，即使當數據是二進制。當求解器='liblinear'時，不可用多項式。auto選擇'ovr'如果數據是二進制的，或者solver='liblinear'，否則選擇'多項'。 . .versionadded:: 0.18 “多項式”情況的隨機平均梯度下降求解器。 . .versionchanged:: 0.22 在0.22中默認從“ovr”改為“auto”。 int，默認=0 對于liblinear和lbfgs求解器，將冗長設置為任意正數。 warm_start: bool，默認=False 當設置為True時，重用前面調用的解決方案以適合初始化，否則就擦除前面的解決方案。對于線性求解器是沒用的。參見:term: ' the Glossary <warm_start> '。</warm_start> . .versionadded:: 0.17 warm_start支持lbfgs， newton-cg， sag， saga*求解器。 n_jobs: int，默認=無如果multi_class='ovr'"，則在類上并行時使用的CPU核數。當' ' solver ' '被設置為'liblinear'時，不管'multi_class'是否被指定，這個參數都會被忽略。' ' None ' '表示1，除非在:obj: ' joblib.parallel_backend '上下文中。“-1”表示使用所有處理器。有關更多細節，請參見:term: ' Glossary <n_jobs> '。</n_jobs> l1_ratio: float, default=None 彈網混合參數``0 <= l1_ratio <= 1``。只在``penalty= ` elasticnet ``時使用。設置' ' l1_ratio=0 ' '等價于使用' ' penalty='l2' ' '，設置' ' l1_ratio=1 ' '等價于使用' ' penalty='l1' ' '。對于' ' 0 < l1_ratio <1 ' '，懲罰是L1和L2的組合。
? ? Attributes ? ? ---------- ? ?? ? ? classes_ : ndarray of shape (n_classes, ) ? ? A list of class labels known to the classifier. ? ?? ? ? coef_ : ndarray of shape (1, n_features) or (n_classes, n_features)?Coefficient of the features in the decision function. ? ?? ? ? `coef_` is of shape (1, n_features) when the given problem is binary. ? ? In particular, when `multi_class='multinomial'`, `coef_` corresponds?to outcome 1 (True) and `-coef_` corresponds to outcome 0 (False). ? ?? ? ? intercept_ : ndarray of shape (1,) or (n_classes,) ? ? Intercept (a.k.a. bias) added to the decision function. ? ?? ? ? If `fit_intercept` is set to False, the intercept is set to zero. ? ? `intercept_` is of shape (1,) when the given problem is binary.? In particular, when `multi_class='multinomial'`, `intercept_`??corresponds to outcome 1 (True) and `-intercept_` corresponds to?outcome 0 (False). ? ?? ? ? n_iter_ : ndarray of shape (n_classes,) or (1, ) ? ? Actual number of iterations for all classes. If binary or multinomial,? it returns only 1 element. For liblinear solver, only the maximum?number of iteration across all classes is given. ? ?? ? ? .. versionchanged:: 0.20 ? ?? ? ? In SciPy <= 1.0.0 the number of lbfgs iterations may exceed??``max_iter``. ``n_iter_`` will now report at most ``max_iter``. ? ?? ? ? See Also ? ? -------- ? ? SGDClassifier : Incrementally trained logistic regression (when given?the parameter ``loss="log"``). ? ? LogisticRegressionCV : Logistic regression with built-in cross?validation. ? ?? ? ? Notes ? ? ----- ? ? The underlying C implementation uses a random number generator to?select features when fitting the model. It is thus not uncommon,? to have slightly different results for the same input data. If? that happens, try with a smaller tol parameter. ? ?? ? ? Predict output may not match that of standalone liblinear in certain?cases. See :ref:`differences from liblinear <liblinear_differences>`? in the narrative documentation.	屬性 ?---------- classes_:形狀的ndarray 分類器已知的類標簽列表。 coef_:決策函數中特征的形狀(1,n_features)或(n_classes, n_features)系數的ndarray。當給定的問題是二進制時，' coef_ '是形狀(1,n_features)。特別是，當“multi_class=”多項“”時，“coef_”對應結果1 (True)，而“-coef_”對應結果0 (False)。 intercept_:形狀(1，)或(n_classes，)的ndarray 在決策函數中加入截距(即偏差)。如果' fit_intercept '設置為False，則攔截設置為零。當給定的問題是二進制時，intercept_ '的形狀是(1，)。特別是，當“multi_class=”多項“”時，“intercept_”對應結果1 (True)，而“-intercept_”對應結果0 (False)。 n_iter_:形狀(n_classes，)或(1，)的ndarray 所有類的實際迭代次數。如果是二項或多項，則只返回1個元素。對于線性求解器，只給出了所有類的最大迭代次數。 . .versionchanged:: 0.20 在SciPy <= 1.0.0中，lbfgs迭代次數可能超過' ' max_iter ' '。' ' n_iter_ ' '現在最多報告' ' max_iter ' '。另請參閱 -------- 增量訓練邏輯回歸(當給定參數' ' loss="log" ' ')。邏輯回歸cv:內置交叉驗證的邏輯回歸。筆記     ----- 底層的C實現使用一個隨機數生成器來選擇適合模型的特性。因此，對于相同的輸入數據，結果略有不同的情況并不少見。如果出現這種情況，嘗試使用較小的tol參數。在某些情況下，Predict輸出可能與獨立liblinear的輸出不匹配。參見:ref:“區別于liblinear <liblinear_differences>”。</liblinear_differences>
? ? References ? ? ---------- ? ?? ? ? L-BFGS-B -- Software for Large-scale Bound-constrained Optimization?Ciyou Zhu, Richard Byrd, Jorge Nocedal and Jose Luis Morales.?http://users.iems.northwestern.edu/~nocedal/lbfgsb.html ? ?? ? ? LIBLINEAR -- A Library for Large Linear Classification ? ? https://www.csie.ntu.edu.tw/~cjlin/liblinear/ ? ?? ? ? SAG -- Mark Schmidt, Nicolas Le Roux, and Francis Bach?Minimizing Finite Sums with the Stochastic Average Gradient ? ? https://hal.inria.fr/hal-00860051/document ? ?? ? ? SAGA -- Defazio, A., Bach F. & Lacoste-Julien S. (2014). ? ? SAGA: A Fast Incremental Gradient Method With Support?for Non-Strongly Convex Composite Objectives ? ? https://arxiv.org/abs/1407.0202 ? ?? ? ? Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate??descent ? ? methods for logistic regression and maximum entropy models.??Machine Learning 85(1-2):41-75. ? ? https://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf	引用 --------- Ciyou Zhu, Richard Byrd, Jorge Nocedal和Jose Luis moral. http://users.iems.northwestern.edu/~ Nocedal /lbfgsb.html LIBLINEAR——一個大型線性分類的圖書館 https://www.csie.ntu.edu.tw/ cjlin / liblinear / SAG——Mark Schmidt, Nicolas Le Roux和Francis Bach用隨機平均梯度最小化有限和 https://hal.inria.fr/hal-00860051/document 佐賀—德法齊奧，巴赫F. &拉科斯特-朱利安S.(2014)。一個支持非強凸復合目標的快速增量梯度方法 https://arxiv.org/abs/1407.0202 俞香福、黃方蘭、林志仁(2011)。雙坐標下降邏輯回歸和最大熵模型的方法。機器學習85 (1 - 2):41 - 75。 https://www.csie.ntu.edu.tw/ cjlin /論文/ maxent_dual.pdf
? ? Examples ? ? -------- ? ? >>> from sklearn.datasets import load_iris ? ? >>> from sklearn.linear_model import LogisticRegression ? ? >>> X, y = load_iris(return_X_y=True) ? ? >>> clf = LogisticRegression(random_state=0).fit(X, y) ? ? >>> clf.predict(X[:2, :]) ? ? array([0, 0]) ? ? >>> clf.predict_proba(X[:2, :]) ? ? array([[9.8...e-01, 1.8...e-02, 1.4...e-08], ? ? [9.7...e-01, 2.8...e-02, ...e-08]]) ? ? >>> clf.score(X, y) ? ? 0.97... ? ? """ ? ? @_deprecate_positional_args	?
? ? def __init__(self, penalty='l2', , dual=False, tol=1e-4, C=1.0,? ? ? ? ? fit_intercept=True, intercept_scaling=1, class_weight=None,? ? ? ? ? random_state=None, solver='lbfgs', max_iter=100,? ? ? ? ? multi_class='auto', verbose=0, warm_start=False, n_jobs=None,? ? ? ? ? l1_ratio=None): ? ? ? ? self.penalty = penalty ? ? ? ? self.dual = dual ? ? ? ? self.tol = tol ? ? ? ? self.C = C ? ? ? ? self.fit_intercept = fit_intercept ? ? ? ? self.intercept_scaling = intercept_scaling ? ? ? ? self.class_weight = class_weight ? ? ? ? self.random_state = random_state ? ? ? ? self.solver = solver ? ? ? ? self.max_iter = max_iter ? ? ? ? self.multi_class = multi_class ? ? ? ? self.verbose = verbose ? ? ? ? self.warm_start = warm_start ? ? ? ? self.n_jobs = n_jobs ? ? ? ? self.l1_ratio = l1_ratio ? ?? ? ? def fit(self, X, y, sample_weight=None): ? ? ? ? """ ? ? ? ? Fit the model according to the given training data. ? ? ? ? Parameters ? ? ? ? ---------- ? ? ? ? X : {array-like, sparse matrix} of shape (n_samples, n_features) ? ? ? ? ? ? Training vector, where n_samples is the number of samples and ? ? ? ? ? ? n_features is the number of features. ? ? ? ? y : array-like of shape (n_samples,) ? ? ? ? ? ? Target vector relative to X. ? ? ? ? sample_weight : array-like of shape (n_samples,) default=None ? ? ? ? ? ? Array of weights that are assigned to individual samples. ? ? ? ? ? ? If not provided, then each sample is given unit weight. ? ? ? ? ? ? .. versionadded:: 0.17 ? ? ? ? ? ? ? ?sample_weight* support to LogisticRegression. ? ? ? ? Returns ? ? ? ? ------- ? ? ? ? self ? ? ? ? ? ? Fitted estimator. ? ? ? ? Notes ? ? ? ? ----- ? ? ? ? The SAGA solver supports both float64 and float32 bit arrays. ? ? ? ? """ ? ? ? ? solver = _check_solver(self.solver, self.penalty, self.dual) ? ? ? ? if not isinstance(self.C, numbers.Number) or self.C < 0: ? ? ? ? ? ? raise ValueError( ? ? ? ? ? ? ? ? "Penalty term must be positive; got (C=%r)" % self.C) ? ? ? ? if self.penalty == 'elasticnet': ? ? ? ? ? ? if (not isinstance(self.l1_ratio, numbers.Number) or? ? ? ? ? ? ? ? ? self.l1_ratio < 0 or self.l1_ratio > 1): ? ? ? ? ? ? ? ? raise ValueError( ? ? ? ? ? ? ? ? ? ? "l1_ratio must be between 0 and 1;" ? ? ? ? ? ? ? ? ? ? " got (l1_ratio=%r)" %? ? ? ? ? ? ? ? ? ? ? self.l1_ratio) ? ? ? ? elif self.l1_ratio is not None: ? ? ? ? ? ? warnings.warn("l1_ratio parameter is only used when penalty is " ? ? ? ? ? ? ? ? "'elasticnet'. Got " ? ? ? ? ? ? ? ? "(penalty={})". ? ? ? ? ? ? ? ? format(self.penalty)) ? ? ? ? if self.penalty == 'none': ? ? ? ? ? ? if self.C != 1.0: # default values ? ? ? ? ? ? ? ? warnings.warn("Setting penalty='none' will ignore the C and? ? ? ? ? ? ? ? ? ?l1_ratio " ? ? ? ? ? ? ? ? ? ? "parameters") ? ? ? ? ? ? ? ? ? ? # Note that check for l1_ratio is done right above ? ? ? ? ? ? C_ = np.inf ? ? ? ? ? ? penalty = 'l2' ? ? ? ? else: ? ? ? ? ? ? C_ = self.C ? ? ? ? ? ? penalty = self.penalty ? ? ? ? if not isinstance(self.max_iter, numbers.Number) or self.max_iter < 0: ? ? ? ? ? ? raise ValueError("Maximum number of iteration must be positive;" ? ? ? ? ? ? ? ? " got (max_iter=%r)" %? ? ? ? ? ? ? ? ? self.max_iter) ? ? ? ? if not isinstance(self.tol, numbers.Number) or self.tol < 0: ? ? ? ? ? ? raise ValueError("Tolerance for stopping criteria must be " ? ? ? ? ? ? ? ? "positive; got (tol=%r)" %? ? ? ? ? ? ? ? ? self.tol) ? ? ? ? if solver == 'lbfgs': ? ? ? ? ? ? _dtype = np.float64 ? ? ? ? else: ? ? ? ? ? ? _dtype = [np.float64, np.float32] ? ? ? ? X, y = self._validate_data(X, y, accept_sparse='csr', dtype=_dtype,? ? ? ? ? ?order="C",? ? ? ? ? ? ? accept_large_sparse=solver != 'liblinear') ? ? ? ? check_classification_targets(y) ? ? ? ? self.classes_ = np.unique(y) ? ? ? ? multi_class = _check_multi_class(self.multi_class, solver,? ? ? ? ? ? ? len(self.classes_)) ? ? ? ? if solver == 'liblinear': ? ? ? ? ? ? if effective_n_jobs(self.n_jobs) != 1: ? ? ? ? ? ? ? ? warnings.warn("'n_jobs' > 1 does not have any effect when" ? ? ? ? ? ? ? ? ? ? " 'solver' is set to 'liblinear'. Got 'n_jobs'" ? ? ? ? ? ? ? ? ? ? " = {}.". ? ? ? ? ? ? ? ? ? ? format(effective_n_jobs(self.n_jobs))) ? ? ? ? ? ? self.coef_, self.intercept_, n_iter_ = _fit_liblinear(X, y, self.C, self. ? ? ? ? ? ? ?fit_intercept, self.intercept_scaling, self.class_weight, self.penalty, self. ? ? ? ? ? ? ?dual, self.verbose, self.max_iter, self.tol, self.random_state,? ? ? ? ? ? ? ? ? sample_weight=sample_weight) ? ? ? ? ? ? self.n_iter_ = np.array([n_iter_]) ? ? ? ? ? ? return self ? ? ? ? if solver in ['sag', 'saga']: ? ? ? ? ? ? max_squared_sum = row_norms(X, squared=True).max() ? ? ? ? else: ? ? ? ? ? ? max_squared_sum = None ? ? ? ? n_classes = len(self.classes_) ? ? ? ? classes_ = self.classes_ ? ? ? ? if n_classes < 2: ? ? ? ? ? ? raise ValueError( ? ? ? ? ? ? ? ? "This solver needs samples of at least 2 classes" ? ? ? ? ? ? ? ? " in the data, but the data contains only one" ? ? ? ? ? ? ? ? " class: %r" %? ? ? ? ? ? ? ? ? classes_[0]) ? ? ? ? if len(self.classes_) == 2: ? ? ? ? ? ? n_classes = 1 ? ? ? ? ? ? classes_ = classes_[1:] ? ? ? ? if self.warm_start: ? ? ? ? ? ? warm_start_coef = getattr(self, 'coef_', None) ? ? ? ? else: ? ? ? ? ? ? warm_start_coef = None ? ? ? ? if warm_start_coef is not None and self.fit_intercept: ? ? ? ? ? ? warm_start_coef = np.append(warm_start_coef,? ? ? ? ? ? ? ? ? self.intercept_[:np.newaxis],? ? ? ? ? ? ? ? ? axis=1) ? ? ? ? self.coef_ = list() ? ? ? ? self.intercept_ = np.zeros(n_classes) ? ? ? ? # Hack so that we iterate only once for the multinomial case. ? ? ? ? if multi_class == 'multinomial': ? ? ? ? ? ? classes_ = [None] ? ? ? ? ? ? warm_start_coef = [warm_start_coef] ? ? ? ? if warm_start_coef is None: ? ? ? ? ? ? warm_start_coef = [None] * n_classes ? ? ? ? path_func = delayed(_logistic_regression_path) ? ? ? ? # The SAG solver releases the GIL so it's more efficient to use ? ? ? ? # threads for this solver. ? ? ? ? if solver in ['sag', 'saga']: ? ? ? ? ? ? prefer = 'threads' ? ? ? ? else: ? ? ? ? ? ? prefer = 'processes' ? ? ? ? fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose, ** ? ? ? ? ? ? _joblib_parallel_args(prefer=prefer))( ? ? ? ? ? ? path_func(X, y, pos_class=class_, Cs=[C_],? ? ? ? ? ? ? ? ? l1_ratio=self.l1_ratio, fit_intercept=self.fit_intercept,? ? ? ? ? ? ? ? ? tol=self.tol, verbose=self.verbose, solver=solver,? ? ? ? ? ? ? ? ? multi_class=multi_class, max_iter=self.max_iter,? ? ? ? ? ? ? ? ? class_weight=self.class_weight, check_input=False,? ? ? ? ? ? ? ? ? random_state=self.random_state, coef=warm_start_coef_,? ? ? ? ? ? ? ? ? penalty=penalty, max_squared_sum=max_squared_sum,? ? ? ? ? ? ? ? ? sample_weight=sample_weight) for? ? ? ? ? ? ? (class_, warm_start_coef_) in zip(classes_, warm_start_coef)) ? ? ? ? fold_coefs_, _, n_iter_ = zip(*fold_coefs_) ? ? ? ? self.n_iter_ = np.asarray(n_iter_, dtype=np.int32)[:0] ? ? ? ? n_features = X.shape[1] ? ? ? ? if multi_class == 'multinomial': ? ? ? ? ? ? self.coef_ = fold_coefs_[0][0] ? ? ? ? else: ? ? ? ? ? ? self.coef_ = np.asarray(fold_coefs_) ? ? ? ? ? ? self.coef_ = self.coef_.reshape(n_classes, n_features +? ? ? ? ? ? ? ? ? int(self.fit_intercept)) ? ? ? ? if self.fit_intercept: ? ? ? ? ? ? self.intercept_ = self.coef_[:-1] ? ? ? ? ? ? self.coef_ = self.coef_[::-1] ? ? ? ? return self ? ?? ? ? def predict_proba(self, X): ? ? ? ? """ ? ? ? ? Probability estimates. ? ? ? ? The returned estimates for all classes are ordered by the?label of classes.??For a multi_class problem, if multi_class is set to be "multinomial"? the softmax function is used to find the predicted probability of? each class. ? ? ? ? Else use a one-vs-rest approach, i.e calculate the probability?of each class assuming it to be positive using the logistic function.? and normalize these values across all the classes. ? ? ? ? Parameters ? ? ? ? ---------- ? ? ? ? X : array-like of shape (n_samples, n_features) ? ? ? ? ? ? Vector to be scored, where `n_samples` is the number of samples??and?`n_features` is the number of features. ? ? ? ? Returns ? ? ? ? ------- ? ? ? ? T : array-like of shape (n_samples, n_classes) ? ? ? ? ? ? Returns the probability of the sample for each class in the model,?where classes are ordered as they are in ``self.classes_``. ? ? ? ? """ ? ? ? ? check_is_fitted(self) ? ? ? ? ovr = self.multi_class in ["ovr", "warn"] or (self.multi_class == 'auto'? ? ? ? ? ?and (self.classes_.size <= 2 or? ? ? ? ? ? ? ? ? self.solver == 'liblinear')) ? ? ? ? if ovr: ? ? ? ? ? ? return super()._predict_proba_lr(X) ? ? ? ? else: ? ? ? ? ? ? decision = self.decision_function(X) ? ? ? ? ? ? if decision.ndim == 1: ? ? ? ? ? ? ? ? # Workaround for multi_class="multinomial" and binary? ? ? ? ? ? ? ? ? ?outcomes ? ? ? ? ? ? ? ? # which requires softmax prediction with only a 1D decision. ? ? ? ? ? ? ? ? decision_2d = np.c_[-decisiondecision] ? ? ? ? ? ? else: ? ? ? ? ? ? ? ? decision_2d = decision ? ? ? ? ? ? return softmax(decision_2d, copy=False) ? ?? ? ? def predict_log_proba(self, X): ? ? ? ? """ ? ? ? ? Predict logarithm of probability estimates. ? ? ? ? The returned estimates for all classes are ordered by the?label of classes. ? ? ? ? Parameters ? ? ? ? ---------- ? ? ? ? X : array-like of shape (n_samples, n_features) ? ? ? ? ? ? Vector to be scored, where `n_samples` is the number of samples?? and? `n_features` is the number of features. ? ? ? ? Returns ? ? ? ? ------- ? ? ? ? T : array-like of shape (n_samples, n_classes) ? ? ? ? ? ? Returns the log-probability of the sample for each class in the? model, where classes are ordered as they are in ``self.classes_``. ? ? ? ? """ ? ? ? ? return np.log(self.predict_proba(X))	概率的估計。所有類返回的估計值都按照類的標簽排序。對于一個多類問題，將多類設為“多項式”，利用softmax函數求出每一類的預測概率。否則使用one vs-rest方法，i。計算概率的每一類假設它是正使用logistic函數。并在所有類中規范化這些值。

總結

以上是生活随笔為你收集整理的ML之sklearn：sklearn.linear_mode中的LogisticRegression函数的简介、使用方法之详细攻略的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： DataScience：对严重不均衡数据
下一篇：成功解决Fit Failed Warni

class LogisticRegression Found at: sklearn.linear_model._logisticclass LogisticRegression(BaseEstimator, LinearClassifierMixin,? SparseCoefMixin): ? ? """ ? ? Logistic Regression (aka logit, MaxEnt) classifier. ? ? In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr', and uses the cross-entropy loss if the 'multi_class' option is set to 'multinomial'. (Currently the 'multinomial' option is supported only by the 'lbfgs',?'sag', 'saga' and 'newton-cg' solvers.) ? ?? ? ? This class implements regularized logistic regression using the?'liblinear' library, 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers. Note that regularization is applied by default. It can handle both dense?and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit?floats for optimal performance; any other input format will be?converted?(and copied). ? ?? ? ? The 'newton-cg', 'sag', and 'lbfgs' solvers support only L2?regularization with primal formulation, or no regularization. The 'liblinear' solver?supports both L1 and L2 regularization, with a dual formulation only?for?the L2 penalty. The Elastic-Net regularization is only supported by the 'saga' solver. ? ?? ? ? Read more in the :ref:`User Guide <logistic_regression>`.	? ? 邏輯回歸(又名logit, MaxEnt)分類器。在多類情況下，如果“multi_class”選項設置為“OvR”，訓練算法使用one vs-rest (OvR)方案，如果“multi_class”選項設置為“多項”，訓練算法使用交叉熵損失。(目前，“多項”選項僅由“lbfgs”、“sag”、“saga”和“newton-cg”求解器支持。) 這個類使用“liblinear”庫、“newton-cg”、“sag”、“saga”和“lbfgs”求解器實現正則邏輯回歸。注意正則化是在默認情況下應用的。它可以處理稠密和稀疏輸入。使用C-ordered數組或包含64位浮點數的CSR矩陣，以獲得最佳性能;任何其他輸入格式都將被轉換(和復制)。 “newton-cg”、“sag”和“lbfgs”求解器只支持使用原始公式的L2正則化，或者不支持正則化。“liblinear”求解器支持L1和L2正則化，只有L2懲罰的對偶公式。彈性網正則化僅由“saga”求解器支持。詳見:ref: ' User Guide <logistic_regression> '。</logistic_regression>
? Parameters ? ? ---------- ? ? penalty : {'l1', 'l2', 'elasticnet', 'none'}, default='l2' ? ? Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 penalties. 'elasticnet' is only supported by the 'saga' solver. If 'none' (not supported by the?liblinear solver), no regularization is applied. ? ?? ? ? .. versionadded:: 0.19 ? ? l1 penalty with SAGA solver (allowing 'multinomial' + L1) ? ?? ? ? dual : bool, default=False ? ? Dual or primal formulation. Dual formulation is only implemented for?l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features. ? ?? ? ? tol : float, default=1e-4 ? ? Tolerance for stopping criteria. ? ?? ? ? C : float, default=1.0 ? ? Inverse of regularization strength; must be a positive float.? Like in support vector machines, smaller values specify stronger?regularization. ? ?? ? ? fit_intercept : bool, default=True ? ? Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function. ? ?? ? ? intercept_scaling : float, default=1 ? ? Useful only when the solver 'liblinear' is used? and self.fit_intercept is set to True. In this case, x becomes?[x, self.intercept_scaling],?i.e. a "synthetic" feature with constant value equal to?intercept_scaling is appended to the instance vector.The intercept becomes ``intercept_scaling * synthetic_feature_weight``. ? ?? ? ? Note! the synthetic feature weight is subject to l1/l2 regularization?as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased. ? ?? ? ? class_weight : dict or 'balanced', default=None ? ? Weights associated with classes in the form ``{class_label: weight}``.?If not given, all classes are supposed to have weight one. ? ?? ? ? The "balanced" mode uses the values of y to automatically adjust? weights inversely proportional to class frequencies in the input data??as ``n_samples / (n_classes * np.bincount(y))``. ? ?? ? ? Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified. ? ?? ? ? .. versionadded:: 0.17 ? ? class_weight='balanced' ? ?? ? ? random_state : int, RandomState instance, default=None?Used when ``solver`` == 'sag', 'saga' or 'liblinear' to shuffle the?data. See :term:`Glossary <random_state>` for details. ? ?? ? ? solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'}, \ default='lbfgs' ? ?? ? ? Algorithm to use in the optimization problem. ? ?? ? ? - For small datasets, 'liblinear' is a good choice, whereas 'sag' and?'saga' are faster for large ones. ? ? - For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs'? handle multinomial loss; 'liblinear' is limited to one-versus-rest? schemes. ? ? - 'newton-cg', 'lbfgs', 'sag' and 'saga' handle L2 or no penalty ? ? - 'liblinear' and 'saga' also handle L1 penalty ? ? - 'saga' also supports 'elasticnet' penalty ? ? - 'liblinear' does not support setting ``penalty='none'`` ? ?? ? ? Note that 'sag' and 'saga' fast convergence is only guaranteed on? features with approximately the same scale. You can?preprocess the data with a scaler from sklearn.preprocessing.	參數 --------- 處罰:{l1, l2,‘elasticnet’,‘沒有’},默認=“l2” 用于指定在處罰中使用的規范。“newton-cg”，“sag”和“lbfgs”求解器只支持l2懲罰。“elasticnet”僅由“saga”求解器支持。如果“none”(liblinear求解器不支持)，則不應用正則化。 . .versionadded:: 0.19 l1懲罰與SAGA求解器(允許“多項”+ l1) bool，默認=False 雙重或原始配方。對偶公式僅適用于l2罰用線性求解器。當n_samples > n_features時，preferred dual=False。 tol:浮動，默認=1e-4 停止標準的容忍度。 C: float, default=1.0 正則化強度的逆;必須是正浮點數。與支持向量機一樣，值越小，正則化越強。 fit_intercept: bool，默認=True 指定一個常數(即偏差或攔截)是否應該添加到決策函數中。 intercept_scaling:浮動，默認=1 只有在使用“liblinear”求解器和self時才有用。fit_intercept設置為True。在這種情況下，x變成[x, self。intercept_scaling]，即。一個常數值等于intercept_scaling的“合成”特性被附加到實例向量中。攔截變成' ' intercept_scaling * synthetic_feature_weight ' '。注意!合成特征權重與所有其他特征一樣，采用l1/l2正則化。為了減少正則化對合成特征權重的影響(因此對攔截的影響)，必須增加intercept_scaling。 class_weight: dict或'balanced'，默認為None 以' ' {class_label: weight} ' ' '形式關聯類的權重。如果沒有給出，所有類的權重都應該是1。 “平衡”模式使用y的值自動調整權重與輸入數據中的類頻率成反比，如' ' n_samples / (n_classes * np.bincount(y)) ' '。注意，如果指定了sample_weight，那么這些權重將與sample_weight相乘(通過fit方法傳遞)。 . .versionadded:: 0.17 * class_weight = '平衡' * random_state: int, RandomState instance, default=None，當' ' solver ' ' = 'sag'， 'saga'或'liblinear'洗發數據時使用。詳見:term: ' Glossary <random_state> '。</random_state> 解決:{‘newton-cg’,‘lbfgs’,‘liblinear’,“凹陷”,“傳奇”},\默認=“lbfgs” 算法用于優化問題。對于小數據集，“liblinear”是一個不錯的選擇，而“sag”和“saga”對于大數據集更快。 -對于多類問題，只有“newton-cg”、“sag”、“saga”和“lbfgs”處理多項損失;“liblinear”僅限于“一對二”方案。 - 'newton-cg'， 'lbfgs'， 'sag'和'saga'處理L2或沒有處罰 -“liblinear”和“saga”也可以處理L1懲罰 -《英雄傳奇》也支持《彈性網》的懲罰 - 'liblinear'不支持設置' ' penalty='none' ' ' 請注意，“sag”和“saga”的快速收斂只能保證在大致相同規模的特性上。您可以使用sklearn.preprocessing中的scaler對數據進行預處理。
? .. versionadded:: 0.17 ? ? Stochastic Average Gradient descent solver. ? ? .. versionadded:: 0.19 ? ? SAGA solver. ? ? .. versionchanged:: 0.22 ? ? The default solver changed from 'liblinear' to 'lbfgs' in 0.22. ? ?? ? ? max_iter : int, default=100 ? ? Maximum number of iterations taken for the solvers to converge. ? ?? ? ? multi_class : {'auto', 'ovr', 'multinomial'}, default='auto' ? ? If the option chosen is 'ovr', then a binary problem is fit for each?label. For 'multinomial' the loss minimised is the multinomial loss fit? across the entire probability distribution, even when the data is?binary. 'multinomial' is unavailable when solver='liblinear'.? 'auto' selects 'ovr' if the data is binary, or if solver='liblinear',??and otherwise selects 'multinomial'. ? ?? ? ? .. versionadded:: 0.18 ? ? Stochastic Average Gradient descent solver for 'multinomial' case. ? ? .. versionchanged:: 0.22 ? ? Default changed from 'ovr' to 'auto' in 0.22. ? ?? ? ? verbose : int, default=0 ? ? For the liblinear and lbfgs solvers set verbose to any positive? number for verbosity. ? ?? ? ? warm_start : bool, default=False ? ? When set to True, reuse the solution of the previous call to fit as?initialization, otherwise, just erase the previous solution.?Useless for liblinear solver. See :term:`the Glossary <warm_start>`. ? ?? ? ? .. versionadded:: 0.17 ? ? warm_start to support lbfgs, newton-cg, sag, saga solvers. ? ?? ? ? n_jobs : int, default=None ? ? Number of CPU cores used when parallelizing over classes if? multi_class='ovr'". This parameter is ignored when the ``solver`` is?set to 'liblinear' regardless of whether 'multi_class' is specified or?not. ``None`` means 1 unless in a :obj:`joblib.parallel_backend`??context. ``-1`` means using all processors. ? ? See :term:`Glossary <n_jobs>` for more details. ? ?? ? ? l1_ratio : float, default=None ? ? The Elastic-Net mixing parameter, with ``0 <= l1_ratio <= 1``. Only? used if ``penalty='elasticnet'``. Setting ``l1_ratio=0`` is equivalent?to using ``penalty='l2'``, while setting ``l1_ratio=1`` is equivalent??to using ``penalty='l1'``. For ``0 < l1_ratio <1``, the penalty is a combination of L1 and L2.	. .versionadded:: 0.17 隨機平均梯度下降求解器。 . .versionadded:: 0.19 SAGA solver。 . .versionchanged:: 0.22 在0.22中，默認求解器從“liblinear”更改為“lbfgs”。 max_iter: int，默認=100 使求解器收斂的最大迭代次數。 multi_class: {'auto'， 'ovr'， '多項'}，默認='auto' 如果選擇的選項是'ovr'，那么每個標簽都適合一個二進制問題。對于“多項”損失最小化是多項式損失適合整個概率分布，即使當數據是二進制。當求解器='liblinear'時，不可用多項式。auto選擇'ovr'如果數據是二進制的，或者solver='liblinear'，否則選擇'多項'。 . .versionadded:: 0.18 “多項式”情況的隨機平均梯度下降求解器。 . .versionchanged:: 0.22 在0.22中默認從“ovr”改為“auto”。 int，默認=0 對于liblinear和lbfgs求解器，將冗長設置為任意正數。 warm_start: bool，默認=False 當設置為True時，重用前面調用的解決方案以適合初始化，否則就擦除前面的解決方案。對于線性求解器是沒用的。參見:term: ' the Glossary <warm_start> '。</warm_start> . .versionadded:: 0.17 warm_start支持lbfgs， newton-cg， sag， saga*求解器。 n_jobs: int，默認=無如果multi_class='ovr'"，則在類上并行時使用的CPU核數。當' ' solver ' '被設置為'liblinear'時，不管'multi_class'是否被指定，這個參數都會被忽略。' ' None ' '表示1，除非在:obj: ' joblib.parallel_backend '上下文中。“-1”表示使用所有處理器。有關更多細節，請參見:term: ' Glossary <n_jobs> '。</n_jobs> l1_ratio: float, default=None 彈網混合參數``0 <= l1_ratio <= 1``。只在``penalty= ` elasticnet ``時使用。設置' ' l1_ratio=0 ' '等價于使用' ' penalty='l2' ' '，設置' ' l1_ratio=1 ' '等價于使用' ' penalty='l1' ' '。對于' ' 0 < l1_ratio <1 ' '，懲罰是L1和L2的組合。
? ? Attributes ? ? ---------- ? ?? ? ? classes_ : ndarray of shape (n_classes, ) ? ? A list of class labels known to the classifier. ? ?? ? ? coef_ : ndarray of shape (1, n_features) or (n_classes, n_features)?Coefficient of the features in the decision function. ? ?? ? ? `coef_` is of shape (1, n_features) when the given problem is binary. ? ? In particular, when `multi_class='multinomial'`, `coef_` corresponds?to outcome 1 (True) and `-coef_` corresponds to outcome 0 (False). ? ?? ? ? intercept_ : ndarray of shape (1,) or (n_classes,) ? ? Intercept (a.k.a. bias) added to the decision function. ? ?? ? ? If `fit_intercept` is set to False, the intercept is set to zero. ? ? `intercept_` is of shape (1,) when the given problem is binary.? In particular, when `multi_class='multinomial'`, `intercept_`??corresponds to outcome 1 (True) and `-intercept_` corresponds to?outcome 0 (False). ? ?? ? ? n_iter_ : ndarray of shape (n_classes,) or (1, ) ? ? Actual number of iterations for all classes. If binary or multinomial,? it returns only 1 element. For liblinear solver, only the maximum?number of iteration across all classes is given. ? ?? ? ? .. versionchanged:: 0.20 ? ?? ? ? In SciPy <= 1.0.0 the number of lbfgs iterations may exceed??``max_iter``. ``n_iter_`` will now report at most ``max_iter``. ? ?? ? ? See Also ? ? -------- ? ? SGDClassifier : Incrementally trained logistic regression (when given?the parameter ``loss="log"``). ? ? LogisticRegressionCV : Logistic regression with built-in cross?validation. ? ?? ? ? Notes ? ? ----- ? ? The underlying C implementation uses a random number generator to?select features when fitting the model. It is thus not uncommon,? to have slightly different results for the same input data. If? that happens, try with a smaller tol parameter. ? ?? ? ? Predict output may not match that of standalone liblinear in certain?cases. See :ref:`differences from liblinear <liblinear_differences>`? in the narrative documentation.	屬性 ?---------- classes_:形狀的ndarray 分類器已知的類標簽列表。 coef_:決策函數中特征的形狀(1,n_features)或(n_classes, n_features)系數的ndarray。當給定的問題是二進制時，' coef_ '是形狀(1,n_features)。特別是，當“multi_class=”多項“”時，“coef_”對應結果1 (True)，而“-coef_”對應結果0 (False)。 intercept_:形狀(1，)或(n_classes，)的ndarray 在決策函數中加入截距(即偏差)。如果' fit_intercept '設置為False，則攔截設置為零。當給定的問題是二進制時，intercept_ '的形狀是(1，)。特別是，當“multi_class=”多項“”時，“intercept_”對應結果1 (True)，而“-intercept_”對應結果0 (False)。 n_iter_:形狀(n_classes，)或(1，)的ndarray 所有類的實際迭代次數。如果是二項或多項，則只返回1個元素。對于線性求解器，只給出了所有類的最大迭代次數。 . .versionchanged:: 0.20 在SciPy <= 1.0.0中，lbfgs迭代次數可能超過' ' max_iter ' '。' ' n_iter_ ' '現在最多報告' ' max_iter ' '。另請參閱 -------- 增量訓練邏輯回歸(當給定參數' ' loss="log" ' ')。邏輯回歸cv:內置交叉驗證的邏輯回歸。筆記     ----- 底層的C實現使用一個隨機數生成器來選擇適合模型的特性。因此，對于相同的輸入數據，結果略有不同的情況并不少見。如果出現這種情況，嘗試使用較小的tol參數。在某些情況下，Predict輸出可能與獨立liblinear的輸出不匹配。參見:ref:“區別于liblinear <liblinear_differences>”。</liblinear_differences>
? ? References ? ? ---------- ? ?? ? ? L-BFGS-B -- Software for Large-scale Bound-constrained Optimization?Ciyou Zhu, Richard Byrd, Jorge Nocedal and Jose Luis Morales.?http://users.iems.northwestern.edu/~nocedal/lbfgsb.html ? ?? ? ? LIBLINEAR -- A Library for Large Linear Classification ? ? https://www.csie.ntu.edu.tw/~cjlin/liblinear/ ? ?? ? ? SAG -- Mark Schmidt, Nicolas Le Roux, and Francis Bach?Minimizing Finite Sums with the Stochastic Average Gradient ? ? https://hal.inria.fr/hal-00860051/document ? ?? ? ? SAGA -- Defazio, A., Bach F. & Lacoste-Julien S. (2014). ? ? SAGA: A Fast Incremental Gradient Method With Support?for Non-Strongly Convex Composite Objectives ? ? https://arxiv.org/abs/1407.0202 ? ?? ? ? Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate??descent ? ? methods for logistic regression and maximum entropy models.??Machine Learning 85(1-2):41-75. ? ? https://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf	引用 --------- Ciyou Zhu, Richard Byrd, Jorge Nocedal和Jose Luis moral. http://users.iems.northwestern.edu/~ Nocedal /lbfgsb.html LIBLINEAR——一個大型線性分類的圖書館 https://www.csie.ntu.edu.tw/ cjlin / liblinear / SAG——Mark Schmidt, Nicolas Le Roux和Francis Bach用隨機平均梯度最小化有限和 https://hal.inria.fr/hal-00860051/document 佐賀—德法齊奧，巴赫F. &拉科斯特-朱利安S.(2014)。一個支持非強凸復合目標的快速增量梯度方法 https://arxiv.org/abs/1407.0202 俞香福、黃方蘭、林志仁(2011)。雙坐標下降邏輯回歸和最大熵模型的方法。機器學習85 (1 - 2):41 - 75。 https://www.csie.ntu.edu.tw/ cjlin /論文/ maxent_dual.pdf
? ? Examples ? ? -------- ? ? >>> from sklearn.datasets import load_iris ? ? >>> from sklearn.linear_model import LogisticRegression ? ? >>> X, y = load_iris(return_X_y=True) ? ? >>> clf = LogisticRegression(random_state=0).fit(X, y) ? ? >>> clf.predict(X[:2, :]) ? ? array([0, 0]) ? ? >>> clf.predict_proba(X[:2, :]) ? ? array([[9.8...e-01, 1.8...e-02, 1.4...e-08], ? ? [9.7...e-01, 2.8...e-02, ...e-08]]) ? ? >>> clf.score(X, y) ? ? 0.97... ? ? """ ? ? @_deprecate_positional_args	?
? ? def __init__(self, penalty='l2', , dual=False, tol=1e-4, C=1.0,? ? ? ? ? fit_intercept=True, intercept_scaling=1, class_weight=None,? ? ? ? ? random_state=None, solver='lbfgs', max_iter=100,? ? ? ? ? multi_class='auto', verbose=0, warm_start=False, n_jobs=None,? ? ? ? ? l1_ratio=None): ? ? ? ? self.penalty = penalty ? ? ? ? self.dual = dual ? ? ? ? self.tol = tol ? ? ? ? self.C = C ? ? ? ? self.fit_intercept = fit_intercept ? ? ? ? self.intercept_scaling = intercept_scaling ? ? ? ? self.class_weight = class_weight ? ? ? ? self.random_state = random_state ? ? ? ? self.solver = solver ? ? ? ? self.max_iter = max_iter ? ? ? ? self.multi_class = multi_class ? ? ? ? self.verbose = verbose ? ? ? ? self.warm_start = warm_start ? ? ? ? self.n_jobs = n_jobs ? ? ? ? self.l1_ratio = l1_ratio ? ?? ? ? def fit(self, X, y, sample_weight=None): ? ? ? ? """ ? ? ? ? Fit the model according to the given training data. ? ? ? ? Parameters ? ? ? ? ---------- ? ? ? ? X : {array-like, sparse matrix} of shape (n_samples, n_features) ? ? ? ? ? ? Training vector, where n_samples is the number of samples and ? ? ? ? ? ? n_features is the number of features. ? ? ? ? y : array-like of shape (n_samples,) ? ? ? ? ? ? Target vector relative to X. ? ? ? ? sample_weight : array-like of shape (n_samples,) default=None ? ? ? ? ? ? Array of weights that are assigned to individual samples. ? ? ? ? ? ? If not provided, then each sample is given unit weight. ? ? ? ? ? ? .. versionadded:: 0.17 ? ? ? ? ? ? ? ?sample_weight* support to LogisticRegression. ? ? ? ? Returns ? ? ? ? ------- ? ? ? ? self ? ? ? ? ? ? Fitted estimator. ? ? ? ? Notes ? ? ? ? ----- ? ? ? ? The SAGA solver supports both float64 and float32 bit arrays. ? ? ? ? """ ? ? ? ? solver = _check_solver(self.solver, self.penalty, self.dual) ? ? ? ? if not isinstance(self.C, numbers.Number) or self.C < 0: ? ? ? ? ? ? raise ValueError( ? ? ? ? ? ? ? ? "Penalty term must be positive; got (C=%r)" % self.C) ? ? ? ? if self.penalty == 'elasticnet': ? ? ? ? ? ? if (not isinstance(self.l1_ratio, numbers.Number) or? ? ? ? ? ? ? ? ? self.l1_ratio < 0 or self.l1_ratio > 1): ? ? ? ? ? ? ? ? raise ValueError( ? ? ? ? ? ? ? ? ? ? "l1_ratio must be between 0 and 1;" ? ? ? ? ? ? ? ? ? ? " got (l1_ratio=%r)" %? ? ? ? ? ? ? ? ? ? ? self.l1_ratio) ? ? ? ? elif self.l1_ratio is not None: ? ? ? ? ? ? warnings.warn("l1_ratio parameter is only used when penalty is " ? ? ? ? ? ? ? ? "'elasticnet'. Got " ? ? ? ? ? ? ? ? "(penalty={})". ? ? ? ? ? ? ? ? format(self.penalty)) ? ? ? ? if self.penalty == 'none': ? ? ? ? ? ? if self.C != 1.0: # default values ? ? ? ? ? ? ? ? warnings.warn("Setting penalty='none' will ignore the C and? ? ? ? ? ? ? ? ? ?l1_ratio " ? ? ? ? ? ? ? ? ? ? "parameters") ? ? ? ? ? ? ? ? ? ? # Note that check for l1_ratio is done right above ? ? ? ? ? ? C_ = np.inf ? ? ? ? ? ? penalty = 'l2' ? ? ? ? else: ? ? ? ? ? ? C_ = self.C ? ? ? ? ? ? penalty = self.penalty ? ? ? ? if not isinstance(self.max_iter, numbers.Number) or self.max_iter < 0: ? ? ? ? ? ? raise ValueError("Maximum number of iteration must be positive;" ? ? ? ? ? ? ? ? " got (max_iter=%r)" %? ? ? ? ? ? ? ? ? self.max_iter) ? ? ? ? if not isinstance(self.tol, numbers.Number) or self.tol < 0: ? ? ? ? ? ? raise ValueError("Tolerance for stopping criteria must be " ? ? ? ? ? ? ? ? "positive; got (tol=%r)" %? ? ? ? ? ? ? ? ? self.tol) ? ? ? ? if solver == 'lbfgs': ? ? ? ? ? ? _dtype = np.float64 ? ? ? ? else: ? ? ? ? ? ? _dtype = [np.float64, np.float32] ? ? ? ? X, y = self._validate_data(X, y, accept_sparse='csr', dtype=_dtype,? ? ? ? ? ?order="C",? ? ? ? ? ? ? accept_large_sparse=solver != 'liblinear') ? ? ? ? check_classification_targets(y) ? ? ? ? self.classes_ = np.unique(y) ? ? ? ? multi_class = _check_multi_class(self.multi_class, solver,? ? ? ? ? ? ? len(self.classes_)) ? ? ? ? if solver == 'liblinear': ? ? ? ? ? ? if effective_n_jobs(self.n_jobs) != 1: ? ? ? ? ? ? ? ? warnings.warn("'n_jobs' > 1 does not have any effect when" ? ? ? ? ? ? ? ? ? ? " 'solver' is set to 'liblinear'. Got 'n_jobs'" ? ? ? ? ? ? ? ? ? ? " = {}.". ? ? ? ? ? ? ? ? ? ? format(effective_n_jobs(self.n_jobs))) ? ? ? ? ? ? self.coef_, self.intercept_, n_iter_ = _fit_liblinear(X, y, self.C, self. ? ? ? ? ? ? ?fit_intercept, self.intercept_scaling, self.class_weight, self.penalty, self. ? ? ? ? ? ? ?dual, self.verbose, self.max_iter, self.tol, self.random_state,? ? ? ? ? ? ? ? ? sample_weight=sample_weight) ? ? ? ? ? ? self.n_iter_ = np.array([n_iter_]) ? ? ? ? ? ? return self ? ? ? ? if solver in ['sag', 'saga']: ? ? ? ? ? ? max_squared_sum = row_norms(X, squared=True).max() ? ? ? ? else: ? ? ? ? ? ? max_squared_sum = None ? ? ? ? n_classes = len(self.classes_) ? ? ? ? classes_ = self.classes_ ? ? ? ? if n_classes < 2: ? ? ? ? ? ? raise ValueError( ? ? ? ? ? ? ? ? "This solver needs samples of at least 2 classes" ? ? ? ? ? ? ? ? " in the data, but the data contains only one" ? ? ? ? ? ? ? ? " class: %r" %? ? ? ? ? ? ? ? ? classes_[0]) ? ? ? ? if len(self.classes_) == 2: ? ? ? ? ? ? n_classes = 1 ? ? ? ? ? ? classes_ = classes_[1:] ? ? ? ? if self.warm_start: ? ? ? ? ? ? warm_start_coef = getattr(self, 'coef_', None) ? ? ? ? else: ? ? ? ? ? ? warm_start_coef = None ? ? ? ? if warm_start_coef is not None and self.fit_intercept: ? ? ? ? ? ? warm_start_coef = np.append(warm_start_coef,? ? ? ? ? ? ? ? ? self.intercept_[:np.newaxis],? ? ? ? ? ? ? ? ? axis=1) ? ? ? ? self.coef_ = list() ? ? ? ? self.intercept_ = np.zeros(n_classes) ? ? ? ? # Hack so that we iterate only once for the multinomial case. ? ? ? ? if multi_class == 'multinomial': ? ? ? ? ? ? classes_ = [None] ? ? ? ? ? ? warm_start_coef = [warm_start_coef] ? ? ? ? if warm_start_coef is None: ? ? ? ? ? ? warm_start_coef = [None] * n_classes ? ? ? ? path_func = delayed(_logistic_regression_path) ? ? ? ? # The SAG solver releases the GIL so it's more efficient to use ? ? ? ? # threads for this solver. ? ? ? ? if solver in ['sag', 'saga']: ? ? ? ? ? ? prefer = 'threads' ? ? ? ? else: ? ? ? ? ? ? prefer = 'processes' ? ? ? ? fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose, ** ? ? ? ? ? ? _joblib_parallel_args(prefer=prefer))( ? ? ? ? ? ? path_func(X, y, pos_class=class_, Cs=[C_],? ? ? ? ? ? ? ? ? l1_ratio=self.l1_ratio, fit_intercept=self.fit_intercept,? ? ? ? ? ? ? ? ? tol=self.tol, verbose=self.verbose, solver=solver,? ? ? ? ? ? ? ? ? multi_class=multi_class, max_iter=self.max_iter,? ? ? ? ? ? ? ? ? class_weight=self.class_weight, check_input=False,? ? ? ? ? ? ? ? ? random_state=self.random_state, coef=warm_start_coef_,? ? ? ? ? ? ? ? ? penalty=penalty, max_squared_sum=max_squared_sum,? ? ? ? ? ? ? ? ? sample_weight=sample_weight) for? ? ? ? ? ? ? (class_, warm_start_coef_) in zip(classes_, warm_start_coef)) ? ? ? ? fold_coefs_, _, n_iter_ = zip(*fold_coefs_) ? ? ? ? self.n_iter_ = np.asarray(n_iter_, dtype=np.int32)[:0] ? ? ? ? n_features = X.shape[1] ? ? ? ? if multi_class == 'multinomial': ? ? ? ? ? ? self.coef_ = fold_coefs_[0][0] ? ? ? ? else: ? ? ? ? ? ? self.coef_ = np.asarray(fold_coefs_) ? ? ? ? ? ? self.coef_ = self.coef_.reshape(n_classes, n_features +? ? ? ? ? ? ? ? ? int(self.fit_intercept)) ? ? ? ? if self.fit_intercept: ? ? ? ? ? ? self.intercept_ = self.coef_[:-1] ? ? ? ? ? ? self.coef_ = self.coef_[::-1] ? ? ? ? return self ? ?? ? ? def predict_proba(self, X): ? ? ? ? """ ? ? ? ? Probability estimates. ? ? ? ? The returned estimates for all classes are ordered by the?label of classes.??For a multi_class problem, if multi_class is set to be "multinomial"? the softmax function is used to find the predicted probability of? each class. ? ? ? ? Else use a one-vs-rest approach, i.e calculate the probability?of each class assuming it to be positive using the logistic function.? and normalize these values across all the classes. ? ? ? ? Parameters ? ? ? ? ---------- ? ? ? ? X : array-like of shape (n_samples, n_features) ? ? ? ? ? ? Vector to be scored, where `n_samples` is the number of samples??and?`n_features` is the number of features. ? ? ? ? Returns ? ? ? ? ------- ? ? ? ? T : array-like of shape (n_samples, n_classes) ? ? ? ? ? ? Returns the probability of the sample for each class in the model,?where classes are ordered as they are in ``self.classes_``. ? ? ? ? """ ? ? ? ? check_is_fitted(self) ? ? ? ? ovr = self.multi_class in ["ovr", "warn"] or (self.multi_class == 'auto'? ? ? ? ? ?and (self.classes_.size <= 2 or? ? ? ? ? ? ? ? ? self.solver == 'liblinear')) ? ? ? ? if ovr: ? ? ? ? ? ? return super()._predict_proba_lr(X) ? ? ? ? else: ? ? ? ? ? ? decision = self.decision_function(X) ? ? ? ? ? ? if decision.ndim == 1: ? ? ? ? ? ? ? ? # Workaround for multi_class="multinomial" and binary? ? ? ? ? ? ? ? ? ?outcomes ? ? ? ? ? ? ? ? # which requires softmax prediction with only a 1D decision. ? ? ? ? ? ? ? ? decision_2d = np.c_[-decisiondecision] ? ? ? ? ? ? else: ? ? ? ? ? ? ? ? decision_2d = decision ? ? ? ? ? ? return softmax(decision_2d, copy=False) ? ?? ? ? def predict_log_proba(self, X): ? ? ? ? """ ? ? ? ? Predict logarithm of probability estimates. ? ? ? ? The returned estimates for all classes are ordered by the?label of classes. ? ? ? ? Parameters ? ? ? ? ---------- ? ? ? ? X : array-like of shape (n_samples, n_features) ? ? ? ? ? ? Vector to be scored, where `n_samples` is the number of samples?? and? `n_features` is the number of features. ? ? ? ? Returns ? ? ? ? ------- ? ? ? ? T : array-like of shape (n_samples, n_classes) ? ? ? ? ? ? Returns the log-probability of the sample for each class in the? model, where classes are ordered as they are in ``self.classes_``. ? ? ? ? """ ? ? ? ? return np.log(self.predict_proba(X))	概率的估計。所有類返回的估計值都按照類的標簽排序。對于一個多類問題，將多類設為“多項式”，利用softmax函數求出每一類的預測概率。否則使用one vs-rest方法，i。計算概率的每一類假設它是正使用logistic函數。并在所有類中規范化這些值。