當前位置：首頁 >

tensorflow基础练习：线性模型

發布時間：2025/3/18 41 豆豆

生活随笔收集整理的這篇文章主要介紹了 tensorflow基础练习：线性模型小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

TensorFlow是一個面向數值計算的通用平臺，可以方便地訓練線性模型。下面采用TensorFlow完成Andrew Ng主講的Deep Learning課程練習題，提供了整套源碼。

線性回歸
多元線性回歸
邏輯回歸

線性回歸

# -*- coding: utf-8 -*- """ Created on Wed Sep 6 19:46:04 2017@author: Administrator """#!/usr/bin/env python # -*- coding=utf-8 -*- # @author: ranjiewen # @date: 2017-9-6 # @description: compare scikit-learn and tensorflow, using linear regression data from deep learning course by Andrew Ng. # @ref: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex2/ex2.htmlimport tensorflow as tf import numpy as np from sklearn import linear_model# Read x and y #x_data = np.loadtxt("ex2x.dat") #y_data = np.loadtxt("ex2y.dat")x_data = np.random.rand(100).astype(np.float32) y_data = x_data * 0.1 + 0.3+np.random.rand(100)# We use scikit-learn first to get a sense of the coefficients reg = linear_model.LinearRegression() reg.fit(x_data.reshape(-1, 1), y_data)print ("Coefficient of scikit-learn linear regression: k=%f, b=%f" % (reg.coef_, reg.intercept_))# Then we apply tensorflow to achieve the similar results # The structure of tensorflow code can be divided into two parts:# First part: set up computation graph W = tf.Variable(tf.random_uniform([1], -1.0, 1.0)) b = tf.Variable(tf.zeros([1])) y = W * x_data + bloss = tf.reduce_mean(tf.square(y - y_data)) / 2 # 對于tensorflow，梯度下降的步長alpha參數需要很仔細的設置，步子太大容易扯到蛋導致無法收斂；步子太小容易等得蛋疼。迭代次數也需要細致的嘗試。 optimizer = tf.train.GradientDescentOptimizer(0.07) # Try 0.1 and you will see unconvergency train = optimizer.minimize(loss)init = tf.initialize_all_variables()# Second part: launch the graph sess = tf.Session() sess.run(init)for step in range(1500):sess.run(train)if step % 100 == 0:print (step, sess.run(W), sess.run(b)) print ("Coeeficient of tensorflow linear regression: k=%f, b=%f" % (sess.run(W), sess.run(b)))

思考：對于tensorflow，梯度下降的步長alpha參數需要很仔細的設置，步子太大容易扯到蛋導致無法收斂；步子太小容易等得蛋疼。迭代次數也需要細致的嘗試。

多元線性回歸

# -*- coding: utf-8 -*- """ Created on Wed Sep 6 19:53:24 2017@author: Administrator """import numpy as np import tensorflow as tf from numpy import mat from sklearn import linear_model from sklearn import preprocessing# Read x and y #x_data = np.loadtxt("ex3x.dat").astype(np.float32) #y_data = np.loadtxt("ex3y.dat").astype(np.float32)x_data = [np.random.rand(100).astype(np.float32),np.random.rand(100).astype(np.float32)+10] x_data=mat(x_data).T y_data = 5.3+np.random.rand(100)# We evaluate the x and y by sklearn to get a sense of the coefficients. reg = linear_model.LinearRegression() reg.fit(x_data, y_data) print ("Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_))# Now we use tensorflow to get similar results.# Before we put the x_data into tensorflow, we need to standardize it # in order to achieve better performance in gradient descent; # If not standardized, the convergency speed could not be tolearated. # Reason: If a feature has a variance that is orders of magnitude larger than others, # it might dominate the objective function # and make the estimator unable to learn from other features correctly as expected. # 對于梯度下降算法，變量是否標準化很重要。在這個例子中，變量一個是面積，一個是房間數，量級相差很大，如果不歸一化，面積在目標函數和梯度中就會占據主導地位，導致收斂極慢。 scaler = preprocessing.StandardScaler().fit(x_data) print (scaler.mean_, scaler.scale_) x_data_standard = scaler.transform(x_data)W = tf.Variable(tf.zeros([2, 1])) b = tf.Variable(tf.zeros([1, 1])) y = tf.matmul(x_data_standard, W) + bloss = tf.reduce_mean(tf.square(y - y_data.reshape(-1, 1)))/2 optimizer = tf.train.GradientDescentOptimizer(0.3) train = optimizer.minimize(loss)init = tf.initialize_all_variables()sess = tf.Session() sess.run(init) for step in range(100):sess.run(train)if step % 10 == 0:print (step, sess.run(W).flatten(), sess.run(b).flatten())print ("Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten())) print ("Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W))))

思路：對于梯度下降算法，變量是否標準化很重要。在這個例子中，變量一個是面積，一個是房間數，量級相差很大，如果不歸一化，面積在目標函數和梯度中就會占據主導地位，導致收斂極慢。

邏輯回歸

數據下載：Exercise: Logistic Regression and Newton's Method

# -*- coding: utf-8 -*- """ Created on Wed Sep 6 20:13:15 2017 數據下載：http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html@author: Administrator """import tensorflow as tf import numpy as np from numpy import mat from sklearn.linear_model import LogisticRegression from sklearn import preprocessing# Read x and y x_data = np.loadtxt("ex4Data/ex4x.dat").astype(np.float32) y_data = np.loadtxt("ex4Data/ex4y.dat").astype(np.float32)#x_data = [np.random.rand(100).astype(np.float32),np.random.rand(100).astype(np.float32)+10] #x_data=mat(x_data).T #y_data = 5.3+np.random.rand(100)scaler = preprocessing.StandardScaler().fit(x_data) x_data_standard = scaler.transform(x_data)# We evaluate the x and y by sklearn to get a sense of the coefficients. reg = LogisticRegression(C=999999999, solver="newton-cg") # Set C as a large positive number to minimize the regularization effect reg.fit(x_data, y_data) print ("Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_))# Now we use tensorflow to get similar results. W = tf.Variable(tf.zeros([2, 1])) b = tf.Variable(tf.zeros([1, 1])) y = 1 / (1 + tf.exp(-tf.matmul(x_data_standard, W) + b)) loss = tf.reduce_mean(- y_data.reshape(-1, 1) * tf.log(y) - (1 - y_data.reshape(-1, 1)) * tf.log(1 - y))optimizer = tf.train.GradientDescentOptimizer(1.3) train = optimizer.minimize(loss)init = tf.initialize_all_variables()sess = tf.Session() sess.run(init) for step in range(100):sess.run(train)if step % 10 == 0:print (step, sess.run(W).flatten(), sess.run(b).flatten())print ("Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten())) print ("Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W))))# Problem solved and we are happy. But... # I'd like to implement the logistic regression from a multi-class viewpoint instead of binary. # In machine learning domain, it is called softmax regression # In economic and statistics domain, it is called multinomial logit (MNL) model, proposed by Daniel McFadden, who shared the 2000 Nobel Memorial Prize in Economic Sciences.print ("------------------------------------------------") print ("We solve this binary classification problem again from the viewpoint of multinomial classification") print ("------------------------------------------------")# As a tradition, sklearn first reg = LogisticRegression(C=9999999999, solver="newton-cg", multi_class="multinomial") reg.fit(x_data, y_data) print ("Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_)) print ("A little bit difference at first glance. What about multiply them with 2?")# Then try tensorflow W = tf.Variable(tf.zeros([2, 2])) # first 2 is feature number, second 2 is class number b = tf.Variable(tf.zeros([1, 2])) V = tf.matmul(x_data_standard, W) + b y = tf.nn.softmax(V) # tensorflow provide a utility function to calculate the probability of observer n choose alternative i, you can replace it with `y = tf.exp(V) / tf.reduce_sum(tf.exp(V), keep_dims=True, reduction_indices=[1])`# Encode the y label in one-hot manner lb = preprocessing.LabelBinarizer() lb.fit(y_data) y_data_trans = lb.transform(y_data) y_data_trans = np.concatenate((1 - y_data_trans, y_data_trans), axis=1) # Only necessary for binary class loss = tf.reduce_mean(-tf.reduce_sum(y_data_trans * tf.log(y), reduction_indices=[1])) optimizer = tf.train.GradientDescentOptimizer(1.3) train = optimizer.minimize(loss)init = tf.initialize_all_variables()sess = tf.Session() sess.run(init) for step in range(100):sess.run(train)if step % 10 == 0:print (step, sess.run(W).flatten(), sess.run(b).flatten())print ("Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten())) print ("Coefficients of tensorflow (raw input): K=%s, b=%s" % ((sess.run(W) / scaler.scale_).flatten(), sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W))))

思考：
對于邏輯回歸，損失函數比線性回歸模型復雜了一些。首先需要通過sigmoid函數，將線性回歸的結果轉化為0至1之間的概率值。然后寫出每個樣本的發生概率（似然），那么所有樣本的發生概率就是每個樣本發生概率的乘積。為了求導方便，我們對所有樣本的發生概率取對數，保持其單調性的同時，可以將連乘變為求和（加法的求導公式比乘法的求導公式簡單很多）。對數極大似然估計方法的目標函數是最大化所有樣本的發生概率；機器學習習慣將目標函數稱為損失，所以將損失定義為對數似然的相反數，以轉化為極小值問題。
我們提到邏輯回歸時，一般指的是二分類問題；然而這套思想是可以很輕松就拓展為多分類問題的，在機器學習領域一般稱為softmax回歸模型。本文的作者是統計學與計量經濟學背景，因此一般將其稱為MNL模型。

Reference：

基礎練習：線性模型

總結

以上是生活随笔為你收集整理的tensorflow基础练习：线性模型的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。