當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

「Tensorflow」TensorFlow基本使用步骤——以线性回归为练习

發布時間：2024/9/27 编程问答 58 豆豆

生活随笔收集整理的這篇文章主要介紹了「Tensorflow」TensorFlow基本使用步骤——以线性回归为练习小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前期準備

加載必要的庫

from __future__ import print_functionimport mathfrom IPython import display from matplotlib import cm from matplotlib import gridspec from matplotlib import pyplot as plt import numpy as np import pandas as pd from sklearn import metrics import tensorflow as tf from tensorflow.python.data import Datasettf.logging.set_verbosity(tf.logging.ERROR) pd.options.display.max_rows = 10 pd.options.display.float_format = '{:.1f}'.format

加載數據集

california_housing_dataframe = pd.read_csv("https://download.mlcc.google.cn/mledu-datasets/california_housing_train.csv", sep=",")

對數據進行隨機化處理，以確保不會出現任何病態排序結果（可能會損害隨機梯度下降法的效果）。此外，將 median_house_value 調整為以千為單位，這樣，模型就能夠以常用范圍內的學習速率較為輕松地學習這些數據。

california_housing_dataframe = california_housing_dataframe.reindex(np.random.permutation(california_housing_dataframe.index)) california_housing_dataframe["median_house_value"] /= 1000.0 california_housing_dataframe

output would be like this:

檢查數據

使用數據前利用california_housing_dataframe.describe()對數據進行統計處理，得到關于各列的一些實用統計信息快速摘要：樣本數、均值、標準偏差、最大值、最小值和各種分位數。

california_housing_dataframe.describe()

output would be like this:

開始構建第一個模型

練習目標是嘗試預測median_house_value的值，使用total_rooms作為輸入特征。
為了訓練模型，這里使用TensorFlow Estimator API 提供的LinearRegressor接口。此 API 負責處理大量低級別模型搭建工作，并會提供執行模型訓練、評估和推理的便利方法。

定義特征并配置特征列

為了將訓練數據導入 TensorFlow，需要指定每個特征包含的數據類型。主要使用以下兩類數據：

分類數據，文字型數據，不包含任何分類特征，包括一些無用的文字或修飾詞。
數值數據，數值型數據（整數或者浮點）。
此時的輸入數值特征為total_rooms，下面的代碼會從california_housing_dataframe中提取total_rooms數據，并使用numeric_column 來定義特征列，這樣會將其數據指定為數值：

# Define the input feature: total_rooms. my_feature = california_housing_dataframe[["total_rooms"]]# Configure a numeric feature column for total_rooms. feature_columns = [tf.feature_column.numeric_column("total_rooms")]

注意：total_rooms數據的形狀是一維數組（每個街區的房間總數列表）。這是 numeric_column 的默認形狀，因此我們不必將其作為參數傳遞。

定義目標

定義目標，即定義median_housing_dataframe，可以從 california_housing_dataframe 中提取它：

# Define the label. targets = california_housing_dataframe["median_house_value"]

配置LinearRegressor

使用LinearRegressor配置線性回歸模型，使用GradientDescenOptimizer（能實現小批量隨機梯度下降法（SGD））訓練該模型，learning_rate參數課控制梯度步長的大小。

接下來，我們將使用 LinearRegressor 配置線性回歸模型，并使用 GradientDescentOptimizer（它會實現小批量隨機梯度下降法 (SGD)）訓練該模型。learning_rate 參數可控制梯度步長的大小。

注意：為了安全起見，還可以通過 clip_gradients_by_norm 將梯度剪裁應用到優化器。梯度裁剪可確保梯度大小在訓練期間不會變得過大，梯度過大會導致梯度下降法失敗。

# Use gradient descent as the optimizer for training the model. my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001) my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)# Configure the linear regression model with our feature columns and optimizer. # Set a learning rate of 0.0000001 for Gradient Descent. linear_regressor = tf.estimator.LinearRegressor(feature_columns=feature_columns,optimizer=my_optimizer )

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

定義輸入函數

要將據導入 LinearRegressor，需要定義一個輸入函數，讓它告訴 TensorFlow 如何對數據進行預處理，以及在模型訓練期間如何批處理、隨機處理和重復數據。
首先，將 Pandas 特征數據轉換成 NumPy 數組字典。然后，使用 TensorFlow Dataset API 根據數據來構建 Dataset 對象，并將數據拆分成大小為 batch_size 的多批數據，以按照指定周期數 (num_epochs) 進行重復。

注意：如果將默認值 num_epochs=None 傳遞到 repeat()，輸入數據會無限期重復。

然后，如果 shuffle 設置為 True，則會對數據進行隨機處理，以便數據在訓練期間以隨機方式傳遞到模型。buffer_size 參數會指定 shuffle 將從中隨機抽樣的數據集的大小。

最后，輸入函數會為該數據集構建一個迭代器，并向 LinearRegressor 返回下一批數據。

訓練模型

在 linear_regressor 上調用 train() 來訓練模型。將 my_input_fn 封裝在 lambda 中，以便可以將 my_feature 和 target 作為參數傳入（有關詳情，請參閱 TensorFlow 輸入函數教程），首先訓練 100 步。

_ = linear_regressor.train(input_fn = lambda:my_input_fn(my_feature, targets),steps=100 )

評估模型

基于訓練數據做一次預測，看模型在訓練期間與這些數據的擬合情況。
注意：訓練誤差可以衡量模型與訓練數據的擬合情況，但并不能衡量模型泛化到新數據的效果。

…不想搬了，待續

總結

以上是生活随笔為你收集整理的「Tensorflow」TensorFlow基本使用步骤——以线性回归为练习的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：玉米大碴粥发苦怎么回事
下一篇： 'gbk' codec can't de