當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

保存模型后无法训练_模型构建到部署实践

發(fā)布時(shí)間：2024/9/30 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了保存模型后无法训练_模型构建到部署实践小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

導(dǎo)讀

在工業(yè)界一般會采用了tensorflow-serving進(jìn)行模型的部署，而在模型構(gòu)建時(shí)會因人而異會使用不同的深度學(xué)習(xí)框架，這就需要在使用指定深度學(xué)習(xí)框架訓(xùn)練出模型后，統(tǒng)一將模型轉(zhuǎn)為pb格式，便于使用tensorflow-serving進(jìn)行部署，本人在部署的過程中碰到了很多的問題。為此，文本對整個(gè)流程進(jìn)行總結(jié)，首先介紹如何使用不同的深度學(xué)習(xí)框架構(gòu)建模型，獲得訓(xùn)練好的模型后將其轉(zhuǎn)為pb格式的模型，然后采用容器+tensorflow-serving進(jìn)行模型部署，最后探討使用http和grpc進(jìn)行inference的實(shí)踐和性能對比。

深度學(xué)習(xí)模型構(gòu)建和模型保存

深度學(xué)習(xí)框架比較流行的包括tensorflow，keras，pytorch，cntk，mxnet和theano。篇幅原因，本文介紹tensorflow和keras的模型構(gòu)建，主要是因?yàn)閳F(tuán)隊(duì)主要使用這兩個(gè)框架進(jìn)行模型構(gòu)建。需要注意的是，tensorflow2.0+版本已經(jīng)將keras作為框架的默認(rèn)API。tensorflow2.0和tensorflow1.x版本相差較大，而keras各個(gè)版本在模型構(gòu)建方面差別不大，并且tensorflow2.0使用keras構(gòu)建和原生構(gòu)建模型基本一致，只需要將keras替換成tensorflow.keras。使用keras和tensorflow2.x構(gòu)建模型如下：

import tensorflow.keras as kerasfrom tensorflow.keras.layers import Conv2D, Flatten, Dense, Dropoutfrom tensorflow.keras.models import Modelfrom tensorflow.keras import utils# 使用原生keras構(gòu)建模型時(shí)，將tensorflow.keras替換成keras即可# 1. data load(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()# 2. data preprocessx_train = x_train/255.0x_test = x_test/255.0x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1)print("train samples: {}, test samples: {}".format(x_train.shape[0], x_test.shape[0]))print("input shape: {}".format(x_train.shape[1:]))y_train = utils.to_categorical(y_train)y_test = utils.to_categorical(y_test)print("label shape:{}".format(y_train.shape[1:]))print(y_test[:10])# 3. build modelinput_data = Input(shape=(28, 28, 1), name='input')x = Conv2D(32, kernel_size=(3, 3), activation="relu", name="conv1")(input_data)x = Conv2D(64, kernel_size=(3, 3), activation="relu", name="conv2")(x)x = Flatten()(x)x = Dense(128, activation="relu")(x)x = Dropout(0.5)(x)output = Dense(10, activation="softmax", name="output")(x)model = Model(input=input_data, output=output)# 4. compile and trainmodel.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])model.fit(x_train, y_train, batch_size=32, epochs=2)# 5. evaluateaccuracy = model.evaluate(x_test, y_test)print("loss:%3.f,?accuracy:%.3f"?%?(accuracy[0],?accuracy[1]))

使用tensorflow構(gòu)建模型訓(xùn)練的教程很多，感興趣的可以在網(wǎng)上搜一搜。訓(xùn)練好模型后，需要將訓(xùn)練好的模型轉(zhuǎn)換成tfserving部署的pb模型，tensorflow2.x構(gòu)建的模型轉(zhuǎn)換成pb非常方便，代碼如下：

import tensorflow as tf# model 為上述訓(xùn)練的模型，也可以調(diào)用tf.keras.models.load_model加載已經(jīng)保存的h5模型tf.keras.models.save_model(model, "./tf2x_save_model")# 包含的模型包括saved_model.pb和variables文件夾，variables文件下包括variables.data-00000-of-00001和variables.index文件。

使用原生的keras構(gòu)建模型無法使用tensorflow2.x進(jìn)行訓(xùn)練。因?yàn)樾枰褂胻ensorflow1.x對h5進(jìn)行轉(zhuǎn)換。將h5轉(zhuǎn)換成pb模型的方法有很多，但使用tensorflow1.x版本有時(shí)也無法正確的將h5轉(zhuǎn)成pb模型，有時(shí)轉(zhuǎn)成功后inference的結(jié)果與本地的不一致。造成這種問題一般都是版本問題或參數(shù)配置不對，這里列舉常用的幾種。

方法一：

tf.contrib.saved_model.save_keras_model(model, save_path)

方法二：

with keras.backend.get_session() as sess: tf.saved_model.simple_save( sess, save_path, inputs={'input': model.input}, outputs={t.name: t for t in model.outputs})

方法三：

from keras import backend as Kfrom tensorflow.python import saved_modelfrom tensorflow.python.saved_model.signature_def_utils_impl import predict_signature_defbuilder = saved_model.builder.SavedModelBuilder(save_path)signature = predict_signature_def( inputs={"input": model.input, }, outputs={"output": model.output})sess = K.get_session()builder.add_meta_graph_and_variables(sess=sess, tags=[saved_model.tag_constants.SERVING], signature_def_map={ "mnist": signature})builder.save()

上述三種方法可以將h5模型轉(zhuǎn)換為pb模型，個(gè)人推薦優(yōu)先使用方法三，因?yàn)槠淇梢灾付ǖ膮?shù)較多，這些參數(shù)會體現(xiàn)在模型的元數(shù)據(jù)中(下面會介紹獲取模型的元數(shù)據(jù)方法)，具有更好的擴(kuò)展性和靈活性。?接下來是部署模型并驗(yàn)證模型。

tfserving模型部署

模型部署可以使用官方構(gòu)建的鏡像(參考文獻(xiàn)中給出)，該鏡像庫中包含不同版本的tensorflow-serving(對應(yīng)不同的tag),可以根據(jù)需要下載，本文在模型部署和inerence時(shí)使用的是tensorflow-serving2.2.0版本。下載鏡像需要安裝docker，docker的安裝可以參考參考文獻(xiàn)部分。

# 下載tfserving2.2.0版本鏡像docker pull tensorflow/serving:2.2.0

準(zhǔn)備好已經(jīng)訓(xùn)練出來的pb模型，如模型保存在/data1/tfserving/models/m/1/, 該目錄下包含saved_model.pb和variables文件夾，variables下包括variables.data-00000-of-00001和variables.index, 其中variables也可能不會生成，具體依賴轉(zhuǎn)換模型使用的接口。路徑中的m表示模型名，1表示模型版本號。啟動命令如下：

docker run -p 8501:8501 -p 8500:8500 --mount type=bind,source=/data1/tfserving/models/m,target=/models/m -e MODEL_NAME=m -t tensorflow/serving:2.2.0

啟動時(shí)暴露8500和8501端口，便于對外提供服務(wù)，source為存在在母機(jī)上模型存放的路徑，target為在容器中模型存放的路徑，MODEL_NAME為m，需要與模型路徑中的模型名保持一致。啟動成功后，可以看到如下提示，提示中也說明了8500為grpc服務(wù)端口，8501為http服務(wù)端口。

模型inference

使用tensorflow-serving模型部署后，支持http和grpc兩種inference方式，下面介紹這兩種inference接口的使用。

http inference

部署模型進(jìn)行inference時(shí)，因?yàn)椴渴鸬哪Ｐ秃苡锌赡苁瞧渌麡I(yè)務(wù)側(cè)提供的，無法直接知道模型的輸入和輸出格式，此時(shí)可以通過接口獲取模型的metadata，根據(jù)metadata準(zhǔn)備數(shù)據(jù)。通過http獲取模型的metadata方式如下：

import requestsroot_url = "http://127.0.0.1:8501"url = "%s/v1/models/m/metadata" % root_urlresp = requests.get(url)

返回的metadata如下：

{ "model_spec":{ "name": "m", "signature_name": "", "version": "1" }, "metadata": { "signature_def": { "signature_def": { "serving_default": { "inputs": { "input": { "dtype": "DT_FLOAT", "tensor_shape": { "dim": [{ "size": "-1", "name": "" },{ "size": "28", "name": "" },{ "size": "28", "name": "" },{ "size": "1", "name": "" }], "unknown_rank": false }, "name": "serving_default_conv1_input:0" }}, "outputs": { "output": { "dtype": "DT_FLOAT", "tensor_shape": { "dim": [{ "size": "-1", "name": "" },{ "size": "10", "name": "" }], "unknown_rank": false }, "name": "StatefulPartitionedCall:0" }}, "method_name": "tensorflow/serving/predict" }

??????上述返回的metadata中，描述了模型名稱，模型版本，輸入的數(shù)據(jù)格式和輸出的數(shù)據(jù)格式，輸入格式為(-1， 28， 28， 1)，輸出為(-1， 10)。知道輸入輸出格式就可以進(jìn)行inference了。inference的代碼如下：

import numpy as np# 1. generate urlroot_url = "http://127.0.0.1:8501"version = 1url = "%s/v1/models/m/versions/%s:predict" % (root_url, version)# 2. generate datanp.random.seed(0)input_data = np.random.rand(2, 28, 28, 1).astype(np.float32)print(input_data.shape)data = { "instances": input_data.tolist()}# 3. post data for inferencestart = time.time()resp = requests.post(url, json=data)# 4. parse resultif resp.status_code == 200: result = json.loads(resp.text) print("predictions", result.get("predictions")) print("time_used:", time.time() - start)# no right result return

上述的inference代碼中隨機(jī)生成了兩個(gè)(28，28，1)的樣本，然后通過http的請求部署好的模型，得到的結(jié)果也包含兩個(gè)樣本的預(yù)測結(jié)果，每個(gè)結(jié)果為1*10的向量，表示屬于0-9數(shù)字的概率，如下：

[[0.0078, 0.0064, 0.1581, 0.0823, 0.0073, 0.0228, 0.00423, 0.0189, 0.6845, 0.0074], [0.0071, 0.0017, 0.2068, 0.0241, 0.0093, 0.0063, 0.0059, 0.0021, 0.7179, 0.0187]]

grpc inference

import numpy as npimport tensorflow as tffrom tensorflow_serving.apis import predict_pb2from tensorflow_serving.apis import prediction_service_pb2_grpcimport grpc# 1. generate datanp.random.seed(0)input_data = np.random.rand(2, 28, 28, 1).astype(np.float32)# 2. grpc inferencechannel = grpc.insecure_channel(root_url)stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)request = predict_pb2.PredictRequest()request.model_spec.name = "m" # 模型名稱request.model_spec.signature_name = "" # 簽名名稱# cov1_input為模型輸入層的名稱，可在構(gòu)建模型時(shí)自定義request.inputs["input"].CopyFrom( tf.make_tensor_proto(input_data.tolist(), shape=list(input_data.shape)))result_future = stub.Predict.future(request, 10.0) # 10 secs timeoutresponse = result_future.result()# 4. parse result# dense_1 是輸出層的名稱，可以通過metadata查看result = np.reshape(response.outputs["ouput"].float_val, (input_shape[0], 10))

因?yàn)橹付松傻臏y試數(shù)據(jù)的隨機(jī)種子，所以http和grpc兩個(gè)測試得到的結(jié)果完全一樣的。grpc的輸出結(jié)果的格式通過response.outputs["output"].float_val解析后是一個(gè)1維的araaylist，需要通過reshape進(jìn)行轉(zhuǎn)換。

在模型部署階段，使用tensorflow0-fserving1.14.0部署tensorflow1.x生成的pb模型，使用tensorflow-serving2.1.0部署tensorflow2.x生成的模型。通過實(shí)驗(yàn)測試發(fā)現(xiàn)，tensorflow1.15以上版本生成的pb模型可以使用tensorflow-serving2.x進(jìn)行部署，tensorflow1.14以下的版本可以是有tensorflow-serving1.14進(jìn)行部署。

兩種接口效率比較

tensorflow-serving服務(wù)提供了http和grpc，兩者的速率是不一樣的，本文使用mnist模型進(jìn)行了測試，模型使用docker啟動，請求數(shù)據(jù)的腳本運(yùn)行在母機(jī)上，每次測試的sample數(shù)為100，測試結(jié)果如下：

測試次數(shù)	http inference平均耗時(shí)(s)	grpcinference平均耗時(shí)(s)
1	1.400	1.747
2	1.376	1.725
3	1.365	1.772
4	1.341	1.757
5	1.333	1.637
平均耗時(shí)	1.363	1.728

另外，我們也使用目標(biāo)檢測的ssd模型對http和grpc進(jìn)行inference的速率測試，模型輸入格式為(320， 320，3)，輸出的目標(biāo)數(shù)為100個(gè)，每次請求一張圖片。測試結(jié)果如下：

測試次數(shù)	http inference平均耗時(shí)	grpc inference平均耗時(shí)
1	0.1473	0.0428
2	0.1425	0.0314
3	0.1456	0.0312
4	0.1413	0.0675
5	0.1496	0.0315
6	0.1462	0.03245
平均耗時(shí)	0.146	0.036

對比兩表發(fā)現(xiàn)，當(dāng)較小的模型和輸入size時(shí)，使用http進(jìn)行inference耗時(shí)較短，當(dāng)較大模型和輸入size時(shí)，使用grpc優(yōu)勢較為明顯。當(dāng)輸入數(shù)據(jù)較大時(shí)，使用http通訊時(shí)數(shù)據(jù)序列化和反序列化的耗時(shí)較多，而grpc對序列化耗時(shí)性能比http快。當(dāng)數(shù)據(jù)較小時(shí)，

總結(jié)

本文介紹了從模型構(gòu)建到模型轉(zhuǎn)換，再到模型部署和inference的全流程。在使用過程中，碰到最大的問題在于將h5模型轉(zhuǎn)換成pb模型及模型inference階段。很多情況下，tensorflow和keras在不同版本下，相同的代碼可能無法正確的將h5轉(zhuǎn)換成pb模型或者轉(zhuǎn)換之后無法使用tfserving部署(部署了也無法得到正確的結(jié)果)。在模型的inference階段，可以使用http和grpc兩種，在使用過程中，往往需要反復(fù)確認(rèn)數(shù)據(jù)的輸入和輸出格式，無法快速的對接。

參考文獻(xiàn)：

https://hub.docker.com/r/tensorflow/servinghttps://blog.csdn.net/qq_42693848/article/details/101153124

總結(jié)

以上是生活随笔為你收集整理的保存模型后无法训练_模型构建到部署实践的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

模型

上一篇：协议crc计算_从零了解modbus协议
下一篇： vue怎么引入外部地址_vue系列教程之