html两个框架同时_两个框架的故事
html兩個框架同時
If you’re like me you have a favourite framework you gravitate towards in every project. For me, it’s Tensorflow, particularly since they better integrated Keras in tf2.0. But every time another feature is released in PyTorch the grass looks a little bit greener on the other side. So this tutorial is for those people with a strong understanding of one framework, and a curiosity about the other.
如果您像我一樣,則在每個項目中都有一個喜歡的框架。 對我而言,它是Tensorflow,尤其是因為它們更好地將Keras集成到了tf2.0中。 但是,每次在PyTorch中發(fā)布另一個功能時,另一面的草都會看起來有點(diǎn)綠色。 因此,本教程適合那些對一個框架有深刻理解而又對另一個框架有好奇心的人。
In this tutorial, I’ll walk you through the same example in both frameworks, including side-by-side comparisons of a few best practices such as:
在本教程中,我將向您介紹兩個框架中的相同示例,包括對一些最佳實(shí)踐的并行比較,例如:
Data generators to cope with large datasets
數(shù)據(jù)生成器以應(yīng)對大型數(shù)據(jù)集
Creating a model from scratch
從頭開始創(chuàng)建模型
Setting up a loss function and optimizer
設(shè)置損失函數(shù)和優(yōu)化器
Training loop with Tensorboard logging and checkpointing
使用Tensorboard記錄和檢查點(diǎn)進(jìn)行訓(xùn)練循環(huán)
Improving results by fine-tuning a pre-trained model
通過微調(diào)預(yù)訓(xùn)練的模型來改善結(jié)果
Evaluating your model of a test set
評估測試模型
數(shù)據(jù)集 (Dataset)
I’ll be using a familiar dataset, Cats v Dogs, because this guide isn’t about solving a tricky problem (you’ve probably already got your own problem in mind), it’s about creating a general, minimal example that you can easily adapt. Mostly though, I’m using this dataset because if I’m going to spend vast amounts of time looking at images I’d rather they be of cute animals. If you want to follow along exactly, download the data from here.
我將使用一個熟悉的數(shù)據(jù)集Cats v Dogs,因為該指南并不是要解決棘手的問題(您可能已經(jīng)想到了自己的問題),而是要創(chuàng)建一個通用的最小示例,您可以輕松地適應(yīng)。 不過,大多數(shù)情況下,我使用的是此數(shù)據(jù)集,因為如果我要花費(fèi)大量時間查看圖像,我希望它們是可愛的動物 。 如果您想完全遵循,請從此處下載數(shù)據(jù)。
My dataset is stored in a subdirectory (‘data’) of folder containing my training script (‘folder’), with the following structure:
我的數(shù)據(jù)集存儲在包含我的訓(xùn)練腳本(文件夾)的文件夾的子目錄(“數(shù)據(jù)”)中,其結(jié)構(gòu)如下:
folder├── data/
├── test/
├──1.jpg
├──...
└── train/
├──cat.0.jpg
├──dog.0.jpg
├──...
設(shè)置數(shù)據(jù)生成器 (Setting up the data generator)
My dataset isn’t very large (25,000 fairly small images, of which I’ll only be using 1000 as a minimal example), so I can load it all in to memory. But datasets too large to load in to memory are becoming more common, so it’s important to have a pipeline that can deal with those situations. A data generator is a great option which allows you to generate the data in real time, run preprocessing and augmentation in batches, and feed it right in to the model. This can lead to huge efficiencies during training, since it allows for data to be prepped on the CPU which the GPU is running training.
我的數(shù)據(jù)集不是很大(有25,000張相當(dāng)小的圖像,其中我將僅以1000張為最小示例),因此我可以將所有圖像加載到內(nèi)存中。 但是太大而無法加載到內(nèi)存中的數(shù)據(jù)集變得越來越普遍,因此擁有可以處理這些情況的管道非常重要。 數(shù)據(jù)生成器是一個很好的選擇,它允許您實(shí)時生成數(shù)據(jù),分批運(yùn)行預(yù)處理和擴(kuò)充,然后將其直接輸入模型。 由于它允許將數(shù)據(jù)準(zhǔn)備在正在運(yùn)行GPU的GPU上進(jìn)行訓(xùn)練,因此可以在訓(xùn)練過程中提高效率。
Tensorflow數(shù)據(jù)生成器 (Tensorflow data generator)
For my Tensorflow data generator, I’m going to inherit from tf.keras.utils.Sequence, so that I can capitalise on perks like multiprocessing. You’ll notice I’m calling a function ‘a(chǎn)ugment’ in this code, you can find the code for that here, or make your own function where the input is an image, and the output is an augmented version of that image, with fixed size (im_size), scaled between -1 and 1.
對于我的Tensorflow數(shù)據(jù)生成器,我將從tf繼承。 喀拉斯邦 。 實(shí)用程序 。 Sequence ,這樣我就可以利用諸如多重處理之類的特權(quán)。 您會注意到我在此代碼中稱函數(shù)為“增補(bǔ)”,您可以在此處找到該代碼,或者在輸入為圖像,輸出為該圖像的增強(qiáng)版本的情況下創(chuàng)建自己的函數(shù),固定大小(im_size),在-1和1之間縮放。
We put as arguments important information such as the directory containing the data (data_dir), the batch size, size the images will be rescaled to (for this purpose they’ll have the same height and width), number of images to use (setting this to a number less than the total number of images is helpful for testing the network and debugging), and whether the data should be shuffled each epoch.
我們將重要信息作為參數(shù),例如包含數(shù)據(jù)的目錄(data_dir),批處理大小,圖像將被縮放到的大小(為此,它們將具有相同的高度和寬度),要使用的圖像數(shù)(設(shè)置數(shù)量少于映像總數(shù),這有助于測試網(wǎng)絡(luò)和調(diào)試),以及是否應(yīng)在每個時期重新整理數(shù)據(jù)。
The class needs a few methods in order to function correctly:
該類需要一些方法才能正常運(yùn)行:
__init__ is the initialising method, it’s called when the class is instantiated. Here it’s locating our image names, put them in a list and shuffle it.
__init__是初始化方法,在實(shí)例化類時調(diào)用它。 在這里找到我們的圖像名稱,將它們放在列表中并隨機(jī)播放。
on_epoch_end is triggered at the end of each epoch, here it just shuffles the data.
on_epoch_end在每個時期結(jié)束時觸發(fā),此處只是將數(shù)據(jù)隨機(jī)播放 。
Each time the training loop requests new data from the generator, an index will be incremented from 0 to an upper limit defined by __len__. Best practice is to set this upper value to the number of batches in each epoch, so that each image is seen once each epoch.
每次訓(xùn)練循環(huán)從生成器請求新數(shù)據(jù)時,索引將從0遞增到__len__定義的上限。 最佳做法是將此上限值設(shè)置為每個時期的批處理數(shù)量,以便每個時期可以看到每個圖像。
__getitem__ is called each time data is requested, it takes the aforementioned index, gets a batch_size list of image names based on that index, and fetches them.
每次請求數(shù)據(jù)時都會調(diào)用__getitem__ ,它將獲取上述索引,并基于該索引獲取圖像名稱的batch_size列表,然后進(jìn)行獲取。
The two other methods in the class aren’t strictly required:
并非嚴(yán)格要求該類中的其他兩個方法:
__get_data is a private method called by __getitem__ to fetch the images and augment them. You could just put this code in __getitem__ but this layout makes the code more modular.
__get_data是__getitem__調(diào)用的私有方法,用于獲取圖像并增強(qiáng)圖像。 您可以將這段代碼放在__getitem__中,但是這種布局使代碼更具模塊化。
load_val loads all of the validation images in one go. This kind of defeats the purpose of having a data generator to deal with large datasets, but unfortunately one of the training methods I’ll be using (‘fit’) does not accept a generator as a validation dataset, hopefully, this is fixed in future releases.
load_val一次性加載所有驗證圖像。 這種方法無法實(shí)現(xiàn)使用數(shù)據(jù)生成器處理大型數(shù)據(jù)集的目的,但是不幸的是,我將使用的一種訓(xùn)練方法(“擬合”)不接受生成器作為驗證數(shù)據(jù)集,希望此問題已得到解決。將來的版本。
Then it’s a pretty simple matter to create an instance of the generator in the training script, and read in all the validation images.
然后,在訓(xùn)練腳本中創(chuàng)建生成器的實(shí)例并讀取所有驗證圖像是一件非常簡單的事情。
PyTorch數(shù)據(jù)生成器 (PyTorch data generator)
The PyTorch data generator is fairly similar to the Tensorflow generator. However in this case, inheriting from torch.utils.data.Dataset allows us to use multiprocessing, analogous to the inheritance of tf.keras.utils.Sequence in the previous section. There’s a lot of other similarities too, we’re using the augment function, we’re also using similar arguments, including batch size, image size, number of images and shuffle.
PyTorch數(shù)據(jù)生成器與Tensorflow生成器非常相似。 但是,在這種情況下,繼承自Torch 。 實(shí)用程序 。 數(shù)據(jù) 。 數(shù)據(jù)集使我們可以使用多重處理,類似于上一節(jié)中tf.keras.utils.Sequence的繼承。 還有很多其他相似之處,我們使用了增強(qiáng)功能,還使用了類似的參數(shù),包括批處理大小,圖像大小,圖像數(shù)量和隨機(jī)播放。
The generator involves three of the same methods:
生成器涉及三種相同的方法:
__init__ is the initialising method, here is shuffles the image filenames (which it has been passed), and sets up the augmentation parameters.
__init__是初始化方法,這里是重新排列圖像文件名(已傳遞),并設(shè)置擴(kuò)充參數(shù)。
__len__ operates in the same way as above
__len__的運(yùn)作方式與上述相同
__getitem__ reads one image and augments it. Note a key difference between this generator and the previous is that here the generator yields only one image and label - PyTorch manages the batching of the images.
__getitem__讀取一張圖像并對其進(jìn)行擴(kuò)充。 請注意,此生成器與前一個生成器之間的主要區(qū)別在于,此處生成器僅生成一個圖像和標(biāo)簽-PyTorch管理圖像的批處理。
An important thing to note here is the normalization applied to the image if the model type in mobilenet, that’s because the network we’ll use for ‘mobilenet’ is a pretrained, torchvision model, which was trained using images normalised in this fasion. Therefore, when using this model we need to normalise in the same way.
這里要注意的重要一點(diǎn)是,如果mobilenet中的模型類型適用于圖像,這是因為我們將用于“ mobilenet”的網(wǎng)絡(luò)是預(yù)訓(xùn)練的Torchvision模型,該模型是使用此功能中標(biāo)準(zhǔn)化的圖像進(jìn)行訓(xùn)練的。 因此,在使用此模型時,我們需要以相同的方式進(jìn)行歸一化。
Creating the PyTorch generator in the training pipeline requires a little extra work. First we set up some parameters, including the number of threads used to load data in parallel. Then we instantiate the class, and pass it to the DataLoader class, which also takes the parameters we set up. We create a second generator for validation, where we pass the validation flag to make sure that images won’t undergo augmentation.
在訓(xùn)練管道中創(chuàng)建PyTorch生成器需要一些額外的工作。 首先,我們設(shè)置一些參數(shù),包括用于并行加載數(shù)據(jù)的線程數(shù)。 然后,我們實(shí)例化該類,并將其傳遞給DataLoader類,該類也采用我們設(shè)置的參數(shù)。 我們創(chuàng)建第二個用于驗證的生成器,在其中傳遞驗證標(biāo)志以確保圖像不會進(jìn)行擴(kuò)增。
創(chuàng)建一個簡單的模型 (Creating a simple model)
Now let’s see what it looks like to create a simple CNN. In both frameworks I’m going to set up a CNN with 4 convolutional layers, separated by max pooling, followed by dropout at 50%, and then two linear layers. We’re not going for performance here, just demonstration.
現(xiàn)在,讓我們看看創(chuàng)建一個簡單的CNN的樣子。 在這兩個框架中,我將建立一個具有4個卷積層的CNN,并通過最大池化將其分開,然后以50%的比例退出,然后是兩個線性層。 我們不打算在這里表現(xiàn),只是示范。
Both frameworks allow you to create the layers you need from the ground up, which means you have a fair amount of customisability. However, unless you have a very good reason to want to create your own custom layer, I encourage you to save yourself the trouble and use the user-friendly wrappers both frameworks provide.
這兩個框架都允許您從頭開始創(chuàng)建所需的層,這意味著您具有大量的可定制性。 但是,除非您有充分的理由要創(chuàng)建自己的自定義層,否則我建議您省去麻煩并使用這兩個框架提供的用戶友好包裝器。
Tensorflow簡單模型 (Tensorflow simple model)
Tensorflow has recently properly integrated Keras, the highly popular wrapper that simplifies creation and training of deep neural networks. This is what it looks like to create our simple CNN in Tensorflow Keras.
Tensorflow最近已正確集成了Keras,這是一種非常流行的包裝器,可簡化深度神經(jīng)網(wǎng)絡(luò)的創(chuàng)建和訓(xùn)練。 這就是在Tensorflow Keras中創(chuàng)建我們的簡單CNN的樣子。
Using the Sequential model type groups a stack of layers together. The order that the layers are stacked within Sequential denotes the order of the layers in the network. In contrast to what we’ll see in PyTorch, all layers, including those without trainable parameters (like MaxPooling and activation functions) are included in the model. That’s because this one function serves to both declare the structure of the model, and define the flow of data in the forward (and backward) pass.
使用順序模型類型將一組堆棧組合在一起。 層在順序內(nèi)堆疊的順序表示網(wǎng)絡(luò)中層的順序。 與我們在PyTorch中看到的相反,該模型包括了所有層,包括沒有可訓(xùn)練參數(shù)的層(例如MaxPooling和激活函數(shù))。 這是因為此功能既可以聲明模型的結(jié)構(gòu),又可以定義前向(和后向)傳遞中的數(shù)據(jù)流。
PyTorch簡單模型 (PyTorch simple model)
In PyTorch, the model is defined as a class that inherits from nn.Module, with an __init__ that contains the layers, and a method forward() that defines how the data will pass through the network, and returns the output of the network.
在PyTorch中,模型定義為從nn.Module繼承的類,其中的__init__包含圖層,而方法forward()定義數(shù)據(jù)如何通過網(wǎng)絡(luò)并返回網(wǎng)絡(luò)的輸出。
Keep in mind that any layers that have parameters that need to be trained (like convolutional layers) need to be registered in __init__. Layers with no trainable parameters (like max pooling and activation functions) can be registered either in __init__ or forward().
請記住,任何需要訓(xùn)練參數(shù)的層(例如卷積層)都必須在__init__中注冊。 沒有可訓(xùn)練參數(shù)的層(例如最大池和激活函數(shù))可以在__init__或forward()中注冊。
This may be a bit more complex than how we connected our network in Tensorflow, but the separation of layers and connectivity lends PyTorch quite a bit of flexibility that isn’t nearly as easy to achieve in Tensorflow.
這可能比我們在Tensorflow中連接網(wǎng)絡(luò)的方式更為復(fù)雜,但是層和連通性的分離為PyTorch帶來了相當(dāng)多的靈活性,這在Tensorflow中幾乎不那么容易實(shí)現(xiàn)。
定義損失函數(shù)和優(yōu)化器 (Defining a loss function and optimizer)
The loss function compares the output of the model to the target value, and estimates how far apart they are. The loss function you use will depend on your application, I’m using Binary Cross Entropy with Logit Loss because I’m training a binary classifier. The “l(fā)ogit loss” part is because the output from my model has a linear activation function, which in DL framework terms means that the input in to my loss function is a “l(fā)ogit”, the term used for a classification output before it passes through a sigmoid or softmax layer. It is more computationally efficient to calculate the sigmoid/softmax with the cross-entropy, which the BCE with logit loss layer does.
損失函數(shù)將模型的輸出與目標(biāo)值進(jìn)行比較,并估計它們之間的距離。 您使用的損失函數(shù)將取決于您的應(yīng)用程序,因為我正在訓(xùn)練二進(jìn)制分類器,所以我將使用二進(jìn)制交叉熵和Logit損失 。 “ logit損失”部分是因為我的模型的輸出具有線性激活函數(shù),這在DL框架中意味著我的損失函數(shù)的輸入是“ logit”,該術(shù)語用于分類輸出在通過之前的分類輸出乙狀或softmax層。 具有交叉熵的Sigmoid / softmax計算效率更高,具有l(wèi)ogit損失層的BCE可以做到。
The optimizer is used to update the parameters of the model in order to reduce the loss. There are lots of optimizers to choose from, but I’m using Adam.
優(yōu)化器用于更新模型的參數(shù),以減少損失。 有很多優(yōu)化器可供選擇,但是我正在使用Adam 。
Tensorflow損失功能和優(yōu)化器 (Tensorflow loss function and optimizer)
In Tensorflow binary cross-entropy with and without logit loss are defined through the same function.
在Tensorflow中,具有和不具有l(wèi)ogit損失的二進(jìn)制交叉熵都通過相同的函數(shù)定義。
PyTorch損失函數(shù)和優(yōu)化器 (PyTorch loss function and optimizer)
In PyTorch, binary cross-entropy with logits loss is a separate function to that without logits loss. Also, the optimizer takes the model parameters as input as well as the learning rate. Therefore, if you’re not training all of the parameters (i.e. if you’re fine tuning a model), then make sure to only pass in the parameters that you are training.
在PyTorch中,具有l(wèi)ogits損失的二進(jìn)制互熵與沒有l(wèi)ogits損失的二進(jìn)制互熵是一個獨(dú)立的函數(shù)。 同樣,優(yōu)化器將模型參數(shù)作為輸入以及學(xué)習(xí)率。 因此,如果您不訓(xùn)練所有參數(shù)(例如,您正在微調(diào)模型),請確保僅傳遞您訓(xùn)練的參數(shù)。
訓(xùn)練循環(huán)(帶日志記錄) (The training loop (with logging))
Finally, we get to the good stuff, training our network. We’re also going to be adding two separate functions to the training loop, one for logging the process of our training to Tensorboard, and another for model checkpointing.
最后,我們掌握了好東西,訓(xùn)練了我們的網(wǎng)絡(luò)。 我們還將在訓(xùn)練循環(huán)中添加兩個單獨(dú)的功能,一個用于將訓(xùn)練過程記錄到Tensorboard中,另一個用于模型檢查點(diǎn)。
Tensorboard is used to log the loss and accuracy of the model during training. you can also add other capabilities, such as logging images (which is particularly handy if you’re training an image generator), and histograms (great for keeping track of gradients).
Tensorboard用于記錄訓(xùn)練期間模型的損失和準(zhǔn)確性。 您還可以添加其他功能,例如記錄圖像(如果正在訓(xùn)練圖像生成器,則特別方便)和直方圖(用于跟蹤梯度的出色功能)。
Tensorboard provides a great method for inspecting results while training, and comparing different models.Tensorboard提供了一種在訓(xùn)練時檢查結(jié)果并比較不同模型的好方法。Model checkpointing saves the model or weights in the chosen folder at regular intervals during training (every epoch as default). We will only be overwriting the weights each step if the validation accuracy is larger than the current checkpoint.
模型檢查點(diǎn)會在訓(xùn)練過程中定期將模型或權(quán)重保存在所選文件夾中(默認(rèn)為每個紀(jì)元)。 如果驗證精度大于當(dāng)前檢查點(diǎn),我們將僅覆蓋權(quán)重。
There are many other functions that you can (and should) add, such as learning rate scheduling and early stopping.
您可以(并且應(yīng)該)添加許多其他功能,例如學(xué)習(xí)率調(diào)度和提前停止。
Tensorflow訓(xùn)練循環(huán)(選項1) (Tensorflow training loop (option 1))
In Tensorflow there are multiple ways of training the network. The first is the simplest, it takes advantage of the Keras ‘fit’ method, using your data generator as the training data input (note that prior to Tensorflow 2.0 you would have had to use ‘fit_generator’ to take a generator as input, but this has been deprecated in recent releases). Unfortunately, the validation data cannot be passed in as a generator.
在Tensorflow中,有多種訓(xùn)練網(wǎng)絡(luò)的方法。 首先是最簡單的方法,它利用Keras的“擬合”方法,將數(shù)據(jù)生成器用作訓(xùn)練數(shù)據(jù)輸入(請注意,在Tensorflow 2.0之前,您必須使用“ fit_generator”將生成器作為輸入,但是在最近的發(fā)行版中已棄用)。 不幸的是,驗證數(shù)據(jù)無法作為生成器傳遞。
Prior to calling ‘fit’ we need to compile our model with the optimizer and loss. We also set some parameters for multiprocessing to speed up the training loop. The other thing to note here is the use of callbacks, which is how we’re defining the Tensorboard and model checkpointing behaviour mentioned earlier.
在稱為“擬合”之前,我們需要使用優(yōu)化器和損失來編譯模型。 我們還為多處理設(shè)置了一些參數(shù),以加快訓(xùn)練循環(huán)。 這里要注意的另一件事是回調(diào)的使用,這是我們定義前面提到的Tensorboard和模型檢查點(diǎn)行為的方式。
This training code, applied to 850 cat/dog training images, with 150 validation images, on a MacBook Pro with no GPU, gives the following results:
在沒有GPU的MacBook Pro上,此訓(xùn)練代碼應(yīng)用于850個貓/狗訓(xùn)練圖像和150個驗證圖像,得出以下結(jié)果:
Tensorflow訓(xùn)練循環(huán)(選項2) (Tensorflow training loop (option 2))
Alternatively, we can define the training loop more explicitly. Specifically, we define a for-loop that iterates over epochs, then another loop over the dataset in batches. First up, we’re defining a number of metrics (train/validation accuracy and loss) which get updated during the train and test step functions.
或者,我們可以更明確地定義訓(xùn)練循環(huán)。 具體來說,我們定義了一個for循環(huán),該循環(huán)在歷元上進(jìn)行迭代,然后在數(shù)據(jù)集中批量進(jìn)行另一個循環(huán)。 首先,我們定義了許多指標(biāo)(訓(xùn)練/驗證準(zhǔn)確性和損失),這些指標(biāo)會在訓(xùn)練和測試步驟功能期間進(jìn)行更新。
We then define the train and validation functions. In the train function, we open a GradientTape() scope, in which we call the model to run the forward pass and compute the loss. Then we retrieve the gradients and use the optimizer to update the weights based on the gradients. The difference in the validation function is that we only run the data through the model to calculate the loss and accuracy, logging them both.
然后,我們定義訓(xùn)練和驗證功能。 在訓(xùn)練函數(shù)中,我們打開GradientTape()范圍,在其中調(diào)用模型以運(yùn)行前向通過并計算損耗。 然后,我們檢索梯度并使用優(yōu)化器根據(jù)梯度更新權(quán)重。 驗證功能的區(qū)別在于,我們僅通過模型運(yùn)行數(shù)據(jù)以計算損失和準(zhǔn)確性,并同時記錄它們。
An important thing to note here is the use of the tf.function decorator above both train and test step functions. Tensorflow 2.0 onwards operates in eager mode by default, which is great for line by line execution and therefore debugging, but it makes for slower function execution. This decorator converts a python function to a static tensorflow graph, which runs faster.
這里要注意的重要事項是在訓(xùn)練和測試步驟功能上方使用tf.function裝飾器 。 Tensorflow 2.0及更高版本默認(rèn)情況下以渴望模式運(yùn)行,這非常適合逐行執(zhí)行并因此進(jìn)行調(diào)試,但它會使函數(shù)執(zhí)行速度變慢。 該裝飾器將python函數(shù)轉(zhuǎn)換為靜態(tài)tensorflow圖,其運(yùn)行速度更快。
We also need to set up Tensorboard logging manually, and check the validation accuracy to monitor when to save out the model weights.
我們還需要手動設(shè)置Tensorboard日志記錄,并檢查驗證準(zhǔn)確性以監(jiān)視何時節(jié)省模型權(quán)重。
This training code, applied to 850 cat/dog training images, with 150 validation images, on a MacBook Pro with no GPU, gives the following results:
在沒有GPU的MacBook Pro上,將此訓(xùn)練代碼應(yīng)用于850個貓/狗訓(xùn)練圖像和150個驗證圖像,得出以下結(jié)果:
PyTorch訓(xùn)練循環(huán) (PyTorch training loop)
The Pytorch loop follows the same logic as the tensorflow loop, one of the main differences to note is how the backwards pass is run quite elegantly and intuitively, by calling the backward method on the loss, and parameters are updated by calling the step method on the optimizer. It’s important to note the use of no_grad scope in the validation step, which temporarily sets all of the “requires_grad” flags in the model parameters to False.
Pytorch循環(huán)遵循與tensorflow循環(huán)相同的邏輯,需要注意的主要區(qū)別之一是,如何通過調(diào)用loss上的向后方法來相當(dāng)優(yōu)雅且直觀地運(yùn)行向后傳遞,并通過調(diào)用step方法來更新參數(shù)。優(yōu)化器。 重要的是要注意在驗證步驟中使用no_grad范圍,該范圍將模型參數(shù)中的所有“ requires_grad”標(biāo)志暫時設(shè)置為False。
Also note the use of ‘model.train()’ and ‘model.eval()’, which are used to switch between modes for models that contain modules which have different training and evaluation behavior, such as batch normalization.
還要注意使用“ model.train()”和“ model.eval()”,它們用于在包含具有不同訓(xùn)練和評估行為(例如批歸一化)的模塊的模型的模式之間切換。
This training code, applied to 850 cat/dog training images, with 150 validation images, on a MacBook Pro with no GPU, gives the following results:
在沒有GPU的MacBook Pro上,此訓(xùn)練代碼應(yīng)用于850個貓/狗訓(xùn)練圖像和150個驗證圖像,得出以下結(jié)果:
This is notably longer time per epoch compared with Tensorflow. Monitoring the use of my CPU cores during execution shows they’re being under-utilised compared to the Tensorflow implementation, despite the setting of the num_workers parameter in the data loader. This is a known issue in Pytorch currently, please comment if you know of a fix!
與Tensorflow相比,每個時期的時間明顯更長。 盡管在數(shù)據(jù)加載器中設(shè)置了num_workers參數(shù),但在執(zhí)行過程中監(jiān)視我的CPU內(nèi)核的使用情況表明,與Tensorflow實(shí)現(xiàn)相比,它們的使用率不足。 這是當(dāng)前在Pytorch中的已知問題,如果您知道有解決方法,請發(fā)表評論!
使用預(yù)先訓(xùn)練的模型改善結(jié)果 (Improving the results with a pre-trained model)
One of the great aspects of Tensorflow and PyTorch as deep learning frameworks is the ability to capitalise on in-built pre-trained models. Both frameworks include many of the most popular models pretrained on ImageNet for you to use for free. Better yet, it’s quite easy to start using these networks, and replace the classification layers with something that better fits your problem.
Tensorflow和PyTorch作為深度學(xué)習(xí)框架的重要方面之一就是能夠利用內(nèi)置的預(yù)訓(xùn)練模型。 這兩個框架都包含許多在ImageNet上經(jīng)過預(yù)訓(xùn)練的最受歡迎的模型,供您免費(fèi)使用。 更好的是,開始使用這些網(wǎng)絡(luò)并用更適合您的問題的方式替換分類層非常容易。
Tensorflow預(yù)訓(xùn)練模型 (Tensorflow pre-trained model)
One particularly easy way of using a pre-trained model in Tensorflow is through Keras Applications, which are canned architectures with pre-trained weights. Note that this is not the only way of using pre-trained models, but it’s probably the easiest. If your model of choice isn’t listed here you can check out TFHub or TF Model Garden.
在Tensorflow中使用預(yù)訓(xùn)練模型的一種特別簡單的方法是通過Keras Applications ,它是具有預(yù)訓(xùn)練權(quán)重的固定結(jié)構(gòu)。 請注意,這不是使用預(yù)訓(xùn)練模型的唯一方法,但這可能是最簡單的方法。 如果未在此處列出您選擇的型號,則可以查看TFHub或TF Model Garden 。
This training code, applied to 850 cat/dog training images, with 150 validation images, on a MacBook Pro with no GPU, gives the following results:
在沒有GPU的MacBook Pro上,將此訓(xùn)練代碼應(yīng)用于850個貓/狗訓(xùn)練圖像和150個驗證圖像,得出以下結(jié)果:
This is much better than the simple model! Note that the training time is lower than the simple model due to the smaller number of parameters actually being trained (even though the whole model is much larger).
這比簡單模型要好得多! 請注意,由于實(shí)際訓(xùn)練的參數(shù)數(shù)量較少(即使整個模型要大得多),因此訓(xùn)練時間比簡單模型要短。
PyTorch預(yù)訓(xùn)練模型 (PyTorch pre-trained model)
The Pytorch equivalent of Keras Applications is Torchvision. Torchvision pre-trained networks require their inputs to be normalised in a particular way, see here for details. As mentioned earlier, I normalised the images in the data augmentation stage.
Keras應(yīng)用程序的Pytorch等效項是Torchvision 。 Torchvision預(yù)訓(xùn)練網(wǎng)絡(luò)要求以特定方式將其輸入歸一化,有關(guān)詳細(xì)信息,請參見此處 。 如前所述,我在數(shù)據(jù)擴(kuò)充階段對圖像進(jìn)行了標(biāo)準(zhǔn)化。
We don’t want to train the feature extraction stage of the network, therefore we turn off the ‘requires_grad’ flag of the network for all layers, before replacing the second classification layer with our own (trainable) linear layer. We pass only this layer in to the optimizer.
我們不想訓(xùn)練網(wǎng)絡(luò)的特征提取階段,因此在用我們自己的(可訓(xùn)練的)線性層替換第二個分類層之前,請關(guān)閉所有層的網(wǎng)絡(luò)的'requires_grad'標(biāo)志。 我們僅將這一層傳遞給優(yōu)化器。
This training code, applied to 850 cat/dog training images, with 150 validation images, on a MacBook Pro with no GPU, gives the following results:
在沒有GPU的MacBook Pro上,此訓(xùn)練代碼應(yīng)用于850個貓/狗訓(xùn)練圖像和150個驗證圖像,得出以下結(jié)果:
模型評估 (Model evaluation)
It’s been a long read but we’re almost there, we just need to evaluate the models. The following code assumes you’ve already created the model layout, and therefore just need to load the weights from file.
讀了很長的書,但是我們差不多了,我們只需要評估模型即可。 以下代碼假定您已經(jīng)創(chuàng)建了模型布局,因此只需要從文件中加載權(quán)重即可。
Tensorflow模型評估 (Tensorflow model evaluation)
The Tensorflow method ‘load_weights’ used on the predefined model structure loads and applies the trained parameters of the model found in the selected checkpoint file. The following code grabs one batch of images from the test set and runs them through the model.
預(yù)定義模型結(jié)構(gòu)上使用的Tensorflow方法“ load_weights”加載并應(yīng)用在選定檢查點(diǎn)文件中找到的模型的訓(xùn)練參數(shù)。 以下代碼從測試集中獲取一批圖像,并在模型中運(yùn)行它們。
Pytorch模型評估 (Pytorch model evaluation)
The Pytorch function ‘load_state_dict’ applies the state of the parameters of ‘model’ found in the selected checkpoint file. The following code grabs one batch of images from the test set and runs them through the model.
Pytorch函數(shù)“ load_state_dict”應(yīng)用在選定檢查點(diǎn)文件中找到的“模型”參數(shù)的狀態(tài)。 以下代碼從測試集中獲取一批圖像,并在模型中運(yùn)行它們。
摘要 (Summary)
I hope that this tutorial has given you a better understanding of the use of the counterpart to your preferred framework. Both frameworks have developed to the point that they’re simultaneously easy to use for beginners and highly customisable when required. To see the code in full, check out the code on GitHub.
我希望本教程可以使您對首選框架的對應(yīng)用法有更好的了解。 兩種框架都發(fā)展到了既適合初學(xué)者又易于定制的高度。 要查看完整的代碼,請在GitHub上查看代碼 。
翻譯自: https://towardsdatascience.com/a-tale-of-two-frameworks-985fa7fcec
html兩個框架同時
總結(jié)
以上是生活随笔為你收集整理的html两个框架同时_两个框架的故事的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 快手怎么设置陌生人私信
- 下一篇: 刘作虎“剧透”一加Ace 2明天亮相:三