當前位置：首頁 > 人工智能 > 卷积神经网络 >内容正文

卷积神经网络

卷积网络和卷积神经网络_卷积神经网络的眼病识别

發布時間：2023/12/15 卷积神经网络 47 豆豆

生活随笔收集整理的這篇文章主要介紹了卷积网络和卷积神经网络_卷积神经网络的眼病识别小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

卷積網絡和卷積神經網絡

關于這個項目 (About this project)

This project is part of the Algorithms for Massive Data course organized by the University of Milan, that I recently had the chance to attend. The task is to develop the Deep Learning model able to recognize eye diseases, from eye-fundus images using the TensorFlow library. An important requirement is to make the training process scalable, so create a data pipeline able to handle massive amounts of data points. In this article, I summarize my findings on convolutional neural networks and methods of building efficient data pipelines using the Tensorflow dataset object. Entire code with reproducible experiments is available on my Github repository: https://github.com/GrzegorzMeller/AlgorithmsForMassiveData

該項目是我最近有幸參加的由米蘭大學組織的“海量數據算法”課程的一部分。任務是開發使用TensorFlow庫從眼底圖像識別眼睛疾病的深度學習模型。一個重要的要求是使培訓過程具有可擴展性，因此創建一個能夠處理大量數據點的數據管道。在本文中，我總結了有關卷積神經網絡和使用Tensorflow數據集對象構建有效數據管道的方法的發現。我的Github存儲庫中提供了具有可重復實驗的整個代碼： https ： //github.com/GrzegorzMeller/AlgorithmsForMassiveData

介紹 (Introduction)

Early ocular disease detection is an economic and effective way to prevent blindness caused by diabetes, glaucoma, cataract, age-related macular degeneration (AMD), and many other diseases. According to World Health Organization (WHO) at present, at least 2.2 billion people around the world have vision impairments, of whom at least 1 billion have a vision impairment that could have been prevented[1]. Rapid and automatic detection of diseases is critical and urgent in reducing the ophthalmologist’s workload and prevents vision damage of patients. Computer vision and deep learning can automatically detect ocular diseases after providing high-quality medical eye fundus images. In this article, I show different experiments and approaches towards building an advanced classification model using convolutional neural networks written using the TensorFlow library.

早期眼病檢測是預防由糖尿病，青光眼，白內障，年齡相關性黃斑變性(AMD)和許多其他疾病引起的失明的經濟有效方法。根據世界衛生組織(WHO)的目前，全世界至少有22億人有視力障礙，其中至少有10億人本來可以預防[1]。快速和自動檢測疾病對于減輕眼科醫生的工作量并防止患者視力損害至關重要。提供高質量的醫學眼底圖像后，計算機視覺和深度學習可以自動檢測眼部疾病。在本文中，我展示了使用使用TensorFlow庫編寫的卷積神經網絡構建高級分類模型的不同實驗和方法。

數據集 (Dataset)

Ocular Disease Intelligent Recognition (ODIR) is a structured ophthalmic database of 5,000 patients with age, color fundus photographs from left and right eyes, and doctors’ diagnostic keywords from doctors. This dataset is meant to represent the ‘‘real-life’’ set of patient information collected by Shanggong Medical Technology Co., Ltd. from different hospitals/medical centers in China. In these institutions, fundus images are captured by various cameras in the market, such as Canon, Zeiss, and Kowa, resulting in varied image resolutions. Annotations were labeled by trained human readers with quality control management[2]. They classify patients into eight labels including normal (N), diabetes (D), glaucoma (G), cataract (C), AMD (A), hypertension (H), myopia (M), and other diseases/abnormalities (O).

眼病智能識別(ODIR)是一個結構化的眼科數據庫，包含5,000名年齡的患者，左眼和右眼的彩色眼底照片以及醫生的醫生診斷關鍵字。該數據集旨在代表由上工醫療技術有限公司從中國不同醫院/醫療中心收集的“真實”患者信息集。在這些機構中，眼底圖像由市場上的各種相機(例如佳能，蔡司和Kowa)捕獲，從而產生不同的圖像分辨率。注釋由經過培訓的人類讀者進行質量控制管理來標記[2]。他們將患者分為八個標簽，包括正常(N)，糖尿病(D)，青光眼(G)，白內障(C)，AMD(A)，高血壓(H)，近視(M)和其他疾病/異常(O) 。

After preliminary data exploration I found the following main challenges of the ODIR dataset:

經過初步的數據探索，我發現了ODIR數據集的以下主要挑戰：

· Highly unbalanced data. Most images are classified as normal (1140 examples), while specific diseases like for example hypertension have only 100 occurrences in the dataset.

·高度不平衡的數據。大多數圖像被歸類為正常圖像(1140個示例)，而特定疾病(例如高血壓)在數據集中僅出現100次。

· The dataset contains multi-label diseases because each eye can have not only one single disease but also a combination of many.

·數據集包含多標簽疾病，因為每只眼睛不僅可以患有一種疾病，而且可以患有多種疾病。

· Images labeled as “other diseases/abnormalities” (O) contain images associated to more than 10 different diseases stretching the variability to a greater extent.

·標記為“其他疾病/異常”(O)的圖像包含與10多種不同疾病相關的圖像，這些圖像在更大程度上擴展了變異性。

· Very big and different image resolutions. Most images have sizes of around 2976x2976 or 2592x1728 pixels.

·非常大且不同的圖像分辨率。大多數圖像的大小約為2976x2976或2592x1728像素。

All these issues take a significant toll on accuracy and other metrics.

所有這些問題都會對準確性和其他指標造成重大損失。

數據預處理 (Data Pre-Processing)

Firstly, all images are resized. In the beginning, I wanted to resize images “on the fly”, using TensorFlow dataset object. Images were resized while training the model. I thought it could prevent time-consuming images resizing at once. Unfortunately, it was not a good decision, execution of one epoch could take even 15 minutes, so I created another function to resize images before creating the TensorFlow dataset object. As a result, data are resized only once and saved to a different directory, thus I could experiment with different training approaches using much faster training execution. Initially, all images were resized to 32x32 pixels size, but quickly I realized that compressing to such a low size, even though it speeds up the training process significantly, loses a lot of important image information, thus accuracy was very low. After several experiments I found that size of 250x250 pixels was the best in terms of compromising training speed and accuracy metrics, thus I kept this size on all images for all further experiments.

首先，調整所有圖像的大小。一開始，我想使用TensorFlow數據集對象“即時”調整圖像大小。在訓練模型時調整圖像大小。我認為這可以防止耗時的圖像立即調整大小。不幸的是，這不是一個好的決定，一個紀元的執行甚至可能花費15分鐘，因此我創建了另一個函數來調整圖像大小，然后再創建TensorFlow數據集對象。結果，數據僅調整一次大小并保存到其他目錄，因此我可以使用更快的訓練執行速度來嘗試不同的訓練方法。最初，所有圖像的大小都調整為32x32像素，但是很快我意識到壓縮到這么小的尺寸，即使它可以顯著加快訓練過程，也會丟失很多重要的圖像信息，因此準確性非常低。經過幾次實驗，我發現250x250像素的尺寸在降低訓練速度和準確性指標方面是最好的，因此我將所有圖片的尺寸都保留下來，以進行進一步的實驗。

Secondly, images are labeled. There is a problem with images annotations in the data.csv file because the labels relate to both eyes (left and right) at once whereas each eye can have a different disease. For example, if the left eye has a cataract and right eye has normal fundus, the label would be a cataract, not indicating a diagnosis of the right eye. Fortunately, the diagnostic keywords relate to a single eye. Dataset was created in a way to provide to the model as input both left and right eye images and return overall (for both eyes) cumulated diagnosis, neglecting the fact that one eye can be healthy. In my opinion, it does not make sense from a perspective of a final user of such a model, and it is better to get predictions separately for each eye, to know for example which eye should be treated. So, I enriched the dataset by creating a mapping between the diagnostic keywords to disease labels. This way, each eye is assigned to a proper label. Fragment of this mapping, in the form of a dictionary, is presented in the Fig. 1. Label information is added by renaming image names, and more specifically, by adding to the image file name one or more letters corresponding to the specific diseases. I applied this solution because this way I do not need to store any additional data frame with all labels. Renaming files is a very fast operation and in the official TensorFlow documentation, TensorFlow datasets are created simply from files, and label information is retrieved from the file name[3]. Moreover, some images that had annotations not related to the specific disease itself, but to the low quality of the image, like “lens dust” or “optic disk photographically invisible” are removed from the dataset as they do not play a decisive role in determining patient’s disease.

其次，圖像被標記。 data.csv文件中的圖像注釋存在問題，因為標簽一次涉及到兩只眼睛(左右)，而每只眼睛可能患有不同的疾病。例如，如果左眼患有白內障而右眼具有正常眼底，則標簽將是白內障，并不表示對右眼的診斷。幸運的是，診斷關鍵字與一只眼睛有關。數據集的創建方式是向模型提供輸入作為左眼和右眼圖像，然后返回整體(對于雙眼)累積的診斷，而忽略了一只眼睛可以健康的事實。我認為，從這種模型的最終用戶的角度來看，這是沒有意義的，最好分別為每只眼睛進行預測，以了解例如應治療哪只眼睛。因此，我通過在診斷關鍵字與疾病標簽之間創建映射來豐富了數據集。這樣，每只眼睛都被分配了一個適當的標簽。該映射的片段以字典的形式呈現在圖1中。通過重命名圖像名稱來添加標簽信息，更具體地說，是通過在圖像文件名稱中添加一個或多個對應于特定疾病的字母來添加標簽信息。我之所以應用此解決方案，是因為這樣一來，我不需要存儲帶有所有標簽的任何其他數據框。重命名文件是一項非常快速的操作，在TensorFlow官方文檔中，僅從文件創建TensorFlow數據集，并從文件名中檢索標簽信息[3]。此外，一些注釋與特定疾病本身無關，但與圖像質量低下有關的圖像(例如“鏡頭塵”或“照相上看不見的光盤”)會從數據集中刪除，因為它們在圖像處理中不起決定性作用。確定患者的疾病。

Fig. 1: Fragment of dictionary mapping specific diagnostic keyword with a disease label圖1：字典片段映射帶有疾病標簽的特定診斷關鍵字

Thirdly, the validation set is created by randomly selecting 30% of all available images. I chose 30% because this dataset is relatively small (only 7000 images in total), but I wanted to make my validation representative enough, not to have a bias when evaluating model, related to the fact, that many image variants or classes could not have their representation in the validation set. The ODIR dataset provides testing images, but unfortunately, no labeling information is provided to them in the data.csv file, thus I could not use available testing images to evaluate the model.

第三，通過隨機選擇所有可用圖像的30％來創建驗證集。我選擇30％是因為該數據集相對較小(總共僅7000張圖像)，但是我想使我的驗證具有足夠的代表性，而在評估模型時不要有偏見，這與事實有關，即許多圖像變體或類不能在驗證集中具有它們的表示形式。 ODIR數據集提供測試圖像，但是不幸的是，在data.csv文件中沒有為它們提供標簽信息，因此我無法使用可用的測試圖像來評估模型。

Next, data augmentation on minority classes was applied on the training set to balance the dataset. Random zoom, random rotation, flip left-right, flip top-bottom were applied. In the beginning, I used the TensorFlow dataset object for applying data augmentation “on the fly” while training the model[4] in order to keep my solution as scalable as possible. Unfortunately, it lacks many features like random rotation, therefore I performed data augmentation before creating the TensorFlow dataset object using other libraries for image processing like OpenCV. In the beginning, I also considered enhancing all images by applying contrast-limited adaptive histogram equalization (CLAHE) in order to increase the visibility of local details of an image, but since it was adding a lot of extra noise to the images (especially to the background, which originally is black) I decided not to follow that direction. Examples of data augmentation using my function written using PIL and OpenCV libraries is presented in Fig. 2.

接下來，將少數群體類別的數據增強應用于訓練集以平衡數據集。應用了隨機縮放，隨機旋轉，左右翻轉，上下翻轉。最初，我在訓練模型時使用TensorFlow數據集對象“動態”應用數據增強[4]，以使我的解決方案盡可能地可擴展。不幸的是，它缺少許多功能，例如隨機旋轉，因此我在使用其他庫(如OpenCV)創建TensorFlow數據集對象之前執行了數據擴充。剛開始時，我還考慮過通過應用對比度限制的自適應直方圖均衡化(CLAHE)來增強所有圖像，以提高圖像局部細節的可見度，但是由于這樣做會給圖像增加很多額外的噪音(尤其是背景(本來是黑色的))我決定不遵循這個方向。使用PIL和OpenCV庫編寫的函數進行數據擴充的示例如圖2所示。

Fig. 2: Exemplary data augmentation results圖2：示例性數據擴充結果

Finally, the TensorFlow dataset object is created. It is developed very similarly to the one presented in official TensorFlow documentation for loading images[5]. Since the library is complicated, and not easy to use for TensorFlow beginners, I would like to share here a summary of my findings on building scalable and fast input pipelines. The tf.data API enables you to build complex input pipelines from simple, reusable pieces. For example, the pipeline for an image model might aggregate data from files in a distributed file system. The tf.data API introduces a tf.data.Dataset abstraction that represents a sequence of elements, in which each element consists of one or more components. For example, in my image pipeline, an element is a single training example, with a pair of tensor components representing the image and its label[6]. With the idea of creating mini-batches, TensorFlow introduces the so-called iterative learning process which is feeding to the model some portion of data (not entire dataset), training, and repeating with another portion, which are called batches. Batch size defines how many examples will be extracted at each training step. After each step, weights are updated. I selected batch size equal to 32, in order to avoid the overfitting problem. With small batch size, weights keep updating regularly and often. The downside of having a small batch size is that training takes much longer than with the bigger size. One important element of tf.data is the ability of the shuffling dataset. In shuffling, the dataset fills a buffer with elements, then randomly samples elements from this buffer, replacing the selected elements with new elements[7]. It prevents situations when images of the same class will be repetitively filled to the batch, which is not beneficial for training the model.

最后，創建TensorFlow數據集對象。它的開發與TensorFlow官方文檔中介紹的用于加載圖像的開發非常相似[5]。由于該庫很復雜，而且對于TensorFlow初學者來說不容易使用，因此我想在此分享我在構建可擴展和快速輸入管道方面的發現的摘要。使用tf.data API，您可以從簡單，可重用的片段中構建復雜的輸入管道。例如，圖像模型的管道可能會聚合分布式文件系統中文件中的數據。 tf.data API引入了tf.data.Dataset抽象，它表示一系列元素，其中每個元素由一個或多個組件組成。例如，在我的圖像管道中，一個元素是一個單獨的訓練示例，其中有一對張量分量表示圖像及其標簽[6]。 TensorFlow基于創建迷你批的想法，引入了所謂的迭代學習過程，該過程將部分數據(不是整個數據集)饋入模型，進行訓練并與另一部分重復進行，這稱為批處理。批次大小定義了每個訓練步驟將提取多少個示例。每一步之后，權重都會更新。為了避免過度擬合的問題，我選擇了等于32的批量大小。批量較小時，重量會定期且經常更新。批量較小的缺點是培訓所需的時間比批量較大的要長得多。 tf.data的一個重要元素是改組數據集的功能。在改組中，數據集用元素填充緩沖區，然后從該緩沖區中隨機采樣元素，用新元素替換所選元素[7]。這樣可以防止將相同類別的圖像重復填充到批次中的情況，這不利于訓練模型。

建立卷積神經網絡 (Building Convolutional Neural Network)

In deep learning, a convolutional neural network (CNN) is a class of deep neural networks, most commonly applied to analyzing visual imagery[8]. Input layer takes 250x250 RGB images. The first 2D convolution layer shifts over the input image using a window of the size of 5x5 pixels to extract features and save them on a multi-dimensional array, in my example number of filters for the first layer equals 32, so to (250, 250, 32) size cube.

在深度學習中，卷積神經網絡(CNN)是一類深度神經網絡，最常用于分析視覺圖像[8]。輸入層可拍攝250x250 RGB圖像。第一2D卷積層使用5x5像素大小的窗口在輸入圖像上移動以提取特征并將其保存在多維數組中，在我的示例中，第一層的過濾器數量等于32，因此等于(250， 250，32)尺寸的立方體。

After each convolution layer, a rectified linear activation function (ReLU) is applied. Activation has the authority to decide if neuron needs to be activated or not measuring the weighted sum. ReLU returns the value provided as input directly, or the value 0.0 if the input is 0.0 or less. Because rectified linear units are nearly linear, they preserve many of the properties that make linear models easy to optimize with gradient-based methods. They also preserve many of the properties that make the linear model generalize well[9].

在每個卷積層之后，都應用了整流線性激活函數(ReLU)。激活有權決定是否需要激活神經元或不測量加權和。 ReLU直接返回作為輸入提供的值，如果輸入等于或小于0.0，則返回值0.0。由于整流線性單位幾乎是線性的，因此它們保留了許多特性，這些特性使線性模型易于使用基于梯度的方法進行優化。它們還保留了許多使線性模型泛化的屬性[9]。

To progressively reduce the spatial size of the input representation and minimize the number of parameters and computation in the network max-pooling layer is added. In short, for each region represented by the filter of a specific size, in my example it is (5, 5), it will take the max value of that region and create a new output matrix where each element is the max of the region in the original input.

為了逐步減小輸入表示的空間大小并最大程度地減少參數的數量，并添加了網絡最大池化層中的計算。簡而言之，對于由特定大小的過濾器表示的每個區域，在我的示例中為(5，5)，它將采用該區域的最大值并創建一個新的輸出矩陣，其中每個元素為該區域的最大值在原始輸入中。

To avoid overfitting problems, two dropouts of 45% layers were added. Several batch normalization layers were added to the model. Batch normalization is a technique for improving the speed, performance, and stability of artificial neural networks[10]. It shifts the distribution of neuron output, so it better fits the activation function.

為避免過度擬合的問題，添加了兩個45％的濾除層。幾個批處理歸一化層已添加到模型中。批處理規范化是一種用于提高人工神經網絡的速度，性能和穩定性的技術[10]。它改變了神經元輸出的分布，因此更適合激活功能。

Finally, the “cube” is flattened. No fully connected layers are implemented to keep the simplicity of the network and keep training fast. The last layer is 8 dense because 8 is the number of labels (diseases) present in the dataset. Since we are facing multi-label classification (data sample can belong to multiple instances) sigmoid activation function is applied to the last layer. The sigmoid function converts each score to the final node between 0 to 1, independent of what other scores are (in contrast to other functions like, for example, softmax), that is why sigmoid works best for the multi-label classification problems. Since we are using the sigmoid activation function, we must go with the binary cross-entropy loss. The selected optimizer is Adam with a low learning rate of 0.0001 because of the overfitting problems that I was facing during the training. The entire architecture of my CNN is presented in Fig.3.

最后，“立方體”被展平。沒有實現完全連接的層來保持網絡的簡單性并保持快速的培訓。最后一層是8致密的，因為8是數據集中存在的標記(疾病)的數量。由于我們面臨著多標簽分類(數據樣本可以屬于多個實例)，因此將S型激活函數應用于最后一層。 sigmoid函數將每個分數轉換為0到1之間的最終節點，而與其他分數無關(與諸如softmax之類的其他函數相反)，這就是為什么sigmoid最能解決多標簽分類問題。由于我們使用的是S型激活函數，因此必須考慮二進制交叉熵損失。所選的優化器是Adam，學習速度為0.0001，因為我在培訓過程中遇到了過度擬合的問題。我的CNN的整個架構如圖3所示。

Fig. 3: Model summary圖3：模型摘要

實驗與結果 (Experiments and Results)

For simplicity, I wanted to start my research with easy proof-of-concept experiments, on less challenging and smaller datasets, to test if all previous assumptions were correct. Thus, I started training a simple model to detect if an eye has normal fundus or cataract, training only on images labeled as N (normal) or C (cataract). The results were very satisfactory, using a relatively simple network in 12 epochs my model got 93% on validation accuracy. This already shows that using CNN it is possible to automatically detect eye cataracts! In each next experiment, I was adding to the dataset images of another class. The fourth experiment is performed on the entire ODIR dataset, achieving almost 50% validation accuracy. Results from the experiments are presented in Table 1. As we can clearly see the overall model has low results because it is hard to train it to detect diabetes correctly since the eye with diabetes looks almost the same as the eye with normal fundus. Detecting myopia or cataract is a much easier task because these images vary a lot from each other and from the normal fundus. Illustration of different selected diseases is presented in the Fig. 4.

為簡單起見，我想從簡單的概念驗證實驗開始研究，以減少挑戰性和縮小數據集的方式來測試所有先前的假設是否正確。因此，我開始訓練一個簡單的模型來檢測眼睛是否具有正常的眼底或白內障，僅對標記為N(正常)或C(白內障)的圖像進行訓練。結果非常令人滿意，在12個時間段內使用相對簡單的網絡，我的模型的驗證準確性達到93％。這已經表明，使用CNN可以自動檢測眼睛白內障！在接下來的每個實驗中，我都將另一個類的圖像添加到數據集中。第四個實驗是在整個ODIR數據集上進行的，驗證精度幾乎達到50％。實驗結果列于表1。我們可以清楚地看到整個模型的結果很低，因為很難訓練它正確地檢測出糖尿病，因為糖尿病眼與眼底正常的眼幾乎一樣。檢測近視或白內障是一個容易得多的任務，因為這些圖像彼此之間以及與正常眼底之間存在很大差異。圖4給出了不同選定疾病的圖示。

Table 1: Experiment results. Legend: N — normal, C- cataract, M — myopia, A — AMD, D — diabetes, ALL — model trained on the entire ODIR dataset表1：實驗結果。圖例：N-正常，白內障，M-近視，A-AMD，D-糖尿病，ALL-在整個ODIR數據集上訓練的模型 Fig. 4: Illustration of different eye diseases. Clearly Diabetes seems to be the most challenging in detecting and cataract is the easiest as varies the most from the normal fundus.圖4：不同眼病的圖示。顯然，糖尿病似乎是檢測中最具挑戰性的疾病，而白內障最容易發生，因為與正常眼底的差異最大。

For all experiments, the same neural network architecture was used. The only difference is the number of epochs each experiment needed to get to the presented results (some needed to be early stopped, others needed more epochs to learn). Also, for experiments that did not include the entire dataset, softmax activation function, and categorical cross-entropy loss were used since they are multi-class, not multi-label classification problems.

對于所有實驗，使用相同的神經網絡架構。唯一的區別是每個實驗達到提出的結果所需的時期數(有些需要提前停止，其他的則需要更多的時期來學習)。另外，對于不包含整個數據集的實驗，使用softmax激活函數和分類交叉熵損失，因為它們屬于多類而非多標簽分類問題。

關于模型可伸縮性的最終考慮 (Final Considerations on Model Scalability)

Nowadays, in the world of Big Data, it is crucial to evaluate every IT project, based on its scalability and reproducibility. From the beginning of the implementation of this project, I put a lot of emphasis on the idea, that even though it is a research project, maybe in the future with more data points of eye diseases the model could be re-trained, and certainly will achieve much better results having more images to train on. So, the main goal was to build a universal data pipeline that is able to handle many more datapoints. This goal was mostly achieved by using advanced TensorFlow library, especially with the dataset object, that supports ETL processes (Extract, Transform, Load) on large datasets. Unfortunately, some transformations were needed to be done before creating the TensorFlow dataset object, which are image resizing and augmenting minority classes. Maybe in the future, it will be possible to resize images “on the fly” faster, and more augmentation functions would be added like random rotation, which was already mentioned before. But if we consider having more data points in the future, possibly it would not be necessary to perform any augmentations, as sufficiently enough image variations would be provided. From the perspective of other popular datasets used in deep learning projects, ODIR would be considered as a small one. That is the reason why data points had to be augmented and oversampled in order to achieve sensible results.

如今，在大數據世界中，至關重要的是根據其可伸縮性和可再現性評估每個IT項目。從這個項目的實施開始，我就非常強調這個想法，即使這是一個研究項目，也許將來在有更多眼病數據點的情況下，該模型可以重新訓練，當然可以訓練更多圖像，效果會更好。因此，主要目標是建立一個能夠處理更多數據點的通用數據管道。此目標主要是通過使用高級TensorFlow庫(尤其是與支持大型數據集上的ETL流程(提取，轉換，加載)的數據集對象)來實現的。不幸的是，在創建TensorFlow數據集對象之前需要進行一些轉換，這些轉換是圖像大小調整和增強少數類。也許在將來，可以更快地“即時”調整圖像大小，并且將添加更多的增強功能，例如之前已經提到的隨機旋轉。但是，如果我們考慮在將來擁有更多的數據點，則可能不需要進行任何擴充，因為將提供足夠的圖像變化。從深度學習項目中使用的其他流行數據集的角度來看，ODIR將被視為一小部分。這就是為什么必須對數據點進行擴充和過采樣才能獲得合理的結果的原因。

摘要 (Summary)

In this project, I have proved that it is possible to detect various eye diseases using convolutional neural networks. The most satisfying result is detecting cataracts with 93% accuracy. Examining all the diseases at one time, gave significantly lower results. With the ODIR dataset providing all-important variations of a specific disease to the training model was not always possible, which affects the final metrics. Although, I am sure that having a bigger dataset, would increase the accuracy of predictions and finally automate the process of detecting ocular diseases.

在這個項目中，我證明了可以使用卷積神經網絡檢測各種眼部疾病。最令人滿意的結果是以93％的準確度檢測白內障。一次檢查所有疾病，結果明顯偏低。利用ODIR數據集，不可能總是向訓練模型提供特定疾病的所有重要變化，這會影響最終指標。雖然，我相信擁有更大的數據集會提高預測的準確性，并最終使眼部疾病的檢測過程自動化。

翻譯自: https://towardsdatascience.com/ocular-disease-recognition-using-convolutional-neural-networks-c04d63a7a2da

卷積網絡和卷積神經網絡

總結

以上是生活随笔為你收集整理的卷积网络和卷积神经网络_卷积神经网络的眼病识别的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：最后一季，《怪奇物语》第 5 季将在今年
下一篇：了解回归：迈向机器学习的第一步