當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

手写体数字图像识别图像_手写识别调整笔画大小而不是图像

發(fā)布時間：2023/12/20 编程问答 53 豆豆

生活随笔收集整理的這篇文章主要介紹了手写体数字图像识别图像_手写识别调整笔画大小而不是图像小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

手寫體數(shù)字圖像識別圖像

A straightforward algorithm to dealing with handwritten symbol recognition problems in Machine Learning

一種處理機器學習中手寫符號識別問題的簡單算法

The recognition of handwritten symbols is one of the most popular and representative examples when dealing with many machine-learning problems. Even more so, when understanding deep neural-network concepts. For this reason, in addition to its obvious practical applications, it is a widely researched topic.

噸他手寫識別符號的是有許多機器學習問題時最流行，最有代表性的例子之一。在理解深入的神經(jīng)網(wǎng)絡概念時更是如此。因此，除了其明顯的實際應用之外，它還是一個被廣泛研究的主題。

Recently, I developed a simple web-based calculator that evaluates handwritten numerical expressions (handcalc.io) using a deep neural network kernel, without the use of any existing external machine learning or image processing libraries. My initial approach was to train the model using existing databases found online, but this proved to be ineffective, considering there were significant discrepancies between these and the symbols written from within the web-app.

最近，我開發(fā)了一個簡單的基于Web的計算器，該計算器使用深度神經(jīng)網(wǎng)絡內核來評估手寫數(shù)字表達式( handcalc.io )，而無需使用任何現(xiàn)有的外部機器學習或圖像處理庫。我最初的方法是使用在線找到的現(xiàn)有數(shù)據(jù)庫來訓練模型，但事實證明這是無效的，因為這些與從Web應用程序中編寫的符號之間存在顯著差異。

In this article I dive in more detail into the motivation and process of implementing a so-called stroke-scaling algorithm. This was a crucial piece to create an effective training dataset for the neural network, as well as a compatible pre-processing step for the web-app’s user inputs to work with the prediction model.

在本文中，我將更詳細地介紹實現(xiàn)所謂的筆劃縮放算法的動機和過程。這是創(chuàng)建有效的神經(jīng)網(wǎng)絡訓練數(shù)據(jù)集的關鍵部分，也是Web應用程序用戶輸入與預測模型一起使用的兼容預處理步驟。

在驗證/測試數(shù)據(jù)集之外識別符號 (Recognising symbols outside the validation/testing datasets)

The process of training the neural network model initially seemed to be a straight-forward task, considering the availability of existing handwritten-symbol databases (e.g. MNIST Database). Unfortunately, although my first trained models achieved accuracies above 95% on their validation sets, they proved to be virtually useless when dealing with symbols coming from outside the dataset.

考慮到現(xiàn)有手寫符號數(shù)據(jù)庫(例如MNIST數(shù)據(jù)庫)的可用性，訓練神經(jīng)網(wǎng)絡模型的過程最初似乎很簡單。不幸的是，盡管我的第一個訓練有素的模型在其驗證集上獲得了95％以上的準確度，但事實證明，當處理來自數(shù)據(jù)集外部的符號時，它們幾乎沒有用。

The reason is visually evident: The symbol-data belonging to existing datasets usually originates from existing handwritten texts, where, in general, the stroke is rather thick compared to the symbols proportions (Fig. 1). Each datapoint is a rather non-sparse matrix containing a swarm of agglomerated pixels of different intensities.

原因在視覺上是顯而易見的：屬于現(xiàn)有數(shù)據(jù)集的符號數(shù)據(jù)通常源自現(xiàn)有的手寫文本，通常，與符號比例相比，筆畫的筆觸較粗(圖1)。每個數(shù)據(jù)點都是一個相當稀疏的矩陣，其中包含大量不同強度的聚集像素。

Fig. 1: Sample symbol ‘5’ from the MNIST dataset圖1：來自MNIST數(shù)據(jù)集的樣本符號“ 5”

The data-points coming from the web-application, however, were sparse, noiseless, and contained thin strokes. The pixels are filled or unfilled, and thus embody a “purer” version of the symbols meant by the writer (Fig. 2).

但是，來自Web應用程序的數(shù)據(jù)點稀疏，無噪音且筆觸細小。像素被填充或未填充，因此體現(xiàn)了書寫者所指符號的“較純”版本(圖2)。

Fig. 2: Sample symbol ‘5’ from the drawing board within the web-application圖2：Web應用程序中來自繪圖板的示例符號“ 5”

Both versions are clearly incompatible, considering their nature is inherently different, and the initial model was therefore doomed to be unsatisfactory for the intended application.

考慮到它們的本質是本質上的不同，這兩個版本顯然是不兼容的，因此，最初的模型注定不能滿足預期的應用程序的要求。

Facing the status quo, virtually the only viable solution was to create a new dataset from scratch, compatible with the type of data-points the users would generate when using the application (the dataset can be found here). For this, I created a simple image-processing library that allowed me to extract the data from photographed pieces of paper with thin handwritten symbols (the library, together with the custom neural network library can be found here).

面對現(xiàn)狀，實際上唯一可行的解??決方案是從頭開始創(chuàng)建一個新數(shù)據(jù)集，該數(shù)據(jù)集與用戶在使用該應用程序時將生成的數(shù)據(jù)點類型兼容(可以在此處找到數(shù)據(jù)集)。為此，我創(chuàng)建了一個簡單的圖像處理庫，該庫使我可以從帶有薄手寫符號的照相紙中提取數(shù)據(jù)(該庫以及自定義神經(jīng)網(wǎng)絡庫都可以在此處找到)。

Although the library contains many functionalities, it was the stroke-scaling algorithm that really ensured consistency across all data-points, whether it was during the creation of the model’s datasets, or during the image pre-processing within the web-app.

盡管該庫包含許多功能，但筆劃縮放算法確實確保了所有數(shù)據(jù)點的一致性，無論是在創(chuàng)建模型數(shù)據(jù)集期間還是在Web應用程序內進行圖像預處理期間。

行程縮放算法 (The stroke-scaling algorithm)

The main idea behind this algorithm is to scale images without altering the information about the stroke itself. When one draws a symbol, our mind only thinks of the pure shape of the stroke to draw. The same symbol written with pens of different thicknesses should “contain” the same information, since the same thing was meant. This algorithm seeks to scale symbol-images preserving the stroke’s information.

該算法背后的主要思想是在不更改筆觸信息的情況下縮放圖像。當一個人繪制一個符號時，我們的大腦只會想到要繪制的筆畫的純凈形狀。用相同粗細的筆書寫的相同符號應該“包含”相同的信息，因為這意味著相同的意思。該算法試圖縮放符號圖像，以保留筆劃的信息。

In the case of scaling up the original image (Fig.3), every line of one pixel width may only change its length, but not thickness.

在放大原始圖像的情況下(圖3)，每一個像素寬度的線只能改變其長度，而不能改變其厚度。

Fig. 3: Unscaled symbol (14x14 pixels)圖3：未縮放的符號(14x14像素)

The scaled up image, may look thinner, but in reality contains the same information, as if one would have drawn the symbol with the same hand movement, just on a canvas of higher resolution (Fig. 4):

放大后的圖像可能看起來更細，但實際上包含相同的信息，就好像是在相同分辨率的畫布上以相同的手勢繪制了該符號一樣(圖4)：

Fig. 4: Scaled symbol (28x28 pixels)圖4：縮放符號(28x28像素)

During the scaling process, the following steps take place: first, every filled pixel of the original image is mapped once to a blank canvas of the desired size, and secondly, an interpolated line between every touching pixel is created, except when two pixels touch each other diagonally on a corner. As a consequence, lines may only change length, but not width. Since each pixel is always mapped, in the case of downscaling, the thin lines do not disappear, and remain at least of one pixel width.

在縮放過程中，將執(zhí)行以下步驟：首先，將原始圖像的每個填充像素映射到所需大小的空白畫布一次，其次，在每個觸摸像素之間創(chuàng)建一條內插線，除非兩個像素觸摸在對角線上彼此對角。結果，線只能改變長度，而不能改變寬度。由于每個像素始終被映射，因此在縮小比例的情況下，細線不會消失，并且至少保持一個像素寬度。

算法實現(xiàn) (Algorithm implementation)

To keep the code pragmatic, the functions presented hereunder are methods within more complex class implementations, and thus, not standalone codebases. This allows for simplicity, avoiding too many arguments passed onto the functions, and keeps the reader’s focus on key areas.

為了使代碼實用，下面介紹的功能是更復雜的類實現(xiàn)中的方法，因此不是獨立的代碼庫。這樣可以簡化操作，避免將過多的參數(shù)傳遞給函數(shù)，并使讀者將注意力集中在關鍵區(qū)域。

Originally, I implemented the code in Python for the preprocessing of data-points during the creation of the datasets, and in JavaScript for the pre-processing of user-inputs within the web-application. However, I also included an implementation in C++, offering a variant that prioritises performance.

最初，我在Python中實現(xiàn)了代碼，用于在創(chuàng)建數(shù)據(jù)集期間對數(shù)據(jù)點進行預處理，而在JavaScript中實現(xiàn)了用于對Web應用程序中的用戶輸入進行預處理的代碼。但是，我還包括C ++的實現(xiàn)，提供了優(yōu)先考慮性能的變體。

All three implementations display different levels of abstraction: the JavaScript version deals with images using custom objects like grids, fields, and coordinates, offering a more intuitive approach; the Python implementation deals with the images as a 2D Numpy arrays; and the C++ implementation uses basic 1D arrays, which offers better time and memory performance, at the cost of less readable source-code.

這三種實現(xiàn)都顯示了不同的抽象級別：JavaScript版本使用諸如網(wǎng)格，字段和坐標之類的自定義對象處理圖像，從而提供了更直觀的方法； Python實現(xiàn)將圖像作為2D Numpy數(shù)組處理； C ++的實現(xiàn)使用基本的一維數(shù)組，該數(shù)組可提供更好的時間和內存性能，但源代碼的可讀性較低。

It is important mentioning that the algorithm uses a method to recognise whether pixels touching diagonally are part of a corner (and should therefore skip the interpolation step). To understand the details behind this implementation, please visit the full version of each algorithm (linked at the title and caption of each version).

值得一提的是，該算法使用一種方法來識別對角線接觸的像素是否為角的一部分(因此應跳過插值步驟)。要了解此實現(xiàn)背后的細節(jié)，請訪問每種算法的完整版本(鏈接在每個版本的標題和標題上)。

JavaScript (full implementation found here):As mentioned above, this implementation deals with a rather sophisticated abstraction of the image canvas. It uses the grid object, which essentially contains its dimensions as attributes, and a 1D array of fields - another custom object - each of which owns an immutable 2D coordinate, and a boolean holding the information of wether the field is filled or not.

JavaScript (在此處找到完整的實現(xiàn)) ：如上所述，此實現(xiàn)處理圖像畫布的相當復雜的抽象。它使用網(wǎng)格對象，該對象本質上包含其維作為屬性，以及一維字段數(shù)組-另一個自定義對象-每個對象都具有不可變的2D坐標，以及一個布爾值，用于保存是否填充該字段的信息。

Apart from the specificalities of how the objects in question are created, accessed, or mutated, the algorithm remains inherently unchanged towards the Python and C++ implementation, and the naming conventions provide an intuitive understanding.

除了特定對象如何創(chuàng)建，訪問或變異的特殊性外，該算法對于Python和C ++實現(xiàn)本質上保持不變，并且命名約定提供了直觀的理解。

scale(xFields, yFields) {// Scales a grid to fit the given position. Only the stroke is scaled, meaning that// each pixel will only be mapped once to the destination. Its position will be scaled but// it's thickness wont. Between non-corner filled pixels there will be a interpolated line// created to keep the stroke continuous.// Destination canvas (Object containing an array of fields with a coordinate, and a boolean)let scaledGrid = new Grid(xFields, yFields);// Dimensions of original grid/imageconst shapeX = this.grid.xFields;const shapeY = this.grid.yFields;// Scaled dimensions without extra width, in case there are filled pixels in the edge (they don't change width)const xFieldsAugmented = shapeX !== 1 ? Math.ceil(((xFields - 1) * shapeX) / (shapeX - 1)) : 0; const yFieldsAugmented = shapeY !== 1 ? Math.ceil(((yFields - 1) * shapeY) / (shapeY - 1)) : 0;// Scaling ratiosconst scalingX = xFieldsAugmented / shapeX;const scalingY = yFieldsAugmented / shapeY;// Array with relative positions of sorrounding fields in relation to the current fieldconst positions = [[-1, 1],[0, 1],[1, 1],[1, 0],[-1, -1],];// Iteration through every field in the original grid// If any of the sorrounding fields in the original grid are field, linear interpolation// between the current pixel and the surrounding one is performed, filling fields inbetween for (let y = 0; y < shapeY; y++) {for (let x = 0; x < shapeX; x++) {// Filled fields in the original grid get mapped into the destination (scaled) gridif (this.grid.getField(x, y).isFilled) {const xScaled = Math.floor(x * scalingX);const yScaled = Math.floor(y * scalingY);scaledGrid.getField(xScaled, yScaled).isFilled = this.grid.getField(x, y).isFilled;// Every position is checked for (let position of positions) {// Calculates the adjacent pixel, checking it's not out of boundsconst xNext = 0 <= x + position[1] && x + position[1] < shapeX ? x + position[1] : x;const yNext = 0 <= y + position[0] && y + position[0] < shapeY ? y + position[0] : y;// Interpolation happens only if the next pixel is filled AND they're not in a corner (to avoid lines between // diagonally touching pixels in a corner)if (this.grid.getField(xNext, yNext).isFilled && !this._isCorner(x, y, position)) {const xScaledNext = Math.floor(xNext * scalingX);const yScaledNext = Math.floor(yNext * scalingY);//Linear interpolation between mapping of current pixel and adjacent pixel in the destination gridconst tMax = Math.max(Math.abs(xScaledNext - xScaled),Math.abs(yScaledNext - yScaled));for (let t = 1; t < tMax; t++) {const xP = Math.floor(xScaled + (t / tMax) * (xScaledNext - xScaled));const yP = Math.floor(yScaled + (t / tMax) * (yScaledNext - yScaled));scaledGrid.getField(xP, yP).isFilled = this.grid.getField(x,y).isFilled;}}}}}}return this.grid.replaceFields(scaledGrid); }

Python (full implementation found here): For performance and clear semantics, the natural way to approach this is by using Numpy arrays. These allow for 2D-indexing, and are optimised to perform faster and more efficiently with memory than lists, considering they manage space similarly to classic C++ arrays, which use adjacent memory slots.

Python (在此處找到完整的實現(xiàn)) ：為了獲得良好的性能和清晰的語義，解決此問題的自然方法是使用Numpy數(shù)組。這些允許2D索引，并且考慮到它們與使用相鄰內存插槽的經(jīng)典C ++數(shù)組類似地管理空間，因此對內存進行優(yōu)化以使其比列表更快，更高效。

This implementation is somewhat less abstract than the JavaScript implementation, and belongs to a more complex implementation within a Python subclass, but it should be intelligible enough, despite being out of its full context.

此實現(xiàn)比JavaScript實現(xiàn)抽象的要少一些，并且屬于Python子類中的一個更復雜的實現(xiàn)，但是盡管它沒有完整的上下文，但它也應該足夠清晰。

def scale(self,xFields,yFields):'''Scales an image to the specified dimensions, without keeping ratio.If scaleStroke=False, all pixels are stretched/compressed, otherwiseonly filled pixels are taken in consideration and spaces in between areinterpolated'''# Destination canvas (numpy array containing 0's or 1's)scaledData = np.zeros((yFields,xFields))# Dimensions of original grid/imageshapeX = self.imageData.data.shape[1]shapeY = self.imageData.data.shape[0]# Scaled dimensions without extra width, in case there are filled pixels in the edge (they don't change width)xFieldsAugmented = math.ceil((xFields - 1)*shapeX/(shapeX-1)) if shapeX != 1 else 0yFieldsAugmented = math.ceil((yFields - 1)*shapeY/(shapeY-1)) if shapeY != 1 else 0# Scaling ratiosscalingX = xFieldsAugmented / self.imageData.data.shape[1]scalingY = yFieldsAugmented / self.imageData.data.shape[0]# Array with relative positions of sorrounding fields in relation to the current fieldpositions = [[-1,1],[0,1],[1,1],[1,0],[-1,-1]]# Iteration through every field in the original grid# If any of the sorrounding fields in the original grid are field, linear interpolation# between the current pixel and the surrounding one is performed, filling fields inbetweenfor y in range(0,self.imageData.data.shape[0]):for x in range(0,self.imageData.data.shape[1]):# Filled fields in the original grid get mapped into the destination (scaled) gridif(self.imageData.data[y][x]):xScaled = math.floor(x * scalingX)yScaled = math.floor(y * scalingY)scaledData[yScaled][xScaled] = self.imageData.data[y][x]# Every position is checked for position in positions:# Calculates the adjacent pixel, checking it's not out of boundsxNext = x+position[1] if 0 <= x+position[1] < self.imageData.data.shape[1] else xyNext = y+position[0] if 0 <= y+position[0] < self.imageData.data.shape[0] else y# Interpolation happens only if the next pixel is filled AND they're not in a corner (to avoid lines between # diagonally touching pixels in a corner)if(self.imageData.data[yNext][xNext] and (not self._isCorner(x,y,position))):xScaledNext = math.floor(xNext * scalingX)yScaledNext = math.floor(yNext * scalingY)# Linear interpolation between mapping of current pixel and adjacent pixel in the destination gridtMax = max(abs(xScaledNext-xScaled),abs(yScaledNext-yScaled))for t in range(1,tMax):xP = math.floor(xScaled+(t/tMax)*(xScaledNext-xScaled))yP = math.floor(yScaled+(t/tMax)*(yScaledNext-yScaled))scaledData[yP][xP] = self.imageData.data[y][x]self.imageData.data = scaledDatareturn self.imageData

C++ (full implementation found here):This is the most austere variant of all three implementations, and for that reason, the one that puts the most emphasis on performance, taking advantage of the low-level control that the language offers.

C ++(可在此處找到完整的實現(xiàn))：這是所有三種實現(xiàn)中最嚴酷的變體，因此，這是最強調性能的一種，它利用了該語言提供的低級控制。

Recalling the Python implementation, using 2D arrays for the images may sound like the most intuitive approach for this third version. However, since we’re allocating the information dynamically, using nested arrays would not be declared in adjacent memory slots: the outer array would then hold pointers to arrays saved, in general, throughout the whole memory. This would significantly increase fetching times and render the algorithm rather inefficient. For this reason, both original and scaled images are stored in 1D arrays of fixed size. Accessing the array might be slightly more cumbersome, but worth the effort considering the indisputable performance improvement.

回顧Python的實現(xiàn)，對于圖像使用2D數(shù)組聽起來可能是此第三版最直觀的方法。但是，由于我們是動態(tài)分配信息，因此不會在相鄰的內存插槽中聲明使用嵌套數(shù)組：外部數(shù)組通常會保存指向整個內存中保存的數(shù)組的指針。這將顯著增加獲取時間，并使算法效率降低。因此，原始圖像和縮放后的圖像都存儲在固定大小的一維數(shù)組中。訪問陣列可能會稍微麻煩一些，但是考慮到無可爭議的性能改進，值得付出努力。

void scaleStroke(const unsigned int xFieldsScaled, const unsigned int yFieldsScaled) {// Scales a grid to fit the given position. Only the stroke is scaled, meaning that// each pixel will only be mapped once to the destination. Its position will be scaled but// it's thickness wont. Between non-corner filled pixels there will be a interpolated line// created to keep the stroke continuous. // Destination canvas as a one-dimensional array (to optimize memory allocation and speed)bool* scaledData = new bool[xFieldsScaled * yFieldsScaled];// Initialization of destination canvasfor (unsigned int y = 0; y < yFieldsScaled; ++y) {for (unsigned int x = 0; x < xFieldsScaled; ++x) {scaledData[y * xFieldsScaled + x] = false;}}// Scaled dimensions without extra width, in case there are filled pixels in the edge (they don't change width)const int xFieldsAugmented = xFields != 1 ? ceil(1.0 * (xFieldsScaled - 1) * xFields / (xFields - 1)) : 0;const int yFieldsAugmented = yFields != 1 ? ceil(1.0 * (yFieldsScaled - 1) * yFields / (yFields - 1)) : 0;// Scaling ratiosconst double scalingX = 1.0 * xFieldsAugmented / xFields;const double scalingY = 1.0 * yFieldsAugmented / yFields;// Array with relative positions of sorrounding fields in relation to the current fieldconst int positions[5][2] = {{-1, 1}, {0, 1}, {1, 1}, {1, 0}, {-1, -1}};// Iteration through every field in the original grid// If any of the sorrounding fields in the original grid are field, linear interpolation// between the current pixel and the surrounding one is performed, filling fields inbetween for (unsigned int y = 0; y < yFields; ++y) {for (unsigned int x = 0; x < xFields; ++x) {// Filled fields in the original grid get mapped into the destination (scaled) gridif (gridArray[y * xFields + x]) {const int xScaled = x * scalingX;const int yScaled = y * scalingY;scaledData[yScaled * xFieldsScaled + xScaled] = true;// Every position is checked for (unsigned int i = 0; i < 5; ++i) {// Calculates the adjacent pixel, checking it's not out of boundsconst int xNext =0 <= x + positions[i][1] && x + positions[i][1] < xFields ? x + positions[i][1] : x;const int yNext =0 <= y + positions[i][0] && y + positions[i][0] < yFields ? y + positions[i][0] : y;// Interpolation happens only if the next pixel is filled AND they're not in a corner (to avoid lines between // diagonally touching pixels in a corner)if (gridArray[yNext * xFields + xNext] &&!(isCorner(x, y, positions[i]))) {const int xScaledNext = xNext * scalingX;const int yScaledNext = yNext * scalingY;//Linear interpolation between mapping of current pixel and adjacent pixel in the destination gridconst int tMax = max(abs(xScaledNext - xScaled), abs(yScaledNext - yScaled));for (unsigned int t = 1; t < tMax; ++t) {const int xP = xScaled + (1.0 * t / tMax) * (xScaledNext - xScaled);const int yP = yScaled + (1.0 * t / tMax) * (yScaledNext - yScaled);scaledData[yP * xFieldsScaled + xP] = true;}}}}}}// Dynamic memory deallocationdelete[] gridArray;// Update of objects attributesgridArray = scaledData;xFields = xFieldsScaled;yFields = yFieldsScaled;}

先決條件和警告(Pre-requisites and caveats)

This algorithm was ideated for the catering to specific requirements, and therefore works best when certain characteristics are fulfilled. To avoid pitfalls, and ensure a successful employment, the following assumptions about the images to be processed should be fulfilled :

該算法旨在滿足特定要求，因此在滿足某些特征時效果最佳。為了避免出現(xiàn)陷阱，并確保成功使用，應滿足以下有關要處理圖像的假設：

Pixels must be booleans (filled/unfilled):

像素必須為布爾值(已填充/未填充)：

The algorithm is not made to work with pixels of different intensities. In the case of dealing with generic images, achieving the desired, compatible format will require desaturation and normalization to make sure pixels are either black or white. Additionally, applying a denoising algorithm preliminarily could prove beneficial.

該算法不適用于不同強度的像素。在處理通用圖像的情況下，要獲得所需的兼容格式，將需要進行去飽和和標準化處理，以確保像素為黑色或白色。此外，初步應用降噪算法可能會被證明是有益的。

The symbol contained should be made out of thin strokes:

包含的符號應使用細筆畫制成：

Since the algorithm maps all pixels to the resized canvas and interpolates a line between adjacent ones, lines of two or more pixels width will separate as independent, adjacent lines. Undesirable connecting lines will appear between the pixels of both lines caused by the interpolation step. To avoid this, it is imperative to achieve thin strokes when creating datasets, making sure to utilise symbols with proportions significantly larger than their stroke’s width. Considering the image’s resolution, the symbol’s width should occupy, at most, one pixel thick. Alternatively, one could include a simple, custom pre-processing algorithm to make strokes thinner.

由于該算法將所有像素映射到調整后的畫布，并在相鄰像素之間插入一條線，因此兩個或更多像素寬度的線將分開為獨立的相鄰線。由于插值步驟，兩條線的像素之間將出現(xiàn)不希望的連接線。為避免這種情況，在創(chuàng)建數(shù)據(jù)集時必須實現(xiàn)細筆觸，確保使用比例明顯大于其筆觸寬度的符號。考慮到圖像的分辨率，符號的寬度最多應占據(jù)一個像素厚。或者，可以包括一種簡單的自定義預處理算法，以使筆劃更細。

The scaling ratio should not be too big:The algorithm uses linear interpolation between mapped pixels. For this reason, resizing an image to a much bigger canvas will make the symbol look unnaturally angular. To overcome this flaw, one could easily change the type of interpolation used in the algorithm (e.g. polynomial interpolation, spline interpolation, etc.), which will be left for the reader to implement, if necessary.

縮放比例不應太大： 該算法在映射像素之間使用線性插值。因此，將圖像調整為更大的畫布將使該符號看起來不自然。為了克服這一缺陷，可以輕松地更改算法中使用的插值類型(例如多項式插值，樣條插值等)，如有必要，留給讀者實施。

結論 (Conclusion)

This article proposed an alternative algorithm to resizing images, which allows thin structures to preserve their stroke’s width. Whether the reader deals with deep learning problems around handwritten recognition, or only needs effective resizing algorithm of this kind, these implementations will hopefully simplify the creation of more effective custom datasets or the processing of existing ones to be compatible with each other.

本文提出了一種調整圖像大小的替代算法，該算法允許薄型結構保留其筆觸的寬度。無論讀者是圍繞手寫識別來解決深度學習問題，還是僅需要這種有效的大小調整算法，這些實現(xiàn)方式都有望簡化更有效的自定義數(shù)據(jù)集的創(chuàng)建或現(xiàn)有數(shù)據(jù)集的處理，以使彼此兼容。

Should the reader be interested in deep learning problems involving handwritten symbols, I highly recommend to visit the related project links, which could well be of useful insight:

如果讀者對涉及手寫符號的深度學習問題感興趣，我強烈建議訪問相關的項目鏈接，這可能會很有用：

handCalc | Written Calculator (Project Repository):https://github.com/michheusser/handCalc

handCalc | 書面計算器(項目資料庫)： https://github.com/michheusser/handCalc

Image Processing and Training of Neural Network:https://github.com/michheusser/neural-network-training

神經(jīng)網(wǎng)絡的圖像處理和訓練： https://github.com/michheusser/neural-network-training

Dataset Creation:https://www.kaggle.com/michelheusser/handwritten-digits-and-operators

數(shù)據(jù)集創(chuàng)建： https://www.kaggle.com/michelheusser/handwriting-digits-and-operators

翻譯自: https://medium.com/@michheusser/handwritten-recognition-resizing-strokes-instead-of-images-b787af9935fc