當前位置：首頁 > 编程语言 > python >内容正文

python

python做作业没头绪_使用Python做作业

發布時間：2023/12/15 python 34 豆豆

生活随笔收集整理的這篇文章主要介紹了 python做作业没头绪_使用Python做作业小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

python做作業沒頭緒

Applying OpenCV and Tesseract to do your math-homework

應用OpenCV和Tesseract進行數學作業

The possibilities to use Python are almost endless — repetitive tasks especially can be solved easily using Python. Here we show how Python can be used to automatically answer problems on a math worksheet.

使用Python的可能性幾乎是無限的-重復的任務尤其可以使用Python輕松解決。在這里，我們展示了如何使用Python自動回答數學工作表上的問題。

First we take a look at the math questions:

首先我們看一下數學問題：

Nothing too difficult, but the amount of questions could make it very tiring to solve and fill in every single one. Instead, let us try it in Python!

沒什么難的，但是大量的問題會使解決和填寫每一個問題變得很累人。相反，讓我們在Python中嘗試一下！

We start by importing the relevant packages. In fact, we need exactly three packages. The first package enables us to read the questions, meaning it transforms image to text. The package we are talking about is called pytesseract. It is important to note that a bit more work than only pip install … is required to get it running. Here is a link to a good tutorial regarding this problem.

我們首先導入相關的軟件包。實際上，我們需要三個軟件包。第一個軟件包使我們能夠閱讀問題，這意味著它將圖像轉換為文本。我們正在談論的軟件包稱為pytesseract。重要的是要注意，要運行它，不僅需要點安裝，還需要做更多的工作。這里是有關此問題的優秀教程的鏈接。

The second package is necessary for finding where exactly the solution should be written. This means we have to tell the machine that the answer of every equation should be written in the black squares next to the equation. In order to find and identify these squares, OpenCV is needed.

第二個程序包對于找到解決方案的確切位置是必需的。這意味著我們必須告訴機器，每個方程的答案都應該寫在方程旁邊的黑色方塊中。為了找到和識別這些正方形，需要OpenCV。

Last but not least we import a package which is able to handle strings or regular expression operations, called “re” for short.

最后但并非最不重要的一點是，我們導入一個能夠處理字符串或正則表達式操作的包，簡稱為“ re”。

import pytesseract as tess
path = (r"C:\Users\PaulM\AppData\Local\Tesseract-OCR\tesseract.exe")
tess.pytesseract.tesseract_cmd = path
import cv2
import re

01閱讀問題 (01 Reading the questions)

We start by importing the picture and apply the image_to_string function from tesseract

我們首先導入圖片，然后從tesseract應用image_to_string函數

png = "{}\questions.png".format(raw_path)
text = tess.image_to_string(png)

Looking at the results below, it seems at first glance that everything worked succesfully. Our result is one big string where each equation is delimited by a line break, which is denoted as a \n symbol.

查看下面的結果，乍看之下一切正常。我們的結果是一個大字符串，其中每個方程式均由換行符分隔，該換行符表示為\ n符號。

However, a bit more cleaning is still necessary before doing the calculations. First we have to remove all spaces between numbers and then figure out which part of this string represent actual questions. This is done with the following three lines of code:

但是，在進行計算之前，仍然需要進行更多清潔工作。首先，我們必須刪除數字之間的所有空格，然后找出該字符串的哪一部分代表實際問題。這是通過以下三行代碼完成的：

text.replace(" ", "")
pattern = re.compile("[0-9]+x[0-9]+")
equations = [x for x in parsed_text if bool(re.match(pattern, x))]

The last line of the code above is filtering the long string shown above and only extracts a certain string pattern. Specifically, it extracts an undefined amount of numbers (denoted as [0–9]+) then the letter x and then again an undefined amount of numbers. The result of that code is a list which contains all equations.

上面代碼的最后一行正在過濾上面顯示的長字符串，并且僅提取特定的字符串模式。具體來說，它將提取數量不確定的數字(表示為[0-9] +)，然后提取字母x ，再提取數量不確定的數字。該代碼的結果是一個包含所有方程式的列表。

The last step is probably the easiest, namely to calculate the solutions of all the equations. For this we build a small function, which is then used within a list comprehension to solve the equations.

最后一步可能是最簡單的，即計算所有方程的解。為此，我們構建了一個小函數，然后將其用于列表推導中來求解方程。

def multiplication (equation):
split_equation = equation.split("x")
num1 = int(split_equation[0])
num2 = int(split_equation[1])
return str(num1 * num2)solutions = [multiplication(x) for x in equations]

The result of this function is the solutions of all questions. In total we end up with a list with the length of 40, which is the exact number of questions on the sheet.

該功能的結果是所有問題的解決方案。總的來說，我們得到一個長度為40的列表，這是工作表上確切的問題數。

02處理圖像 (02 Processing the image)

The next step is now to fill in the solutions back on the questionnaire. This sounds easier then it actually is. To fill in the answers on the sheet, we have to find a specific location on the png where we want the solution to be written. In our example, we would like to find the coordinates of the black answer box which is next to every equation.

現在的下一步是將解決方案重新填寫到問卷中。聽起來比實際要容易。要在工作表上填寫答案，我們必須在png上找到要寫入解決方案的特定位置。在我們的示例中，我們希望找到每個方程式旁邊的黑色答案框的坐標。

We start by reading in the image using OpenCV. Next, we transform the picture into a gray-scale format. This is done in order to compress information. Since we would like to identify a certain shape on an image, colors are not important to us and we can move from a tensor to a matrix.

我們首先使用OpenCV讀取圖像。接下來，我們將圖片轉換為灰度格式。這樣做是為了壓縮信息。由于我們想在圖像上確定某種形狀，因此顏色對我們而言并不重要，因此我們可以從張量轉變為矩陣。

raw_img = cv2.imread(png)
img = cv2.imread(png, cv2.IMREAD_GRAYSCALE)

Let’s take a look on how the output of the gray-scale looks like. The Gif below nicely shows that we now have a large matrix containing all the pixels of the picture. We can see that most of the picture is covered with white pixels (white encodes to the integer 255). Furthermore, we can even read the equation, the equal sign, as well as the answer box ny looking at where the pixel number and color changes.

讓我們看一下灰度輸出的樣子。下面的Gif很好地顯示了我們現在有一個包含圖片所有像素的大矩陣。我們可以看到大部分圖片被白色像素覆蓋(白色編碼為整數255)。此外，我們甚至可以讀取等式，等號以及查看像素數和顏色變化位置的答案框ny。

The pixels representing the answer box are of particular interest to us, since we would like the answer to be placed within it. Before continuing to identify the box, some pre-processing is necessary — namely to enhance the contrast between the box and the white background for better identification of the shapes, a process called thresholding. An example which exemplifies the need to do that, is shown below. On the left side we have the number three shown from the initial image, whereas on the right side we have the same number after applying the thresholding.

代表答案框的像素對我們特別感興趣，因為我們希望將答案放置在其中。在繼續識別盒子之前，必須進行一些預處理-即增強盒子和白色背景之間的對比度，以便更好地識別形狀，此過程稱為閾值處理。下面顯示了一個示例，說明了這樣做的必要性。在左側，我們顯示了初始圖像中顯示的數字3，而在使用閾值后，我們使用了相同的數字。

As can be seen below, a written number is not entirely black. Especially on the sides the strength of the ink fades out. In order to make it easier for the computer to identify clear shapes, like a square for example, we turn every pixel below a certain threshold black and the rest white.

如下所示，一個書面數字并非全是黑色的。尤其是在側面，墨水的強度逐漸減弱。為了使計算機更容易識別清晰的形狀(例如正方形)，我們將每個低于特定閾值的像素設為黑色，將其余像素變為白色。

img = cv2.imread(png, cv2.IMREAD_GRAYSCALE)
_, threshold = cv2.threshold(img, 170, 255, cv2.THRESH_BINARY)

The code above shows how this step was implemented. The first line reads in the png we imported at the beginning and directly transforms it into a gray-scaled picture. The second line then applies the thresholding to the gray-scaled image. This is done by specifying the image, the threshold value (in our case 170, which is obtained by trial and error), the maximum value (in our case we would like the pixels to turn white if the exceed the threshold), and the way OpenCV should apply the thresholding. Binary thresholding means that there will be a clear cut — every pixel with a value below the thresholding will be set to zero, every pixel above the threshold will be set to the maximum value, in our case 255.

上面的代碼顯示了如何執行此步驟。第一行讀入我們一開始導入的png，并將其直接轉換為灰度圖片。然后第二行將閾值應用于灰度圖像。這是通過指定圖像，閾值(在我們的情況下為170，這是通過反復試驗獲得的)，最大值(在我們的情況下，如果像素超過閾值，我們希望像素變為白色)以及OpenCV應該應用閾值的方式。二進制閾值意味著存在明確的界限-每個值低于閾值的像素都將設置為零，高于閾值的每個像素都將設置為最大值(在我們的示例中為255)。

The next step is then to identify the squares within our image. This is done by the handy function called findContours

然后，下一步是識別圖像中的正方形。這是通過名為findContours的便捷函數完成的

contours, _ = cv2.findContours(threshold, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)

We see that the function takes three inputs (it thas more arguments than that, but these three are relevant for our problem). The first input represents our thresholded image. The second input is not of great importance to us, since it states which kind of hierarchy should be used when storing the contours. The third output defines how a shape should be saved.

我們看到該函數接受三個輸入(其參數更多，但是這三個與我們的問題有關)。第一個輸入代表我們的閾值圖像。第二個輸入對我們而言并不重要，因為它指出了存儲輪廓時應使用哪種層次結構。第三個輸出定義應如何保存形狀。

The image below shows this last point visually. Even though we have two white squares, there are two different in which we could save the relevant information needed to replicate these squares: we could either save every single pixel, as it is done in the left picture, or we save only the corner coordinates. Needless to say, the right one would use significantly less memory. Exactly this method is specified when the cv2.CHAIN_APPROX_SIMPLE in the command above was called.

下圖直觀地顯示了最后一點。即使我們有兩個白色正方形，也有兩個可以保存復制這些正方形所需的相關信息的方法：我們可以保存每個像素(如左圖所示)，或者僅保存角坐標。不用說，正確的人將使用更少的內存。當調用上述命令中的cv2.CHAIN_APPROX_SIMPLE時，正是指定了此方法。

Source: https://docs.opencv.org/3.4/d4/d73/tutorial_py_contours_begin.html資料來源： https : //docs.opencv.org/3.4/d4/d73/tutorial_py_contours_begin.html

03插入解決方案 (03 Inserting the solutions)

After storing all the information of all kind of shapes from the picture, we would like to restrict the shapes we are looking for to the squares. As outlined above, we stored the information of every shape by storing the coordinates of the corner points of every contour. Since we are interested in squares, only contours which have exactly four corner points are relevant for our problem.

在存儲完圖片中各種形狀的所有信息之后，我們希望將所需形狀限制為正方形。如上所述，我們通過存儲每個輪廓的拐角點的坐標來存儲每個形狀的信息。由于我們對正方形感興趣，因此只有正好具有四個角點的輪廓才與我們的問題有關。

rectangles = [x for x in contours[1:] if (len(x)==4)]

One not very intuitive feature of the OpenCV function findCountours is that it detects contours from right to left and from bottom to top. This created a bit of a problem, given that we our solutions are stored in a different way, namely from top to bottom and left to right. In order to align these two lists, we alter the rectangles list we created in the code above through the following code

OpenCV函數findCountours的一項不是很直觀的功能是它從右到左以及從下到上檢測輪廓。考慮到我們的解決方案以不同的方式存儲，即從上到下，從左到右，這造成了一個問題。為了對齊這兩個列表，我們通過以下代碼更改了在上面的代碼中創建的矩形列表

right_side = list(reversed(rectangles[0::2]))
left_side = list(reversed(rectangles[1::2]))
sorted_list = left_side + right_side

Now the solutions as well as the rectangle information are both in same order. Last but not least we then have to write the solutions into the rectangle. This is done by extracting the bottom left x and y coordinate, which is shown in the image below as the red circle.

現在，解和矩形信息都處于相同的順序。最后但并非最不重要的一點是，我們必須將解寫入矩形中。這是通過提取左下角的x和y坐標完成的，在下圖中顯示為紅色圓圈。

The actual writing of the solution for each question is done by a function called putText. The input of the function are relatively straight forward. Namely, we insert the image as well as some coordinates and a font.

每個問題的答案的實際編寫是通過一個稱為putText的函數完成的。函數的輸入相對簡單。即，我們插入圖像以及一些坐標和字體。

font = cv2.FONT_HERSHEY_COMPLEX
for i, j in zip(solutions, sorted_list):
x = j[1][0][0]
y = j[1][0][1]
cv2.putText(img, i, (x, y), font, 0.7, (0))cv2.imshow("Threshold", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Finally we then can display our results, which look very promising. It might be that solving these question by hand instead of using Python could have been quicker, but it would have been considerably less fun!

最后，我們可以顯示我們的結果，這看起來很有希望。也許手動解決這些問題而不是使用Python可能會更快，但樂趣會大大減少！

翻譯自: https://medium.com/swlh/using-python-to-do-your-homework-4453264ba517

python做作業沒頭緒

總結

以上是生活随笔為你收集整理的python做作业没头绪_使用Python做作业的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： alexnet 结构_AlexNet的体
下一篇： websocket python爬虫_p