當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

用于文本识别的合成数据生成器

發(fā)布時間：2025/4/16 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了用于文本识别的合成数据生成器小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

https://github.com/Belval/TextRecognitionDataGenerator

A synthetic data generator for text recognition

說明：

功能與上篇博客介紹的文本圖片生成類似。

安裝相關(guān)的依賴后，按要求即可以運行demo。

可以生成自己所希望的語料的文本，也可以添加自己所需要的背景。

例如，火車票信息，可以將所有可能的車站名稱、車次名稱、等一些固定的信息都放在里面，隨機生成需要的樣本數(shù)據(jù)。

python run.py -l cn --output_dir MY_samples -i texts/city.txt -c 1000 -b 3 -w 5

另外，對于中文字體（黑體、宋體...），如何修改還在探索。

生成樣本如下圖：

TextRecognitionDataGenerator??

A synthetic data generator for text recognition

What is it for?

Generating text image samples to train an OCR software. Now supporting non-latin text!

What do I need to make it work?

I use Archlinux so I cannot tell if it works on Windows yet.

Python 3.X OpenCV 3.2 (It probably works with 2.4) Pillow Numpy Requests BeautifulSoup tqdm

You can simply use?pip install -r requirements.txt?too.

New

Specify text color range using?-tc min,max
Explicit alignement when using?-al?with fixed width (0: Left, 1: Center, 2: Right)
Fixed width using?-wd
Generate random strings with letters, numbers and symbols (Thank you @FHainzl)
Save the labels in a file instead of in the file name (Thank you @FHainzl)
Add support for Simplified and Traditional Chinese

How does it work?

python run.py -w 5 -f 64

You get 1000 randomly generated images with random text on them like:

????

What if you want random skewing? Add?-k?and?-rk?(python run.py -w 5 -f 64 -k 5 -rk)

But scanned document usually aren't that clear are they? Add?-bl?and?-rbl?to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):

???

Maybe you want another background? Add?-b?to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or picture (3).

???

When using picture background (3). A picture from the pictures/ folder will be randomly selected and the text will be written on it.

Or maybe you are working on an OCR for handwritten text? Add?-hw! (Experimental)

It uses a Tensorflow model trained using?this excellent project?by Grzego.

The project does not require TensorFlow to run if you aren't using this feature

You can also add distorsion to the generated text with?-d?and?-do

The text is chosen at random in a dictionary file (that can be found in the?dicts?folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg

There are a lot of parameters that you can tune to get the results you want, therefore I recommand checking out?python run.py -h?for more informations.

How to create images with Chinese (both simplified and traditional) text

It is simple! Just do?python run.py -l cn -c 1000 -w 5!

Unfortunately I do not speak Chinese so you may have to edit?texts/cn.txt?to include some meaningful words instead of random glyphs.

Here are examples of what I could make with it:

Traditional:

Simplified:

Can I add my own font?

Yes, the script picks a font at random from the?fonts?directory.

fonts/latin	English, French, Spanish, German
fonts/cn	Chinese
?	?

Simply add / remove fonts until you get the desired output.

If you want to add a new non-latin language, the amount of work is minimal.

Create a new folder with your language two-letters code

Add a .ttf font in it

Edit?run.py?to add an if statement in?load_fonts()

Add a text file in?dicts?with the same two-letters code

Run the tool as you normally would but add?-l?with your two-letters code

It only supports .ttf for now.

Benchmarks

Intel Core i7-4710HQ @ 2.50Ghz + SSD (-c 1000 -w 1)
- -t 1?: 363 img/s
- -t 2?: 694 img/s
- -t 4?: 1300 img/s
- -t 8?: 1500 img/s
AMD Ryzen 7 1700 @ 4.0Ghz + SSD (-c 1000 -w 1)
- -t 1?: 558 img/s
- -t 2?: 1045 img/s
- -t 4?: 2107 img/s
- -t 8?: 3297 img/s

Contributing

Create an issue describing the feature you'll be working on

Code said feature

Create a pull request

Feature request & issues

If anything is missing, unclear, or simply not working, open an issue on the repository.

What is left to do?

Better background generation
Better handwritten text generation
More customization parameters (mostly regarding background)

總結(jié)

以上是生活随笔為你收集整理的用于文本识别的合成数据生成器的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Jboss7或者wildfly部署war
下一篇： AOP统一处理请求日志