浅谈OCR之Tesseract
光學(xué)字符識(shí)別(OCR,Optical Character Recognition)是指對(duì)文本資料進(jìn)行掃描,然后對(duì)圖像文件進(jìn)行分析處理,獲取文字及版面信息的過(guò)程。OCR技術(shù)非常專業(yè),一般多是印刷、打印行業(yè)的從業(yè)人員使用,可以快速的將紙質(zhì)資料轉(zhuǎn)換為電子資料。關(guān)于中文OCR,目前國(guó)內(nèi)水平較高的有清華文通、漢王、尚書,其產(chǎn)品各有千秋,價(jià)格不菲。國(guó)外OCR發(fā)展較早,像一些大公司,如IBM、微軟、HP等,即使沒(méi)有推出單獨(dú)的OCR產(chǎn)品,但是他們的研發(fā)團(tuán)隊(duì)早已掌握核心技術(shù),將OCR功能植入了自身的軟件系統(tǒng)。對(duì)于我們程序員來(lái)說(shuō),一般用不到那么高級(jí)的,主要在開發(fā)中能夠集成基本的OCR功能就可以了。這兩天我查找了很多免費(fèi)OCR軟件、類庫(kù),特地整理一下,今天首先來(lái)談?wù)凾esseract,下一次將討論下Onenote 2010中的OCR API實(shí)現(xiàn)。可以在這里查看OCR技術(shù)的發(fā)展簡(jiǎn)史。
測(cè)試代碼下載
轉(zhuǎn)載請(qǐng)注明出處:http://www.cnblogs.com/brooks-dotnet/archive/2010/10/05/1844203.html?
?
1、Tesseract概述
Tesseract的OCR引擎最先由HP實(shí)驗(yàn)室于1985年開始研發(fā),至1995年時(shí)已經(jīng)成為OCR業(yè)內(nèi)最準(zhǔn)確的三款識(shí)別引擎之一。然而,HP不久便決定放棄OCR業(yè)務(wù),Tesseract也從此塵封。
數(shù)年以后,HP意識(shí)到,與其將Tesseract束之高閣,不如貢獻(xiàn)給開源軟件業(yè),讓其重?zé)ㄐ律?#xff0d;-2005年,Tesseract由美國(guó)內(nèi)華達(dá)州信息技術(shù)研究所獲得,并求諸于Google對(duì)Tesseract進(jìn)行改進(jìn)、消除Bug、優(yōu)化工作。
Tesseract目前已作為開源項(xiàng)目發(fā)布在Google Project,其項(xiàng)目主頁(yè)在這里查看,其最新版本3.0已經(jīng)支持中文OCR,并提供了一個(gè)命令行工具。本次我們來(lái)測(cè)試一下Tesseract 3.0,由于命令行對(duì)最終用戶不太友好,我用WPF簡(jiǎn)單封裝了一下,就可以方便的進(jìn)行中文OCR了。
?
1.1、首先到Tesseract項(xiàng)目主頁(yè)下載命令行工具、源代碼、中文語(yǔ)言包:
?
1.2、命令行工具解壓縮后如下(不含1.jpg、1.txt):
?
1.3、為了進(jìn)行中文OCR,將簡(jiǎn)體中文語(yǔ)言包復(fù)制到【tessdata】目錄下:
?
1.4、在DOS下切換到Tesseract的命令行目錄,查看一下tesseract.exe的命令格式:
?
Imagename為待OCR的圖片,outputbase為OCR后的輸出文件,默認(rèn)是文本文件(.txt),lang為使用的語(yǔ)言包,configfile為配置文件。
?
1.5、下面來(lái)測(cè)試一下,準(zhǔn)備一張jpg格式的圖片,這里我是放到了和Tesseract同一個(gè)目錄中:
?
輸入:tesseract.exe 1.jpg 1 -l chi_sim,然后回車,幾秒鐘就OCR完成了:
這里注意命令的格式:imagename要加上擴(kuò)展名.jpg,輸出文件和語(yǔ)言包不需要加擴(kuò)展名。
?
OCR結(jié)果:
?
可以看到結(jié)果不是很理想,中文識(shí)別還說(shuō)的過(guò)去,但是英文、數(shù)字大都亂碼。不過(guò)作為老牌的OCR引擎,能做到這種程度已經(jīng)相當(dāng)不錯(cuò)了,期待Google的后續(xù)升級(jí)吧,支持一下。
?
2、使用WPF封裝Tesseract命令行
2.1、鑒于命令行書寫容易出錯(cuò),且對(duì)最終用戶很不友好,我做了一個(gè)簡(jiǎn)單的WPF小程序,將Tesseract的命令行封裝了一下:
?
左邊選擇圖片、預(yù)覽,右邊選擇輸出目錄,顯示OCR結(jié)果,支持本地及網(wǎng)絡(luò)圖片的預(yù)覽。
?
2.2、為了使得圖片預(yù)覽支持縮放、移動(dòng),原本打算使用微軟的Zoom It API,可惜不支持WPF,于是使用了一個(gè)第三方的類:
using?System;using?System.Windows.Controls;
using?System.Windows.Input;
using?System.Windows.Media.Animation;
using?System.Windows;
using?System.Windows.Media;
namespace?PanAndZoom
{
????public?class?PanAndZoomViewer?:?ContentControl
????{
????????public?double?DefaultZoomFactor?{?get;?set;?}
????????private?FrameworkElement?source;
????????private?Point?ScreenStartPoint?=?new?Point(0,?0);
????????private?TranslateTransform?translateTransform;
????????private?ScaleTransform?zoomTransform;
????????private?TransformGroup?transformGroup;
????????private?Point?startOffset;
????????public?PanAndZoomViewer()
????????{
????????????this.DefaultZoomFactor?=?1.4;
????????}
????????public?override?void?OnApplyTemplate()
????????{
????????????base.OnApplyTemplate();
????????????Setup(this);
????????}
????????void?Setup(FrameworkElement?control)
????????{
????????????this.source?=?VisualTreeHelper.GetChild(this,?0)?as?FrameworkElement;
????????????this.translateTransform?=?new?TranslateTransform();
????????????this.zoomTransform?=?new?ScaleTransform();
????????????this.transformGroup?=?new?TransformGroup();
????????????this.transformGroup.Children.Add(this.zoomTransform);
????????????this.transformGroup.Children.Add(this.translateTransform);
????????????this.source.RenderTransform?=?this.transformGroup;
????????????this.Focusable?=?true;
????????????this.KeyDown?+=?new?KeyEventHandler(source_KeyDown);
????????????this.MouseMove?+=?new?MouseEventHandler(control_MouseMove);
????????????this.MouseDown?+=?new?MouseButtonEventHandler(source_MouseDown);
????????????this.MouseUp?+=?new?MouseButtonEventHandler(source_MouseUp);
????????????this.MouseWheel?+=?new?MouseWheelEventHandler(source_MouseWheel);
????????}
????????void?source_KeyDown(object?sender,?KeyEventArgs?e)
????????{
????????????//?hit?escape?to?reset?everything
????????????if?(e.Key?==?Key.Escape)?Reset();
????????}
????????void?source_MouseWheel(object?sender,?MouseWheelEventArgs?e)
????????{
????????????//?zoom?into?the?content.??Calculate?the?zoom?factor?based?on?the?direction?of?the?mouse?wheel.
????????????double?zoomFactor?=?this.DefaultZoomFactor;
????????????if?(e.Delta?<=?0)?zoomFactor?=?1.0?/?this.DefaultZoomFactor;
????????????//?DoZoom?requires?both?the?logical?and?physical?location?of?the?mouse?pointer
????????????var?physicalPoint?=?e.GetPosition(this);
????????????DoZoom(zoomFactor,?this.transformGroup.Inverse.Transform(physicalPoint),?physicalPoint);
????????}
????????void?source_MouseUp(object?sender,?MouseButtonEventArgs?e)
????????{
????????????if?(this.IsMouseCaptured)
????????????{
????????????????//?we're?done.??reset?the?cursor?and?release?the?mouse?pointer
????????????????this.Cursor?=?Cursors.Arrow;
????????????????this.ReleaseMouseCapture();
????????????}
????????}
????????void?source_MouseDown(object?sender,?MouseButtonEventArgs?e)
????????{
????????????//?Save?starting?point,?used?later?when?determining?how?much?to?scroll.
????????????this.ScreenStartPoint?=?e.GetPosition(this);
????????????this.startOffset?=?new?Point(this.translateTransform.X,?this.translateTransform.Y);
????????????this.CaptureMouse();
????????????this.Cursor?=?Cursors.ScrollAll;
????????}
????????void?control_MouseMove(object?sender,?MouseEventArgs?e)
????????{
????????????if?(this.IsMouseCaptured)
????????????{
????????????????//?if?the?mouse?is?captured?then?move?the?content?by?changing?the?translate?transform.??
????????????????//?use?the?Pan?Animation?to?animate?to?the?new?location?based?on?the?delta?between?the?
????????????????//?starting?point?of?the?mouse?and?the?current?point.
????????????????var?physicalPoint?=?e.GetPosition(this);
????????????????this.translateTransform.BeginAnimation(TranslateTransform.XProperty,?CreatePanAnimation(physicalPoint.X?-?this.ScreenStartPoint.X?+?this.startOffset.X),?HandoffBehavior.Compose);
????????????????this.translateTransform.BeginAnimation(TranslateTransform.YProperty,?CreatePanAnimation(physicalPoint.Y?-?this.ScreenStartPoint.Y?+?this.startOffset.Y),?HandoffBehavior.Compose);
????????????}
????????}
????????///?<summary>Helper?to?create?the?panning?animation?for?x,y?coordinates.</summary>
????????///?<param?name="toValue">New?value?of?the?coordinate.</param>
????????///?<returns>Double?animation</returns>
????????private?DoubleAnimation?CreatePanAnimation(double?toValue)
????????{
????????????var?da?=?new?DoubleAnimation(toValue,?new?Duration(TimeSpan.FromMilliseconds(300)));
????????????da.AccelerationRatio?=?0.1;
????????????da.DecelerationRatio?=?0.9;
????????????da.FillBehavior?=?FillBehavior.HoldEnd;
????????????da.Freeze();
????????????return?da;
????????}
????????///?<summary>Helper?to?create?the?zoom?double?animation?for?scaling.</summary>
????????///?<param?name="toValue">Value?to?animate?to.</param>
????????///?<returns>Double?animation.</returns>
????????private?DoubleAnimation?CreateZoomAnimation(double?toValue)
????????{
????????????var?da?=?new?DoubleAnimation(toValue,?new?Duration(TimeSpan.FromMilliseconds(500)));
????????????da.AccelerationRatio?=?0.1;
????????????da.DecelerationRatio?=?0.9;
????????????da.FillBehavior?=?FillBehavior.HoldEnd;
????????????da.Freeze();
????????????return?da;
????????}
????????///?<summary>Zoom?into?or?out?of?the?content.</summary>
????????///?<param?name="deltaZoom">Factor?to?mutliply?the?zoom?level?by.?</param>
????????///?<param?name="mousePosition">Logical?mouse?position?relative?to?the?original?content.</param>
????????///?<param?name="physicalPosition">Actual?mouse?position?on?the?screen?(relative?to?the?parent?window)</param>
????????public?void?DoZoom(double?deltaZoom,?Point?mousePosition,?Point?physicalPosition)
????????{
????????????double?currentZoom?=?this.zoomTransform.ScaleX;
????????????currentZoom?*=?deltaZoom;
????????????this.translateTransform.BeginAnimation(TranslateTransform.XProperty,?CreateZoomAnimation(-1?*?(mousePosition.X?*?currentZoom?-?physicalPosition.X)));
????????????this.translateTransform.BeginAnimation(TranslateTransform.YProperty,?CreateZoomAnimation(-1?*?(mousePosition.Y?*?currentZoom?-?physicalPosition.Y)));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleXProperty,?CreateZoomAnimation(currentZoom));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleYProperty,?CreateZoomAnimation(currentZoom));
????????}
????????///?<summary>Reset?to?default?zoom?level?and?centered?content.</summary>
????????public?void?Reset()
????????{
????????????this.translateTransform.BeginAnimation(TranslateTransform.XProperty,?CreateZoomAnimation(0));
????????????this.translateTransform.BeginAnimation(TranslateTransform.YProperty,?CreateZoomAnimation(0));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleXProperty,?CreateZoomAnimation(1));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleYProperty,?CreateZoomAnimation(1));
????????}
????}
}
?
?
?
2.3、除了使用鼠標(biāo)。還可以使用滾動(dòng)條調(diào)節(jié)圖片預(yù)覽效果:
<WrapPanel?Grid.Row="2"?Grid.Column="0">????????????????<Label?Name="lab長(zhǎng)度"?Content="長(zhǎng)度:"?Margin="3"?/>
????????????????<Slider?Name="sl長(zhǎng)度"?MinWidth="50"?Margin="3"?VerticalAlignment="Center"?Maximum="400"?Value="{Binding?ElementName=img圖片,?Path=Width,?Mode=TwoWay}"?/>
????????????????<Label?Name="lab寬度"?Content="寬度:"?Margin="3"?/>
????????????????<Slider?Name="sl寬度"?MinWidth="50"?Margin="3"?VerticalAlignment="Center"?Maximum="400"?Value="{Binding?ElementName=img圖片,?Path=Height,?Mode=TwoWay}"?/>
????????????????<Label?Name="lab透明度"?Content="透明度:"?Margin="3"?/>
????????????????<Slider?Name="sl透明度"?MinWidth="50"?Margin="3"?VerticalAlignment="Center"?Maximum="1"?Value="{Binding?ElementName=img圖片,?Path=Opacity,?Mode=TwoWay}"?/>
????????????????<Label?Name="lab拉伸方式"?Content="拉伸方式:"?Margin="3"?/>
????????????????<ComboBox?Name="txt拉伸方式"?Margin="3"?MinWidth="85">
????????????????????<ComboBoxItem?Content="Fill"?/>
????????????????????<ComboBoxItem?Content="None"?IsSelected="True"?/>
????????????????????<ComboBoxItem?Content="Uniform"?/>
????????????????????<ComboBoxItem?Content="UniformToFill"?/>
????????????????</ComboBox>
????????????</WrapPanel>
????????????<local:PanAndZoomViewer?Grid.Row="3"?Grid.Column="0"?Height="300"?Margin="3">
????????????????<Image?Name="img圖片"?Stretch="{Binding?ElementName=txt拉伸方式,?Path=Text,?Mode=TwoWay}"?/>
????????????</local:PanAndZoomViewer>
?
?
??
2.4、由于Tesseract命令行不支持直接OCR網(wǎng)絡(luò)圖片,故先下載:
private?void?fnStartDownload(string?v_strImgPath,?string?v_strOutputDir,?out?string?v_strTmpPath)????????{
????????????int?n?=?v_strImgPath.LastIndexOf('/');
????????????string?URLAddress?=?v_strImgPath.Substring(0,?n);
????????????string?fileName?=?v_strImgPath.Substring(n?+?1,?v_strImgPath.Length?-?n?-?1);
????????????this.__OutputFileName?=?v_strOutputDir?+?"\\"?+?fileName.Substring(0,?fileName.LastIndexOf("."));
????????????if?(!Directory.Exists(System.Configuration.ConfigurationManager.AppSettings["tmpPath"]))
????????????{
????????????????Directory.CreateDirectory(System.Configuration.ConfigurationManager.AppSettings["tmpPath"]);
????????????}
????????????string?Dir?=?System.Configuration.ConfigurationManager.AppSettings["tmpPath"];
????????????v_strTmpPath?=?Dir?+?"\\"?+?fileName;
????????????WebRequest?myre?=?WebRequest.Create(URLAddress);
????????????client.DownloadFile(v_strImgPath,?v_strTmpPath);
????????????//Stream?str?=?client.OpenRead(v_strImgPath);
????????????//StreamReader?reader?=?new?StreamReader(str);
????????????//byte[]?mbyte?=?new?byte[Int32.Parse(System.Configuration.ConfigurationManager.AppSettings["MaxDownloadImgLength"])];
????????????//int?allmybyte?=?(int)mbyte.Length;
????????????//int?startmbyte?=?0;
????????????//while?(allmybyte?>?0)
????????????//{
????????????//????int?m?=?str.Read(mbyte,?startmbyte,?allmybyte);
????????????//????if?(m?==?0)
????????????//????{
????????????//????????break;
????????????//????}
????????????//????startmbyte?+=?m;
????????????//????allmybyte?-=?m;
????????????//}
????????????//FileStream?fstr?=?new?FileStream(v_strTmpPath,?FileMode.Create,?FileAccess.Write);
????????????//fstr.Write(mbyte,?0,?startmbyte);
????????????//str.Close();
????????????//fstr.Close();
????????}
?
?
??
2.5、使用Process來(lái)調(diào)用Tesseract命令行:
private?void?fnOCR(string?v_strTesseractPath,?string?v_strSourceImgPath,?string?v_strOutputPath,?string?v_strLangPath)????????{
????????????using?(Process?process?=?new?System.Diagnostics.Process())
????????????{
????????????????process.StartInfo.FileName?=?v_strTesseractPath;
????????????????process.StartInfo.Arguments?=?v_strSourceImgPath?+?"?"?+?v_strOutputPath?+?"?-l?"?+?v_strLangPath;
????????????????process.StartInfo.UseShellExecute?=?false;
????????????????process.StartInfo.CreateNoWindow?=?true;
????????????????process.StartInfo.RedirectStandardOutput?=?true;
????????????????process.Start();
????????????????process.WaitForExit();
????????????}
????????}
?
?
??
2.6、測(cè)試本地圖片:
?
2.7、測(cè)試網(wǎng)絡(luò)圖片:
?
小結(jié):
本次我們簡(jiǎn)單討論了下Tesseract的用法,作為一款開源、免費(fèi)的OCR引擎,能夠支持中文十分難得。雖然其識(shí)別效果不是很理想,但是對(duì)于要求不高的中小型項(xiàng)目來(lái)說(shuō),已經(jīng)足夠用了。這里有一份免費(fèi)OCR工具列表,感興趣的朋友可以研究一下。下一次將測(cè)試一下Onenote 2010中OCR功能,以及如何調(diào)用其API,為項(xiàng)目所用。
轉(zhuǎn)載于:https://www.cnblogs.com/Crackers/p/4142290.html
總結(jié)
以上是生活随笔為你收集整理的浅谈OCR之Tesseract的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: [译]用AngularJS构建大型ASP
- 下一篇: 旁路电容和去耦电容