當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Lucene4.3.1 拼写检查SpellChecker

發(fā)布時間：2025/3/21 编程问答 38 豆豆

生活随笔收集整理的這篇文章主要介紹了 Lucene4.3.1 拼写检查SpellChecker 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

2019獨角獸企業(yè)重金招聘Python工程師標(biāo)準(zhǔn)>>>

org.apache.lucene.search.spell?
Class SpellChecker

java.lang.Object
?org.apache.lucene.search.spell.SpellChecker

Lucene拼寫檢查類

使用例子：

?SpellChecker?spellchecker?=?new?SpellChecker(spellIndexDirectory);//?To?index?a?field?of?a?user?index:spellchecker.indexDictionary(new?LuceneDictionary(my_lucene_reader,?a_field));//?To?index?a?file?containing?words:spellchecker.indexDictionary(new?PlainTextDictionary(new?File("myfile.txt")));String[]?suggestions?=?spellchecker.suggestSimilar("misspelt",?5);

SpellChecker有三個構(gòu)造方法，可以根據(jù)給定的Directory實例創(chuàng)建SpellChecker對象進行后續(xù)操作；

PlainTextDictionary實現(xiàn)了Dictionary接口，并提供3個構(gòu)造方法，參數(shù)分別為：File、InputStream、Reader

上面例子中根據(jù)一個文本文件創(chuàng)建PlainTextDirectory字典，該文本文件的格式為每一行包含一個詞，如：

word1 word2 word3

其他：FileDictionary,?HighFrequencyDictionary,?LuceneDictionary

SpellChecker方法：

String [] suggestSimilar（String word,int numSug）

參數(shù)：

word-需要檢查的詞

numSug-返回的suggest詞數(shù)

其他的：String [] suggestSimilar（...），可以根據(jù)精度等進行，詳情請參考官方文檔；

完整代碼示例：

import?org.apache.lucene.document.Document; import?org.apache.lucene.document.Field; import?org.apache.lucene.document.TextField; import?org.apache.lucene.index.DirectoryReader; import?org.apache.lucene.index.IndexReader; import?org.apache.lucene.index.IndexWriter; import?org.apache.lucene.index.IndexWriterConfig; import?org.apache.lucene.queryparser.classic.QueryParser; import?org.apache.lucene.search.IndexSearcher; import?org.apache.lucene.search.Query; import?org.apache.lucene.search.ScoreDoc; import?org.apache.lucene.search.TopDocs; import?org.apache.lucene.search.spell.PlainTextDictionary; import?org.apache.lucene.search.spell.SpellChecker; import?org.apache.lucene.store.Directory; import?org.apache.lucene.store.RAMDirectory; import?org.apache.lucene.util.Version; import?org.wltea.analyzer.lucene.IKAnalyzer;import?java.io.File; import?java.io.IOException; import?java.util.ArrayList; import?java.util.List;public?class?SpellCheckerTest?{private?static?String?filepath?=?"C:\\Users\\Mr_Tank_\\Desktop\\BaseTest\\dictionaryfile.txt";private?Document?document;private?Directory?directory;private?IndexWriter?indexWriter;private?SpellChecker?spellchecker;private?IndexReader?indexReader;private?IndexSearcher?indexSearcher;private?IndexWriterConfig?getConfig()?{return?new?IndexWriterConfig(Version.LUCENE_43,?new?IKAnalyzer(true));}private?IndexWriter?getIndexWriter()?{directory?=?new?RAMDirectory();try?{return?new?IndexWriter(directory,?getConfig());}?catch?(IOException?e)?{e.printStackTrace();return?null;}}/***?Create?index?for?test**?@param?content*?@throws?IOException*/public?void?createIndex(String?content)?{indexWriter?=?getIndexWriter();document?=?new?Document();document.add(new?TextField("content",?content,?Field.Store.YES));try?{indexWriter.addDocument(document);indexWriter.commit();indexWriter.close();}?catch?(IOException?e)?{e.printStackTrace();}}public?ScoreDoc[]?gethits(String?content)?{try?{indexReader?=?DirectoryReader.open(directory);indexSearcher?=?new?IndexSearcher(indexReader);QueryParser?parser?=?new?QueryParser(Version.LUCENE_43,?"content",?new?IKAnalyzer(true));Query?query?=?parser.parse(content);TopDocs?td?=?indexSearcher.search(query,?1000);return?td.scoreDocs;}?catch?(Exception?e)?{e.printStackTrace();return?null;}}/***?@param?scoreDocs*?@return*?@throws?IOException*/public?List<Document>?getDocumentList(ScoreDoc[]?scoreDocs)?throws?IOException?{List<Document>?documentList?=?null;if?(scoreDocs.length?>=?1)?{documentList?=?new?ArrayList<Document>();for?(int?i?=?0;?i?<?scoreDocs.length;?i++)?{documentList.add(indexSearcher.doc(scoreDocs[i].doc));}}return?documentList;}public?String[]?search(String?word,?int?numSug)?{directory?=?new?RAMDirectory();try?{spellchecker?=?new?SpellChecker(directory);spellchecker.indexDictionary(new?PlainTextDictionary(new?File(filepath)),?getConfig(),?true);return?getSuggestions(spellchecker,?word,?numSug);}?catch?(IOException?e)?{e.printStackTrace();return?null;}}private?String[]?getSuggestions(SpellChecker?spellchecker,?String?word,?int?numSug)?throws?IOException?{return?spellchecker.suggestSimilar(word,?numSug);}public?static?void?main(String[]?args)?throws?IOException?{SpellCheckerTest?spellCheckerTest?=?new?SpellCheckerTest();spellCheckerTest.createIndex("開源中國-找到您想要的開源項目，分享和交流");spellCheckerTest.createIndex("CSDN-全球最大中文IT社區(qū)");String?word?=?"開園中國";/*ScoreDoc[]?scoreDocs?=?spellCheckerTest.gethits(word);List<Document>?documentList?=?spellCheckerTest.getDocumentList(scoreDocs);if?(documentList.size()?>=?1)?{for?(Document?d?:?documentList)?{System.out.println("搜索結(jié)果："?+?d.get("content"));}}*/String[]?suggest?=?spellCheckerTest.search(word,?5);if?(suggest?!=?null?&&?suggest.length?>=?1)?{for?(String?s?:?suggest)?{System.out.println("您是不是要找："?+?s);}}?else?{System.out.println("拼寫正確");}} }

dictionaryfile.txt：

中華人民共和國開源中國開源社區(qū) Lucene 拼寫檢查 Lucene4.3.1

轉(zhuǎn)載于:https://my.oschina.net/tanweijie/blog/194046

總結(jié)

以上是生活随笔為你收集整理的Lucene4.3.1 拼写检查SpellChecker的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

SpellChecker

上一篇： C++11 中值得关注的几大变化
下一篇：【原创翻译】习题

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

编程问答

Lucene4.3.1 拼写检查SpellChecker

org.apache.lucene.search.spell?Class SpellChecker

總結(jié)

org.apache.lucene.search.spell?
Class SpellChecker