當前位置：首頁 >

3.5 实例讲解Lucene索引的结构设计

發布時間：2025/3/15 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 3.5 实例讲解Lucene索引的结构设计小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

3.2節我們已經運行了一個Lucene建立索引的小程序，這一節我們就以這個小程序為例講解一下Lucene建立索引的過程。

1 import java.nio.charset.StandardCharsets; 2 import java.nio.file.Files; 3 import java.nio.file.Paths; 4 import java.io.*; 5 6 import org.apache.lucene.analysis.standard.StandardAnalyzer; 7 import org.apache.lucene.document.Document; 8 import org.apache.lucene.document.Field; 9 import org.apache.lucene.document.StringField; 10 import org.apache.lucene.document.TextField; 11 import org.apache.lucene.index.IndexWriter; 12 import org.apache.lucene.index.IndexWriterConfig; 13 import org.apache.lucene.store.Directory; 14 import org.apache.lucene.store.FSDirectory; 15 import org.apache.lucene.util.Version; 16 17 /** 18 * @author csl 19 * @description: 20 * 依賴jar：Lucene-core，lucene-analyzers-common，lucene-queryparser 21 * 作用：簡單的索引建立 22 */ 23 public class Indexer { 24 public static Version luceneVersion = Version.LATEST; 25 /** 26 * 建立索引 27 */ 28 public static void createIndex(){ 29 IndexWriter writer = null; 30 try{ 31 //1、創建Directory 32 //Directory directory = new RAMDirectory();//創建內存directory 33 Directory directory = FSDirectory.open(Paths.get("index"));//在硬盤上生成Directory00 34 //2、創建IndexWriter 35 IndexWriterConfig iwConfig = new IndexWriterConfig( new StandardAnalyzer()); 36 writer = new IndexWriter(directory, iwConfig); 37 //3、創建document對象 38 Document document = null; 39 //4、為document添加field對象 40 File f = new File("raw");//索引源文件位置 41 for (File file:f.listFiles()){ 42 document = new Document(); 43 document.add(new StringField("path", f.getName(),Field.Store.YES)); 44 System.out.println(file.getName()); 45 document.add(new StringField("name", file.getName(),Field.Store.YES)); 46 InputStream stream = Files.newInputStream(Paths.get(file.toString())); 47 document.add(new TextField("content", new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8))));//textField內容會進行分詞 48 //document.add(new TextField("content", new FileReader(file))); 如果不用utf-8編碼的話直接用這個就可以了 49 writer.addDocument(document); 50 } 51 }catch(Exception e){ 52 e.printStackTrace(); 53 }finally{ 54 //6、使用完成后需要將writer進行關閉 55 try { 56 writer.close(); 57 } catch (IOException e) { 58 e.printStackTrace(); 59 } 60 } 61 } 62 public static void main(String[] args) throws IOException 63 { 64 createIndex(); 65 } 66 } View Code

創建索引共六步：

1.創建索引目錄。

Directory directory = new RAMDirectory(); Directory directory = FSDirectory.open(Paths.get("index")); View Code

創建索引目錄有兩種方式：

RAMDirectory類：創建一個內存目錄，優點是速度快，缺點是程序退出后索引目錄數據就會丟失。
FSDirectory類： ?創建一個文件目錄，該方式創建的索引數據保存在磁盤上，不會因為程序的退出而消失。

下文針對FSDirectory方式來講解Lucene的基本使用。

2.創建IndexWriter。

1 IndexWriterConfig iwConfig = new IndexWriterConfig( new StandardAnalyzer()); 2 IndexWriter writer = new IndexWriter(directory, iwConfig); View Code

通過IndexWriter對象來創建和維護索引。

IndexWriterConfig對象用來對IndexWriter進行初始配置：配置分詞器；配置索引維護的方式；配置用來緩沖文檔的RAM大小等。

具體可參照IndexWriterrConfig文檔根據需求進行個性化配置。

3. 創建Document。

1 Document doc=new Document(); View Code

Document是Lucene建立索引的基本單元，相當于數據庫的關系表。

4. 添加Field。

1 document = new Document(); 2 document.add(new StringField("path", f.getName(),Field.Store.YES)); 3 System.out.println(file.getName()); 4 document.add(new StringField("name", file.getName(),Field.Store.YES)); 5 InputStream stream = Files.newInputStream(Paths.get(file.toString())); 6 document.add(new TextField("content", new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8))));//textField內容會進行分詞 7 //document.add(new TextField("content", new FileReader(file))); 如果不用utf-8編碼的話直接用這個就可以了 View Code

Field是Lucene建立索引的最小單元，相當于關系表中的屬性。一個Document可以包含多個Field。Document添加Field只需調用Add()方法。

Lucene為我們提供了多種類型的Field，比如IntField, LongField, StringField, TextField等。程序實例中，我們用到了StringField和TextField。我們有必要來了解一下這兩種Field的區別，因為這關系到倒排表的建立：

StringField：對域進行索引，但不進行分詞，將域值作為單一的語匯單元，適用于索引那些不能被分解的域值，如URL，文件路徑，電話號碼等。參考StringField文檔。
TextField：對域既索引又分詞,Lucene會對這個域進行分詞并建立倒排表。參考TextField文檔。

5.添加Document。

對IndexWriter對象調用addDocument方法將文檔添加到索引庫中。

6.關閉IndexWriter對象。

把所有的文檔都添加到索引庫中后，關閉Indexwriter對象。

ps:這篇博客以文集為例形象生動地說明了IndexWriter,Document和Field的關系，大家不妨看一看：例子

關于Lucene的具體索引步驟就介紹到這里~~

轉載于:https://www.cnblogs.com/itcsl/p/6828652.html

總結

以上是生活随笔為你收集整理的3.5 实例讲解Lucene索引的结构设计的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

3.5 实例讲解Lucene索引的结构设计

總結