當前位置：首頁 > 编程语言 > java >内容正文

java

es 插入数据_记录一次Java导入百万级数据到Elasticsearch经历

發布時間：2025/3/21 java 52 豆豆

生活随笔收集整理的這篇文章主要介紹了 es 插入数据_记录一次Java导入百万级数据到Elasticsearch经历小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

環境配置

SpringBoot-2.1.8 + Elasticsearch-6.2.2

版本問題

spring官網提供

elasticsearch
spring-data-elasticsearch
spring-boot-starter-data-elasticsearch

有三種集成方式可選擇，我這里的springboot2.1.8對elasticsearch最高支持到6.2.2

連接方式

REST API，端口9200，這種客戶端的連接方式是RESTful風格的，使用http的方式進行連接，分為Java Low Level REST Client和Java High Level REST Client。
Transport連接端口9300，這種客戶端連接方式是直接連接ES的節點(給定多個集群節點，將客戶端負載均衡地向這個節點地址集發送請求)，使用TCP的方式進行連接。
spring封裝后的調用：spring-data-elasticsearch和spring-boot-starter-data-elasticsearch。

測試代碼

import org.junit.Test;import org.junit.runner.RunWith;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.boot.test.context.SpringBootTest;import org.springframework.data.elasticsearch.core.ElasticsearchTemplate;import org.springframework.data.elasticsearch.core.query.IndexQuery;import org.springframework.test.context.junit4.SpringRunner;import org.xinhua.cbcloud.pojo.DocLog;import org.xinhua.cbcloud.repository.DocLogRepository;import org.xinhua.cbcloud.util.IDUtil;import java.util.ArrayList;import java.util.LinkedList;import java.util.List;@RunWith(SpringRunner.class)@SpringBootTestpublic class LyfeifeiApplicationTests { private Logger logger = LoggerFactory.getLogger(LyfeifeiApplicationTests.class); @Autowired ElasticsearchTemplate elasticsearchTemplate; @Autowired DocLogRepository docLogRepository; @Test public void createIndex() { // 創建索引 System.out.println(elasticsearchTemplate.createIndex(DocLog.class)); } @Test public void deleteIndex() throws Exception { System.out.println(elasticsearchTemplate.deleteIndex(DocLog.class)); } @Test public void insert() throws Exception { int counter = 0; logger.info("開始執行"); List docLogs = new ArrayList<>(); for (int i = 0; i < 1000000; i++) { DocLog docLog = new DocLog(); docLog.setId(IDUtil.getId()); docLog.setDocId(String.valueOf(IDUtil.getId())); docLog.setMessageId("1569482824826#b807cc4a-6258-4f61-a445-c97006d9512c"); docLog.setContext("111111"); docLogs.add(docLog); } // 索引隊列 List indexQueries = new LinkedList<>(); for (DocLog docLog : docLogs) { IndexQuery indexQuery = new IndexQuery(); indexQuery.setId(String.valueOf(IDUtil.getId())); indexQuery.setObject(docLog); indexQuery.setIndexName("doclog"); indexQuery.setType("docs"); indexQueries.add(indexQuery); if (counter % 5000 == 0) { elasticsearchTemplate.bulkIndex(indexQueries); indexQueries.clear(); } counter++; } if (indexQueries.size() > 0) { elasticsearchTemplate.bulkIndex(indexQueries); } logger.info("執行結束"); }}

測試結果

寫在最后

目前的es為單節點并沒有做集群，100w數據生成再導入es不到兩分鐘，感覺還能接受吧。這個本身沒有太多難點，es的強大更表現在查詢檢索方面，bulkIndex是它提供的一個批量導入方法。es的數據插入小弟嘗試過多線程批量導入，但是無論是單條還是批量都以失敗告終，應該是其內部的鎖造成的，暫時還沒有做深入研究。

總結

以上是生活随笔為你收集整理的es 插入数据_记录一次Java导入百万级数据到Elasticsearch经历的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： geek软件_社团秀@UNC新媒体协会@
下一篇： java try catch_Java中