日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 >

Elasticsearch索引的数据存储路径是如何确定的

發布時間:2025/3/15 39 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Elasticsearch索引的数据存储路径是如何确定的 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Elasticsearch中,在node的配置中可以指定path.data用來作為節點數據的存儲目錄,而且我們可以指定多個值來作為數據存儲的路徑,那么Elasticsearch是如何判斷應該存儲到哪個路徑下呢?今天我就記錄一下這個問題。

Elasticsearch的索引創建過程

  • 集群master收到創建索引的請求后,經過創建索引的一些步驟,最終會將索引創建完成的請求提交到ClusterState
  • master將根據ClusterState分發給所有節點
  • 涉及創建shard的節點會讀取本地可用的path.data,然后依據一定的規則獲取路徑。
  • 創建基本shard路徑,保存基本的shard信息。
  • 如何確定在哪個目錄下

    源碼

    主要調用的是ShardPath的selectNewPathForShard方法

    for (NodeEnvironment.NodePath nodePath : env.nodePaths()) {totFreeSpace = totFreeSpace.add(BigInteger.valueOf(nodePath.fileStore.getUsableSpace()));}// TODO: this is a hack!! We should instead keep track of incoming (relocated) shards since we know// how large they will be once they're done copying, instead of a silly guess for such cases:// Very rough heuristic of how much disk space we expect the shard will use over its lifetime, the max of current average// shard size across the cluster and 5% of the total available free space on this node:BigInteger estShardSizeInBytes = BigInteger.valueOf(avgShardSizeInBytes).max(totFreeSpace.divide(BigInteger.valueOf(20)));// TODO - do we need something more extensible? Yet, this does the job for now...final NodeEnvironment.NodePath[] paths = env.nodePaths();// If no better path is chosen, use the one with the most space by defaultNodeEnvironment.NodePath bestPath = getPathWithMostFreeSpace(env);if (paths.length != 1) {Map<NodeEnvironment.NodePath, Long> pathToShardCount = env.shardCountPerPath(shardId.getIndex());// Compute how much space there is on each pathfinal Map<NodeEnvironment.NodePath, BigInteger> pathsToSpace = new HashMap<>(paths.length);for (NodeEnvironment.NodePath nodePath : paths) {FileStore fileStore = nodePath.fileStore;BigInteger usableBytes = BigInteger.valueOf(fileStore.getUsableSpace());pathsToSpace.put(nodePath, usableBytes);}bestPath = Arrays.stream(paths)// Filter out paths that have enough space.filter((path) -> pathsToSpace.get(path).subtract(estShardSizeInBytes).compareTo(BigInteger.ZERO) > 0)// Sort by the number of shards for this index.sorted((p1, p2) -> {int cmp = Long.compare(pathToShardCount.getOrDefault(p1, 0L),pathToShardCount.getOrDefault(p2, 0L));if (cmp == 0) {// if the number of shards is equal, tie-break with the number of total shardscmp = Integer.compare(dataPathToShardCount.getOrDefault(p1.path, 0),dataPathToShardCount.getOrDefault(p2.path, 0));if (cmp == 0) {// if the number of shards is equal, tie-break with the usable bytescmp = pathsToSpace.get(p2).compareTo(pathsToSpace.get(p1));}}return cmp;})// Return the first result.findFirst()// Or the existing best path if there aren't any that fit the criteria.orElse(bestPath);}statePath = bestPath.resolve(shardId);dataPath = statePath;}

    過程分析

  • 首先判斷是否自定義了path.data,沒有自定義就在默認路徑下創建
  • 自定義的情況下確保節點下最少有5%的空間可以使用
  • 獲取所有的paths,
  • 然后設置默認最佳的path是當前擁有最多空間的path
  • 遍歷所有的paths,首先過濾掉沒有空間的path,如果最終沒有符合的,就返回4步驟的path,否則繼續6步驟
  • 按照規則對paths排序,首先判斷每個path下該索引的shard數,優先返回含有本索引的shard數最少的path;
    當條件結果相同,對比每個path中包含有的shard總數(所有索引的),返回包含shard數最少的path;
    當2條件結果相同,對比可用空間,返回可用空間最大的path
  • 生成相應的路徑,創建目錄等信息。
  • ?

    ?

    總結

    以上是生活随笔為你收集整理的Elasticsearch索引的数据存储路径是如何确定的的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。