日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 运维知识 > 数据库 >内容正文

数据库

使用solr的DIHandler 构建mysql大表全量索引,内存溢出问题的解决方法

發布時間:2025/4/5 数据库 24 豆豆
生活随笔 收集整理的這篇文章主要介紹了 使用solr的DIHandler 构建mysql大表全量索引,内存溢出问题的解决方法 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

solr官方給出的解決方式是:

DataImportHandler is designed to stream row one-by-one. It passes a fetch size value (default: 500) to Statement#setFetchSize which some drivers do not honor. For MySQL, add batchSize property to dataSource configuration with value -1. This will pass Integer.MIN_VALUE to the driver as the fetch size and keep it from going out of memory for large tables.Should look like:<dataSource type="JdbcDataSource" name="ds-2" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:8889/mysqldatabase" batchSize="-1" user="root" password="root"/>

說明:DataImportHandler 設計是支持按行獲取的。它通過Statement#setFetchSize來設置每次獲取的數量,默認是500個。然而一些驅動不支持設置fetchSize。對mysql來說,傳遞fetchSize屬性值-1到Datasource配置中。它將將Integer.MIN_VALUE(-231,-2147483648 [0x80000000])傳給驅動作為fetchsize,此時確保大表不會造成大表移除。

mysql官方給出的解釋是:

ResultSetBy default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate and, due to the design of the MySQL network protocol, is easier to implement. If you are working with ResultSets that have a large number of rows or large values and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.To enable this functionality, create a Statement instance in the following manner:stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE); The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row.There are some caveats with this approach. You must read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown.The earliest the locks these statements hold can be released (whether they be MyISAM table-level locks or row-level locks in some other storage engine such as InnoDB) is when the statement completes.If the statement is within scope of a transaction, then locks are released when the transaction completes (which implies that the statement needs to complete first). As with most other databases, statements are not complete until all the results pending on the statement are read or the active result set for the statement is closed.Therefore, if using streaming results, process them as quickly as possible if you want to maintain concurrent access to the tables referenced by the statement producing the result set.

通過聯合使用forward-only,read-only resultSet和fetchsize值為Integer.MIN_VALUE作為驅動一行行獲取結果流的信號。設置完以后,所有statement創建的resultSet將會一行行的獲取結果集。

參考文獻:

【1】?https://wiki.apache.org/solr/DataImportHandlerFaq

【2】http://dev.mysql.com/doc/connector-j/en/connector-j-reference-implementation-notes.html

轉載于:https://www.cnblogs.com/davidwang456/p/4800911.html

總結

以上是生活随笔為你收集整理的使用solr的DIHandler 构建mysql大表全量索引,内存溢出问题的解决方法的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。