OMG!又一个频繁FullGC的案例
將用戶已安裝APP數(shù)據(jù)從MySQL中遷移到MongoDB中。MySQL中存儲方式比較簡單,每個用戶每個已安裝的APP一行記錄,且數(shù)據(jù)模型對應(yīng)AppFromMySQL。遷移到MongoDB中,我們想更好的利用MongoDB的優(yōu)勢,所以其對應(yīng)的數(shù)據(jù)模型為UserAppMongo,如果用JSON表示則如下所示:
{"id":?"201811040001","userId":?"12","appMongoList":?[{"appName":?"支付寶","packageName":?"com.alipay","iconUrl":?"http://s3.domain.com/12/12/com.alipay.jpg"},{"appName":?"淘寶","packageName":?"com.alibaba.taobao","iconUrl":?"http://s3.domain.com/12/12/com.alibaba.taobao.jpg"}] }問題重現(xiàn)
按照慣例,為了方便重現(xiàn)問題,將代碼濃縮一下:
class?AppMongo?{private?String?appName;private?String?packageName;private?int?versionCode;private?Date?installTime;private?String?iconUrl;private?String?downloadUrl;private?String?remark;private?Long?size;private?String?developer; } //?需要保存到MongoDB中的用戶已安裝app信息,這樣保存的好處就是MongoDB中installed_apps這張表的user_id能設(shè)置唯一鍵約束,查詢性能相比RDBMS中數(shù)據(jù)平鋪要高不少 class?UserAppMongo?{private?String?id;private?Long?userId;private?List<AppMongo>?appMongoList; } //?關(guān)系型數(shù)據(jù)庫中用戶已安裝app class?AppFromMySQL?{private?int?id;private?Long?userId;private?String?packageName;private?int?versionCode;private?Date?installTime;private?String?appName;private?String?iconUrl;private?String?downloadUrl;private?String?remark;private?Long?size;private?String?developer; }public?class?FullGCSample?{public?static?void?main(String[]?args)?throws?Exception{for?(int?pageNo?=?0;?pageNo?<?10000;?pageNo++)?{List<Long>?userList?=?getUserIdByPage(pageNo);List<UserAppMongo>?userAppMongoList?=?new?ArrayList<>(userList.size());for?(Long?userId:userList){List<AppFromMySQL>?appFromMySQLList?=?getUserInstalledAppList(userId);UserAppMongo?userAppMongo?=?new?UserAppMongo();userAppMongo.setId(System.nanoTime()+"");//測試代碼任意模擬一個偽唯一IDuserAppMongo.setUserId(userId);userAppMongo.setAppMongoList(appFromMySQL2AppMongo(appFromMySQLList));userAppMongoList.add(userAppMongo);}//?save?List<UserAppMongo>?to?mongodbsave2MongoDB(userAppMongoList);}}private?static?void?save2MongoDB(List<UserAppMongo>?userAppMongoList)?throws?Exception?{//?模擬保存一次數(shù)據(jù)到mongodb中要5msThread.sleep(5);}private?static?List<AppMongo>?appFromMySQL2AppMongo(List<AppFromMySQL>?list){List<AppMongo>?appMongoList?=?new?ArrayList<>();for?(AppFromMySQL?app:list){AppMongo?appMongo?=?new?AppMongo();//TODO?bean?copyappMongoList.add(appMongo);}return?appMongoList;}private?static?List<AppFromMySQL>?getUserInstalledAppList(Long?useId){List<AppFromMySQL>?appFromMySQLList?=?new?ArrayList<>();//?假設(shè)用戶手機上安裝的app數(shù)量在50~200之間int?size?=?50?+?new?Random().nextInt(150);for?(int?i?=?0;?i?<?size;?i++)?{AppFromMySQL?appFromMySQL?=?new?AppFromMySQL(i,?(long)i,?"com.afei.android"+i,?i,?new?Date(),?"appName"+i);appFromMySQL.setIconUrl(String.valueOf(i));appFromMySQL.setDownloadUrl(String.valueOf(i));appFromMySQL.setRemark(String.valueOf(i));appFromMySQL.setSize((long)i);appFromMySQL.setDeveloper(String.valueOf(i));appFromMySQLList.add(appFromMySQL);}return?appFromMySQLList;}private?static?List<Long>?getUserIdByPage(int?pageNo){List<Long>?userList?=?new?ArrayList<>();//?取數(shù)據(jù)時每一頁1000個用戶for?(int?i?=?0;?i?<?2000;?i++)?{userList.add((long)i);}return?userList;} }配套的JVM參數(shù)如下(由于是遷移程序,沒必要配置CMS甚至G1,默認的PS垃圾回收即可):
-Xmx400m?-Xms400m?-Xmn150m?-verbose:gc?-XX:+PrintGCDetails運行后jstat -gcutil 57408 2s的結(jié)果如下:
??S0?????S1?????E??????O??????M?????CCS????YGC?????YGCT????FGC????FGCT?????GCT29.81??82.88?100.00??39.35??61.05??61.52?????40???16.274?????7????6.756???23.03091.43??21.01?100.00??39.26??61.05??61.52?????45???17.791?????8????7.327???25.1180.00??90.53???0.00??88.47??61.05??61.52?????47???18.694?????9????7.327???26.02123.00???0.00?100.00??19.10??61.05??61.52?????52???19.655????10????9.227???28.88293.29???0.00???0.00??90.25??61.05??61.52?????56???21.326????11????9.227???30.55394.21???0.00???0.00??82.39??61.05??61.52?????60???22.435????12???10.253???32.68893.23??93.23?100.00??71.09??61.05??61.52?????64???23.223????12???11.027???34.250這里有兩個比較嚴(yán)重的問題:
Old區(qū)漲的過快;
FGC太頻繁;
事實上第二個問題就是第一個問題引起的。
分析問題
這個案例比較特殊,雖然FGC頻繁,但是每次FGC后,Old都能降下去。這種情況下,我們不好通過jmap -dump得到dump文件,或者通過jmap -histo得到Java對象柱狀圖,因為極大可能是Old區(qū)的使用率很低的時候生成的結(jié)果,這種結(jié)果沒多大參考價值:
[afei@node1?~]#?jstat?-gcutil?121165?100S0?????S1?????E??????O??????M?????CCS????YGC?????YGCT????FGC????FGCT?????GCT???0.00???0.00??40.00??15.71??58.25??51.76????287????7.891????63????2.921???10.81296.58???0.00??18.00??34.05??58.25??51.76????289????7.937????63????2.921???10.85896.84???0.00???0.00??70.73??58.25??51.76????291????8.001????63????2.921???10.9230.00???0.00???0.00??27.31??58.25??51.76????291????8.033????64????2.978???11.0100.00??99.47???0.00??45.80??58.25??51.76????293????8.077????64????2.978???11.0550.00??96.84???0.00??83.17??58.25??51.76????295????8.144????65????2.978???11.12196.91???0.00???0.00??21.68??58.25??51.76????296????8.157????65????3.026???11.183那么我們有其他辦法在Old區(qū)使用率很大,甚至發(fā)生FGC前生成dump文件嗎?當(dāng)然有,這里介紹兩個參數(shù):-XX:+HeapDumpAfterFullGC和-XX:+HeapDumpBeforeFullGC。看命名就知道,這兩個參數(shù)是在FGC前后生成dump文件。需要注意的是,一定是發(fā)生FGC,而不是CMS GC或者G1這種并發(fā)GC。加上-XX:+HeapDumpBeforeFullGC這個參數(shù)后,再次運行,我們看到如下這樣的GC日志,即在FGC之前生成dump文件:
[GC?(Allocation?Failure)?[PSYoungGen:?94016K->42816K(102400K)]?236438K->227942K(358400K),?0.0661795?secs]?[Times:?user=0.62?sys=0.88,?real=0.07?secs]? [GC?(Allocation?Failure)?[PSYoungGen:?94016K->42752K(102400K)]?279142K->270606K(358400K),?0.0711319?secs]?[Times:?user=0.60?sys=1.01,?real=0.07?secs]? [Heap?Dump?(before?full?gc):?Dumping?heap?to?java_pid121598.hprof?... Heap?dump?file?created?[366886452?bytes?in?1.878?secs] ,?1.8782650?secs][Full?GC?(Ergonomics)?[PSYoungGen:?42752K->0K(102400K)]?[ParOldGen:?227854K->41341K(256000K)]?270606K->41341K(358400K),?[Metaspace:?2828K->2828K(1056768K)],?0.1720676?secs]?[Times:?user=3.72?sys=0.07,?real=0.17?secs]?對dump文件進行分析,結(jié)果如下,兩個比較靠前的對象是UserAppMongo和AppMongo:
headp dump而通過TOP1的對象UserAppMongo的"List Objects"->"with outgoing references",得到如下圖所示,由圖可知,UserAppMongo這個對象屬性里包含了List<AppMongo>對象(appMongoList),其本質(zhì)是Object數(shù)組,每個AppMongo對象又是由appName,packageName,installTime等屬性組成,所以Histogram視圖中排名前幾位的UserAppMongo,Object[],ArrayList,AppMongo事實上都是UserAppMongo這一個對象:
outgoing references遷移程序比較簡單,核心代碼就那么幾行,通過問題對象UserAppMongo,review代碼的過程中,我們很快就懷疑到了下面這段代碼:
List<Long>?userList?=?getUserIdByPage(pageNo); List<UserAppMongo>?userAppMongoList?=?new?ArrayList<>(userList.size()); for?(Long?userId:userList){List<AppFromMySQL>?appFromMySQLList?=?getUserInstalledAppList(userId);UserAppMongo?userAppMongo?=?new?UserAppMongo();userAppMongo.setId(System.nanoTime()+"");userAppMongo.setUserId(userId);userAppMongo.setAppMongoList(appFromMySQL2AppMongo(appFromMySQLList));userAppMongoList.add(userAppMongo); } //?save?List<UserAppMongo>?to?mongodb save2MongoDB(userAppMongoList);這段代碼的邏輯是:
得到一批用戶ID;
然后遍歷這些用戶ID,取得每個用戶已安裝APP集合轉(zhuǎn)換成MongoDB需要的數(shù)據(jù)模型;
批量保存到MongoDB中;
我們仔細分析一下這段代碼就會發(fā)現(xiàn),遍歷每一頁的過程中,總計有pageSize*n*2個對象直到保存到MongoDB后,遍歷下一頁時這些對象才會得到釋放,其中pageSize是每一頁的用戶數(shù)量(方法getUserIdByPage中),n是用戶平均安裝APP的數(shù)量,之所以乘以2是因為有一半是MySQL數(shù)據(jù)模型對象,另一半是MongoDB數(shù)據(jù)模型對象。假設(shè)每一頁1000個用戶,用戶平均安裝的APP數(shù)量為100個。那么處理每一頁時總計有20w個對象一直常駐,且無法被GC掉。
如何解決
了解了問題的本質(zhì)后,就比較好解決了,而且有很多種方法可以解決。
-
方法1-增大Young區(qū)
方法1就是增大Young區(qū)大小,準(zhǔn)確的說是增大Eden區(qū)大小,大到能容忍20w個對象。那如果遷移程序?qū)ageSize改為2000,那么就需要增大Eden區(qū)直到能容下40w個對象。
-
方法2-優(yōu)化代碼
方法1優(yōu)化辦法的JVM參數(shù)還得跟pageSize參數(shù)值耦合,有點約束。我們能否優(yōu)化成無論pageSize多大。每次內(nèi)存中最大常駐對象數(shù)量是一定的呢?當(dāng)然可以,請看下面這段優(yōu)化后的代碼:
List<Long>?userList?=?getUserIdByPage(pageNo); List<UserAppMongo>?userAppMongoList?=?new?ArrayList<>(userList.size());for?(Long?userId:userList){List<AppFromMySQL>?appFromMySQLList?=?getUserInstalledAppList(userId);UserAppMongo?userAppMongo?=?new?UserAppMongo();userAppMongo.setId(System.nanoTime()+"");userAppMongo.setUserId(userId);userAppMongo.setAppMongoList(appFromMySQL2AppMongo(appFromMySQLList));userAppMongoList.add(userAppMongo);//?核心優(yōu)化代碼if?(userAppMongoList.size()>=threshold){save2MongoDB(userAppMongoList);userAppMongoList.clear();} } //?save?List<UserAppMongo>?to?mongodb save2MongoDB(userAppMongoList);說明:
核心優(yōu)化代碼的threshold的值,取一個合理的值即可。這樣的話,無論getUserIdByPage()時pageSize多大,整個堆中不可GC的駐留對象只會多幾個userId而已。
假設(shè)threshold設(shè)置為500,那么在遍歷到下一頁之前整個堆中不可GC的駐留對象個數(shù)為:500*100*2=10000,其中100是平均每個用戶安裝APP的數(shù)量。
這樣優(yōu)化以后,無論getUserIdByPage()中批量取用戶時pageSize為1000,還是5000,還是20000。JVM參數(shù)都不需要調(diào)整,且非常穩(wěn)定。jstat -gcutil 56436 2s結(jié)果如下所示,運行一段時間都沒有FGC,并且Old漲幅基本可以接受:
??S0?????S1?????E??????O??????M?????CCS????YGC?????YGCT????FGC????FGCT?????GCT35.87???0.00??54.00???3.64??61.16??61.52?????52????3.894?????0????0.000????3.8940.00??50.37??48.00???3.89??61.16??61.52?????67????4.392?????0????0.000????4.39212.41???0.00??46.00???4.14??61.16??61.52?????80????4.990?????0????0.000????4.9901.66??14.04?100.00???4.38??61.16??61.52?????89????5.636?????0????0.000????5.6360.00??27.05??24.00???4.63??61.16??61.52????103????6.146?????0????0.000????6.146總結(jié)
以上是生活随笔為你收集整理的OMG!又一个频繁FullGC的案例的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Java 11 已发布,String 还
- 下一篇: 李开复:不是言AI必称中美,而是欧洲太堂