當(dāng)前位置:
首頁(yè) >
Linux Lernel Panic 报错解决思路
發(fā)布時(shí)間:2025/6/15
45
豆豆
生活随笔
收集整理的這篇文章主要介紹了
Linux Lernel Panic 报错解决思路
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
Linux Kernel Panic 報(bào)錯(cuò)解決思路
作為系統(tǒng)管理員面對(duì)server重啟都起不來(lái),那是一個(gè)相當(dāng)煩躁,接顯示器查看經(jīng)常會(huì)碰到遇到這樣的字眼"Kernel Panic" 在這里Peace從網(wǎng)上搜了篇文章說(shuō)的很詳細(xì)貼出來(lái)與大家一起分享下~ 正文如下: Linux雖然沒有藍(lán)屏現(xiàn)象,不過(guò)Kernel報(bào)錯(cuò)有時(shí)也會(huì)讓人頭疼。有時(shí)重啟后正常,linux系統(tǒng)運(yùn)行一段時(shí)間后又down了,總不能出現(xiàn)問題就reboot啊。我從網(wǎng)上搜集一下資料,整理了出來(lái),希望大家能在評(píng)論與我交流您的看法與經(jīng)驗(yàn)。 什么是Kernel Panic? wiki: A?kernel panic?is an action taken by an?operating system?upon detecting an internal?fatal error?from which it cannot safely recover. The term is largely specific to?Unix?and?Unix-like?systems; for?Microsoft Windowsoperating systems the equivalent term is “Bug check” (or,?colloquially, “Blue Screen of Death“). The?kernel?routines that handle panics (in?AT&T-derived and?BSD?Unix source code, a routine known as?panic()) are generally designed to output an?error message?to the?console, dump an p_w_picpath of kernel memory to disk for post-mortemdebugging?and then either wait for the system to be manually rebooted, or initiate an automatic?reboot. The information provided is of highly technical nature and aims to assist a?system administrator?or?software developer?in diagnosing the problem. Attempts by the operating system to read an invalid or non-permitted?memory address?are a common source of kernel panics. A panic may also occur as a result of a hardware failure or a bug in the operating system. In many cases, the operating system could continue operation after memory violations have occurred. However, the system is in an unstable state and rather than risking security breaches and data corruption, the operating system stops to prevent further damage and facilitate diagnosis of the error. The kernel panic was introduced in an early version of?Unix?and demonstrated a major difference between the design philosophies of Unix and its predecessor?Multics. Multics developer?Tom van Vleck?recalls a discussion of this change with Unix developer?Dennis Ritchie: I remarked to Dennis that easily half the code I was writing in Multics was error recovery code. He said, “We left all that stuff out. If there’s an error, we have this routine called panic, and when it is called, the machine crashes, and you holler down the hall, ‘Hey, reboot it.’”[1] The original?panic()?function was essentially unchanged from Fifth Edition UNIX to the?VAX-based UNIX 32V and output only an error message with no other information, then dropped the system into an endless idle loop. As the Unixcodebase?was enhanced, the?panic()?function was also enhanced to dump various forms of debugging information to the console. panic是英文中是驚慌的意思,Linux Kernel panic正如其名,linux kernel不知道如何走了,它會(huì)盡可能把它此時(shí)能獲取的全部信息都打印出來(lái)。 有兩種主要類型kernel panic: 1.hard panic(也就是Aieee信息輸出)2.soft panic (也就是Oops信息輸出) 常見Linux Kernel Panic報(bào)錯(cuò)內(nèi)容: Kernel panic-not syncing fatal exception in interrupt
kernel panic – not syncing: Attempted to kill the idle task!
kernel panic – not syncing: killing interrupt handler!
Kernel Panic – not syncing:Attempted to kill init ! 什么會(huì)導(dǎo)致Linux Kernel Panic? 只有加載到內(nèi)核空間的驅(qū)動(dòng)模塊才能直接導(dǎo)致kernel panic,你可以在系統(tǒng)正常的情況下,使用lsmod查看當(dāng)前系統(tǒng)加載了哪些模塊。
除此之外,內(nèi)建在內(nèi)核里的組件(比如memory map等)也能導(dǎo)致panic。 因?yàn)閔ard panic和soft panic本質(zhì)上不同,因此我們分別討論。 hard panic 一般出現(xiàn)下面的情況,就認(rèn)為是發(fā)生了kernel panic:
根據(jù)panic的狀態(tài)不同,內(nèi)核將記錄所有在系統(tǒng)鎖定之前的信息。因?yàn)閗enrel panic是一種很嚴(yán)重的錯(cuò)誤,不能確定系統(tǒng)能記錄多少信息,下面是一些需要收集的關(guān)鍵信息,他們非常重要,因此盡可能收集全,當(dāng)然如果系統(tǒng)啟動(dòng)的時(shí)候就kernel panic,那就無(wú)法只知道能收集到多少有用的信息了。
- setterm -blank 0
- setterm -powerdown 0
- setvesablank off
KDB編譯到內(nèi)核里,panic發(fā)生時(shí),他將內(nèi)核引導(dǎo)到一個(gè)shell環(huán)境而不是鎖定。這樣,我們就可以收集一些與panic相關(guān)的信息了,這對(duì)我們定位問題的根本原因有很大的幫助。 使用KDB需要注意,內(nèi)核必須是基本核心版本,比如是2.4.18,而不是2.4.18-5這樣子的,因?yàn)镵DB僅對(duì)基本核心有效。 soft panic 癥狀:
當(dāng)soft panic發(fā)生時(shí),內(nèi)核將產(chǎn)生一個(gè)包含內(nèi)核符號(hào)(kernel symbols)信息的dump數(shù)據(jù),這個(gè)將記錄在/var/log/messages里。為了開始排查故障,可以使用ksymoops工具來(lái)把內(nèi)核符號(hào)信息轉(zhuǎn)成有意義的數(shù)據(jù)。 為了生成ksymoops文件,需要:
- 從/var/log/messages里找到的堆棧跟蹤文本信息保存為一個(gè)新文件。確保刪除了時(shí)間戳(timestamp),否則ksymoops會(huì)失敗。
- 運(yùn)行ksymoops程序(如果沒有,請(qǐng)安裝)
- 詳細(xì)的ksymoops執(zhí)行用法,可以參考ksymoops(8)手冊(cè)。
kernel.sysrq=1 #激活Magic SysRq ?否則,鍵盤鼠標(biāo)沒有響應(yīng) 按住 [ALT]+[SysRq]+[COMMAND], 這里SysRq是Print SCR鍵,而COMMAND按以下來(lái)解釋! b – 立即重啟
e – 發(fā)送SIGTERM給init之外的系統(tǒng)進(jìn)程
o – 關(guān)機(jī)
s – sync同步所有的文件系統(tǒng)
u – 試圖重新掛載文件系統(tǒng) 配置一下以防萬(wàn)一。 很多網(wǎng)友安裝linux出現(xiàn)“Kernel panic-not syncing fatal exception in interrupt”是由于網(wǎng)卡驅(qū)動(dòng)原因。 解決方法:將選項(xiàng)“Onboard Lan”的選項(xiàng)“Disabled”,重啟從光驅(qū)啟動(dòng)即可。 等安裝完系統(tǒng)之后,再進(jìn)入BIOS將“Onboard Lan”的選項(xiàng)給“enable”,下載相應(yīng)的網(wǎng)卡驅(qū)動(dòng)安裝。 如出現(xiàn)以下報(bào)錯(cuò): init() r8168 …? ??????????… … ?????????… :Kernel panic:?Fatal exception r8168是網(wǎng)卡型號(hào)。 在BIOS中禁用網(wǎng)卡,從光驅(qū)啟動(dòng)安裝系統(tǒng)。再?gòu)木W(wǎng)上下載網(wǎng)卡驅(qū)動(dòng)安裝。 #tar??vjxf ?r8168-8.014.00.tar.bz2
? # make ?clean ?modules???????(as root or with sudo) # make ?install # depmod ?-a # modprobe ?r8168 安裝好系統(tǒng)后reboot進(jìn)入BIOS把網(wǎng)卡打開。 另有網(wǎng)友在Kernel panic出錯(cuò)信息中看到“alc880”,這是個(gè)聲卡類型。嘗試著將聲卡關(guān)閉,重啟系統(tǒng),搞定。 安裝linux系統(tǒng)遇到安裝完成之后,無(wú)法啟動(dòng)系統(tǒng)出現(xiàn)Kernel panic-not syncing fatal exception。很多情況是由于板載聲卡、網(wǎng)卡、或是cpu 超線程功能(Hyper-Threading )引起的。這類問題的解決辦法就是先查看錯(cuò)誤代碼中的信息,找到錯(cuò)誤所指向的硬件,將其禁用。系統(tǒng)啟動(dòng)后,安裝好相應(yīng)的驅(qū)動(dòng),再啟用該硬件即可。
另外出現(xiàn)“Kernel Panic — not syncing: attempted to kill init”和“Kernel Panic — not syncing: attempted to kill idle task”有時(shí)把內(nèi)存互相換下位置或重新插拔下可以解決問題。 快樂學(xué)習(xí),快樂分享!
轉(zhuǎn)載于:https://blog.51cto.com/hepeace/1033079
總結(jié)
以上是生活随笔為你收集整理的Linux Lernel Panic 报错解决思路的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Oracle的LINUX安装
- 下一篇: Oracle11gExp导出空表方法