當前位置:
首頁 >
《Linux Shell脚本攻略》读书笔记第五章 一网情深
發布時間:2025/3/17
22
豆豆
生活随笔
收集整理的這篇文章主要介紹了
《Linux Shell脚本攻略》读书笔记第五章 一网情深
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1、網站下載?wget[root@stone?~]#?wget?www.baidu.com--2013-05-20?10:21:08--??http://www.baidu.com/Resolving?www.baidu.com...?115.239.210.26,?115.239.210.27Connecting?to?www.baidu.com|115.239.210.26|:80...?connected.HTTP?request?sent,?awaiting?response...?200?OKLength:?10480?(10K)?[text/html]Saving?to:?`index.html'
100%[===================================================================================================>]?10,480??????--.-K/s???in?0.05s???
2013-05-20?10:21:08?(199?KB/s)?-?`index.html'?saved?[10480/10480]
[root@stone?~]#?wget?-O?www.baidu.com?www.baidu.com--2013-05-20?10:25:28--??http://www.baidu.com/Resolving?www.baidu.com...?115.239.210.26,?115.239.210.27Connecting?to?www.baidu.com|115.239.210.26|:80...?connected.HTTP?request?sent,?awaiting?response...?200?OKLength:?10460?(10K)?[text/html]Saving?to:?`www.baidu.com'
100%[===================================================================================================>]?10,460??????--.-K/s???in?0.05s???
2013-05-20?10:25:28?(191?KB/s)?-?`www.baidu.com'?saved?[10460/10460]#-O指定輸出文件名
#-t指定重試次數#-o指定日志文件#-limit-rate指定最大下載速度#-Q或者--quota指定最大下載配額#-c?URL可以從斷點繼續下載#--mirror下載一個網站的所有頁面#-r表示遞歸下載#-l?depth指定遞歸的層級,與-r配合使用#--user?--password指定用戶名和密碼
2、以格式化純文本形式下載網頁?lynx[root@stone?~]#?lynx?-dump?www.baidu.com?>?index.html?
3、curl入門下載[root@stone?~]#?curl?-C?-?-O?http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-6%?Total????%?Received?%?Xferd??Average?Speed???Time????Time?????Time??CurrentDload??Upload???Total???Spent????Left??Speed100??1706??100??1706????0?????0????285??????0??0:00:05??0:00:05?--:--:--???373[root@stone?~]#?ll?RPM-GPG-KEY-CentOS-6?-rw-r--r--?1?root?root?1706?May?20?11:09?RPM-GPG-KEY-CentOS-6#-C?-表示斷點續傳#-C?offset表示從offset偏移量處續傳#-O表示將下載內容保存為與URL中最后相同文件名的文件中#-o?filename表示將下載的文件保存到指定的文件中
#--silent,-s表示靜默下載,不顯示進度信息#--limit-rate限定下載速度#--max-filesize指定可下載的最大文件大小#-u?username:password指定用戶名和密碼#-u?usernmae只指定用戶名,密碼在提示后輸入
發送http請求[root@stone?~]#?curl?-I?www.baidu.comHTTP/1.1?200?OKDate:?Mon,?20?May?2013?03:18:12?GMTServer:?BWS/1.0Content-Length:?10460Content-Type:?text/html;charset=utf-8Cache-Control:?privateSet-Cookie:?BDSVRTM=1;?path=/Set-Cookie:?H_PS_PSSID=1420_2447_1944_1788_2249;?path=/;?domain=.baidu.comSet-Cookie:?BAIDUID=78C44F4DC793B800B02746B241A9C08C:FG=1;?expires=Mon,?20-May-43?03:18:12?GMT;?path=/;?domain=.baidu.comExpires:?Mon,?20?May?2013?03:18:12?GMTP3P:?CP="?OTI?DSP?COR?IVA?OUR?IND?COM?"Connection:?Keep-Alive#-I打印HTTP頭部信息[root@stone?~]#?curl?http://book.sarathlakshman.com/lsc/mlogs/submit.php?-d?"host=test-host&user=slynux"<html>You?have?entered?:<p>HOST?:?test-host</p><p>USER?:?slynux</p><html>#-d(--data)發送POST請求并讀取網站的HTML響應
4、制作圖片抓取器及下載工具[root@stone?bin]#?curl?-s?www.e-acic.com?|?egrep?-o?"<img?src=[^>]*>"<img?src="/p_w_picpaths/ac_06.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/uploads/2013/01/311556055003.jpg"?/><img?src="/uploads/2013/01/301405454704.png"?/><img?src="/uploads/2013/01/301500445838.jpg"?/><img?src="/p_w_picpaths/wenzi1.png"?/><img?src="/uploads/2013/01/301458396818.jpg"?/><img?src="/uploads/2013/01/280942174858.png"?/><img?src="/uploads/2013/05/171009004173.jpg"?/><img?src="/uploads/2013/05/171009045981.png"?/><img?src="p_w_picpaths/index_28.png"?/><img?src="p_w_picpaths/index_30.png"?/><img?src="p_w_picpaths/index_33.png"?/><img?src="p_w_picpaths/index_34.png"?/>[root@stone?bin]#?curl?-s?www.e-acic.com?|?egrep?-o?"<img?src=[^>]*>"?|?sed?'s/<img?src=\"\([^"]*\).*/\1/g'/p_w_picpaths/ac_06.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/uploads/2013/01/311556055003.jpg/uploads/2013/01/301405454704.png/uploads/2013/01/301500445838.jpg/p_w_picpaths/wenzi1.png/uploads/2013/01/301458396818.jpg/uploads/2013/01/280942174858.png/uploads/2013/05/171009004173.jpg/uploads/2013/05/171009045981.pngp_w_picpaths/index_28.pngp_w_picpaths/index_30.pngp_w_picpaths/index_33.pngp_w_picpaths/index_34.png
[root@stone?~]#?cat?bin/img_download.sh?#!/bin/bashif?[?$#?-ne?3?];thenecho?"Usage:?$0?url?-d?directory"exit?-1fi
for?i?in?{1..4}docase?$1?in-d)?shift;directory=$1;shift;;*)?url=${url:-$1};shift;;esacdone
mkdir?-p?$directorybaseurl=$(echo?$url?|?egrep?-o?"https?://[a-z.]+")
curl?-s?$url?|?egrep?-o?"<img?src=[^>]*>"?|?sed?'s/<img?src=\"\([^"]*\).*/\1/g'?>?/tmp/$$.listsed?-i?'s|^/|$baseurl/|'?/tmp/$$.list
cd?$directorywhile?read?filename;docurl?-C?-?-O?"$filename"done?<?/tmp/$$.list
5、查找網站中的無效鏈接[root@stone?~]#?cat?bin/find_broken.sh?#!/bin/bash
if?[?$#?-eq?2?];thenecho?-e?"Usage?$0?URL\n"exit?-1fi
echo?Broken?links:
mkdir?/tmp/broken.lynxcd?/tmp/broken.lynx
lynx?-traversal?$1?>?/dev/nullcount=0sort?-u?reject.dat?>?links.txt
while?read?link;dooutput=`curl?-I?$link?-s?|?grep?"HTTP/.*OK"`;if?[[?-z?$output?]];thenecho?$links;let?count++fidone?<?links.txt
[?$count?-eq?0?]?&&?echo?No?broken?links?found.
#運行到lynx?-traversal?$1?>?/dev/null會卡住
6、跟蹤網站變更[root@stone?~]#?cat?bin/change.sh?#!/bin/bash
if?[?$#?-eq?2?];thenecho?-e?"Usage?$0?URL\n"exit?-1fi
first_time=0
if?[?!?-e?"last.html"?];thenfirst_time=1fi
curl?--silent?$1?-o?recent.html
if?[?$first_time?-ne?1?];thenchanges=$(diff?-u?last.html?recent.html)if?[?-n?"$changes"?];thenecho?-e?"Changes:\n"echo?"$changes"elseecho?-e?"\nWebsite?has?no?changes"fielseecho?"[Fist?run]?Archiving..."fi
cp?recent.html?last.html#檢查不同的網站需要在不同的目錄下面進行
100%[===================================================================================================>]?10,480??????--.-K/s???in?0.05s???
2013-05-20?10:21:08?(199?KB/s)?-?`index.html'?saved?[10480/10480]
[root@stone?~]#?wget?-O?www.baidu.com?www.baidu.com--2013-05-20?10:25:28--??http://www.baidu.com/Resolving?www.baidu.com...?115.239.210.26,?115.239.210.27Connecting?to?www.baidu.com|115.239.210.26|:80...?connected.HTTP?request?sent,?awaiting?response...?200?OKLength:?10460?(10K)?[text/html]Saving?to:?`www.baidu.com'
100%[===================================================================================================>]?10,460??????--.-K/s???in?0.05s???
2013-05-20?10:25:28?(191?KB/s)?-?`www.baidu.com'?saved?[10460/10460]#-O指定輸出文件名
#-t指定重試次數#-o指定日志文件#-limit-rate指定最大下載速度#-Q或者--quota指定最大下載配額#-c?URL可以從斷點繼續下載#--mirror下載一個網站的所有頁面#-r表示遞歸下載#-l?depth指定遞歸的層級,與-r配合使用#--user?--password指定用戶名和密碼
2、以格式化純文本形式下載網頁?lynx[root@stone?~]#?lynx?-dump?www.baidu.com?>?index.html?
3、curl入門下載[root@stone?~]#?curl?-C?-?-O?http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-6%?Total????%?Received?%?Xferd??Average?Speed???Time????Time?????Time??CurrentDload??Upload???Total???Spent????Left??Speed100??1706??100??1706????0?????0????285??????0??0:00:05??0:00:05?--:--:--???373[root@stone?~]#?ll?RPM-GPG-KEY-CentOS-6?-rw-r--r--?1?root?root?1706?May?20?11:09?RPM-GPG-KEY-CentOS-6#-C?-表示斷點續傳#-C?offset表示從offset偏移量處續傳#-O表示將下載內容保存為與URL中最后相同文件名的文件中#-o?filename表示將下載的文件保存到指定的文件中
#--silent,-s表示靜默下載,不顯示進度信息#--limit-rate限定下載速度#--max-filesize指定可下載的最大文件大小#-u?username:password指定用戶名和密碼#-u?usernmae只指定用戶名,密碼在提示后輸入
發送http請求[root@stone?~]#?curl?-I?www.baidu.comHTTP/1.1?200?OKDate:?Mon,?20?May?2013?03:18:12?GMTServer:?BWS/1.0Content-Length:?10460Content-Type:?text/html;charset=utf-8Cache-Control:?privateSet-Cookie:?BDSVRTM=1;?path=/Set-Cookie:?H_PS_PSSID=1420_2447_1944_1788_2249;?path=/;?domain=.baidu.comSet-Cookie:?BAIDUID=78C44F4DC793B800B02746B241A9C08C:FG=1;?expires=Mon,?20-May-43?03:18:12?GMT;?path=/;?domain=.baidu.comExpires:?Mon,?20?May?2013?03:18:12?GMTP3P:?CP="?OTI?DSP?COR?IVA?OUR?IND?COM?"Connection:?Keep-Alive#-I打印HTTP頭部信息[root@stone?~]#?curl?http://book.sarathlakshman.com/lsc/mlogs/submit.php?-d?"host=test-host&user=slynux"<html>You?have?entered?:<p>HOST?:?test-host</p><p>USER?:?slynux</p><html>#-d(--data)發送POST請求并讀取網站的HTML響應
4、制作圖片抓取器及下載工具[root@stone?bin]#?curl?-s?www.e-acic.com?|?egrep?-o?"<img?src=[^>]*>"<img?src="/p_w_picpaths/ac_06.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/p_w_picpaths/ac_14.jpg"?/><img?src="/uploads/2013/01/311556055003.jpg"?/><img?src="/uploads/2013/01/301405454704.png"?/><img?src="/uploads/2013/01/301500445838.jpg"?/><img?src="/p_w_picpaths/wenzi1.png"?/><img?src="/uploads/2013/01/301458396818.jpg"?/><img?src="/uploads/2013/01/280942174858.png"?/><img?src="/uploads/2013/05/171009004173.jpg"?/><img?src="/uploads/2013/05/171009045981.png"?/><img?src="p_w_picpaths/index_28.png"?/><img?src="p_w_picpaths/index_30.png"?/><img?src="p_w_picpaths/index_33.png"?/><img?src="p_w_picpaths/index_34.png"?/>[root@stone?bin]#?curl?-s?www.e-acic.com?|?egrep?-o?"<img?src=[^>]*>"?|?sed?'s/<img?src=\"\([^"]*\).*/\1/g'/p_w_picpaths/ac_06.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/p_w_picpaths/ac_14.jpg/uploads/2013/01/311556055003.jpg/uploads/2013/01/301405454704.png/uploads/2013/01/301500445838.jpg/p_w_picpaths/wenzi1.png/uploads/2013/01/301458396818.jpg/uploads/2013/01/280942174858.png/uploads/2013/05/171009004173.jpg/uploads/2013/05/171009045981.pngp_w_picpaths/index_28.pngp_w_picpaths/index_30.pngp_w_picpaths/index_33.pngp_w_picpaths/index_34.png
[root@stone?~]#?cat?bin/img_download.sh?#!/bin/bashif?[?$#?-ne?3?];thenecho?"Usage:?$0?url?-d?directory"exit?-1fi
for?i?in?{1..4}docase?$1?in-d)?shift;directory=$1;shift;;*)?url=${url:-$1};shift;;esacdone
mkdir?-p?$directorybaseurl=$(echo?$url?|?egrep?-o?"https?://[a-z.]+")
curl?-s?$url?|?egrep?-o?"<img?src=[^>]*>"?|?sed?'s/<img?src=\"\([^"]*\).*/\1/g'?>?/tmp/$$.listsed?-i?'s|^/|$baseurl/|'?/tmp/$$.list
cd?$directorywhile?read?filename;docurl?-C?-?-O?"$filename"done?<?/tmp/$$.list
5、查找網站中的無效鏈接[root@stone?~]#?cat?bin/find_broken.sh?#!/bin/bash
if?[?$#?-eq?2?];thenecho?-e?"Usage?$0?URL\n"exit?-1fi
echo?Broken?links:
mkdir?/tmp/broken.lynxcd?/tmp/broken.lynx
lynx?-traversal?$1?>?/dev/nullcount=0sort?-u?reject.dat?>?links.txt
while?read?link;dooutput=`curl?-I?$link?-s?|?grep?"HTTP/.*OK"`;if?[[?-z?$output?]];thenecho?$links;let?count++fidone?<?links.txt
[?$count?-eq?0?]?&&?echo?No?broken?links?found.
#運行到lynx?-traversal?$1?>?/dev/null會卡住
6、跟蹤網站變更[root@stone?~]#?cat?bin/change.sh?#!/bin/bash
if?[?$#?-eq?2?];thenecho?-e?"Usage?$0?URL\n"exit?-1fi
first_time=0
if?[?!?-e?"last.html"?];thenfirst_time=1fi
curl?--silent?$1?-o?recent.html
if?[?$first_time?-ne?1?];thenchanges=$(diff?-u?last.html?recent.html)if?[?-n?"$changes"?];thenecho?-e?"Changes:\n"echo?"$changes"elseecho?-e?"\nWebsite?has?no?changes"fielseecho?"[Fist?run]?Archiving..."fi
cp?recent.html?last.html#檢查不同的網站需要在不同的目錄下面進行
轉載于:https://blog.51cto.com/stonebox/1341928
與50位技術專家面對面20年技術見證,附贈技術全景圖總結
以上是生活随笔為你收集整理的《Linux Shell脚本攻略》读书笔记第五章 一网情深的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 十二. python面向对象主动调用其他
- 下一篇: javascript 中的getter,