日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

TianyaDL_4thread天涯帖子下载4线程版

發(fā)布時間:2023/12/20 编程问答 33 豆豆
生活随笔 收集整理的這篇文章主要介紹了 TianyaDL_4thread天涯帖子下载4线程版 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
# -*- coding: cp936 -*- ''' author:郎芭 QQ:149737748 ''' import os,urllib2,time,sys,re import thread from bs4 import BeautifulSoup start_time=time.clock() za='<div.*</div>' a=thread.allocate()#多線程用的鎖 a.acquire() #設(shè)置第二部份鎖為阻塞 b=thread.allocate() b.acquire()#第三部分 c=thread.allocate() c.acquire()#第四部分 d=thread.allocate() d.acquire()def runa(qi,zhi,wurl,x,y):result=''soup=bsp(wurl)lzname=soup.find('div',{'class':'atl-menu clearfix js-bbs-act'})['js_activityusername']for i in xrange(int(qi),int(zhi)+1):newurl='http://bbs.tianya.cn/post-%s-%s-%s.shtml'%(x,y,i)txt=pagecollect(newurl,lzname)if txt:print u'The page %s is completed!\r'%i,else: print u'The page %s is None! \r'%i,result +=txt#優(yōu)先寫入第一部分內(nèi)容,再解鎖第二部分阻塞! writf(result,title)a.release()#解鎖def runb(qi,zhi,wurl,x,y):result=''soup=bsp(wurl)lzname=soup.find('div',{'class':'atl-menu clearfix js-bbs-act'})['js_activityusername']for i in xrange(int(qi),int(zhi)+1):newurl='http://bbs.tianya.cn/post-%s-%s-%s.shtml'%(x,y,i)txt=pagecollect(newurl,lzname)if txt:print u'The page %s is completed!\r'%i,else: print u'The page %s is None! \r'%i,result +=txta.acquire()#狀態(tài)為阻塞,不能執(zhí)行下一步,等待上步完成后解鎖!writf(result,title)b.release()def runc(qi,zhi,wurl,x,y):result=''soup=bsp(wurl)lzname=soup.find('div',{'class':'atl-menu clearfix js-bbs-act'})['js_activityusername']for i in xrange(int(qi),int(zhi)+1):newurl='http://bbs.tianya.cn/post-%s-%s-%s.shtml'%(x,y,i)txt=pagecollect(newurl,lzname)if txt:print u'The page %s is completed!\r'%i,else: print u'The page %s is None! \r'%i,result +=txtb.acquire()#狀態(tài)為阻塞,不能執(zhí)行下一步,等待上步完成后解鎖!writf(result,title)c.release()def rund(qi,zhi,wurl,x,y):result=''soup=bsp(wurl)lzname=soup.find('div',{'class':'atl-menu clearfix js-bbs-act'})['js_activityusername']for i in xrange(int(qi),int(zhi)+1):newurl='http://bbs.tianya.cn/post-%s-%s-%s.shtml'%(x,y,i)txt=pagecollect(newurl,lzname)if txt:print u'The page %s is completed!\r'%i,else: print u'The page %s is None! \r'%i,result +=txtc.acquire()#狀態(tài)為阻塞,不能執(zhí)行下一步,等待上步完成后解鎖!writf(result,title)d.release()def writf(result,title):#寫入文件dirs=os.getcwd()fname='%s.txt'%(title)ff=open(fname,'a')ff.write(result)ff.close()def pagecollect(url,lzname): #獲得當(dāng)前頁內(nèi)容soup=bsp(url)txt=[]lzpost=soup.findAll('div',{'_host':lzname})for i in xrange(len(lzpost)):post=lzpost[i].find('div',{'class':'atl-content'}).text.encode('utf-8')txt.append(re.sub(za,'',post))return ''.join(txt)def bsp(url):turl=urllib2.urlopen(url,timeout=10).read()rsp=BeautifulSoup(turl)return rspdef pagenum(wurl):#獲得URL數(shù)字位1,2,3和總頁數(shù)soup=bsp(wurl)surl=wurl.split('-')z=re.search('(\d+)',surl[3]).group(0)fom=soup.find('form',{'action':'','method':'get'})['onsubmit'].split(',')zong=re.search('(\d+)',fom[3]).group(0)return surl[1],surl[2],z,zongif __name__=='__main__':wurl=raw_input('>>')x,y,z,zong=pagenum(wurl)soup=bsp(wurl)title=re.sub('_.*','',soup.title.text)print titleprint x,'-',y,'-',z,'Pages:',zongz=int(z)if int(zong)-z>=12: #判斷頁數(shù)大于等于12頁則用多線程fen=(int(zong)-z)/4fen=int(fen)thread.start_new_thread(runa,(z,z+1*fen,wurl,x,y))thread.start_new_thread(runb,(z+1*fen+1,z+2*fen,wurl,x,y))thread.start_new_thread(runc,(z+2*fen+1,z+3*fen,wurl,x,y))thread.start_new_thread(rund,(z+3*fen+1,zong,wurl,x,y))else:runa(z,zong,wurl,x,y)d.acquire()print 'Used %.2fs '%(time.clock()-start_time)

?

轉(zhuǎn)載于:https://www.cnblogs.com/langba/archive/2013/06/01/3111939.html

總結(jié)

以上是生活随笔為你收集整理的TianyaDL_4thread天涯帖子下载4线程版的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。