python 模拟浏览器登录获取cookie_使用cookielib模拟浏览器在python中获取url
我正在使用cookielib,有時在瀏覽器中打開一個url,通過瀏覽器發(fā)出許多其他請求來下載許多其他文件。我可以使用cookielib或任何其他python庫復(fù)制相同的行為嗎?在
我必須從python腳本發(fā)出超過1個GET請求。當(dāng)我打開網(wǎng)頁時,通過分析網(wǎng)絡(luò)請求,我得到了瀏覽器發(fā)出的所有請求的url。在
我在看是否有什么方法我可以只做一個請求,它像瀏覽器一樣自己獲取所有相關(guān)的請求。在
我對js或css不是很感興趣,而是主要的html。在
我嘗試了下面的代碼,但它無法下載整個頁面cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
response = opener.open('https://applicant.keybank.com/psp/hrsappl/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1')
html = response.read()
但是當(dāng)我按順序獲取另外3個GET url時,它能夠在第三個GET響應(yīng)中給我所需的html。我通過檢查瀏覽器的network標(biāo)簽得到了這些url
^{pr2}$
下面是我正在進(jìn)行的其他抓取的完整代碼response = opener.open('https://applicant.keybank.com/psc/hrsappl/EMPLOYEE/EMPL/s/WEBLIB_PT_NAV.ISCRIPT1.FieldFormula.IScript_UniHeader_Frame?c=NNTCgkqGs001AcPaisqGbYpTu%2fbGx4jx&Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&PortalIsPagelet=true&NoCrumbs=yes')
response.read()
response = opener.open('https://applicant.keybank.com/psc/hrsappl/EMPLOYEE/EMPL/s/WEBLIB_PTPPB.ISCRIPT2.FieldFormula.IScript_TemplatePageletBuilder?PTPPB_PAGELET_ID=KC_LNAV_APPLICANT&target=KCNV_KC_LNAV_APPLICANT_TMPL&Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&PortalIsPagelet=true&NoCrumbs=yes&PortalTargetFrame=TargetContent')
response.read()
response = opener.open('https://hronline.keybank.com/psc/hrshrm/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalCRefLabel=Careers&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&NoCrumbs=yes&PortalKeyStruct=yes')
required_html = response.read()
總結(jié)
以上是生活随笔為你收集整理的python 模拟浏览器登录获取cookie_使用cookielib模拟浏览器在python中获取url的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: mysql extended_expla
- 下一篇: python随机生成车牌_使用Pytho