爬取有用的代理
最近爬蟲老是遇到ip被封,心里好不爽,那爬IP去
# -*- coding:utf-8 -*- # time :2019/4/6 23:41 # author: 毛利 from lxml import etree import time import requests from threading import Thread from multiprocessing import Process,Queue import pymongo parse_list = [{'urls':['https://www.kuaidaili.com/free/inha/{}/'.format(i) for i in range(1,20)],'pattern':'//div[@id="list"]//tr[position()>1]', # position()>1 只是表示 從第二個元素開始取'position':{'ip':'./td[1]','port':'./td[2]','address':'./td[5]'}},{'urls':['https://www.xicidaili.com/wt/{}'.format(i) for i in range(1,20)],'pattern':'//table[@id="ip_list"]//tr[position()>1]','position':{'ip':'./td[2]','port':'./td[3]','address':'./td[4]/a'}總結
- 上一篇: 花呗分期占用额度吗
- 下一篇: 通过flask构建自己的代理池