对于元素进行定位。把a标签的href值全部拿出来,而且也把对应的名称取出来
tree = etree.HTML(resp.content)
node list = tree.xpath( /html/body/div[2]/div[2]/div[3]/ul/li )
sub url list = []
for node in node list: if len(node.xpath( ./a/@href )) > 0: sub url = node.xpath( ./a/@href )[0] if len(node.xpath( ./a/@href )) > 0: title = node.xpath( ./a/b/text() )[0] sub url list.append((sub url, title))
4.3 访问详情页
base url = http://www.netbian.com/
for sub url, title in sub url list: s page = base url + sub url s resp = requests.get(s page) with open( s.html , wb ) as f: f.write(s resp.content)
【详细分析】论bo能否返回FPX(超多图片证据) - 英雄联盟
hupu.com - get the latest breaking news, showbiz & celebrity photos, sport news & rumours, viral videos and top stories from hupu.com Daily Mail and Mail on Sunday newspapers.
这是我送给兄弟女朋友的六一礼物_PaperJack的博客-CSDN博客
csdn.net - get the latest breaking news, showbiz & celebrity photos, sport news & rumours, viral videos and top stories from csdn.net Daily Mail and Mail on Sunday newspapers.