前言

我也不知道说啥了, 看呗,就当是一个案例练习吧,

首先导入库

frombs4importBeautifulSoupfromurllib.requestimporturlretrieveimportrequestsimportosimporttime

主体代码(一)

if__name__=='__main__':list_url=[]fornuminrange(1,3):ifnum==1:url='http://www.shuaia.net/index.html'else:url='http://www.shuaia.net/index_%d.html'%numheaders={"User-Agent":"Mozilla/5.0(WindowsNT6.1;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/58.0.3029.110Safari/537.36"}req=requests.get(url=url,headers=headers)req.encoding='utf-8'html=req.textbf=BeautifulSoup(html,'lxml')targets_url=bf.find_all(class_='item-img')foreachintargets_url:list_url.append(each.img.get('alt')+'='+each.get('href'))print('连接采集完成')

主体代码(二)

foreach_imginlist_url:img_info=each_img.split('=')target_url=img_info[1]filename=img_info[0]+'.jpg'print('下载:'+filename)headers={"User-Agent":"Mozilla/5.0(WindowsNT6.1;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/58.0.3029.110Safari/537.36"}img_req=requests.get(url=target_url,headers=headers)img_req.encoding='utf-8'img_html=img_req.textimg_bf_1=BeautifulSoup(img_html,'lxml')img_url=img_bf_1.find_all('div',class_='wr-single-content-list')img_bf_2=BeautifulSoup(str(img_url),'lxml')img_url='http://www.shuaia.net'+img_bf_2.div.img.get('src')if'images'notinos.listdir():os.makedirs('images')urlretrieve(url=img_url,filename='images/'+filename)time.sleep(1)print('下载完成!')

感觉如何?自己能实现吗?欢迎大家交流学习