[授人以渔]学习用Python下载尤物丧志的图片
导读
自学了下python,为了下载收集尤物的图片
班门弄斧, 见笑.
python3.8.6
按说python3系列都可以.
把代码保存为"youwu.py", 或者下载"youwu.txt", 改名为"youwu.py".
保证有python3的环境下, 命令行进入当前文件夹,
运行"python youwu.py" ,按提示输入网址. 会在当前目录根据标题自动生成文件夹.
D:pygetpics>python youwu.py
请输入网址:
https://youwu.lol/albums/3e97acbf9e2e3eb3f2625a0100bf23be
开始下载第一页!
未分类性感作品嫣嫣子GameOver尤物丧志
共6页
共 10张图片
1_1.jpg
图片下载超时https://images2.imgbox.com/58/aa/h9mBOAD7_o.jpg
1_2.jpg 图片下载超时https://images2.imgbox.com/5a/6a/lWk7dOUu_o.jpg
1_3.jpg 1_4.jpg 1_5.jpg
复制代码
下列代码保存为youwu.py
# * coding: utf8 *
import urllib
import urllib.request
import os
import re
import time
import socket
import sys
from urllib.parse import urlparse
domain = ""
title = ""
page = 1
lastpage = 0
def builddir(folderpath):
folder_name = folderpath
os.makedirs(folder_name)
def downpic(url):
global domain
global title
global page
global lastpage
domain = get_url_host(url)
ua_headers = {"UserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}
# proxy_handler = urllib.request.ProxyHandler({'http': 'http://127.0.0.1:10800','https': 'https://127.0.0.1:10800'})
# opener = urllib.request.build_opener(proxy_handler)
# urllib.request.install_opener(opener)
request = urllib.request.Request(url, headers = ua_headers)
socket.setdefaulttimeout(35)
html = ""
try:
response = urllib.request.urlopen(request)
html = response.read().decode('utf8')
except urllib.error.URLError as e:
if isinstance(e.reason, socket.timeout):
print('请求超时,请检查网络连接。')
exit()
if html == "":
print('未获取页面内容')
exit()
if title == "":
titlereg = re.compile(r'
titlestr = re.findall(titlereg, html)
titreg = re.compile(r'[wu4e00u9fff]+')
titarr = re.findall(titreg, titlestr[0])
title = ''.join(titarr)
print(title)
if lastpage == 0:
lastpagereg = re.compile(r']*?>[^<]*?s*? lastpagearr = re.findall(lastpagereg, html)
lastpage = lastpagearr[0]
print("共" + lastpage + "页")
if page > 1:
print("下载第 " + str(page) + "/" + str(lastpage) + " 页")
conreg = re.compile(r'
imgdiv = re.findall(conreg, html)
imgre = re.compile(r'
imglist = re.findall(imgre, imgdiv[0])
if imglist:
print("共 "+ str(len(imglist)) + "张图片")
x = 1
for imgurl in imglist:
picdown(imgurl, str(page) + '_' + str(x) + '.jpg')
x+=1
