嗨~我是小鱼,一个不太厉害混迹编程界的计算机老学姐。
前言
不知道大家有没有听过一句关于Python的名言:不要重复造轮子。
话是这么说,那么问题来了:
1、首先,你知道有哪些轮子已经造好了吗?哪个轮子能够用到你自己编写的程序中?有名有姓的的著名轮子就400多个,更别说没名没姓自己在制造中的轮子。
2、其次,确实没重复造轮子,但是在重复制造汽车。包括好多大神写的好几百行代码,为的是解决一个Excel本身就有的成熟功能。
3、很多人是用来抓图,数据,抓点图片、视频、天气预报自娱自乐一下,然后呢?抓到大数据以后做什么用呢?比如某某品牌服装卖的快,然后呢?比如某某电影票房多,然后呢?

为了让大家在学习的过程中有些乐趣以及给大家搜集一些小零件,小轮子,下面给大家整理一些Python3.6.4调试通过的代码:
1、抓取知乎图片
2、听两个聊天机器人互相聊天
3、AI分析唐诗的作者是李白还是杜甫
4、彩票随机生成35选7
5、自动写检讨书
6、屏幕录相机
7、制作Gif动图
8、新闻聚合
9、给小姐姐颜值打分
10、微信群发祝福脚本

一、抓取知乎图片?
from selenium import webdriver
import time
import urllib.request
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://www.zhihu.com/question/29134042")
i = 0
while i < 10:
????driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
????time.sleep(2)
????try:
????????driver.find_element_by_css_selector('button.QuestionMainAction').click()
????????print("page" + str(i))
????????time.sleep(1)
????except:
????????break
result_raw = driver.page_source
content_list = re.findall("img src=\"(.+?)\" ", str(result_raw))
n = 0
while n < len(content_list):
????i = time.time()
????local = (r"%s.jpg" % (i))
????urllib.request.urlretrieve(content_list[n], local)
????print("编号:" + str(i))
n = n + 1
二、两个机器人微信聊天
from time import sleep
import requests
s = input("请主人输入话题:")
while True:
????resp = requests.post("http://www.tuling123.com/openapi/api",data={"key":"4fede3c4384846b9a7d0456a5e1e2943", "info": s, })
????resp = resp.json()
????sleep(1)
????print('小鱼:', resp['text'])
????s = resp['text']
????resp = requests.get("http://api.qingyunke.com/api.php", {'key': 'free', 'appid':0, 'msg': s})
????resp.encoding = 'utf8'
????resp = resp.json()
????sleep(1)
????print('里里:', resp['content'])
网上还有一个据说智商比较高的小i机器人,用爬虫的功能来实现一下:
import urllib.request
import re
while True:
????x = input("主人:")
????x = urllib.parse.quote(x)
????link = urllib.request.urlopen(
????????"http://nlp.xiaoi.com/robot/webrobot?&callback=__webrobot_processMsg&data=%7B%22sessionId%22%3A%22ff725c236e5245a3ac825b2dd88a7501%22%2C%22robotId%22%3A%22webbot%22%2C%22userId%22%3A%227cd29df3450745fbbdcf1a462e6c58e6%22%2C%22body%22%3A%7B%22content%22%3A%22" + x + "%22%7D%2C%22type%22%3A%22txt%22%7D")
????html_doc = link.read().decode()
????reply_list = re.findall(r'\"content\":\"(.+?)\\r\\n\"', html_doc)
print("小i:" + reply_list[-1])
三、分析唐诗的作者是李白还是杜甫
import jieba
from nltk.classify import NaiveBayesClassifier
# 需要提前把李白的诗收集一下,放在libai.txt文本中。
text1 = open(r"libai.txt", "rb").read()
list1 = jieba.cut(text1)
result1 = " ".join(list1)
# 需要提前把杜甫的诗收集一下,放在dufu.txt文本中。
text2 = open(r"dufu.txt", "rb").read()
list2 = jieba.cut(text2)
result2 = " ".join(list2)
# 数据准备
libai = result1
dufu = result2
# 特征提取
def word_feats(words):
????return dict([(word, True) for word in words])
libai_features = [(word_feats(lb), 'lb') for lb in libai]
dufu_features = [(word_feats(df), 'df') for df in dufu]
train_set = libai_features + dufu_features
# 训练决策
classifier = NaiveBayesClassifier.train(train_set)
# 分析测试
sentence = input("请输入一句你喜欢的诗:")
print("\n")
seg_list = jieba.cut(sentence)
result1 = " ".join(seg_list)
words = result1.split(" ")
# 统计结果
lb = 0
df = 0
for word in words:
????classResult = classifier.classify(word_feats(word))
????if classResult == 'lb':
????????lb = lb + 1
????if classResult == 'df':
????????df = df + 1
# 呈现比例
x = float(str(float(lb) / len(words)))
y = float(str(float(df) / len(words)))
print('李白的可能性:%.2f%%' % (x * 100))
print('杜甫的可能性:%.2f%%' % (y * 100))
四、随机生成彩票35选7
import random
temp = [i + 1 for i in range(35)]
random.shuffle(temp)
i = 0
list = []
while i < 7:
????list.append(temp[i])
????i = i + 1
list.sort()
print('\033[0;31;;1m')
print(*list[0:6], end="")
print('\033[0;34;;1m', end=" ")
print(list[-1])
五、自动写检讨书
import random
import xlrd
ExcelFile = xlrd.open_workbook(r'test.xlsx')
sheet = ExcelFile.sheet_by_name('Sheet1')
i = []
x = input("请输入具体事件:")
y = int(input("老师要求的字数:"))
while len(str(i)) < y * 1.2:
????s = random.randint(1, 60)
????rows = sheet.row_values(s)
????i.append(*rows)
print(" "*8+"检讨书"+"\n"+"老师:")
print("我不应该" + str(x)+",", *i)
print("再次请老师原谅!")
'''
以下是样稿:
请输入具体事件:抽烟
老师要求的字数:200
????????检讨书
老师:
我不应该抽烟, 学校一开学就三令五申,一再强调校规校纪,提醒学生不要违反校规,可我却没有把学校和老师的话放在心上,没有重视老师说的话,没有重视学校颁布的重要事项,当成了耳旁风,这些都是不应该的。同时也真诚地希望老师能继续关心和支持我,并却对我的问题酌情处理。 无论在学习还是在别的方面我都会用校规来严格要求自己,我会把握这次机会。 但事实证明,仅仅是热情投入、刻苦努力、钻研学业是不够的,还要有清醒的政治头脑、大局意识和纪律观念,否则就会在学习上迷失方向,使国家和学校受损失。
再次请老师原谅!
'''
六、屏幕录相机,抓屏软件
from time import sleep
from PIL import ImageGrab
m = int(input("请输入想抓屏几分钟:"))
m = m * 60
n = 1
while n < m:
????sleep(0.02)
????im = ImageGrab.grab()
????local = (r"%s.jpg" % (n))
????im.save(local, 'jpeg')
????n = n + 1
七、 制作Gif动图
from PIL import Image
im = Image.open("1.jpg")
images = []
images.append(Image.open('2.jpg'))
images.append(Image.open('3.jpg'))
im.save('gif.gif', save_all=True, append_images=images, loop=1, duration=1, comment=b"aaabb")
八、新闻聚合
现在很少见的一类应用,至少我从来没有用过,又叫做Usenet。这个程序的主要功能是用来从指定的来源(这里是Usenet新闻组)收集信息,然后讲这些信息保存到指定的目的文件中(这里使用了两种形式:纯文本和html文件)。这个程序的用处有些类似于现在的博客订阅工具或者叫RSS订阅器。
from nntplib import NNTP
from time import strftime,time,localtime
from email import message_from_string
from urllib import urlopen
import textwrap
import re
day = 24*60*60
def wrap(string,max=70):
????????'''
????????'''
????????return '\n'.join(textwrap.wrap(string)) + '\n'
class NewsAgent:
????????'''
????????'''
????????def __init__(self):
????????????????self.sources = []
????????????????self.destinations = []
????????def addSource(self,source):
????????????????self.sources.append(source)
????????def addDestination(self,dest):
????????????????self.destinations.append(dest)
????????def distribute(self):
????????????????items = []
????????????????for source in self.sources:
????????????????????????items.extend(source.getItems())
????????????????for dest in self.destinations:
????????????????????????dest.receiveItems(items)
class NewsItem:
????????def __init__(self,title,body):
????????????????self.title = title
????????????????self.body = body
class NNTPSource:
????????def __init__(self,servername,group,window):
????????????????self.servername = servername
????????????????self.group = group
????????????????self.window = window
????????def getItems(self):
????????????????start = localtime(time() - self.window*day)
????????????????date = strftime('%y%m%d',start)
????????????????hour = strftime('%H%M%S',start)
????????????????server = NNTP(self.servername)
????????????????ids = server.newnews(self.group,date,hour)[1]
????????????????for id in ids:
????????????????????????lines = server.article(id)[3]
????????????????????????message = message_from_string('\n'.join(lines))
????????????????????????title = message['subject']
????????????????????????body = message.get_payload()
????????????????????????if message.is_multipart():
????????????????????????????????body = body[0]
????????????????????????yield NewsItem(title,body)
????????????????server.quit()
class SimpleWebSource:
????????def __init__(self,url,titlePattern,bodyPattern):
????????????????self.url = url
????????????????self.titlePattern = re.compile(titlePattern)
????????????????self.bodyPattern = re.compile(bodyPattern)
????????def getItems(self):
????????????????text = urlopen(self.url).read()
????????????????titles = self.titlePattern.findall(text)
????????????????bodies = self.bodyPattern.findall(text)
????????????????for title.body in zip(titles,bodies):
????????????????????????yield NewsItem(title,wrap(body))
class PlainDestination:
????????def receiveItems(self,items):
????????????????for item in items:
????????????????????????print item.title
????????????????????????print '-'*len(item.title)
????????????????????????print item.body
class HTMLDestination:
????????def __init__(self,filename):
????????????????self.filename = filename
????????def receiveItems(self,items):
????????????????out = open(self.filename,'w')
????????????????print >> out,'''
????????????????<html>
????????????????<head>
?????????????????<title>Today's News</title>
????????????????</head>
????????????????<body>
????????????????<h1>Today's News</hi>
????????????????'''
????????????????print >> out, '<ul>'
????????????????id = 0
????????????????for item in items:
????????????????????????id += 1
????????????????????????print >> out, '<li><a href="#">%s</a></li>' % (id,item.title)
????????????????print >> out, '</ul>'
????????????????id = 0
????????????????for item in items:
????????????????????????id += 1
????????????????????????print >> out, '<h2><a name="%i">%s</a></h2>' % (id,item.title)
????????????????????????print >> out, '<pre>%s</pre>' % item.body
????????????????print >> out, '''
????????????????</body>
????????????????</html>
????????????????'''
def runDefaultSetup():
????????agent = NewsAgent()
????????bbc_url = 'http://news.bbc.co.uk/text_only.stm'
????????bbc_title = r'(?s)a href="[^"]*">\s*<b>\s*(.*?)\s*</b>'
????????bbc_body = r'(?s)</a>\s*<br/>\s*(.*?)\s*<'
????????bbc = SimpleWebSource(bbc_url, bbc_title, bbc_body)
????????agent.addSource(bbc)
????????clpa_server = 'news2.neva.ru'
????????clpa_group = 'alt.sex.telephone'
????????clpa_window = 1
????????clpa = NNTPSource(clpa_server,clpa_group,clpa_window)
????????agent.addSource(clpa)
????????agent.addDestination(PlainDestination())
????????agent.addDestination(HTMLDestination('news.html'))
????????agent.distribute()
if __name__ == '__main__':
????????runDefaultSetup()
?这个程序,首先从整体上进行分析,重点部分在于NewsAgent,它的作用是存储新闻来源,存储目标地址,然后在分别调用来源服务器(NNTPSource以及SimpleWebSource)以及写新闻的类(PlainDestination和HTMLDestination)。所以从这里也看的出,NNTPSource是专门用来获取新闻服务器上的信息的,SimpleWebSource是获取一个url上的数据的。而PlainDestination和HTMLDestination的作用很明显,前者是用来输出获取到的内容到终端的,后者是写数据到html文件中的。
有了这些分析,然后在来看主程序中的内容,主程序就是来给NewsAgent添加信息源和输出目的地址的。
九、给小姐姐颜值打分
import requests
import re
import json
from urllib.request import urlretrieve
import time
from face_level import face
url = "https://www.douyu.com/g_yz"
header = {
????'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.108 Safari/537.36'
}
result = requests.get(url,headers=header).text
# $ 是以什么结束 ?. 匹配除了换行符\n以外的任意一个字符
reg = r"window\.\$DATA = (.*?);.*?var pageType = 'list2';"
data = re.findall(reg,result,re.S)
# print(MNIST_data[0])
dy_data = json.loads(data[0])
path = "./image"
for item in dy_data['list']:
????name = item['rn']
????image_url = item['rs1']
????print("正在下载:%s的图片"%name)
????filepath = path + "/" + name + ".png"
????urlretrieve(image_url,filepath)
????time.sleep(0.2)
????level = face(filepath)
print("颜值打分%s"%level)
十、微信群发祝福脚本
# 微信群发祝福脚本
import itchat
import requests
import time
import random
from itchat.content import *
# 用于记录回复过的好友
replied = []
# 获取祝福语
def GetRandomGreeting():
res = requests.get("http://www.xjihe.com/api/life/greetings?festival=新年&page=10", headers = {'apiKey':'sQS2ylErlfm9Ao2oNPqw6TqMYbJjbs4g'})
results = res.json()['result']
return results[random.randrange(len(results))]['words']
# 发送祝福语
def SendGreeting(msg):
global replied
friend = itchat.search_friends(userName=msg['FromUserName'])
if friend['RemarkName']:
itchat.send((friend['RemarkName']+','+GetRandomGreeting()), msg['FromUserName'])
else:
itchat.send((friend['NickName']+','+GetRandomGreeting()), msg['FromUserName'])
replied.append(msg['FromUserName'])
# 文本消息
@itchat.msg_register([TEXT])
def text_reply(msg):
if '年' in msg['Text'] and msg['FromUserName'] not in replied:
SendGreeting(msg)
# 其他消息
@itchat.msg_register([PICTURE, RECORDING, VIDEO, SHARING])
def others_reply(msg):
if msg['FromUserName'] not in replied:
SendGreeting(msg)
if __name__ == '__main__':
# itchat.auto_login(enableCmdQR=True, hotReload=True)
# Windows下使用
itchat.auto_login()
itchat.run()
总结
看了书本学习之后,可以多去找些这种小项目练练手。这样,第一能够在学习中获得一些成就感;第二,学以致用练练手,并且查漏补缺;第三,这些小项目无伤大雅,但是能让你的学习生活增添一丝乐趣;最后当然就是能够使自己的能力得到一个提升!
这些小项目虽然小,但是作用不容小觑,是我们学习Python路上的基石,能不能筑起万丈高楼,都看这些基础知识炸不扎实了!
小鱼建立了一个新手交流群,可以为大家答疑解惑,想要获取源代码的也可以进群获取哦!里面还有100个练手项目+源代码~
交流群:954526228(注意进群暗号是:小鱼)
快来和小鱼一起成长进步吧! ① 2000多本Python电子书(主流和经典的书籍应该都有了) ② Python标准库资料(最全中文版) ③ 项目源码(四五十个有趣且经典的练手项目及源码) ④ Python基础入门、爬虫、web开发、大数据分析方面的视频(适合小白学习) ⑤ Python学习路线图(告别不入流的学习)
|