第一章 变量和数据类型
1.1 变量
变量命名规则
- 字母数字下划线组合,但不能数字开头
- 变量名不能包含空格
- 不能使用Python保留的特殊用途单词
- 变量简短,知名见义
- 慎用小写字母l和大写数字O,会被误认为数字1 和0
- Python中的变量一律使用小写,这与其他编程语言中的小驼峰命名不一样。
1.2 常量
常量使用大写
1.3 字符串
-
常见操作 -
字符串大小写 In [49]: name = 'hello world'
In [50]: name.title()
Out[50]: 'Hello World'
In [51]: name.upper()
Out[51]: 'HELLO WORLD'
In [52]: name.lower()
Out[52]: 'hello world'
In [53]: name.capitalize()
Out[53]: 'Hello world'
-
合并字符串
In [41]: str_pre = "hello"
In [42]: str_after = " world"
In [43]: str_concat = str_pre+str_after
In [44]: str_concat
Out[44]: 'hello world'
-
删除空白和特殊字符
In [46]: str_trip ='000000000helloworld00000000000'
In [47]: str_trip.strip('0')
Out[47]: 'helloworld'
-
字符串的搜索和替换
name = "hello world"
In [29]: name.count('l')
Out[29]: 3
In [30]: name.capitalize()
Out[30]: 'Hello world'
In [33]: name.center(20,'-')
Out[33]: '----hello world-----'
In [35]: name.find('p')
Out[35]: -1
In [36]: name.index('l')
Out[36]: 2
In [37]: name.index('p')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-37-733edee1b9a6> in <module>
----> 1 name.index('p')
ValueError: substring not found
In [38]: name.replace('world','word')
Out[38]: 'hello word'
In [39]: 'word' in name
Out[39]: False
In [40]: 'world' in name
Out[40]: True
-
字符串分割
In [26]: string_split = "I have a dream!"
In [27]: string_split.split()
Out[27]: ['I', 'have', 'a', 'dream!']
-
字符串连接
In [20]: print(','.join('6666'))
6,6,6,6
In [22]: print(','.join(['alice','bluce','candy','duke']))
alice,bluce,candy,duke
-
字符串切片 In [13]: string_split = "hello world"
In [14]: string_split[:]
Out[14]: 'hello world'
In [15]: string_split[0:]
Out[15]: 'hello world'
In [16]: string_split[1:]
Out[16]: 'ello world'
In [17]: string_split[:-1]
Out[17]: 'hello worl'
In [18]: string_split[::-1]
Out[18]: 'dlrow olleh'
-
string 模块 In [8]: import string
In [9]: string.ascii_uppercase
Out[9]: 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
In [10]: string.ascii_lowercase
Out[10]: 'abcdefghijklmnopqrstuvwxyz'
In [11]: string.ascii_letters
Out[11]: 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
In [12]: string.digits
Out[12]: '0123456789'
1.4 数字
1.5 注释
1.6 标准数据类型
- Number(数字)
- String(字符串)
- List(列表)
- Tuple(元组)
- Set(集合)
- Dictionary(字典)
在这六个标准数据类型中分可变数据类型和不可变数据类型
- 可变数据类型 List,Dictionary Set
- 不可变数据类型 Number String Tuple
可变数据类型的方法通常是没有返回值的,直接修改源数据,不可变数据类型的方法一般有返回值,对源对象的操作生成新的对象,源对象并没有发生改变。
1.7 输入输出
-
输出-print --1.占位符
name = 'World'
print('hello %s'%(name))
--2. str.format() 格式化字符串
print('my name is {name},age is {age}'.format(name='justin',age=18))
-- 3. f-string 格式化字符串 推荐
name = 'World'
print(f'hello,{name}')
-- 4.不换行输出
for i in range(0,4):
print(i,end='')
-
输入-input()
input参考资料:
第二章 常见数据结构
2.1 list
-
定义 list(列表)是一系列按照特定顺序排列的元素组成,元素类型可以不一样。列表名字一般采用负数形式。有序。 -
常见方法
-
增
In [113]: range(1,11)
Out[113]: range(1, 11)
In [114]: type(range(1,11))
Out[114]: range
In [111]: numbers = list(range(1,11))
In [112]: numbers
Out[112]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
In [62]: cellphones = []
In [63]: cellphones.append('Apple')
In [64]: cellphones.append('HuaWei')
In [65]: cellphones.append('小米')
In [66]: cellphones.append('Oppo')
In [67]: cellphones
Out[67]: ['Apple', 'HuaWei', '小米', 'Oppo']
In [68]: cellphones.insert(0,'Nokia')
In [69]: cellphones
Out[69]: ['Nokia', 'Apple', 'HuaWei', '小米', 'Oppo']
-
删 In [89]: cellphones
Out[89]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [90]: del cellphones[0]
In [91]: cellphones
Out[91]: ['HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [92]: cellphones.insert(0,'Apple')
In [93]: cellphones
Out[93]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [94]: cellphones.pop()
Out[94]: 'Xiaomi'
In [95]: cellphones
Out[95]: ['Apple', 'HuaWei', 'Nokia', 'Oppo']
In [96]: cellphones.append('Xiaomi')
In [97]: cellphones
Out[97]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [98]: cellphones.append(cellphones.pop())
In [99]: cellphones
Out[99]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [100]: cellphones.insert(0,cellphones.pop(0))
In [101]: cellphones
Out[101]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [102]: cellphones.remove('Apple')
In [103]: cellphones
Out[103]: ['HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [104]: cellphones.insert(0,'Apple')
In [105]: cellphones.insert(0,'Apple')
In [106]: cellphones
Out[106]: ['Apple', 'Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [107]: cellphones.remove(['Apple','Apple'])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-107-a2c7e0df81fd> in <module>
----> 1 cellphones.remove(['Apple','Apple'])
ValueError: list.remove(x): x not in list
In [108]: cellphones.remove('Apple')
In [109]: cellphones
Out[109]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
-
改
In [72]: cellphones[3]='Xiaomi'
In [73]: cellphones
Out[73]: ['Nokia', 'Apple', 'HuaWei', 'Xiaomi', 'Oppo']
-
查
In [70]: cellphones[0]
Out[70]: 'Nokia'
In [71]: cellphones[1]
Out[71]: 'Apple'
-
列表反转
In [73]: cellphones
Out[73]: ['Nokia', 'Apple', 'HuaWei', 'Xiaomi', 'Oppo']
In [76]: cellphones[::-1]
Out[76]: ['Nokia', 'Apple', 'HuaWei', 'Xiaomi', 'Oppo']
In [77]: cellphones
Out[77]: ['Oppo', 'Xiaomi', 'HuaWei', 'Apple', 'Nokia']
-
列表切片 同字符串操作 -
列表长度
In [78]: len(cellphones)
Out[78]: 5
In [79]: cellphones
Out[79]: ['Oppo', 'Xiaomi', 'HuaWei', 'Apple', 'Nokia']
-
排序
In [83]: cellphones.sort()
In [84]: cellphones
Out[84]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [85]: cellphones.sort(reverse=True)
In [86]: cellphones
Out[86]: ['Xiaomi', 'Oppo', 'Nokia', 'HuaWei', 'Apple']
In [87]: cellphones.sort(reverse=False)
In [88]: cellphones
Out[88]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [80]: sorted(cellphones)
Out[80]: ['Apple', 'HuaWei', 'Nokia', 'Oppo', 'Xiaomi']
In [81]: cellphones
Out[81]: ['Oppo', 'Xiaomi', 'HuaWei', 'Apple', 'Nokia']
-
遍历
In [115]: for cellphone in cellphones:
...: print(cellphone)
...:
Apple
HuaWei
Nokia
Oppo
Xiaomi
-
列表复制 In [124]: my_food = ['bread','milk','orange']
In [125]: friend_food = my_food[:]
In [126]: friend_food
Out[126]: ['bread', 'milk', 'orange']
In [127]: my_food.append('apple')
In [128]: friend_food.append('pear')
In [129]: my_food
Out[129]: ['bread', 'milk', 'orange', 'apple']
In [130]: friend_food
Out[130]: ['bread', 'milk', 'orange', 'pear']
In [131]: friend_food_copy = friend_food
In [132]: friend_food
Out[132]: ['bread', 'milk', 'orange', 'pear']
In [133]: friend_food_copy.pop()
Out[133]: 'pear'
In [134]: friend_food
Out[134]: ['bread', 'milk', 'orange']
不可以用 friend_food=my_food 进行列表复制,这只是friend_food指向my_food,对friend_food 的任何操作都会影响源列表
-
列表推导式
In [116]: numbers = [i for i in range(1,11)]
In [117]: numbers
Out[117]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
In [118]: numbers = [i for i in range(1,11) if i%2==0]
In [119]: numbers
Out[119]: [2, 4, 6, 8, 10]
In [120]: list_e = [(e, f * f) for e in range(3) for f in range(5, 15, 5)]
In [121]: list_e
Out[121]: [(0, 25), (0, 100), (1, 25), (1, 100), (2, 25), (2, 100)]
In [122]: list_g = [[x for x in range(g - 3, g)] for g in range(22) if g % 3 == 0 and g != 0]
In [123]: list_g
Out[123]:
[[0, 1, 2],
[3, 4, 5],
[6, 7, 8],
[9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20]]
-
列表的深拷贝浅拷贝
In [1]: a = [1,2,3]
In [2]: b = a
In [3]: b
Out[3]: [1, 2, 3]
In [4]: a[2]=4
In [5]: a
Out[5]: [1, 2, 4]
In [6]: b
Out[6]: [1, 2, 4]
In [40]: b = a.copy()
In [41]: a
Out[41]: [1, 2, 3, [4, 5]]
In [42]: b
Out[42]: [1, 2, 3, [4, 5]]
In [44]: a[0]=10
In [45]: a
Out[45]: [10, 2, 3, [4, 5]]
In [46]: b
Out[46]: [1, 2, 3, [4, 5]]
In [47]: a[3][0]=6
In [48]: a
Out[48]: [10, 2, 3, [6, 5]]
In [49]: b
Out[49]: [1, 2, 3, [6, 5]]
In [1]: list1 = [1,2,3,[4,5]]
In [2]: import copy
In [3]: list2 = copy.copy(list1)
In [4]: list1
Out[4]: [1, 2, 3, [4, 5]]
In [5]: list2
Out[5]: [1, 2, 3, [4, 5]]
In [6]: list1[0]=10
In [7]: list1
Out[7]: [10, 2, 3, [4, 5]]
In [8]: list2
Out[8]: [1, 2, 3, [4, 5]]
In [9]: list1[3][0]=6
In [10]: list1
Out[10]: [10, 2, 3, [6, 5]]
In [11]: list2
Out[11]: [1, 2, 3, [6, 5]]
In [13]: list1 =[1,2,3,[4,5]]
In [14]: list2 = copy.deepcopy(list1)
In [15]: list2
Out[15]: [1, 2, 3, [4, 5]]
In [16]: list1[0]=10
In [17]: list1
Out[17]: [10, 2, 3, [4, 5]]
In [18]: list2
Out[18]: [1, 2, 3, [4, 5]]
In [19]: list1[3][0]=6
In [20]: list1
Out[20]: [10, 2, 3, [6, 5]]
In [21]: list2
Out[21]: [1, 2, 3, [4, 5]]
-
使用场景 list 使用在需要查询,修改的场景,极不擅长需要频繁的插入和删除元素的场景。
2.2 tuple
-
定义 tuple–元组,元组是不可变的,元组是一类不允许添加和删除元素的特殊列表,一旦创建不允许添加和删除修改.有序。 -
常见方法
-
增
In [143]: tuple1 = ()
In [144]: tuple2 = (1,)
-
删 元组没有删除元素的方法,但是可以删除 整个元组 In [141]: del tuple1
-
改 元组是不可变的,不能修改元组中的元素,但是可以对元组进行拼接
In [226]: (1,2,3)+(4,)
Out[226]: (1, 2, 3, 4)
-
查 按照索引进行查看 In [224]: tuple1 = (1,2,3,4,5,6)
In [225]: tuple1[0]
-
元组内置函数
In [152]: max(tuple1)
Out[152]: 3
In [153]: min(tuple2)
Out[153]: 4
In [154]: tuple([1,2,3,4,5,6])
Out[154]: (1, 2, 3, 4, 5, 6)
-
zip 函数
In [96]: a = [1,2,3]
In [97]: b = [1,4,9]
In [98]: for x,y in zip(a,b):
...: print(f'x ={x} y={y}')
...:
x =1 y=1
x =2 y=4
x =3 y=9
print([*zip(a,b)])
-
元组运算符
In [145]: len((1,2,3))
Out[145]: 3
In [146]: (1,2,3)+(4,5,6)
Out[146]: (1, 2, 3, 4, 5, 6)
In [147]: ('hi',)*4
Out[147]: ('hi', 'hi', 'hi', 'hi')
In [148]: 3 in (1,2,3)
Out[148]: True
-
遍历元组 同list 一样
In [138]: tuple1 = (i for i in range(1,11))
In [139]: tuple1
Out[139]: <generator object <genexpr> at 0x0000022BC1B08DD0>
In [140]: for i in tuple1:
...: print(i)
...:
1
2
3
4
5
6
7
8
9
10
-
元组推导式
In [138]: tuple1 = (i for i in range(1,11))
In [139]: tuple1
Out[139]: <generator object <genexpr> at 0x0000022BC1B08DD0>
list 能用的推导式在元组上都可以进行使用
-
使用场景 相比较于list,tuple实例更加节省内存,如果你确定你的对象后面不会被修改,可以大胆使用元组。元组常用于拆包解包。
2.3 dict
2.4 set
第三章 流程控制
3.1 if
-
示列
if True:
pass
if True:
do something
else:
do others
if True:
do something
elif True:
do something
else:
do something
-
以下值都为False
- None
- False
- 所有的值为0的数
- 0 整型
- 0.0 浮点型
- “” 空字符串
- [] 空列表
- () 空元组
- {} 空字典
- set() 空集合
In [209]: bool(None),bool(False),bool(0),bool([]),bool(()),bool({}),bool(""),bool(set())
Out[209]: (False, False, False, False, False, False, False, False)
Python 没有长整型数据类型
3.2 while
3.3 for
第四章 函数
4.1 函数的调用
4.2 函数的创建
4.3 函数的参数
实参
形参
-
位置实参 In [245]: def pos_arg(name,age):
...: print(f'name is {name} age is {age}')
...:
In [247]: pos_arg("Justin",18)
name is Justin age is 18
-
默认参数 In [251]: def def_arg(number,n=2):
...: print(number**n)
...:
In [252]: def_arg(2,3)
8
In [253]: def_arg(2)
4
-
关键字实参 In [254]: def key_arg(name,age):
...: print(f'name is {name} age is {age}')
...:
In [255]: key_arg(age ='20',name = 'Justin')
name is Justin age is 20
-
可变参数
-
任意数量关键字实参(字典) In [272]: def fun1(hight,weight,**args):
...: info = {}
...: info['hight'] = hight
...: info['weight'] = weight
...: for key,value in args.items():
...: info[key] = value
...: return info
...:
...:
In [273]: fun1(180,130,name="Justin",age=18)
Out[273]: {'hight': 180, 'weight': 130, 'name': 'Justin', 'age': 18}
In [274]: fun1(180,130,name="Justin",age=18,university='清华大学')
Out[274]:
{'hight': 180,
'weight': 130,
'name': 'Justin',
'age': 18,
'university': '清华大学'}
-
任意数量实参(元组) In [265]: fun(1,2,3,4,5,6)
Out[265]: 21
In [266]: fun(1,2)
Out[266]: 3
In [267]: fun(1,2,3)
Out[267]: 6
In [268]: def fun(*args):
...: sum = 0
...: for i in args:
...: sum +=i
...: return sum
...:
...:
In [269]:
4.4 常用函数
-
高阶函数
-
map In [275]: a,*b = map(int,input().strip().split())
1 2 3 4 5 6 7
In [276]: a
Out[276]: 1
In [277]: b
Out[277]: [2, 3, 4, 5, 6, 7]
-
reduce In [279]: from functools import reduce
...: def add(x,y):
...: return x+y
...:
In [280]: reduce(add,[1,3,5,7,9])
Out[280]: 25
-
filter In [281]: def not_empty(s):
...: return s and s.strip()
...:
In [282]: list(filter(not_empty,['A', '', 'B', None, 'C', ' ']))
Out[282]: ['A', 'B', 'C']
-
sorted list3 = [-1,2,-100,3,-4,5]
print(sorted(list3,key=abs))
list3
-
匿名函数 list(map(lambda x:x*x,[1,2,3,4,5,6,7,8,9]))
-
内置函数
4.5 将函数存储在模块中
4.6 函数中的解包和打包
-
在函数定义中
In [133]: def add(*arg):
...: sum1 = 0
...: for i in arg:
...: sum1 +=i
...: return sum1
...:
In [134]: add(1)
Out[134]: 1
In [135]: add(1,2)
Out[135]: 3
In [136]: add(1,2,3,4,5,6)
Out[136]: 21
In [137]: def fun1(**args):
...: for key,value in args.items():
...: print(f'key={key},value={value}')
...:
In [138]: fun1(a=1,b=2)
key=a,value=1
key=b,value=2
-
在函数调用中
def myfun(a, b):
print(a + b)
>>> n = [1, 2]
>>> myfun(*n)
3
>>> m = (1, 2)
>>> myfun(*m)
3
>>> mydict = {'a':1, 'b': 2}
>>> myfun(**mydict)
3
>>> myfun(*mydict)
ba
4.7 ·*args **kwargs 的使用
python中*args和**kwargs的理解
4.8 高阶函数
4.9 装饰器
第五章 面向对象
5.1 类
5.2 继承
5.3 封装
5.4 鸭子类型
5.5 内部类
5.6 装饰器
5.7 工厂方法
第六章 文件与目录
6.1 目录常见操作
-
创建目录
import os
os.makedirs('tmp/python/fileop',exist_ok=True)
exist_ok=True 表示如果目录存在也不报错
-
删除文件或目录
import os
os.remove('sdf.py')
import shutil
shutil.rmtree('tmp', ignore_errors=True)
-
拷贝文件或目录
from shutil import copyfile
copyfile('d:/tools/first.py', 'e:/first.py')
from shutil import copytree
copytree('d:/tools/aaa', 'e:/new/bbb')
move(source,dst)
-
修改文件名或目录名 import os
os.rename('d:/tools/aaa','d:/tools/bbb')
os.rename('d:/tools/first.py','d:/tools/second.py')
-
对文件路径名操作 In [317]: import os
In [318]: os.path.basename(r'E:\oracle11gclient_X64\client\doc\index.htm')
Out[318]: 'index.htm'
In [319]: os.path.dirname('E:\oracle11gclient_X64\client\doc\index.htm')
Out[319]: 'E:\\oracle11gclient_X64\\client\\doc'
In [320]: os.path.join('temp','test',os.path.basename('E:\oracle11gclient_X64\client\doc\index.htm'))
Out[320]: 'temp\\test\\index.htm'
-
判断文件,目录是否存在 import os
os.path.exists('d:/systems/cmd.exe')
os.path.exists('d:/systems')
import os
os.path.isfile('d:/systems/cmd.exe')
import os
os.path.isdir('d:/systems')
-
文件大小和修改日期
>>> os.path.getsize('file1')
3669
>>> os.path.getmtime('file1')
1272478234.0
>>> import time
>>> time.ctime(os.path.getmtime('/etc/passwd'))
'Wed Apr 28 13:10:34 2010'
>>>
import os
def getFileSize(filePath, size=0):
for root, dirs, files in os.walk(filePath):
for f in files:
size += os.path.getsize(os.path.join(root, f))
print(f)
return size
其实获取目录大小就是遍历目录中的每一个文件,大小相加
-
当前工作目录 In [286]: import os
In [287]: cwd = os.getcwd()
In [289]: cwd = os.getcwd()
In [290]: os.chdir('E:\\')
In [291]: print(os.getcwd())
E:\
-
递归遍历目录下的所有文件 import os
targetDir = r'E:\test_makedirs'
files = []
dirs = []
for (dirpath, dirnames, filenames) in os.walk(targetDir):
files += filenames
dirs += dirnames
print(files)
print(dirs)
import os
targetDir = r'd:\tmp\util\dist\check'
for (dirpath, dirnames, filenames) in os.walk(targetDir):
for fn in filenames:
fpath = os.path.join(dirpath, fn)
-
得到目录中所有的文件和子目录名 import os
from os.path import isfile, join,isdir
targetDir = r'd:\tmp\util\dist\check'
print([f for f in os.listdir(targetDir) if isfile(join(targetDir, f))])
print([f for f in os.listdir(targetDir) if isdir(join(targetDir, f))])
-
得到目录中指定扩展名和子目录 import glob
exes = glob.glob(r'd:\tmp\*.txt')
print(exes)
-
实战
In [353]: list1
Out[353]: ['a', 'b', 'c', 'd']
In [354]: for i in list1:
...: path = os.path.join('E:\\test_makedirs',i,i*2)
...: os.makedirs(path,exist_ok=True)
...: with open(os.path.join(path,i*3+'.txt'),'w') as f:
...: f.write(i*100)
6.2文件读取
-
全部读取 targetFile = r'E:\666\file_open.txt'
with open(targetFile) as f:
content = f.read()
print(content.rstrip())
-
逐行读取 targetFile = r'E:\666\file_open.txt'
with open(targetFile) as f:
for line in f:
print(line)
targetFile = r'E:\666\file_open.txt'
with open(targetFile) as f:
for line in f.readlines():
print(line)
6.3 文件写入
-
新建文件 filename = 'programming.txt'
with open(filename, 'w',encoding='utf8') as file_object:
file_object.write("I love programming.\n")
file_object.write("I love creating new games.\n")
r 只读 w–写,a–追加,r+ --读写
-
追加文件 filename = 'programming.txt'
with open(filename, 'a',encoding='utf8') as file_object:
file_object.write("I also love finding meaning in large datasets.\n")
file_object.write("I love creating apps that can run in a browser.\n")
第七章 异常
7.1 异常处理
try:
answer = int(first_number) / int(second_number)
except ZeroDivisionError:
print("You can't divide by 0!")
else:
print(answer)
except 中可以使用pass ,代表发生异常什么都不做
7.2 日志
-
常见的日志级别
类型 | 描述 |
---|
DEBUG | 仅用于问题诊断的信息 | INFO | 该程序正在按预期运行 | WARNING | 指示出了问题 | ERROR | 该软件将不再能够运行 | CRITICAL | 非常严重的错误 |
-
实战 import logging
logging.basicConfig(filename='program1.log', format='%(asctime)s %(message)s', level=logging.INFO)
logging.info('Logging app started')
logging.warning('An example logging message.')
logging.warning('Another log message')
"""
2022-02-23 15:08:34,021 Logging app started
2022-02-23 15:08:34,021 An example logging message.
2022-02-23 15:08:34,021 Another log message
"""
7.3 存储数据
-
序列化 json.dump() -
反序列化 json.load() -
实战
import json
def get_stored_username():
"""如果存储了用户名,就获取它 反序列化"""
filename = 'username.json'
try:
with open(filename) as f_obj:
username = json.load(f_obj)
except FileNotFoundError:
return None
else:
return username
def get_new_username():
"""提示用户输入用户名,序列化"""
username = input("what is your name? ")
filename = 'username.json'
with open(filename, 'w') as f_obj:
json.dump(username, f_obj)
return username
def greet_user():
"""问候用户,指出起名字"""
username = get_stored_username()
if username:
print("Welcome back, " + username + "!")
else:
username = get_new_username()
print("We'll remember you when you come back, " + username + "!")
greet_user()
-
总结 Python 中序列化的方式
- json
- pickle(机器学习中保存模型使用较多)
- shelve
- marshal
joblib (机器学习中保存模型较多,可以多进程,效率比pickle更高)
第八章 测试代码
8.1 几个常见概念
8.2 测试函数
-
测试函数
def get_formatted_name(first, last, middle=''):
if middle:
full_name = first + ' ' + middle + ' ' + last
else:
full_name = first + ' ' + last
return full_name.title()
-
单元测试
import unittest
from NameFunction import get_formatted_name
class NamesTestCase(unittest.TestCase):
"""测试NameFunction.py"""
def test_first_last_name(self):
formatted_name = get_formatted_name('james', 'hardon')
self.assertEqual(formatted_name, 'James Hardon')
def test_first_last_middle_name(self):
"""测试能否返回 middle 名字"""
formatted_name = get_formatted_name(
'wolfgang', 'mozart', 'amadeus')
self.assertEqual(formatted_name, 'Wolfgang Amadeus Mozart')
if __name__ == '__main__':
unittest.main()
8.3 测试类
-
需要测试的类 class AnonymousSurvey():
"""收集匿名调查问卷的答案"""
def __init__(self, question):
"""存储一个问题,并为存储答案做准备"""
self.question = question
self.responses = []
def show_question(self):
"""显示调查问卷"""
print(self.question)
def store_response(self, new_response):
"""存储单份调查答卷"""
self.responses.append(new_response)
def show_results(self):
"""显示收集到的所有答卷"""
print("Survey results:")
for response in self.responses:
print('- ' + response)
-
测试类 import unittest
from AnonymousSurvey import AnonymousSurvey
class SurveyTestCase(unittest.TestCase):
"""针对AnonymouSurvey类的测试"""
def setUp(self):
""" 创建一个调查对象和一组答案,供使用的测试方法使用"""
question = "What language did you first learn to speak?"
self.my_survey = AnonymousSurvey(question)
self.response = ['English', 'Spanish', 'Mandarin']
def test_store_single_response(self):
"""测试单个答案会被妥善地存储"""
self.my_survey.store_response(self.responses[0])
self.assertIn(self.responses[0], self.my_survey.responses)
def test_store_three_responses(self):
"""测试三个答案会被妥善地存储"""
for response in self.responses:
self.my_survey.store_response(response)
for response in self.responses:
self.assertIn(response, self.my_survey.responses)
if __name__ == '__main__':
unittest.main()
unittest.TestCase 类包含方法setUp() ,让我们只需创建这些对象一次,并在每个测试方法中使用它们。如果你在TestCase 类中包含了方法setUp() ,Python将先运行它,再运行各个以test_打头的方法。这样,在你编写的每个测试方法中都可使用在方法setUp() 中创建的对象了。
第九章 多线程
https://github.com/Adopat/Python-tutorial/tree/master/10.%E5%A4%9A%E7%BA%BF%E7%A8%8B
第十章 多进程
https://github.com/Adopat/Python-tutorial/tree/master/11.%E5%A4%9A%E8%BF%9B%E7%A8%8B
第十一章 NumPy
第十二章 Pandas
12.1 常见概念
12.2 常见操作
-
了解数据
df.head()
df.info()
df.index
df.columns
df.columns[1]
df.shape[0]
df.shape[1]
df.A.dtype 或 df.dtypes['A']
df.A 或 df['A']
df.A.value_counts()
df.A.value_counts().count()
In [442]: test_dict
Out[442]:
{'id': [1, 2, 3, 4, 5, 6],
'name': ['Alice', 'Bob', 'Cindy', 'Eric', 'Helen', 'Grace '],
'math': [90, 89, 99, 78, 97, 93],
'english': [89, 94, 80, 94, 94, 90],
'gender': ['male', 'male', 'fmale', 'fmale', 'fmale', 'fmale']}
In [443]: import pandas as pd
In [444]: df = pd.DataFrame(test_dict)
In [445]: df
Out[445]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
In [446]: df.head()
Out[446]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
In [447]: df.tail()
Out[447]:
id name math english gender
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
In [449]: df.columns
Out[449]: Index(['id', 'name', 'math', 'english', 'gender'], dtype='object')
In [450]: df.index
Out[450]: RangeIndex(start=0, stop=6, step=1)
In [451]: df.shape[0]
Out[451]: 6
In [452]: df.shape[1]
Out[452]: 5
In [453]: df.gender
Out[453]:
0 male
1 male
2 fmale
3 fmale
4 fmale
5 fmale
Name: gender, dtype: object
In [454]: df.gender.value_counts()
Out[454]:
fmale 4
male 2
Name: gender, dtype: int64
In [456]: df.describe()
Out[456]:
id math english
count 6.000000 6.000000 6.000000
mean 3.500000 91.000000 90.166667
std 1.870829 7.456541 5.455884
min 1.000000 78.000000 80.000000
25% 2.250000 89.250000 89.250000
50% 3.500000 91.500000 92.000000
75% 4.750000 96.000000 94.000000
max 6.000000 99.000000 94.000000
In [457]: df.gender.dtype
Out[457]: dtype('O')
In [458]: df.dtypes['math']
Out[458]: dtype('int64')
In [459]: df.dtypes['gender']
Out[459]: dtype('O')
In [460]: df.gender.value_counts().count()
Out[460]: 2
In [474]: df.describe()
Out[474]:
id math english
count 6.000000 6.000000 6.000000
mean 3.500000 91.000000 90.166667
std 1.870829 7.456541 5.455884
min 1.000000 78.000000 80.000000
25% 2.250000 89.250000 89.250000
50% 3.500000 91.500000 92.000000
75% 4.750000 96.000000 94.000000
max 6.000000 99.000000 94.000000
In [475]: df.describe(include='all')
Out[475]:
id name math english gender
count 6.000000 6 6.000000 6.000000 6
unique NaN 6 NaN NaN 2
top NaN Helen NaN NaN fmale
freq NaN 1 NaN NaN 4
mean 3.500000 NaN 91.000000 90.166667 NaN
std 1.870829 NaN 7.456541 5.455884 NaN
min 1.000000 NaN 78.000000 80.000000 NaN
25% 2.250000 NaN 89.250000 89.250000 NaN
50% 3.500000 NaN 91.500000 92.000000 NaN
75% 4.750000 NaN 96.000000 94.000000 NaN
max 6.000000 NaN 99.000000 94.000000 NaN
In [476]: df.math.describe()
Out[476]:
count 6.000000
mean 91.000000
std 7.456541
min 78.000000
25% 89.250000
50% 91.500000
75% 96.000000
max 99.000000
Name: math, dtype: float64
In [457]: df.gender.dtype
Out[457]: dtype('O')
In [458]: df.dtypes['math']
Out[458]: dtype('int64')
In [459]: df.dtypes['gender']
Out[459]: dtype('O')
In [460]: df.gender.value_counts().count()
Out[460]: 2
In [472]: df.gender.nunique()
Out[472]: 2
In [462]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 5 columns):
--- ------ -------------- -----
0 id 6 non-null int64
1 name 6 non-null object
2 math 6 non-null int64
3 english 6 non-null int64
4 gender 6 non-null object
dtypes: int64(3), object(2)
memory usage: 368.0+ bytes
In [470]: df[['name','gender']]
Out[470]:
name gender
0 Alice male
1 Bob male
2 Cindy fmale
3 Eric fmale
4 Helen fmale
5 Grace fmale
In [471]: df['name']
Out[471]:
0 Alice
1 Bob
2 Cindy
3 Eric
4 Helen
5 Grace
Name: name, dtype: object
In [478]: df.columns[1]
Out[478]: 'name'
-
过滤和排序
In [483]: df
Out[483]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
In [485]: df.query('english>90')
Out[485]:
id name math english gender
1 2 Bob 89 94 male
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
In [486]: df.query('english>90').gender.nunique()
Out[486]: 2
In [487]: df.query('english>90').gender.value_counts()
Out[487]:
fmale 2
male 1
Name: gender, dtype: int64
In [489]: df.query('english>90&math>90')
Out[489]:
id name math english gender
4 5 Helen 97 94 fmale
In [492]: df[(df['english']>90) & (df['math']>90)]
Out[492]:
id name math english gender
4 5 Helen 97 94 fmale
In [495]: df.sort_values(by='math')
Out[495]:
id name math english gender
3 4 Eric 78 94 fmale
1 2 Bob 89 94 male
0 1 Alice 90 89 male
5 6 Grace 93 90 fmale
4 5 Helen 97 94 fmale
2 3 Cindy 99 80 fmale
In [496]: df.sort_values(by='math',ascending=False)
Out[496]:
id name math english gender
2 3 Cindy 99 80 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
0 1 Alice 90 89 male
1 2 Bob 89 94 male
3 4 Eric 78 94 fmale
In [497]: df.sort_values(by='math',ascending=True)
Out[497]:
id name math english gender
3 4 Eric 78 94 fmale
1 2 Bob 89 94 male
0 1 Alice 90 89 male
5 6 Grace 93 90 fmale
4 5 Helen 97 94 fmale
2 3 Cindy 99 80 fmale
In [498]: df.math.sort_values()
Out[498]:
3 78
1 89
0 90
5 93
4 97
2 99
Name: math, dtype: int64
In [500]: df[df['name'].isin(['Alice','Bob','Grace'])]
Out[500]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
In [502]: df.query("name in ['Alice','Bob']")
Out[502]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
df1 = df.copy()
df1.set_index('id',inplace=True)
-
分组 In [512]: df
Out[512]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
In [513]: df.groupby('gender')['math'].max()
Out[513]:
gender
fmale 99
male 90
Name: math, dtype: int64
In [514]: df.groupby('gender').math.max()
Out[514]:
gender
fmale 99
male 90
Name: math, dtype: int64
In [515]: df.groupby('gender').agg({'math':max})
Out[515]:
math
gender
fmale 99
male 90
In [516]: df.groupby('gender').math.agg(['max','min','mean'])
Out[516]:
max min mean
gender
fmale 99 78 91.75
male 90 89 89.50
In [519]: df.groupby('gender').describe()
Out[519]:
id math english
count mean std min 25% 50% 75% max count mean std min 25% 50% 75% max count mean std min 25% 50% 75% max
gender
fmale 4.0 4.5 1.290994 3.0 3.75 4.5 5.25 6.0 4.0 91.75 9.500000 78.0 89.25 95.0 97.50 99.0 4.0 89.5 6.608076 80.0 87.50 92.0 94.00 94.0
male 2.0 1.5 0.707107 1.0 1.25 1.5 1.75 2.0 2.0 89.50 0.707107 89.0 89.25 89.5 89.75 90.0 2.0 91.5 3.535534 89.0 90.25 91.5 92.75 94.0
In [520]: df.groupby('gender')['math'].agg(['max','min','mean'])
Out[520]:
max min mean
gender
fmale 99 78 91.75
male 90 89 89.50
-
apply
In [5]: df
Out[5]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
In [6]: def transform_gender(gender):
...: if gender == 'male':
...: return '男'
...: elif gender == 'fmale':
...: return '女'
...: else:
...: return None
...:
...:
In [7]: df1 = df.copy()
In [8]: df1['性别']=df.gender.apply(transform_gender)
In [9]: df1
Out[9]:
id name math english gender 性别
0 1 Alice 90 89 male 男
1 2 Bob 89 94 male 男
2 3 Cindy 99 80 fmale 女
3 4 Eric 78 94 fmale 女
4 5 Helen 97 94 fmale 女
5 6 Grace 93 90 fmale 女
In [18]: df2.applymap(lambda x :x*10 if (type(x) is int) else x)
Out[18]:
id name math english gender
0 10 Alice 900 890 male
1 20 Bob 890 940 male
2 30 Cindy 990 800 fmale
3 40 Eric 780 940 fmale
4 50 Helen 970 940 fmale
5 60 Grace 930 900 fmale
-
连接
In [27]: raw_data_1 = {
...: 'subject_id': ['1', '2', '3', '4', '5'],
...: 'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
...: 'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']}
...:
...: raw_data_2 = {
...: 'subject_id': ['4', '5', '6', '7', '8'],
...: 'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
...: 'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']}
...:
...: raw_data_3 = {^M
...: 'subject_id': ['1', '2', '3', '4', '5', '7', '8', '9', '10', '11'],
...: 'test_id': [51, 15, 15, 61, 16, 14, 15, 1, 61, 16]}
In [28]: data1 = pd.DataFrame(raw_data_1)
In [29]: data2 = pd.DataFrame(raw_data_2)
In [30]: data3 = pd.DataFrame(raw_data_3)
In [31]: data1_data2 = pd.concat([data1,data2])
In [32]: data1_data2
Out[32]:
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
3 4 Alice Aoni
4 5 Ayoung Atiches
0 4 Billy Bonder
1 5 Brian Black
2 6 Bran Balwner
3 7 Bryce Brice
4 8 Betty Btisan
In [33]: data1_data2 = pd.concat([data1,data2],ignore_index=True)
In [34]: data1_data2
Out[34]:
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
3 4 Alice Aoni
4 5 Ayoung Atiches
5 4 Billy Bonder
6 5 Brian Black
7 6 Bran Balwner
8 7 Bryce Brice
9 8 Betty Btisan
In [63]: df_3 = pd.merge(df_1,df_2,how='left',left_on='lkey',right_on='rkey')
In [64]: df_3
Out[64]:
lkey value rkey height
0 foo 1 foo 5
1 foo 1 foo 8
2 bar 2 bar 6
3 baz 3 baz 7
4 foo 5 foo 5
5 foo 5 foo 8
In [65]: df_1
Out[65]:
lkey value
0 foo 1
1 bar 2
2 baz 3
3 foo 5
In [66]: df_2
Out[66]:
rkey height
0 foo 5
1 bar 6
2 baz 7
3 foo 8
In [68]: df_4 = df_1.merge(df_2,how='left',left_on='lkey',right_on='rkey')
In [69]: df_4
Out[69]:
lkey value rkey height
0 foo 1 foo 5
1 foo 1 foo 8
2 bar 2 bar 6
3 baz 3 baz 7
4 foo 5 foo 5
5 foo 5 foo 8
In [71]: df_2.rename(columns={'height':'value'},inplace=True)
In [72]: df_2
Out[72]:
rkey value
0 foo 5
1 bar 6
2 baz 7
3 foo 8
In [73]: df_4 = df_1.merge(df_2,how='left',left_on='lkey',right_on='rkey')
In [74]: df_4
Out[74]:
lkey value_x rkey value_y
0 foo 1 foo 5
1 foo 1 foo 8
2 bar 2 bar 6
3 baz 3 baz 7
4 foo 5 foo 5
5 foo 5 foo 8
-
总结工作常用操作 1. df.isna().sum()
df.isna().sum()/df.shape[0]
df['feature'].isna().sum()
df['feature'].isna().sum()/df.shape[0]
2. loc&iloc
df.loc[a, b]
df.loc[:, :]
df.loc[[index], :]
df.loc[index:, 'feature_1':'feature_5']
df.loc[index]
3. df.feature[data.feature == 1] = 0
4.merge pd.merge(df_1, df_2, how='', left_on='', right_on='')
5.concat pd.concat([df_1, df_2], axis=1)
6.for index, row in df.iterrows(): 遍历DataFrame的每一行
7.删除 列 del df.A
8.修改数据类型 df = df.astype('float32')
9.获取top n df.feature_name.nlargest(2) df.feature_name.nsmallest(2)
10. case when 用法 使用 apply df['a']=df.A.apply(lambda x:cun(x))
11. 解决CSV 文件过大问题
data = pd.read_csv(path, sep=',', iterator=True)
loop = True
chunkSize = 1000
chunks = []
index=0
while loop:
try:
print(index)
chunk = data.get_chunk(chunkSize)
chunks.append(chunk)
index+=1
except StopIteration:
loop = False
print("Iteration is stopped.")
print('开始合并')
data = pd.concat(chunks, ignore_index=True)
14.判断数据中是否存在极值
np.isinf(data[i]).any()
15.统计特征中每一个值出现的次数
data[feature_1].value_counts()
16.逗号替换 , 修改数据类型
df['feature_1'] = ['100,000', '200,000', '300,000', '400,000']
df['feature_1'] = df['feature_1'].apply(lambda x: str(x).replace(',', '')).astype('float')
17.pandas 字符串根据长度自动补齐自定义字符
df['feature_2'] = ['405', '8094', '100', '22']
df['feature_2'].str.pad(width=6, side='left', fillchar='*')
18.dataframe 筛选条件
df.loc[df['feature_1'] > 100000]
df.loc[(df['feature_1'] > 100000) & (df['feature_2'] != '100')]
df.query()
In [78]: df
Out[78]:
id name math english gender
0 1 Alice 90 89 male
1 2 Bob 89 94 male
2 3 Cindy 99 80 fmale
3 4 Eric 78 94 fmale
4 5 Helen 97 94 fmale
5 6 Grace 93 90 fmale
In [79]: df.loc[df.math>90,'name']
Out[79]:
2 Cindy
4 Helen
5 Grace
Name: name, dtype: object
19.dataframe 去重
df['feature_3'] = [200, 300, 400, 200]
df.drop_duplicates(subset = ['feature_3'], keep='first')
20.字典转dataframe
dict_a = {'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
df2 = pd.DataFrame.from_dict(dict_a, orient='index', columns=['a'])
df2.reset_index().rename(columns={'index': 'abcd', 'a': 'ABCD'})
21.apply根据多列进行判断
def function(a, b):
if a >= 100000 and b == '100':
return 1
else:
return 0
df['test'] = df.apply(lambda x: function(x.feature_1, x.feature_2), axis = 1)
22.pandas 实现分组排序(组内排序) 类似数据库中的开窗函数
df.groupby([feature_1])[feature_2].rank(ascending=True, method='first')
--等效于 row_number() over(partition by feature_1 order by feature_2 asc ) 序号 1,2,3,4,5,5,6
df.groupby([feature_1])[feature_2].rank(ascending=True, method='max')
--等效于 rank() over(partition by feature_1 order by feature_2 asc) 1,2,3,3,3,6 序号不连续
df.groupby([feature_1])[feature_2].rank(ascending=True, method='min')
--等效于 dense_rank() over(partition by feature_1 order by feature_2 asc) 1,2,3,3,3,4 序号时连续的
23. pandas 合并行
import pandas as pd
import numpy as np
import os
os.chdir(r'C:/Users/young/Desktop')
df=pd.read_excel('多行合并.xlsx')
def concat_func(x):
return pd.Series({
'爱好':','.join(x['爱好'].unique()),
'性别':','.join(x['性别'].unique())
}
)
result=df.groupby(df['姓名']).apply(concat_func).reset_index()
result
24.字符串操作 关键点 .str.contains 类似SQL中的like 操做
df[df['colB'].str.contains('a{竖线}b')]
df[~df['colA'].isin(['A','B'])]
25.列拼接 .str.cat 注意合并的列数据类型要一致
data['合并2']=data['姓名'].str.cat(data['性别'],sep=',').str.cat(data['身份'],sep=',')
26.空值填充
data['类别'].fillna('others',inplace=True)
27.空值删除 how='any' 代表带有空值的 行/列删除,axis=0,代表删除行 ,how='all' 删除一整行或一整列都时空的数据
data.dropna(subset=['品牌'],how='any',axis=0,inplace=True)
28 删除重复值
data.drop_duplicates(inplace=True)
29.时间类型转换
pd.to_datetime(df['打卡时间'].str[:19],format="%Y-%m-%d %H:%M:%S")
30.pandas 在指定位置增加列
df.insert(loc=0, column='#', value=df.index)
根据SQL 和 pandas 对比
-
resample 时序分析
第十三章 GUI编程
13.1 常见GUI
13.2 PythonSimpleGUI
第十四章 其他
14.1 日期和时间操作
-
常用的操作时间的包 -
常见操作
-
获取当前时间
In [393]: import time
In [394]: time.time()
Out[394]: 1645579485.5731077
In [395]: from datetime import datetime
In [396]: datetime.now()
Out[396]: datetime.datetime(2022, 2, 23, 9, 25, 24, 717509)
In [397]: str(datetime.now())
Out[397]: '2022-02-23 09:25:35.981446'
-
时间格式化
-
时间转字符串 In [398]: time.localtime()
Out[398]: time.struct_time(tm_year=2022, tm_mon=2, tm_mday=23, tm_hour=9, tm_min=26, tm_sec=8, tm_wday=2, tm_yday=54, tm_isdst=0)
In [399]: time.strftime('%Y-%m-%d %H:%M:%S',time.localtime())
Out[399]: '2022-02-23 09:27:03'
In [400]: datetime.now().strftime('%Y-%M-%d %H:%M:%S')
Out[400]: '2022-27-23 09:27:47'
In [402]: time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))
Out[402]: '2022-02-23 09:51:14'
-
字符串时间转时间
In [404]: time.strptime('2022-02-01','%Y-%m-%d')
Out[404]: time.struct_time(tm_year=2022, tm_mon=2, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=32, tm_isdst=-1)
In [406]: int(time.mktime(time.strptime('2022-02-01','%Y-%m-%d')))
Out[406]: 1643644800
-
ISO 格式转为本地时间 import dateutil.parser
dt = dateutil.parser.isoparse('2008-09-03T20:56:35.450686+00:00')
localdt = dt.astimezone(tz=None)
localdt.strftime('%Y-%m-%d %H:%M:%S')
-
获取指定的年月日,周 >>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2018, 6, 30, 23, 3, 54, 238947)
>>> datetime.now().year
2018
>>> datetime.now().month
6
>>> datetime.now().day
30
>>> datetime.now().hour
23
>>> datetime.now().minute
7
>>> datetime.now().second
58
>>> datetime.now().microsecond
151169
>>> datetime.now().weekday()
5
-
获取指定日期推移 thatDay = "2018-6-24"
from datetime import datetime,timedelta
theDay = datetime.strptime(thatDay, "%Y-%m-%d").date()
target = theDay + timedelta(days=120)
print(target)
print(target.weekday())
target = theDay - timedelta(days=120)
print(target)
print(target.weekday())
-
获取指定日期那周的周一 thatDay = "2022-10-30"
from datetime import datetime,timedelta
theDay = datetime.strptime(thatDay, "%Y-%m-%d").date()
weekMonday = theDay - timedelta(days=theDay.weekday())
-
获取某个月共有多少天 from calendar import monthrange
mr = monthrange(2011, 2)
print(mr[1])
14.2 正则表达式
正则参考资料, 在线正则校验网址
14.3 读写Excel
-
读写Excel 常见的包
-
openpyxl import openpyxl
workbook = openpyxl.Workbook()
sheet = workbook.activesheet['A1']='data'
workbook.save('test.xlsx')
-
xlutils xlrd/xlwt import xlwt
import xlutils
workbook = xlrd.open_workbook('myexcel.xls')
worksheet = workbook.sheet_by_index(0)
data = worksheet.cell_value(0,0)
wb = xlwt.Workbook()
sh = wb.add_sheet('Sheet1')
sh.write(0,0,'data')
wb.save('myexcel.xls')
book = xlrd.open_workbook('myexcel.xls')
new_book = xlutils.copy(book)
worksheet = new_book.getsheet(0)
worksheet.write(0,0,'new data')
new_book.save()
-
xlsxwriter import xlsxwriter as xw
workbook = xw.Workbook('myexcel.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write('A1',1)
workbook.close()
-
win32com
import win32com.client
excel = win32com.client.Dispatch("Excel.Application")
workbook = excel.Workbooks.Open(r"d:\tmp\income1.xlsx")
sheet = workbook.Sheets('2017')
sheet.Cells(1,1).Value="你好"
workbook.Save()
workbook.Close()
excel.Quit()
sheet = None
book = None
excel.Quit()
excel = None
import time
def byCom():
t1 = time.time()
import win32com.client
excel = win32com.client.Dispatch("Excel.Application")
workbook = excel.Workbooks.Open(r"h:\tmp\ruijia\数据.xlsx")
sheet = workbook.Sheets(2)
print(sheet.Cells(2,15).Value)
print(sheet.UsedRange.Rows.Count)
t2 = time.time()
print(f'打开: 耗时{t2 - t1}秒')
total = 0
for row in range(2,sheet.UsedRange.Rows.Count+1):
value = sheet.Cells(row,15).Value
if type(value) not in [int,float]:
continue
total += value
print(total)
t3 = time.time()
print(f'读取数据: 耗时{t3 - t2}秒')
def byXlrd():
t1 = time.time()
import xlrd
srcBook = xlrd.open_workbook("数据.xlsx")
sheet = srcBook.sheet_by_index(1)
print(sheet.cell_value(rowx=1,colx=14))
print(sheet.nrows)
t2 = time.time()
print(f'打开: 耗时{t2 - t1}秒')
total = 0
for row in range(1,sheet.nrows):
value = sheet.cell_value(row, 14)
if type(value) == str:
continue
total += value
print(total)
t3 = time.time()
print(f'读取数据: 耗时{t3 - t2}秒')
byCom()
byXlrd()
如果你只是从 大Excel文件中 读取或修改少量数据,Excel COM 接口会快很多。
但是,如果你要读取大Excel中的大量数据,不要使用 COM接口,会非常的慢。
-
xlwings import xlwings as xw
workbook = xw.Book(r'path/myexcel.xlsx')
data_range = workbook.sheets('Sheet1').range('A1')
data_range.value = [1,2,3]
workbook.save()
-
pandas import pandas as pd
df = pd.read_excel(path, sheet_name='xxx', header=1)
df.to_excel(path, sheet_name='xxx')
-
DataNitro
Cell('A1').value = 'data'
CellRange('A1:B2').value = 'data'
-
各种方式读写Excel 对比 各种方式读写Excel 对比参考资料
14.4 调用其他程序
14.5 socket编程
pass
14.6 哈希和加密
-
常见的哈希函数
哈希算法 | 字节长度 |
---|
MD5 | 计算结果16字节 | SHA1 | 计算结果20字节 | SHA224 | 计算结果28字节 | SHA256 | 计算结果32字节 | SHA384 | 计算结果48字节 | SHA512 | 计算结果64字节 |
哈希函数(hash function)可以把 任意长度的数据(字节串)计算出一个为固定长度的结果数据。
我们习惯把 要计算 的数据称之为 源数据, 计算后的结果数据称之为 哈希值(hash value)或者 摘要(digests)。
-
哈希计算的特点
-
相同的 源数据, 采用 相同的哈希算法, 计算出来的哈希值 一定相同 -
不管 源数据 有多大,相同的哈希算法,计算出来的哈希值长度 都是一样长的 -
不同的源数据 使用同样的哈希算法,可能会产生相同的 哈希值,这被称之为碰撞率(collision rate) -
各种哈希算法,计算的结果长度越长,碰撞率越低,通常耗费的计算时长也越长。 即使是 MD5 算法, 碰撞率也 非常小,小到几乎可以忽略不计。大约是 1.47*10的负29次方 -
哈希不可逆 -
哈希函数的使用场景
- 校验拷贝下载文件
- 校验信息有效性(登录,用户名密码校验)
-
Python 中实现哈希 import hashlib
m = hashlib.md5()
m.update("张三,学费已交|13ty8ffbs2v".encode())
resultBytes = m.digest()
resultHex = m.hexdigest()
print(resultHex)
-
常见的加密算法
- 对称加密
AES , RC4 , DES , 3DES , IDEA - 非对称加密
RSA (Rivest–Shamir–Adleman) -
加密算法的特点
- 加解密算法 是可逆的,hash算法是不可逆的。
- hash算法可以对很大的数据产生比较小的哈希值,而加密算法源数据很大,加密后的数据也会很大
-
加密算法的使用场景
-
Python 中实现加密 from cryptography.fernet import Fernet
key = Fernet.generate_key()
f = Fernet(key)
src = "dfdkslfjdlsjdkljg"
srcBytes = src.encode()
token = f.encrypt(srcBytes)
print(token)
sb = f.decrypt(token)
print(sb.decode())
14.6 图片处理
14.7 制作可执行文件
|