python数据结构之列表、字典、元组、集合

2025-01-07 技术教程

列表

列表在python里是有序集合对象类型。
列表里的对象可以是任何对象：数字，字符串，列表或者字典，元组。与字符串不同，列表是可变对象，支持原处修改的操作
python的列表是：

任意对象的有序集合通过偏移读取可变长度、异构以及任意嵌套属于可变序列的分组对象引用数组列表的操作

列表的操作和字符串大部分都相同：
合并/重复：

list1+list2：结果是两个列表按顺序结合list*3：结果是列表list重复三次for i in list1: print(i)：按顺序打印列表里的内容3 in list：判断列表里有没有一个对象是对象3list1.index(1)：查找列表里第一个为1的对象的位置list1.count(1)：查找列表里对象为1的个数list1[x:y]：取第x到y的对象，重新建立一个列表len(list1)：list1里的对象个数基本列表操作

创建一个列表：

>>> list=[]>>> list=[1,2,'3',[]]>>> list[1, 2, '3', []]

列表取值：

>>> list[1]2>>> list[0:3][1, 2, '3']

重复列表内容：

>>> list*3[1, 2, '3', [], 1, 2, '3', [], 1, 2, '3', []]

使用in方法来判断对象是否在列表中：

>>> 3 in listFalse>>> [] in listTrue

循环打印：

>>> for i in list:... print (i,end=' ')...1 2 3 []

迭代方式创建列表：

>>> list=[i*4 for i in 'ASDF' ]>>> list['AAAA', 'SSSS', 'DDDD', 'FFFF']

矩阵：

list=[ [1,2,3,],[4,5,6],[7,8,9] ]>>> list[[1, 2, 3], [4, 5, 6], [7, 8, 9]]>>> list[0][1]2>>> list[1][2]6

列表原处修改：

>>> food=['spam','eggs','milk']>>> food[1]'eggs'>>> food[1]='Eggs'>>> food[:]['spam', 'Eggs', 'milk']列表的方法列表的添加：

>>>food.append('cake') >>> food['spam', 'Eggs', 'milk', 'cake']列表的排序：

>>> food.sort() >>> food['Eggs', 'cake', 'milk', 'spam']合并列表：

>>> list1=[1,2,3]>>> list2=[4,5,6]>>> list1.extend(list2)>>> list1[1, 2, 3, 4, 5, 6]列表的取值：

>>> list1.pop()6>>> list1[1, 2, 3, 4, 5]列表倒序显示：

>>> list1[1, 2, 3, 4, 5]>>> list1.reverse()>>> list1[5, 4, 3, 2, 1]列表的索引：

>>> list=[1,2,3,4,5]>>> list.index(3)2列表的插入：

>>> list.insert(2,10)>>> list[1, 2, 10, 3, 4, 5]删除列表的某一个对象：

>>> list[1, 2, 10, 3, 4, 5]>>> del list[2]>>> list[1, 2, 3, 4, 5]列表的排序：
列表的排序默认是先以字母大小写进行排序的，可以在列表中加一个选项key=lower.str使其都转换成小写，使用reverse=True进行倒序排列

>>> list=['abc','aDd','ace']>>> sorted(list)['aDd', 'abc', 'ace']>>> list['abc', 'aDd', 'ace']>>> sorted(list,key=str.lower,reverse=True)['aDd', 'ace', 'abc']>>> sorted(list,key=str.lower)['abc', 'ace', 'aDd']>>>sorted([x.lower() for x in list])['abc', 'ace', 'add']>>> sorted([x.lower() for x in list],reverse=True)['add', 'ace', 'abc']列表的实际用法取值：

>>> info=['myname',18,[1997,9,28]]>>> _name,_age,_birth=info>>> _name'myname'>>> _age18>>> _birth[1997, 9, 28]>>> _name,_age,(_birth_y,_birth_m,_birth_d)=info>>> _birth_y1997>>> _birth_m,_birth_d(9, 28)

当取的值不固定的时候，可以用*代替：

>>> a=['adc',122,2215,'asd@asd']>>> a_name,*a_phone,a_mail=a>>> a_name'adc'>>> a_phone[122, 2215]只保留列表里最后N个元素：
使用deque函数可以设置列表中的元素个数，如果超过列表最大限制，那么会将列表里最左边的元素删掉，如果是在左边添加的，那么删除的是最右边的元素

>>> from collections import deque>>> q=deque(maxlen=3)>>> q.append(1)>>> q.append(2)>>> q.append(3)>>> qdeque([1, 2, 3], maxlen=3)>>> q.append(4)>>> qdeque([2, 3, 4], maxlen=3) >>> q.appendleft('5')>>> qdeque(['5', 2, 3], maxlen=3)取出列表中的最大值和最小值:
使用heapq模块的nlargest,nsmallest方法来取出列表中的几个最大值和最小值，当然也可以使用max和min函数来求最大和最小，使用sum函数来求列表数字的和

>>> from heapq import nlargest,nsmallest>>> num=[1,4,6,7,8,8,34,64,23,7,45,34]>>> nlargest(3,num)[64, 45, 34]>>> nlargest(2,num)[64, 45]>>> nsmallest(2,num)[1, 4]>>> nsmallest(4,num)[1, 4, 6, 7]>>> num[1, 4, 6, 7, 8, 8, 34, 64, 23, 7, 45, 34]>>> max(num)64>>> min(num)1>>> sum(num)241>>> a_info=['wanger','wangerxiao',25,'computer']>>> _name=slice(0,2)>>> _age=slice(2,3)>>> _job=slice(3,4)>>> a_info[_name]['wanger', 'wangerxiao']>>> a_info[_age][25]>>> a_info[_job]['computer']重复元素计算：
这会用到collections模块的Counter方法

>> a=[1,2,3,4,5,6,2,4,2,5,6]>>> from collections import Counter>>> count_word=Counter(a)>>> count_wordCounter({2: 3, 4: 2, 5: 2, 6: 2, 1: 1, 3: 1})>>> count_word.most_common(3)[(2, 3), (4, 2), (5, 2)]>>> count_word.most_common(2)[(2, 3), (4, 2)]字典

字典在python里是无序集合对象类型。
字典的值都有独立的唯一的键，用相应的键来取值。
python字典主要特性如下：

通过键而不是偏移量来读取任意对象的无序组合可变长，异构，任意嵌套属于可映射类型对象引用表

字典用法注意事项：

序列运算无效——串联，分片不能使用对新索引（键）赋值会添加项键不一定是字符串——只要是不可变的对象（列表字典除外）字典的基本操作：

字典的赋值：

>>> dict={'a':97,'b':98}>>> len(dict)2>>> print("ascii code of 'a' is {},ascii code of 'b' is {}".format(dict['a'],dict['b']))ascii code of 'a' is 97,ascii code of 'b' is 98

判断特定的键是否存在于字典里：

>>> 'a' in dictTrue>>> 'b>>>> 'b' is in dictTrue原处修改：

#更改特定键的值>>> food={'eggs':3,'ham':1,'spam':4}>>> food['ham']=2>>> food{'eggs': 3, 'ham': 2, 'spam': 4}#增加新的键和相应的值>>> food['branch']=['bacon','bake']>>> food{'eggs': 3, 'ham': 2, 'spam': 4, 'branch': ['bacon', 'bake']}#删除一个字典元素>>> del food['eggs']>>> food{'ham': 2, 'spam': 4, 'branch': ['bacon', 'bake']}#清空字典所有条目>>> dict.clear()#删除字典del dict字典的方法

查找字典的键值是否存在,如果不存在可以设置返回的值

>>> food.get('ham')2>>> dict.get('b')2>>> dict.get('0')>>> dict.get('0','none')'none'

创建字典的方法：
1.最原始的方法：

dict={'name':'wanger','age':25}

2.按键赋值方法：

>>> dict={}>>> dict['name']='wanger'>>> dict['age']=25

字典的比较：
字典的比较会比较字典的键，而不是字典的值,可以使用zip方式将字典的值和键反过来，这样就会比较值了，可以使用sorted函数对字典进行排序

>>> dict={'a':1,'b':2,'c':3,'d':4}>>> max(dict)'d'>>> min(dict)'a'>>> max(zip(dict.values(),dict.keys()))(4, 'd')>>> min(zip(dict.values(),dict.keys()))(1, 'a')>>> sorted(zip(dict.values(),dict.keys()))[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]>>> sorted(zip(dict.values(),dict.keys()),reverse=True)[(4, 'd'), (3, 'c'), (2, 'b'), (1, 'a')]字典列表的排序：
可以使用sorted函数进行排序，使用key参数可以对排序的键进行定义，这里要用到operator模块的itemgetter函数

>>> rows[{'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},{'fname': 'David', 'lname': 'Beazley', 'uid': 1002}, {'fname': 'John', 'lname': 'Clesse', 'uid': 1001},{'fname': 'Big', 'lname': 'Jones', 'uid': 1004}]>>> from operator import itemgetter>>> rows_fname=sorted(rows,key=itemgetter('fname'))>>> rows_fname[{'fname': 'Big', 'lname': 'Jones', 'uid': 1004},{'fname': 'Brian', 'lname':'Jones', 'uid': 1003}, {'fname': 'David', 'lname': 'Beazley', 'uid': 1002}, {'fname': 'John', 'lname': 'Clesse', 'uid': 1001}]>>> rows_uid=sorted(rows,key=itemgetter('uid'))>>> rows_uid[{'fname': 'John', 'lname': 'Clesse', 'uid': 1001}, {'fname': 'David', 'lname': 'Beazley', 'uid': 1002}, {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},{'fname': 'Big', 'lname': 'Jones', 'uid': 1004}]元组元组简介

元组与列表非常类似，只是不能在原处更改，元祖在python里的特点：

任意对象的有序组合通过偏移取数据属于不可变序列类型固定长度，异构，任意嵌套对象引用的数组元组的创建

元祖创建在只有单个元素的时候，必须加逗号（,），元组里可以嵌套元组

>>> tuple=()>>> tuple=(1,)>>> type(tuple)<class 'tuple'>#这里加不加括号都一样>>> tuple=(1,2,'3',(4,5))>>> tuple(1, 2, '3', (4, 5))>>> tuple=1,2,'3',(4,5)>>> tuple(1, 2, '3', (4, 5))将列表转换为元组

>>> list=[1,2,3,4]>>> sd=tuple(list)>>> sd(1, 2, 3, 4)元组的方法

元组的排序：
元组经过sorted排序后，会将其转换为列表

>>> tuple=(1,5,3,6,4,2)>>> sorted(tuple)[1, 2, 3, 4, 5, 6]查找元组元素位置：

>>> tuple(1, 5, 3, 6, 4, 2)>>> tuple.index(3)2

计算元组元素数目：

>>> tuple(1, 5, 3, 6, 4, 2)>>> tuple.count(3)1

元组的切片：

>>> tuple[0]1>>> tuple[2:](3, 6, 4, 2)>>> tuple[2:3](3,)列表和元组的操作类似，列表操作里只要不是在原处修改的，都可用于元组

>>> (1,2)+(3,4)(1, 2, 3, 4)>>> (1,2)*4(1, 2, 1, 2, 1, 2, 1, 2)>>> len(tuple)6集合集合简介

set是一个无序且不重复的元素集合
集合对象十一组无序排列的可哈希的值，集合成员可以做字典中的键。set也支持用in 和not in操作符检查成员，由于集合本身是无序的，不可以为集合创建索引或执行切片操作，也没有键可用来获取集合中元素的值。

集合特点集合中的元素和字典中的键一样不重复集合中的元素为不可变对象集合的创建

>>> s=set('a')>>> a=set({'k1':1,'k2':2})>>> b=(['y','e','d','o'])>>> c={'a','b','c'}>>> d={('a','b','c')}集合基本操作集合的比较

#比较a、b集合中a中存在，b中不存在的集合>>> a={11,22,33}>>> b={11,23,45}>>> a.difference(b){33, 22}#找到a中存在，b中不存在的集合，并把a、b集合中都有的值覆盖掉>>> a={11,22,33}>>> print(a.difference_update(b))None>>> a{33, 22}

集合的删除：

>>> a={11,22,33}>>> a.discard(11)>>> a.discard(44)>>> a{33, 22}#移除不存在的元素会报错>>> a={11,22,33}>>> a.remove(11)>>> a.remove(44)Traceback (most recent call last):File "<stdin>", line 1, in <module>KeyError: 44>>> a{33, 22}#移除末尾的元素>>> a={11,22,33}>>> a.pop()33>>> a{11, 22}

取交集：

#取交集赋给新值>>> a={1,2,3,4}>>> b={6,5,4,3}>>> print (a.intersection(b)){3, 4}#取交集并把交集赋给a>>> print (a.intersection_update(b))None>>> a{3, 4}集合判断：

>>> a={3,4}>>> b={6,5,4,3}#判断a是否与b没有交集，有交集False，无交集True>>> a.isdisjoint(b)False#判断a是否是b的子集>>> a.issubset(b)True#判断a是否是b的父集>>> a.issuperset(b)False集合合并：

>>> a={1,2,3,4}>>> b={3, 4, 5, 6}#打印不同的元素>>> print (a.symmetric_difference(b)){1, 2, 5, 6}#打印不同的元素，并覆盖到集合a>>> print (a.symmetric_difference_update(b))None>>> a{1, 2, 5, 6}集合取并集：

>>> a={1, 2, 5, 6}>>> b={3, 4, 5, 6}>>> print (a.union(b)){1, 2, 3, 4, 5, 6}

集合的更新：

>>> a={1, 2, 5, 6}>>> b={3, 4, 5, 6}#把a、b的值合并，并把值赋给集合a>>> a.update(b)>>> a{1, 2, 3, 4, 5, 6}#添加a集合的元素>>> a.update([7,8])>>> a{1, 2, 3, 4, 5, 6, 7, 8}集合的转换：
将集合分别转换为列表、元组、字符串

>>> a=set(range(5))}>>> li=list(a)>>> tu=tuple(a)>>> st=str(a)>>> print (li)[0, 1, 2, 3, 4]>>> print (tu)(0, 1, 2, 3, 4)>>> print (st){0, 1, 2, 3, 4}python文件文件简介

文件对象在python里可以作为操作系统上的文件的链接
文件对象的使用方式与之前的字符串、列表等对象不同，它是对文件的输入、输出进行控制
在python里会用open函数来进行文件的控制

文件的访问

在python里使用open函数可以访问文件。
基本格式是：open（<file_address>[,access_mode]）
这里的文件地址是文本形式，在Windows里由于文件地址是使用反斜杠(),所以，可以使用r来对反斜杠不进行转义。
例如：
open(r'C:\mydir\myfile')
访问模式里是参数，默认是r(读取)
在访问模式，每一种方法都有一种使用到b的方式，就是二进制模式。
文件的读写参数

操作说明符解释r以只读方式打开文件，这是默认模式rb以二进制格式打开一个文件用于只读。这是默认模式r+打开一个文件用于读写rb+以二进制格式打开一个文件用于读写w打开一个文件只用于写入。文件存在则覆盖，不存在，则创建新文件wb以二进制格式打开一个文件只用于写入。文件存在则覆盖，不存在则创建w+打开一个文件用于读写。如果文件已存在则将其覆盖，不存在则创建新文件。wb+以二进制打开一个文件用于读写。如果该文件存在则覆盖，不存在则创建a打开一个文件用于追加，如果文件内容存在，则将新内容追加到文件末尾，不存在则创建新文件写入ab以二进制格式打开一个文件用于写入a+打开一个文件用于读写，如果该文件存在，则会将新的内容追加到文件末尾，如果文件不存在，则创建新文件用于读写。ab+以二进制格式打开一个文件用于追加，文件存在将追加，不存在则创建新文件用于读写文件的使用迭代器是最好的读行工具，比如使用for循环内容是字符串，不是对象，文件读取完之后，内容是以字符串的形式读取的。close是通常选项，当你使用完文件后，使用close方法来关闭文件关联文件是缓冲而且是可查找的，flush或close()方法可以直接存储缓存里的内容，seek方法可以转到指定位置，当我们使用文件的时候，跟其他对象一样，用一个变量来引用

例子

>>> file1=open(r'D:\ruanjian\1.txt','w')>>> file1.write('hello,world')11>>> file1.close()>>> file1=open(r'D:\ruanjian\1.txt')>>> file1.read()'hello,world'#tell用于获取文件指针位置，文件读取之后，文件指针在最后面>>> file1.tell()11>>> file1.close()>>>> file1=open(r'D:\ruanjian\1.txt')>>> file1.seek(6)6>>> file1.read(5)'world'文件的读取

当我们要读取前五个字符的时候可以这样：

>>> file1=open(r'D:\ruanjian\1.txt')>>> file1.read(5)'hello'>>> file1.tell()5

当我们要按行读取的时候，可以使用readline和readlines方法

>>> file1=open(r'D:\ruanjian\1.txt')>>> file1.readline()'hello,world\n'>>> file1.readline()'wanger\n'>>> file1.readline()'asdfgghh'>>> file1.readline()''>>> file1=open(r'D:\ruanjian\1.txt')>>> file1.readlines()['hello,world\n', 'wanger\n', 'asdfgghh']文件的写入

当我们需要写入到一个文件的时候，会使用w模式。当相应的文件存在时，会覆盖原先的文件然后写入，当相应的文件不存在时会创建新文件。

基本写入

>>> file=open(r'D:\ruanjian\1.txt','w')>>> file.write('hello,world')11>>> file.write('|wanger')7>>> file.flush()>>> file.close()>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()'hello,world|wanger'

在这里flush()方法是把缓存里的内容写入硬盘中。当运行close()方法的时候，也会进行同样操作。

按列表写入：
writelines是把列表里的元素一个一个输入进去。当然，元素里的字符串最后没有换行，最终结果也不是换行的。

>>> list=['hello,world!\n','wanger\n','asdfgh\n']>>> file=open(r'D:\ruanjian\1.txt','w')>>> file.writelines(list)>>> file.close()>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()'hello,world!\nwanger\nasdfgh\n'

在特定位置写入
当我们输入错误的时候，可以把指针挪到最前面，然后继续输入。seek可以有两个传递变量，只有一个变量或者第一个变量为0时，就是更改当前的指针，第二个变量为1的时候，会返回当前指针位置，这个与tell方法同样，最后，第一个变量为0，第二个变量为2的时候会把指针放到最后

>>> file=open(r'D:\ruanjian\1.txt','w')>>> file.write('heelo')5>>> file.seek(0)0>>> file.write('hello')5>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()'hello'

在最后写入
之前看到的w模式，当文件是已有文件，就会删除里面的所有内容后再写入的。当我们需要在最后添加，而不是删除原有内容时，可以使用a模式。

>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()'hello'>>> file.close()>>> file=open(r'D:\ruanjian\1.txt','a')>>> file.write('my name is wanger')17>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()'hellomy name is wanger'

在模式里，我们会看到r+,w+,a+三种模式都有读写的方法。
r+模式，只能打开已有文件，打开时保留原有文件，对文件可读，可写，也可更改原有内容。打开是指针在文件最前面。
w+模式，打开时没有相应的文件，会创建；有相应的文件会覆盖原有的内容
a+模式，可以打开原有文件，也可创建新的文件，打开时指针为文件的最后位置。指针可以放到任何位置来读内容，但写入时，指针默认会移动到最后，然后写入。

模式打开已有文件打开新的文件打开时指针位置写入时指针位置r+保留内容发生错误文件开头当前位置w+删除内容创建文件文件开头当前位置a+保留内容创建文件文件尾端文件尾端文件的访问二进制模式
在这个模式中，在python2.x中不会有什么特别，因为在2.x里存储方式就是二进制方式，但在python3.x里是Unicode方式。

>>> cha='啊'>>> cha_b=cha.encode()>>> file=open(r'D:\ruanjian\1.txt','w')>>> file.write(cha)1>>> file.write(cha_b)Traceback (most recent call last):File "<stdin>", line 1, in <module>TypeError: write() argument must be str, not bytes>>> file.close()>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()'啊'>>> file=open(r'D:\ruanjian\1.txt','wb')>>> file.write(cha)Traceback (most recent call last):File "<stdin>", line 1, in <module>TypeError: a bytes-like object is required, not 'str'>>> file.write(cha_b)3>>> file.close()>>> file=open(r'D:\ruanjian\1.txt','rb')>>> file.read()b'\xe5\x95\x8a'文件与其他类型原生对象的存取
存储一些对象的时候，比如说列表，字典等；python都需要把这些对象转换成字符串后存储：

>>> file=open(r'D:\ruanjian\1.txt','w')>>> file.write({'a':97})Traceback (most recent call last):File "<stdin>", line 1, in <module>TypeError: write() argument must be str, not dict>>> file.write(str({'a':97}))9>>> file.write(str([1,2]))6>>> file.close()>>> file=open(r'D:\ruanjian\1.txt')>>> file.read()"{'a': 97}[1, 2]"

如果要将存储的字符串转换回原来的数据类型，可以用pickle模块：

>>> file=open(r'D:\ruanjian\1.txt','wb')>>> a={'a':97}>>> pickle.dump(a,file)>>> file.close()>>> file=open(r'D:\ruanjian\1.txt','rb')>>> a_=pickle.load(file)>>> a_{'a': 97}打印输出至文件

需要把打印的内容直接输出到文件里的时候：

>>> with open (r'D:\ruanjian\1.txt','w') as f:... print ('hello,world!',file=f)...>>> with open (r'D:\ruanjian\1.txt') as f:... f.read()...'hello,world!\n'判断文件是否存在，不存在时写入

因为w方式对已存在的文件会清楚后写入，但有的时候我们不想覆盖原有的文件，我们可以使用如下方式：

>>> if not os.path.exists(r'D:\ruanjian\1.txt'):... with open(r'D:\ruanjian\1.txt','wt') as f:... f.write('hello,world')... else:... print ('file already exists')...file already exists

在python3.x中我们也可以使用这种方式来判断文件是否存在,存在的话会报错，不存在的话文件可以创建

>>> with open(r'D:ruanjian\2.txt','xt') as f:... f.write('hello\n')...6>>> with open(r'D:ruanjian\2.txt','xt') as f:... f.write('hello\n')...Traceback (most recent call last): File "<stdin>", line 1, in <module>FileExistsError: [Errno 17] File exists: 'D:ruanjian\\2.txt'读写压缩文件

文件在存储时也可以压缩存储，需要用到gzip或者bz2模块，在这两个模块中，默认是二进制模式，因此需要使用wt,rt等，指定text模式。读的时候使用rt，和read()。
压缩级别可以用compresslevel来设置，也可以使用open里的encoding,errors,newline等。

>>> with gzip.open(r'D:\ruanjian\1.gz','wt') as f:... f.write('text')...4>>> with gzip.open(r'D:\ruanjian\1.gz','rt') as f:... f.read()...'text'>>> with bz2.open(r'D:\ruanjian\1.bz2','wt') as f:... f.write('hello,world')...11>>> with bz2.open(r'D:\ruanjian\1.bz2','rt') as f:... f.read()...'hello,world'获取文件夹中的文件列表

这要用到os模块里的方法，关于os模块可以查看公众号的历史消息，对os模块有详细的解释，这里只列出一些简单的方法：

>>> import os>>> os.getcwd()'/root/blog'>>> os.listdir('.')['_config.yml', 'node_modules', '.gitignore', 'source', 'db.json', 'themes', 'package.json', 'public', 'scaffolds', '.deploy_git']#当需要判断是文件时>>> files=[file for file in os.listdir('.') if os.path.isfile(os.path.join('.',file))]>>> files['_config.yml', '.gitignore', 'db.json', 'package.json']

欢迎各位关注我的微信公众号