python数据结构之数字和字符串

2025-01-04 技术教程

python数据类型：Number（数字）String(字符串)List(列表)Dictonary（字典）Tuple(元组)sets(集合)

其中数字、字符串、元组是不可变的，列表、字典是可变的。
对不可变类型的变量重新赋值，实际上是重新创建一个不可变类型的对象，并将原来的变量重新指向新创建的对象（如果没有其他变量引用原有对象的话（即引用计数为0），原有对象就会被回收）。

数字int:整数
1.正负数
2.十六进制(表示方式为0x或者0X开头。例如：0xff)
3.八进制(表示方式为0o或者0O开头。例如：0o632457)
4.二进制 (表示方式为0b或者0B开头。例如：0b101100)fraction:分数float:浮点数complex:复数bool:布尔型(特殊的数值类型，只有True和False两个值)进制转换整数转其他进制

使用bin(i),oct(i),hex(i)函数可以将十进制数分别转换为二进制，八进制，十六进制

>>> s=10>>> bin(s)'0b1010'>>> oct(s)'0o12'>>> hex(s)'0xa'

使用int(str,base)可以将非十进制的数转换成整数
，其中str是文本形式的数字，base可以为2,8,16数字，分别代表二进制，八进制，十六进制，最高到36位，最低为2

>>> int('0b10010',2)18>>> int('0o52415',8)21773>>> int('0x134ab',16)79019>>> int('s',32)28>>> int('yz',36)1259

当然也可以进行16进制转二进制八进制，八进制可以转其他进制

>>> hex(0b1001)'0x9'>>> hex(0o1234)'0x29c'>>> oct(0b101)'0o5'>>> oct(0xff)'0o377'>>> bin(0xff)'0b11111111'>>> bin(0o7777)'0b111111111111'各类运算符算数运算符：+,-,*,/,%.//,**比较运算符：==，!=,>,<,<=,>=赋值运算符：=,+=,-=,*=,/=,%=,//=.**=位运算符：&,|,^,~,<<,>>逻辑运算符：and,or,not成员运算符：in,not in身份运算符：is,is not

>>> a=12>>> f=~a>>> f-13>>> bin(f)'-0b1101'>>> bin(a)'0b1100'>>> bin(a<<1)'0b11000'>>> bin(a>>1)'0b110'>>> list=[1,2,3,4,5]>>> a=3>>> print (a in list)True>>> print (a not in list)False>>> a=['1,2,3,4,5']>>> b=a>>> print (b is a )True>>> print (b is not a )False>>> b=a[:]>>> print (b is a) #这是因为字符串是不可变的False>>> print (id(a))42473480>>> print (id(b))42485000运算符优先级** （优先级最高的是幂运算）~,+,- (加和减都是一元运算符)*，/，%，//+,-<<,>>&^,|<=,>=,<,>==,!==,+=,-=,*=,/=,%=,//=,**=数学函数的应用power：幂函数，功能与运算符**一样

>>> pow(2,3)8sqrt：取当前数的平方根

>>> import math>>> math.sqrt(4)2.0max：最大值

>>>max(2,3,4,5,1,9,6)9min：最小值

>>> min(2,3,4,5,1,9,6)1abs与fabs：取绝对值，fabs取出的是浮点数

>>> abs(-1)1>>> math.fabs(-1)1.0round：四舍五入（当小数为5的时候会向靠近偶数的一端进）

>>> round(3.5)4>>> round(2.5)2>>> round(2.54)3>>> round(2.45)2ceil：向上取整

>>> math.ceil(1.7)2>>> math.ceil(1.3)2floor：向下取整

>>> math.floor(1.7)1>>> math.floor(1.3)1cmp：python2中的比较函数，当前面数值大返回-1，一样大返回0，后面数值大返回1

>>> cmp(1,2)-1>>> cmp(1,1)0>>> cmp(2,1)1随机数函数
- 取0-1之间的随机小数：

>>> import random>>> random.random()0.18001643527271916

- 取自定义数里的随机数：(可以取多个元素)

>>> random.choice([1,2,3,4,5])2>>> random.choice([1,2,3,4,5])3>>> random.sample([1,2,3,4,5,6,7,8,9],2)[3, 7]>>> random.sample([1,2,3,4,5,6,7,8,9],3)[4, 9, 3]

- 随机打乱顺序：

>>> a=[1,2,3,4,5,8]>>> random.shuffle(a)>>> a[1, 8, 2, 3, 4, 5]

- 获取N位随机数：(二进制)

>>> random.getrandbits(6)55>>> random.getrandbits(6)48>>> random.getrandbits(7)104modf：把浮点数的整数位和小数位单独取出来

>>> math.modf(1.4)(0.3999999999999999, 1.0)>>> math.modf(1.5)(0.5, 1.0)>>> math.modf(2.8)(0.7999999999999998, 2.0)>>> math.modf(3.1)(0.10000000000000009, 3.0)log：指数函数。默认e为底数，结果为浮点数。也可以自定义底数

>>> math.log(4,2)2.0>>> math.log2(4)2.0>>> math.log10(100)2.0>>> math.log(100,10)2.0格式化输出：格式化输出保留有效数字,格式化输出的是字符串

>>> s=format(2.345,'0.2f')>>> s>>> type (s)<class 'str'>>>> round(2.5)2>>> format(2.5,'0.0f')'2'Decimal模块：在使用浮点数的时候，因为计算机是使用二进制表示，所以会出现精度问题，可以使用Deciamal模块来解决精度问题

>>> a=4.2>>> b=2.1>>> a+b6.300000000000001>>> from decimal import Decimal>>> a=Decimal('2.1')>>> b=Decimal('4.2')>>> a+bDecimal('6.3')格式化输出——format：使用format进行进制转换

>>> a=20>>> bin(a)'0b10100'>>> oct(a)'0o24'>>> hex(a)'0x14'>>> format(a,'b')'10100'>>> format(a,'o')'24'>>> format(a,'x')'14'字符串

字符串(python2默认使用ascii编码，使用Unicode编码须在字符串前加u,python3使用unicode编码）
a='str'
a=u'str'

字符串表示方法单引号：'str' '1'双引号："str""1"三引号：'''...str...''' """...str..."""转义字符：“str1 \tadded tab\nstr2”

Raw字符串：r"C:\user\administrator"(无法进行转义操作)

字符串操作字符串合并

>>> 'abc'+'def''abcdef'>>> 'hello' *5'hellohellohellohellohello'>>> print ('-'*50)-------------------------------------------------->>> "aa""bb"'aabb'>>> 'ab''cd''abcd'字符串取值

a="text">>> for c in a:... print (c,end='')...text>>> for c in a:... print (c,end='-')...t-e-x-t->>> 'x' in aTrue>>> text='this_is_str'>>> text[0:4]'this'>>> text[5:7]'is'>>> text[:4]'this'>>> text[-3:]'str'>>> text[-12:-7]'this'>>> text[::2]'ti_***'>>> text[8:1:-2]'ss_i'字符串编码转换

>>> ord('d')100>>> chr(99)'c'>>> ord('王')29579>>> chr(29579)'王'字符串大小写转换这里利用ascii编码进行大小写转换

>>> Text="" #初始化Text>>> text="aSdFgHjK" >>> for i in text:... i_code=ord(i)... if 97<=i_code and i_code<=122:... Text+=chr(i_code-32)... else:... Text+=i...>>> Text'ASDFGHJK'>>> for x in text:... x_code=ord(x)... if 65<=x_code and x_code<=90:... Text+=chr(x_code+32)... else:... Text+=x...>>> Text'ASDFGHJKasdfghjk'这里利用字符串的方法进行转换

>>> str='asdFGHzxcVBN'>>> str.replace('asd','ASD')'ASDFGHzxcVBN'除此之外，还可以使用字符串的大小写方法进行大小写转换 ascii编码对照表
二进制十进制十六进制图形0010 00003220（空格）0010 00013321!0010 00103422"0010 00113523#0010 01003624$0010 01013725%0010 01103826&0010 01113927''0010 10004028(0010 10014129)0010 1010422A*0010 1011432B+0010 1100442C,0010 1101452D-0010 1110462E.0010 1111472F/0011 0000483000011 0001493110011 0010503220011 0011513330011 0100523440011 0101533550011 0110543660011 0111553770011 1000563880011 1001573990011 1010583A:0011 1011593B;0011 1100603C<0011 1101613D=0011 1110623E>0011 1111633F?0100 00006440@0100 00016541A0100 00106642B0100 00116743C0100 01006844D0100 01016945E0100 01107046F0100 01117147G0100 10007248H0100 10017349I0100 1010744AJ0100 1011754BK0100 1100764CL0100 1101774DM0100 1110784EN0100 1111794FO0101 00008050P0101 00018151Q0101 00108252R0101 00118353S0101 01008454T0101 01018555U0101 01108656V0101 01118757W0101 10008858X0101 10018959Y0101 1010905AZ0101 1011915B[0101 1100925C\0101 1101935D]0101 1110945E^0101 1111955F_0110 00009660`0110 00019761a0110 00109862b0110 00119963c0110 010010064d0110 010110165e0110 011010266f0110 011110367g0110 100010468h0110 100110569i0110 10101066Aj0110 10111076Bk0110 11001086Cl0110 11011096Dm0110 11101106En0110 11111116Fo0111 000011270p0111 000111371q0111 001011472r0111 001111573s0111 010011674t0111 010111775u0111 011011876v0111 011111977w0111 100012078x0111 100112179y0111 10101227Az0111 10111237B{0111 11001247C\0111 11011257D}0111 11101267E~字符串方法字符串大小写相关的方法capitalize()：字符串首字母大写

>>> str='hello world'>>> str.capitalize()'Hello world'title()：字符串中单词的首字母大写

>>> str.title()'Hello World'upper()：字符串转换成大写

>>> str.upper()'HELLO WORLD'lower()：字符串转换成小写

>>> str.lower()'hello world'swapcase()：字符串大小写互转

>>> str='HellO wORld'>>> str.swapcase()'hELLo WorLD'字符串排版相关的方法center()：居中对齐

>>> str='hello'>>> str.center(11)' helloo '>>> str.center(11,'_')'___helloo__'ljust()：居左对齐

>>> str.ljust(11,'_')'helloo_____'>>> str.ljust(11)'helloo rjust()：居右对齐

>>> str.rjust(11)' hello'>>> str.rjust(11,'_')'_____hello'expandtabs()：修改tab空格的个数

>>> str='hello\tworld'>>> print (str)hello world>>> str.expandtabs(9)'hello world'>>> str.expandtabs(4)'hello world'zfill()：将字符串扩充到指定长度，前面使用0填充

>>> str.zfill(20)'000000000hello\tworld'>>> 'sad'.zfill(10)'0000000sad'strip()：删除字符串两边(左边lstrip或右边rstrip)的指定字符（默认为空格和换行符）

>>> str=' hello world '>>> str.strip()'hello world'>>> str.lstrip()'hello world '>>> str.rstrip()' hello world'>>> str='hello,world'>>> str.strip('h')'ello,world'>>> str.strip('[held]')'o,wor'字符串查找相关的方法startswith(prefix[,start[,end]])/endswith(suffix[,start[,end]]) 判断是否以特定字符串开头或者结尾

>>> str='hello,world'>>> str.startswith('hello')True>>> str.startswith('hello',0,5)True>>> str.startswith('hello',1,5)False>>> str.endswith('rld',8)True>>> str.endswith('rld',9)False>>> str.endswith('rld',8,11)Truecount(sub[,start[,end]])：相应字符串在文本中的个数

>>> str='hello,world'>>> str.count('l')3>>> str.count('ll')1find/rfind()：分别从字符串前后开始查找第一个匹配到的字符串的位置,找不到就返回-1

str='hello,world'>>> str.find('l')2>>> str.rfind('l')9index/rindex()：与find方法类似，但是找不到会报错

>>> str.index('l')2>>> str.rindex('l')9>>> str.index('a')Traceback (most recent call last):File "<stdin>", line 1, in <module>ValueError: substring not found>>> str.find('a')replace(old,new[,count])：替换字符串，count代表替换个数

>>> str.replace('l','L')'heLLo,worLd'>>> str.replace('l','L',1)'heLlo,world'格式判断相关方法isalpha() ：判断是否是字母isdigit()：判断是否是数字isalnum()：判断是否是数字和字母islower()：判断是否有字母，且字母为小写字母isupper()：判断是否有字幕，且字母为大写字母isspace()：判断是不是只有空格和换行符号istitle()：判断字符串每个单词的首字母是否大写isdecimal()：判断是不是数字isnumeric()：判断是不是数字isidentifier()：判断字符能否成为标识符isprintable()：判断字符是否全部能打印的

isdigit、isdecimal、isnumeric三者的区别
isdigit()
True: Unicode数字，byte数字（单字节），全角数字（双字节），罗马数字
False: 汉字数字
Error: 无
isdecimal()
True: Unicode数字，，全角数字（双字节）
False: 罗马数字，汉字数字
Error: byte数字（单字节）
isnumeric()
True: Unicode数字，全角数字（双字节），罗马数字，汉字数字
False: 无
Error: byte数字（单字节）
字符串分隔split([sep[,maxsplit]])/rsplit([sep[,maxsplit]])：分别从左右按照sep字符串分隔，最多分隔maxsplit此，默认无数次
>> str='asd,fgh,jkl'>> str.split(',')['asd', 'fgh', 'jkl']>> str.rsplit(',',1)['asd,fgh', 'jkl']splitlines()以\n或者\r或者\n\r分隔
>> str='asd\nfgh\njkl'>> str.splitlines()['asd', 'fgh', 'jkl']partition(sep)：将分隔符也作为一个元素列出来
>> 'http://www.baidu.com'.partition('://')('http', '://', 'www.baidu.com')字符串其他方法join()：以特定的分隔符将字符串分隔
>> str='asdfg'>> '-'.join(str)'a-s-d-f-g'

字符串格式化输出

python字符串格式化输出的三种方式

使用字符串格式格式化操作符——百分号%使用字符串方法 format使用 f-strings进行字符串格式化使用%进行格式化

这种格式化表达式类似于C语言

格式化操作符（%）说明s获取传入对象的str方法的返回值，并将其格式化到指定位置r与s一样，但输出方式是repr方式，而不是strc整数：将数字转换成其unicode对应的值，10进制范围为 0<=i<=1114111（py27则只支持0-255）；字符：将字符添加到指定位置d有符号十进制（整数），将整数、浮点数转换成十进制表示，并将其格式化到指定位置i有符号整数u无符号整数o将整数转换成八进制表示，并将其格式化到指定位置x将整数转换成十六进制表示，并将其格式化到指定位置X与x一样，A-F是大写e浮点指数，将整数、浮点数转换成科学计数法，并将其格式化到指定位置（小写e）E与e一样，E为大写f将整数、浮点数转换成浮点数表示，并将其格式化到指定位置（默认保留小数点后6位）F浮点数十进制g浮点e或f，自动调整将整数、浮点数转换成浮点型或科学计数法表示G浮点E或F，自动调整将整数、浮点数转换成浮点型或科学计数法表示%当字符串中存在格式化标志时，需要用 %%表示一个百分号

注：Python中百分号格式化是不存在自动将整数转换成二进制表示的方式

举例

>>> "%s|%r|%c" %("this is str","this is repr","C")"this is str|'this is repr'|C">>> "%d|%i|%o|%x|%X|" %(3,5,12,13,14)'3|5|14|d|E|'>>> "%e|%E|%f|%F|%g|%G" %(1.5E3,1.5e3,13.5,13.5,1.5e13,13.5E15)'1.500000e+03|1.500000E+03|13.500000|13.500000|1.5e+13|1.35E+16'>>> "%(string)-10s"%({'string':'1'})'1 >>> "%(float)+10.2F"%({'float':3.1})' +3.10'>>> "%(float)-10.2f"%({'float':3.1})'3.10 '使用format方法

语法：{}.format(value)
参数:
（value):可以是整数，浮点数，字符串，字符甚至变量。
Returntype：返回一个格式化字符串，其值在占位符位置作为参数传递。

#位置参数>>> username='wanger'>>> password=123456>>> print ("{}'s password is {}".format(username,password))wanger's password is 123456 >>> username='wanger'>>> password=123456>>> print ("{1}'s password is {0}".format(password,username))wanger's password is 123456 #下标参数>>> si=['KB','MB','GB','TB','PB','EB','ZB','YB']>>> '1000{0[0]}=1{0[1]}'.format(si)'1000KB=1MB'#浮点数精度>>> '{:.4f}'.format(3.1415926)'3.1416'>>> '{:>10.4f}'.format(3.1415926)' 3.1416'>>> 'this is a test {t[0]}'.format(t='hello')'this is a test h'>>> 'this is a test {t[1]}'.format(t='hello')'this is a test e'#使用模块作为参数>>> import sys>>> sys.platform'win32'>>> "{0.platform}".format(sys)'win32'>>> 'my laptop platform is {s}'.format(s=sys.platform)'my laptop platform is win32'>>> 'my laptop platform is (s.platform)'.format(s=sys)'my laptop platform is (s.platform)'#关键字参数>>> 'my name is {name} ,age is {age}'.format(name='wanger',age='25')'my name is wanger ,age is 25

当占位符{}为空时，Python将按顺序替换通过str.format（）传递的值。
str.format（）方法中存在的值本质上是元组数据类型，元组中包含的每个单独值都可以通过索引号调用，索引号以索引号0开头。
第三段代码的变量si是一个列表，{0}就代表format()方法的第一个参数，那么{0[0]}就代表列表的第一个元素，{0[1]}就代表列表的第二个元素
这个例子说明格式说明符可以通过利用（类似） Python 的语法
访问到对象的元素或属性。这就叫做复合字段名 (compound field names) 。

以下复合字段名都是“ 有效的 ” 。
• 使用列表作为参数，并且通过下标索引来访问其元素（跟上
一例类似）
• 使用字典作为参数，并且通过键来访问其值
• 使用模块作为参数，并且通过名字来访问其变量及函数
• 使用类的实例作为参数，并且通过名字来访问其方法和属性
• 以上方法的任意组合

format_spec参数

表达式：format_spec ::= [[fill]align][sign][#][0][width][,][.precision][type]

fill和align以及后面的width相当于str方法中的center，ljust,rjust

>>> '{:+^15}'.format('start')'+++++start+++++'>>> '{:+^15}'.format('end')'++++++end++++++'>>> '{:*<15}'.format('end')'end************'>>> '{:*>15}'.format('start')'**********start'>>> '{:=+20}'.format(10)'+ 10'>>> print("{:=10}\n{:=+20}\n{:-^10}\n{:=-13}".format(10,10,'-',-15)) 10+ 10----------- 15#只有在数字显示里，显示二进制数，八进制数，十六进制数的时候，需要显示前面的0b,0o,0x的时候才会用到

>>> "{0:8b},{0:8o},{0:8x}".format(10)' 1010, 12, a'>>> "{0:b},{0:o},{0:x}".format(10)'1010,12,a'>>> ("{0:#8b},{0:#8o},{0:#8x}".format(10))' 0b1010, 0o12, 0xa',符号是表示数字时每三位中间加，

>>> '{:,}'.format(100000000000)'100,000,000,000'0是固定宽度前面补0.precision ::= integer(精度显示)

>>> '{:010.5}'.format(3.1415926)'00003.1416'type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" (跟之前使用%表示的相等)
- 当为字符时：使用s，默认就是s
- 当为整数时：b,o,x和X是二进制、八进制、十六进制，c是数字按Unicode转换成字符，d是正常十进制，默认就是d。也可以使用n来代替d

>>> "{0:d},{0:b},{0:o},{0:x},{0:X}".format(10)'10,1010,12,a,A'

- 为浮点数时：e和E是指数，f和F是浮点数。g和G是同一的，也可以使用n来代替g, %是显示百分比

>>> "{0:e},{0:F},{0:g},{0:n},{0:%}".format(10.3)'1.030000e+01,10.300000,10.3,10.3,1030.000000%'使用f-strings方法进行格式化

f-strings也称为“格式化字符串文字”，f字符串是f在开头有一个字符串文字，其中以 {} 包含的表达式会进行值替换。表达式在运行时进行评估，然后使用format协议进行格式化。其中以 {} 包含的表达式会进行值替换。

特点代码简洁，没有多余的引号括号{}里面的变量，可以是字符串类型，也可以是整型、浮点型，或者是复杂类型，比如数组、词典等，会自动转换成成字符串形式。括号{}里面还可以是函数，比如 f'数组a的长度为:{len(a)}'。一句话，只要是位于 {} 中的，都会当做 python 代码来执行。但里面只能写表达式，不能写执行语句如{a=2}之类的。f-string在本质上并不是字符串常量，而是一个在运行时运算求值的表达式，速度非常快

简单举例

>>> name='wanger'>>> age=25>>> f"hello,I'm {name},my age {age} ""hello,I'm wanger,my age 25 "#也可以使用大写F>>> F"hello, I'm {name},my age {age} ""hello, I'm wanger,my age 25 "

当然也可以进行简单的计算

>>> f"{2*3}"'6'

也可以调用函数

>>> def test(input):... return input.lower()...>>> name='WangEr'>>> f"{test(name)} is funny"'wanger is funny'

还可以选择直接调用方法

>>> name='WangEr'>>> f"{name.lower()} is funny"'wanger is funny'

在使用字典的时候。如果要为字典的键使用单引号，请记住确保对包含键的f字符串使用双引号。

comedian = {'name': 'wanger', 'age': 25}f"The comedian is {comedian['name']}, aged {comedian['age']}."'The comedian is wanger, aged 25.'使用字符串的场景

使用多个界定符分隔字符串
split只能使用单一字符串，如果要使用多个分隔符的话，就要用到正则表达式模块了

>>> str='asd,dfg;zxc ert uio'>>> import re>>> re.split(r'[;,\s]\s*',str)['asd', 'dfg', 'zxc', 'ert', 'uio']

[]表示里面字符任意匹配。[;, ]表示；或者，或者空格，\s*表示任意个前面字符

字符串开头或结尾匹配
比如要看一个地址是否是http://或者ftp://开头
或者查看文件后缀是不是TXT格式
可以这样查看

>>> url='http://www.baidu.com'>>> ftp='ftp://www.baidu.com'>>> url.startswith(('http://','ftp://'))True>>> txt='ziyuan.txt'>>> txt.endswith('txt')True>>> url[0:7]=="http://" or url[0:6]=="ftp://"True>>> txt[7:10]=="txt"True用shell通配符
我们还可以使用shell通配符来检查文件的结尾，这需要用到fnmatch模块
fnmatch不区分大小写，fnmatchcase是区分大小写的

>>> from fnmatch import fnmatch,fnmatchcase>>> fnmatch('log.txt','*.txt')True>>> fnmatch('log.TXT','*.txt')True>>> fnmatchcase('log.TXT','*.txt')False>>> fnmatchcase('log.TXT','*.TXT')True

匹配和搜索特定格式的文本
普通匹配可以使用find方法，如果是特定格式的话还是会用到正则模块

>>> date1='2018/10/24'>>> date2='2018/12/21'>>> date3='2018-12-05'>>> def isdate(date):... if re.match(r'\d+/\d+/\d+',date):... print ('match OK')... else:... print ('not match')>>> isdate(date1)match OK>>> isdate(date2)match OK>>> isdate(date3)not match

在正则模块re中\d表示单个数字，+表示一个或多个前面的字段

搜索和替换特定的文本格式
普通的匹配可以使用replace方法，如果匹配特定格式，还是要用正则模块re

>>> import re>>> date='today is 13/12/2018'>>> re.sub(r'(\d+)/(\d+)/(\d+)',r'\3,-\2-\1',date)'today is 2018-12-13'>>> datepat=re.compile(r'(\d+)/(\d+)/(\d+)') #为了防止每次都要定义匹配模式，可以在这里定义一个匹配的变量，以后匹配直接调用这个变量>>> datepat.sub(r'\3,-\2-\1',date)'today is 2018-12-13'>>> date='yestory is 12/12/2018,today is 13/12/2018'>>> datepat.subn(r'\3,-\2-\1',date)('yestory is 2018-12-12,today is 2018-12-13', 2)

\1,\2,\3分别代表前面匹配模式中的第一个括号匹配到的，第二个括号匹配到的，第三个括号匹配到的，使用subn方法可以看到匹配到几次

忽略大小写的搜索替换
如果要忽略大小写还是要用到re模块，需要用到的是re的IGNORECASE方法

>>> import re>>> re.findall('replace','Replace,replace,REPLACE')['replace']>>> re.findall('replace','Replace,replace,REPLACE',flags=re.IGNORECASE)['Replace', 'replace', 'REPLACE']>>> str='Replace is the same as REPLACE'>>> re.sub('replace','WORD',str,flags=re.IGNORECASE)'WORD is the same as WORD'最短匹配模式
用正则表达式匹配某个文本模式，而他找到的是最长匹配，如果要匹配最短字符，可以用下面的方法

>>> strpat=re.compile(r'\"(.*)\"')>>> text='this is my "name"'>>> strpat.findall(text)['name']>>> text='this is my "name" and this is my "age"'>>> strpat.findall(text)['name" and this is my "age']>>> strpat=re.compile(r'\"(.*?)"')>>> strpat.findall(text)['name', 'age']