正则表达式常见匹配内容-觅稀奇MeXiQi.COM

1、匹配中文字符

示例代码：

importre

s='人生苦短，我学python！'

#s=s.encode('utf-8').decode('utf-8')

#s=s.decode('utf-8').encode('utf-8')

#匹配中文

#方法一

s_ch=''

foriins:

if'\u4e00'<=i<='\u9fa5':

s_ch=i

print(s_ch)

#方法二

aa=re.compile('[\u4e00-\u9fa5]')

bb=aa.findall(s)

print(bb)

cc=''.join(bb)

print(cc)

#方法三

dd=re.findall('[\u4e00-\u9fa5]',s)

print(dd)

ff=''.join(dd)

print(ff)

运行结果：

2、匹配双字节字符（包括汉字在内）

示例代码：

importre

s='人生苦短，我学python！'

#s=s.encode('utf-8').decode('utf-8')

#s=s.decode('utf-8').encode('utf-8')

#匹配双字节字符（包括汉字在内）

#方法一

aa=re.compile('[^\x00-\xff]')

bb=aa.findall(s)

print(bb)

cc=''.join(bb)

print(cc)

#方法二

dd=re.findall('[^\x00-\xff]',s)

print(dd)

ff=''.join(dd)

print(ff)

运行结果：

3、匹配Email地址

示例代码：

importre

s="人生苦短，我学python！myemailis:123456789@qq.com"

#匹配Email地址

#方法一

aa=re.compile("[\w!#$%&'*/=?^_`{|}~-](?:\.[\w!#$%&'*/=?^_`{|}~-])*@(?:[\w](?:[\w-]*[\w])?\.)[\w](?:[\w-]*[\w])?")

bb=aa.findall(s)

print(bb)

#方法二

cc=re.findall("[\w!#$%&'*/=?^_`{|}~-](?:\.[\w!#$%&'*/=?^_`{|}~-])*@(?:[\w](?:[\w-]*[\w])?\.)[\w](?:[\w-]*[\w])?",

print(cc)

运行结果：

4、匹配网址URL

示例代码：

importre

s="人生苦短，我学python！mywebsiteis:https://www.百度.com"

#匹配网址URL

#方法一

aa=re.compile("[a-zA-z]://[^\s]*")

bb=aa.findall(s)

print(bb)

#方法二

cc=re.findall("[a-zA-z]://[^\s]*",

print(cc)

运行结果：

5、匹配网站title

示例代码：

importrequests

importre

url='https://pz.wendu.com/'

response=requests.get(url)

data=response.text

#print(data)

res=re.findall(r'<title>(.*?)</title>',data)[0]

print(res)

运行结果：

6、匹配国内电话号码

示例代码：

importre

s="人生苦短，我学python！myphoneis:0101-8758521"

#匹配国内电话号码

#方法一

aa=re.compile("\d{3}-\d{8}|\d{4}-\d{7,8}")

bb=aa.findall(s)

print(bb)

#方法二

cc=re.findall("\d{3}-\d{8}|\d{4}-\d{7,8}",

print(cc)

运行结果：

7、匹配手机号

示例代码：

importre

s1='num:12345678900,name:dgw,phone:19876543210,age:25'

s2='num:12345678900,name:dgw,phone:119876543210,age:25'

aa=re.compile(r'(?<=\D)1[3456789]\d{9}',re.S)

bb=aa.findall(s1)

print(bb)

cc=re.compile(r'(?<=\D)1[3456789]\d{9}',re.S)

dd=cc.findall(s2)

print(dd)

ee=re.compile(r'1[3456789]\d{9}',re.S)

ff=ee.findall(s2)

print(ff)

gg=re.compile(r'(?<=\d)1[3456789]\d{9}',re.S)

hh=gg.findall(s2)

print(hh)

运行结果：

8、判断一个字符串中是否包含数值

示例代码：

importre

defhas_number(string):

pattern=re.compile(r'\d')

returnbool(pattern.search(string))

#测试

print(has_number('hello123'))#True

print(has_number('hello'))#False

其中，\d表示匹配任意数字，表示匹配一个或多个数字。search方法返回第一个匹配的对象，如果匹配成功，则返回True。

运行结果;

9、判断一个字符串中是否包含非标准字符

示例代码：

importre

defhas_nonstandard_char(string):

pattern=re.compile(r'[&@#]')

returnbool(pattern.search(string))

#测试

print(has_nonstandard_char('hello#world'))#True

print(has_nonstandard_char('hello,world'))#False

其中，[&@#]表示匹配字符集中的任意一个字符，即匹配&、@或#中的任意一个。search方法返回第一个匹配的对象，如果匹配成功，则返回True。

运行结果：