Python 怎样将一个字符串中制定的第几个空格替换为逗号(不能全部替换)?
问题背景:
在做自然语言处理的小项目——拼写纠错,需要将一句英文中的单词序列抽出来,也就是去除里面的逗号和句号,但是这个序列处理完之后,需要还原回原来的句子,也就是把逗号和句号再加上去,求问这个怎样实现,用正则可以吗?
具体问题:
举例:英文句子:I live python, love you. 抽出单词序列的代码是这样写的:
line = 'I live python, love you.'
str = line.replace(', ', ' ')
words = [x for x in str.strip().strip('.').split(' ')
得到的序列为: I live python love you(列表形式)
然后经过拼写纠错处理:I love python love you
接下来就出现问题了!!!
我想把它再转换回之前的原句子, 怎样将‘,’和‘.’再填回去??
因为一次只处理一个句子,所以添加句号的话放在最后就行了,但是逗号怎么添加??
我的想法:
我的想法在之前使用正则将逗号替换为空格的时候, 记录下来这个句子替换了几个逗号,怎么替换在了第几个空格处。在最后使用正则找到第几个空格,然后再将它替换为逗号, 这样可行吗?我的正则不太熟练,求高人指点一下???
-
完全按照你的要求,把单词和符号分开,words是单词列表,随便你修改,字数多少都可以。符号可以是任意标点。修改好合并插入。
import re s = "The hunt for the puma began in a small village where a woman picking blackberries saw 'a large cat' only five yards away from her. It immediately ran away when she saw it, and experts confirmed that a puma will not attack a human being unless it is cornered. The search proved difficult, for the puma was often observed at one place in the morning and at another place twenty miles away in the evening. Wherever it went, it left behind it a trail of dead deer and small animals like rabbits. Paw prints were seen in a number of places and puma fur was found clinging to bushes. Several people complained of \"cat-like noises\" at night and a businessman on a fishing trip saw the puma up a tree. The experts were now fully convinced that the animal was a puma, but where had it come from? As no pumas had been reported missing from any zoo in the country, this one must have been in the possession of a private collector and somehow managed to escape. The hunt went on for several weeks, but the puma was not caught. It is disturbing to think that a dangerous wild animal is still at large in the quiet countryside." symb = re.split(r'[a-zA-Z]+', s) words = re.split(r'[^a-zA-Z]+', s) words[2] = "changed" #在这里修改 result = "".join(list(map(lambda x: x[0] + x[1], zip(symb, words)))) print(result)
运行结果:
The hunt changed the puma began in a small village where a woman picking blackberries saw 'a large cat' only five yards away from her. It immediately ran away when she saw it, and experts confirmed that a puma will not attack a human being unless it is cornered. The search proved difficult, for the puma was often observed at one place in the morning and at another place twenty miles away in the evening. Wherever it went, it left behind it a trail of dead deer and small animals like rabbits. Paw prints were seen in a number of places and puma fur was found clinging to bushes. Several people complained of "cat-like noises" at night and a businessman on a fishing trip saw the puma up a tree. The experts were now fully convinced that the animal was a puma, but where had it come from? As no pumas had been reported missing from any zoo in the country, this one must have been in the possession of a private collector and somehow managed to escape. The hunt went on for several weeks, but the puma was not caught. It is disturbing to think that a dangerous wild animal is still at large in the quiet countryside.
可以看到,第三个单词for替换为changed,标点符号没有变化
-
如下:
line = 'I live python, love you.' line=line.replace(',',' , ') #在逗号两边加空格 line=line.replace('.',' .') #在句号前边加空格 words=line.split()
对单词进行修正
res=" ".join(words) res=res.replace(' , ',',') #将逗号两边的空格删除 res=res.replace(' .','.') #将句号前边的空格删除
res就是所要的结果
发表回复