根據您添加的代碼示例,您嘗試回答的問題是如何替換' '
和', '
,以獲取pandas dataframe
中的每一行。
這裏有一個辦法做到這一點:
import pandas as pd
sampletxt = pd.read_csv('teste.csv' , header = None)
output = sampletxt.replace('\s+', ', ', regex=True)
print(output)
例子:
In [24]: l
Out[24]:
['input phrase of the file to exemplify',
'input phrase of the file to exemplify 2',
'input phrase of the file to exemplify 4']
In [25]: sampletxt = pd.DataFrame(l)
In [26]: sampletxt
Out[26]:
0
0 input phrase of the file to exemplify
1 input phrase of the file to exemplify 2
2 input phrase of the file to exemplify 4
In [27]: output = sampletxt.replace('\s+', ', ', regex=True)
In [28]: output
Out[28]:
0
0 input, phrase, of, the, file, to, exemplify
1 input, phrase, of, the, file, to, exemplify, 2
2 input, phrase, of, the, file, to, exemplify, 4
OLD答案
您還可以使用re.sub(..)
,如下所示:
In [3]: import re
In [4]: st = "input phrase of the file to exemplify"
In [5]: re.sub(' ',', ', st)
Out[5]: 'input, phrase, of, the, file, to, exemplify'
re.sub(...)
快於str.replace(..)
In [6]: timeit re.sub(' ',', ', st)
100000 loops, best of 3: 1.74 µs per loop
In [7]: timeit st.replace(' ',', ')
1000000 loops, best of 3: 257 ns per loop
如果你有多個空格分隔兩個單詞的基礎上,str.replace(' ',',')
將是錯誤的輸出所有的答案。例如
In [15]: st
Out[15]: 'input phrase of the file to exemplify'
In [16]: re.sub(' ',', ', st)
Out[16]: 'input, phrase, of, the, file, to, , exemplify'
In [17]: st.replace(' ',', ')
Out[17]: 'input, phrase, of, the, file, to, , exemplify'
爲了解決這個問題,你需要使用符合一個或多個空格如下正則表達式(正則表達式):
In [22]: st
Out[22]: 'input phrase of the file to exemplify'
In [23]: re.sub('\s+', ', ', st)
Out[23]: 'input, phrase, of, the, file, to, exemplify'
的'替換()'功能,這表現在所有的答案,是你在找什麼。但是,請注意,如果單詞之間有多個空格,則可能會收到不良結果。例如,'a b c'.replace('',',')'返回a,b,c''。如果這對你來說不是問題,那麼你很好。 – Reti43