在每個單詞前添加Virgula

我有一個超過一千行的文本文件，對於某個特定的過程，我需要用逗號分隔這些單詞。我想幫助開發這一算法在python，因爲我開始在語言在每個單詞前添加Virgula

ENTRADA

input phrase of the file to exemplify

賽達

input, phrase, of, the, file, to, exemplify

我想是這樣的：

import pandas as pd 

sampletxt = pd.read_csv('teste.csv' , header = None) 
output = sampletxt.replace(" ", ", ") 

print output

來源

2017-10-11 Rivaldo Hater

的'替換（）'功能，這表現在所有的答案，是你在找什麼。但是，請注意，如果單詞之間有多個空格，則可能會收到不良結果。例如，'a b c'.replace（''，'，'）'返回a，b，c''。如果這對你來說不是問題，那麼你很好。 – Reti43

根據您添加的代碼示例，您嘗試回答的問題是如何替換' '和', '，以獲取pandas dataframe中的每一行。

這裏有一個辦法做到這一點：

import pandas as pd 

sampletxt = pd.read_csv('teste.csv' , header = None) 
output = sampletxt.replace('\s+', ', ', regex=True) 
print(output)

例子：

In [24]: l 
Out[24]: 
['input phrase of the file to exemplify', 
'input phrase of the file to exemplify 2', 
'input phrase of the file to exemplify 4'] 

In [25]: sampletxt = pd.DataFrame(l) 

In [26]: sampletxt 
Out[26]: 
             0 
0 input phrase of the file to exemplify 
1 input phrase of the file to exemplify 2 
2 input phrase of the file to exemplify 4 

In [27]: output = sampletxt.replace('\s+', ', ', regex=True) 

In [28]: output 
Out[28]: 
               0 
0  input, phrase, of, the, file, to, exemplify 
1 input, phrase, of, the, file, to, exemplify, 2 
2 input, phrase, of, the, file, to, exemplify, 4

OLD答案

您還可以使用re.sub(..)，如下所示：

In [3]: import re 

In [4]: st = "input phrase of the file to exemplify" 

In [5]: re.sub(' ',', ', st) 
Out[5]: 'input, phrase, of, the, file, to, exemplify'

re.sub(...)快於str.replace(..)

In [6]: timeit re.sub(' ',', ', st) 
100000 loops, best of 3: 1.74 µs per loop 

In [7]: timeit st.replace(' ',', ') 
1000000 loops, best of 3: 257 ns per loop

如果你有多個空格分隔兩個單詞的基礎上，str.replace(' ',',')將是錯誤的輸出所有的答案。例如

In [15]: st 
Out[15]: 'input phrase of the file to exemplify' 

In [16]: re.sub(' ',', ', st) 
Out[16]: 'input, phrase, of, the, file, to, , exemplify' 

In [17]: st.replace(' ',', ') 
Out[17]: 'input, phrase, of, the, file, to, , exemplify'

爲了解決這個問題，你需要使用符合一個或多個空格如下正則表達式（正則表達式）：

In [22]: st 
Out[22]: 'input phrase of the file to exemplify' 

In [23]: re.sub('\s+', ', ', st) 
Out[23]: 'input, phrase, of, the, file, to, exemplify'

來源

2017-10-11 21:16:33 MedAli

很好的解釋，謝謝。 –

the_list = entrada.split(" ") # take input & make a list of all values, separated by " " 
saida = the_list.join(", ") # join all elements with ", "

來源

2017-10-11 21:08:33 Eqomatic

'split（）'默認在空格處分割。但是，split（）和split（''）'有區別，前者可能更可取。 – Reti43

隨着幾千行，我想它會有點慢分裂和加入每一行..：/ – peyo

我想適應在我的文本文件中使用。 –

你的線可能只是一個字符串，所以你可以使用：

line.replace(" ",", ")

來源

2017-10-11 21:09:26

複雜明智的，你應該直接用逗號替換空間，而不是多次穿越的短語。

the_list = entrada.replace(' ', ', ')

來源

2017-10-11 21:11:07

首先，您需要read your input on line at a time。然後你只需使用str.replace（）這樣：

sampletxt = "input phrase of the file to exemplify" 
output = sampletxt.replace(" ", ", ")

大功告成。

來源

2017-10-11 21:12:22 peyo

在每個單詞前添加Virgula

回答

相關問題