2013-12-10 51 views
2

在標準輸入,我提供以下文件:蟒蛇代替單詞按條件

#123  595739778  "neutral"  Won the match #getin 
    #164  595730008  "neutral"  Good girl 

數據2號看起來像這樣:

labels 1 0 -1 
    -1 0.272653 0.139626 0.587721 
    1 0.0977782 0.0748234 0.827398 

我想看看它在-1數據#2文件,然後用負,1則正,替換0,則中性

以下是我的問題:

  1. 啓動數據#2文件在第二行
  2. 我正面臨着替換的麻煩。我想像下面一樣替換它,但是它顯示了一個錯誤,它期望另外1個參數,但是我已經有2個參數了。
  3. 如果我這樣做,類似下面(注意print語句):

    if binary == "-1": 
        senti = str.replace(senti.strip('"'),"negative") 
    elif binary == "1": 
        senti = str.replace(senti.strip('"'),"positive") 
    elif binary == "0": 
        senti = str.replace(senti.strip('"'),"neutral") 
    print id, "\t", num, "\t", senti, "\t", sent 
    

    ,但如果我這樣做(注意打印),那麼它不會在 '如果條件' 走出去:

    if binary == "-1": 
        senti = str.replace(senti.strip('"'),"negative") 
    elif binary == "1": 
        senti = str.replace(senti.strip('"'),"positive") 
    elif binary == "0": 
        senti = str.replace(senti.strip('"'),"neutral") 
    

    打印ID, 「\ t」 的,NUM, 「\ t」 的,senti, 「\ t」 的,送

如何打印即可。 輸出,我得到: #123 595739778 「中性」 贏得了比賽#getin #164 595730008 「中立」 好女孩

output expected (replace just replaces the negative, positive & neutral as per data# file: 

    #123  595739778  negative  Won the match #getin 
    #164  595730008  positive  Good girl 

錯誤:

Traceback (most recent call last): 
    File "./combine.py", line 17, in <module> 
    senti = str.replace(senti.strip('"'),"negative") 
TypeError: replace() takes at least 2 arguments (1 given) 

這裏是我的代碼:

for line in sys.stdin: 
    (id,num,senti,sent) = re.split("\t+",line.strip()) 
    tweet = re.split("\s+", sent.strip().lower()) 
    f = open("data#2.txt","r") 
    for line1 in f: 
     (binary,rest,rest1,test2) = re.split("\s", line1.strip()) 
     if binary == "-1": 
      senti = str.replace(senti.strip('"'),"negative") 
     elif binary == "1": 
      senti = str.replace(senti.strip('"'),"positive") 
     elif binary == "0": 
      senti = str.replace(senti.strip('"'),"neutral") 
     print id, "\t", num, "\t", senti, "\t", sent 
+0

你可以發佈你收到的錯誤嗎? – qmorgan

+0

@qmorgan檢查我的編輯 – fscore

回答

3

你實際上錯過了一個替換的論點;因爲它是字符串本身的方法,你可以做兩種:

In [72]: str.replace('one','o','1') 
Out[72]: '1ne' 

In [73]: 'one'.replace('o','1') 
Out[73]: '1ne' 

在代碼中,你可能會想,例如

if binary == "-1": 
     senti = senti.strip('"').replace("-1","negative") 

要跳過數據#2文件的第一行,一個選擇是

f = open("data#2.txt","r") 
for line1 in f.readlines()[1:]: # skip the first line 
    #rest of your code here 

編輯:聊天對話後,你想要什麼,我覺得更像是以下幾點:

f = open("data#2.txt","r") 
datalines = f.readlines()[1:] 

count = 0 

for line in sys.stdin: 
    if count == len(datalines): break # kill the loop if we've reached the end 
    (tweetid,num,senti,tweets) = re.split("\t+",line.strip()) 
    tweet = re.split("\s+", tweets.strip().lower()) 
    # grab the right index from our list 
    (binary,rest,rest1,test2) = re.split("\s", datalines[count].strip()) 
    if binary == "-1": 
     sentiment = "negative" 
    elif binary == "1": 
     sentiment = "positive" 
    elif binary == "0": 
     sentiment = "neutral" 
    print tweetid, "\t", num, "\t", sentiment, "\t", tweets 
    count += 1 # add to our counter 
+0

嗨那工作,但檢查#3在我的編輯 – fscore

+0

我無法理解你在這裏說什麼。你能改說嗎? – qmorgan

+0

檢查我的編輯請 – fscore