2016-03-04 88 views
-3
import re 

fp1 = open('stopwords.txt','r') 
stop = fp1.readline() 
#print(stop) 

def passstopwords(getstopwords): 
    stopword = getstopwords 
    #print(stopword) 
    fp = open('read1.txt', 'r') 
    line = fp.readline 
    while line: 
     line = fp.readline() 
     print(getstopwords) 
     line = re.sub(getstopwords, r'', line) 
     print(line) 
    fp.close() 
    return; 

passstopwords(stop) 

我得到的輸出是同一行,沒有任何更改。但是,如果我寫'somestring'而不是'getstopwords',它工作正常。使用python替換字符串中的單詞

+0

請更正該函數的縮進。 – Kasramvd

+1

是的,請。給定的腳本不會運行,因爲它沒有正確縮進。另外,你用什麼輸入? –

+1

請使用更好的參數名稱。這會讓你的代碼更容易理解。 'getstopswords'感覺像一個函數,我會打電話來獲取停用詞。如果其模式使用'stop_word_pattern'。 –

回答

0

我Inputfile中是SAMPLE.TXT包含以下內容

Lorem Ipsum is simply dummy text of the printing and typesetting industry. 
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, 
when an unknown printer took a galley of type and scrambled it to make a type 
specimen book. It has survived not only five centuries, but also the leap 
into electronic typesetting, remaining essentially unchanged. 
It was popularised in the 1960s with the release of Letraset sheets con 

的stopWords.txt中是

Lorem 
simply 
book 
printing 

的代碼是:

import re 
fp1 = open('stopwords.txt','r') 
lisOfStopWords = fp1.readlines() 
fp1.close() 

def passstopwords(lisOfStopWords): 
    stopwords = "|".join([x.strip() for x in lisOfStopWords]) 
    print("Stopwords:" + stopwords) 
    fp = open('SAMPLE.TXT', 'r') 
    stopWordPattern = r"%(stopwords)s" % {'stopwords' : stopwords} 
    for line in fp.readlines(): 
     print("ORIGINAL:" + line.strip()) 
     line = re.sub(stopWordPattern, r'', line) 
     print("REPLACED:"+ line) 
    fp.close() 
    return; 

passstopwords(lisOfStopWords) 

輸出是:

Stopwords:Lorem|simply|book|printing 
ORIGINAL:Lorem Ipsum is simply dummy text of the printing and typesetting industry. 
REPLACED: Ipsum is dummy text of the and typesetting industry. 

ORIGINAL:Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, 
REPLACED: Ipsum has been the industry's standard dummy text ever since the 1500s, 

ORIGINAL:when an unknown printer took a galley of type and scrambled it to make a type 
REPLACED:when an unknown printer took a galley of type and scrambled it to make a type 

ORIGINAL:specimen book. It has survived not only five centuries, but also the leap 
REPLACED:specimen . It has survived not only five centuries, but also the leap 

ORIGINAL:into electronic typesetting, remaining essentially unchanged. 
REPLACED:into electronic typesetting, remaining essentially unchanged. 

ORIGINAL:It was popularised in the 1960s with the release of Letraset sheets con 
REPLACED:It was popularised in the 1960s with the release of Letraset sheets con 

正如你看到的Loremsimplybookprinting將被替換。

+0

謝謝。但問題是我通過名爲「stopword」的變量傳遞值 –

+0

@NameetNameet可以請求粘貼您的完整代碼嗎? – Kordi

+0

我編輯了我的早期程序。 –

相關問題