2015-01-26 88 views
2

我在文本文件中有大量的字符串,我想按如下方式在每個字符串周圍放置倒排引號。把倒排引號圍繞使用索引的字符串,python

文本文件包含這麼多的線路,如:

{created_at:2014年7月7日,文章:土耳其政府已經 繪製的路線圖取締庫爾德工人黨武裝的回報,誰 爲了在土耳其東南部開拓出一個 獨立的國家採取了對土耳其國家武器。}

,我想插入倒報價周圍的日期和文章內容是這樣的...

{created_at:「2014年7月07,」文章:「土耳其政府已經 繪製的取締庫爾德工人黨的武裝分子返回的路線圖誰 爲了開拓拿起武器反對土耳其政府使用蟒蛇指數法在土耳其東南部「}一個 獨立的狀態..

但我得到的結果作爲{created_at : "July 07", 2014, article : "The Turkish government has drawn a roadmap for the return of militants of the banned PKK, who took up arms against the Turkish state in order to carve out a separate state in southeastern Turkey} ..因此它被放置引號錯了位置。

這裏是我的代碼:此粗葉文件的讀/寫你

f = open("textfile.txt", "r") 
for item in f: 
    first_comma_pos = item.find(",") 
    print first_comma_pos 
    first_colon_pos = item.find(" : ") 
    print first_colon_pos 
    second_comma_pos = item.find(",", first_comma_pos) 
    second_colon_pos = item.find(" : ", second_comma_pos) 
    print second_colon_pos 
    item = (item[:first_colon_pos+3] + 
     '"' + item[first_colon_pos+3:second_comma_pos] + '"' + 
     item[second_comma_pos:second_colon_pos+3] + 
     '"' + item[second_colon_pos+3:-1] + '"\n') 
    print item 
    saveFile= open("result.txt", "a") 
    saveFile.write(item) 
    saveFile.write('\n') 
    saveFile.close() 
+2

......,問題是......? – 2015-01-26 19:22:01

+3

你的問題有兩個問題:1)你沒有說明問題是什麼,2)這可能是一個[XY問題](http://meta.stackexchange.com/questions/66377/what-is-the -xy-problem)的問題。 – Roberto 2015-01-26 19:22:33

+0

更新了這個問題,我沒有得到任何錯誤,但是我的代碼將倒​​置的引號放在錯誤的位置,如問題中所示。 – 2015-01-26 19:26:56

回答

2

你是相當準確的,但2個缺陷: -

  • 你,你沒有額外增加了指數
  • 用於查找第一個逗號本身的位置find你結束"是你{之外。因此,曾經被扔出去的地方

編輯的代碼

f = open("textfile.txt", "r") 
for item in f: 
    first_comma_pos = item.find(",") 
    print item 
    print first_comma_pos 
    first_colon_pos = item.find(" : ") 
    print first_colon_pos 
    second_comma_pos = item.find(",", first_comma_pos+1) # Note change 
    second_colon_pos = item.find(" : ", second_comma_pos) 
    print second_colon_pos 
    item = (item[:first_colon_pos+3] + 
     '"' + item[first_colon_pos+3:second_comma_pos] + '"' + 
     item[second_comma_pos:second_colon_pos+3] + 
     '"' + item[second_colon_pos+3:-2] + '"}\n') # Note change 
    print item 
    saveFile= open("result.txt", "a") 
    saveFile.write(item) 
    saveFile.write('\n') 
    saveFile.close() 

輸出

{created_at: 「2014年7月07,」 文章:「土耳其政府已經繪製的被取締的庫爾德工人黨的武裝分子的路線圖,他們拿起武器對土耳其國家,以便在土耳其東南部劃出一個單獨的州。「}

+0

幫助糾正他的腳本的好工作...可能更具教育價值,那麼我的答案(+1) – 2015-01-26 19:43:42

2

漂亮哈克但

fix_json.py

import re,json 
s = """{created_at : July 07, 2014, article : The Turkish government has drawn a roadmap for the return of militants of the banned PKK, who took up arms against the Turkish state in order to carve out a separate state in southeastern Turkey.}""" 
parts0 = s.split(":") 
data = {} 
for lhs,rhs in zip(parts0,parts0[1:]): 
    #: assume that the word directly preceding the ":" is the key 
    #: word defined by regex below 
    key = re.sub("[^a-zA-Z_]","",lhs.rsplit(",",1)[-1]) 
    value = rhs.rsplit(",",1)[0] 
    data[key] = value 

print json.dumps(data) 

.. 。以及根據您的示例對您的數據做出一些假設

+1

這個解決方案讓我非常難過。 – 2015-01-26 19:37:18

+2

我不同意......但給了提供的信息,我認爲這是OP正在尋找....真正的答案是調整其他腳本輸出有效數據(json或任何其他序列化的數據),而不是編寫自己的序列化例程 – 2015-01-26 19:38:56

+2

@JonKiparsky不要感到難過,這裏是[something](http://xkcd.com/)爲你加油! – 2015-01-26 19:41:28

2

如果數據始終是格式,可以從右邊記號化的點點滴滴,如:

s = """{created_at : July 07, 2014, article : The Turkish government has drawn a roadmap for the return of militants of the banned PKK, who took up arms against the Turkish state in order to carve out a separate state in southeastern Turkey.}""" 

created_at, a_sep, article_text = s.strip('{}').rpartition('article :') 
start, c_sep, created_date = created_at.rpartition('created_at :') 
new_string = '{{{} "{}", {} "{}"}}'.format(
    c_sep, 
    created_date.strip(' ,'), 
    a_sep, 
    article_text.strip() 
) 

# {created_at : "July 07, 2014", article : "The Turkish government has drawn a roadmap for the return of militants of the banned PKK, who took up arms against the Turkish state in order to carve out a separate state in southeastern Turkey."}