從文本文件中選擇特定的信息並將它們轉換爲Python中的數組/列表

我有這個項目，我必須在Python中編寫代碼，但對於初學者來說，這是非常困難的。基本上，我從來沒有用python進行編程，並且從昨天開始只開始谷歌學習，所以我想也許你們可以幫助它，因爲我甚至無法開始解決這個問題。從文本文件中選擇特定的信息並將它們轉換爲Python中的數組/列表

我給出一個初始文本文件，讓我們input.txt中調用它，它具有以下列方式如下數據：

Thomas Hales 
12 2345 
45 6780 

Peter Lebones 
10 15430 
11 1230 
23 3450 
John White 
2 12130 
11 32410 
15 4520

有根據他們給出的姓名和電話號碼。出於此問題的目的，左列中的數字僅僅是標識號。右欄中的數字是人們在銀行投資的金額。

我應該採取文本文件中的所有數據，以各種方式操縱它，然後創建一個新的文本文件（所有這些都由python運行的腳本完成），稱爲output.txt 上面的例子中，包含此：

Thomas Hales 45 
Peter Lebones 10 
John White 11

我有這個地步（但它不工作，再加上它是一團糟，我與別人的幫助下，誰也不會知道是不是真的做到了。他在做什麼）：

import sys 
import subprocess 
import re 
import string 


try: 
    fread=open(sys.argv[1]).readlines() 
except IOError: 
    print "There is no file like that!" 
    sys.exit() 
except IndexError: 
    print "There is no argumentum given" 
alpha = string.ascii_letters 
writeout=open("result.txt","w") 
inputarray=fread.readlines() 
for ... in inputarray: # not sure what goes in the "..." part 
    array=inputarray.split('\n') 
for i in range(len(array)-1): 
    if array[i].isalpha(): 
    writeout.write(array[i]+" ") 

fread.close() 
writeout.close()

所以基本上，我給了一個文本文件。那麼我應該爲每個人選擇最高的投資，並將左欄中的數字與最高的投資相關聯。然後，我應該讓腳本製作一個output.txt文件，其中包含每個人的姓名和最高投資的「Id號碼」。

來源

2013-04-12 user1966576

我假設當一行以數字開頭時，我們有一個投資，否則就是一個名字。

每次找到一個名字，寫出來的以前的名稱和最高的投資標識符：

with open(sys.argv[1]) as inputfile, open("result.txt","w") as outputfile: 
    name = None 
    investment_id = max_investment = 0 
    for line in inputfile: 
     if not line.strip(): continue # skip empty lines 

     if not line[:1].isdigit(): # name 
      if name and investment_id: 
       # write previous name 
       outputfile.write('{} {}\n'.format(name, investment_id)) 
      name = line.strip() 
      investment_id = max_investment = 0 

     else: 
      id, investment = [int(i) for i in line.split()] 
      if investment > max_investment: 
       max_investment = investment 
       investment_id = id 

    if name and investment_id: 
     # write last name 
     outputfile.write('{} {}\n'.format(name, investment_id))

對於示例輸入，這寫道：

Thomas Hales 45 
Peter Lebones 10 
John White 11

來源

2013-04-12 15:10:21

非常感謝你的幫助。現在唯一的一點是我嘗試使用execfile（「filename.py」，'input.txt'）運行它，但它說TypeError必須不是str。通常我知道以python filename.py'input.txt'運行會更好，但實際上，我運行的是Windows 7，並且我昨天試圖用python file.py打開任何文件並且沒有任何工作。該文件與python.exe位於相同的路徑，所以不知道有什麼問題。 – user1966576

嗯，我剛剛讀到你不能傳遞參數與execfile ...但我無法得到它打開腳本傳統的python script.py arg1方式，並且子進程不工作或者...我可以運行文件與execfile，但它說它缺少一個參數... – user1966576

'subprocess'應該在Windows上工作得很好。確保你使用文件的* full *路徑。 –

也許這基本配方處理一行一行的文件將幫助你開始右腳。

import sys 

file_name = sys.argv[1] 

# This takes care of closing the file after we're done processing it. 
with open(file_name) as file_handle: 

    # Once we have a file handle, we can iterate over it. 
    for line in file_handle: 

     # This is where your real programming logic will go. 
     # For now, we're just printing the input line. 
     print line,

我懷疑你可能還會發現split()是有用的，因爲它可以讓你打散數字線。例如，你可以試試這個它是如何工作的實驗：

parts = line.split() 
print parts

來源

2013-04-12 15:13:16 FMc

使用Python re模塊可以給你一個很好的發射平臺，只需打破行到的東西，你可以遍歷。

>>> results = re.findall("(\w+) (\w+)",buff,re.S) 
[('Thomas', 'Hales'), ('12', '2345'), ('45', '6780'), ('63', '3210'), ('Peter', 'Lebones'), ('10', '15430'), ('23', '3450'), ('John', 'White'), ('2', '12130'), ('11', '32410'), ('15', '4520')]

來源

2013-04-12 15:13:56 pyInTheSky

with open("input.txt", "r") as inp, open("output.txt", "w") as out: 
     data = inp.readlines() 
     for i in xrange(0, len(data), 4): 
      name = data[i].strip() 
      maxi = 0 
      true_code = 0 
      for item in data[i+1: i+4]: 
       code, bal = item.strip().split(" ") 
       code, bal = int(code), int(bal) 
       if bal >= maxi: 
        maxi = bal 
        true_code = code 
      out.write("%s %s" %(name, true_code))

來源

2013-04-12 15:24:17 Zangetsu

這給出了「需要超過1個值才能解包」的錯誤，但謝謝。 – user1966576

可能是由於代碼與天平之間存在多個空格字符，例如11__1230而不是11_1230 – Zangetsu

從文本文件中選擇特定的信息並將它們轉換爲Python中的數組/列表

回答

相關問題