2013-02-07 115 views
-3

我想根據另一個名爲list.txt的文件的內容從名爲data.txt的文件中提取數據。我需要從data.txt中提取$ 11,如果在data.txt中提供了 $ 1和$ 2的list.txt。 list.txt和$ 4的data.txt是相同的。從文本文件中提取列

contents of list.txt 

2aas p0877 
asds k9876 
651a kl098 

contents of data.txt 

2aas F DNK_ECTHA Q9XT6 12-208 192.0 250.0 198.0 104.00 78.80 99.0 108.0 97 5 
asds G DNK_DROME k9876 12-209 192.0 250.0 197.0 100.00 78.80 87.0 100.0 97 6 
1ot3 H DNK_DROME Q9bt6 11-208 142.0 256.0 194.0 106.00 78.80 97.0 100.0 97 5 
651a H DNK_ECTHA kl098 10-208 192.0 259.0 197.0 100.00 78.80 98.0 100.0 99 5 
2aas H pyp_DROME p0877 12-208 192.0 250.0 130.0 102.00 78.80 67.0 103.0 97 9 

desired output 

2aas p0877 67.0 
asds k9876 87.0 
651a kl098 98.0 
+4

如果你表現出你已經試過這將是有益的。然後,我們可以提供更有針對性的建議,而且看起來好像你只是要求別人爲你做你的工作。 –

+0

python + awk = noway – Denis

回答

1

我假設data.txt包含您希望「查詢」利用list.txt

項這裏有一個快速和骯髒的方式使用python數據列表:

# Create a data dict using data.txt 
with open("data.txt") as f: 
    # create generator of entries using non-empty lines in file 
    entries = (line.split() for line in f if line.strip()) 
    # create dict using ($1,$4) as key and $11 as value 
    data = dict(((d[0], d[3]), d[10]) for d in entries) 

# for each entry in list.txt, print out matching data 
with open("list.txt") as f: 
    entries = (tuple(line.split()) for line in f if line.strip()) 
    for e in entries: 
    if e in data: 
     print e[0], e[1], data[e] 

運行的是在與文件相同的目錄中:

[[email protected]]$ python extract.py 
2aas p0877 67.0 
asds k9876 87.0 
651a kl098 98.0 

,或者對於awk解決方案:

[[email protected]]$ awk 'FILENAME==ARGV[1] {pair[$1" "$4] = $11; next} ($1" "$2 in pair) {printf("%s\t%s\t%s\n", $1, $2, pair[$1" "$2])}' data.txt list.txt 
2aas p0877 67.0 
asds k9876 87.0 
651a kl098 98.0