2016-06-12 73 views
-2

file1.txt搜索含有用戶名,即字符串(FILE1.TXT)從FILE2.TXT

tony 
peter 
john 
... 

file2.txt包含用戶的詳細信息,只有一行對每個用戶的詳細信息,即

alice 20160102 1101 abc 
john 20120212 1110 zjc9 
mary 20140405 0100 few3 
peter 20140405 0001 io90 
tango 19090114 0011 n4-8 
tony 20150405 1001 ewdf 
zoe 20000211 0111 jn09 
... 

我想從file2.txt獲得用戶提供的短名單file1.txt用戶提供,即

john 20120212 1110 zjc9 
peter 20140405 0001 io90 
tony 20150405 1001 ewdf 

如何使用python來做到這一點?

+0

如果你開始有四個空間它得到的代碼格式呈現的每一行 - 或者你可以使用{}按鈕在降價編輯器中設置突出顯示的代碼的格式。 – AlBlue

+4

SO既不是代碼編寫,也不是教程服務。請學習[問]。 – jonrsharpe

+0

請閱讀[python文件](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)和[strings](http://www.learnpython.org/) EN/Basic_String_Operations)。如果編程時出現錯誤,請提問。 – ravigadila

回答

0

您可以使用.split(' '),屁股uming認爲總是會有的名稱,並在file2.txt

這裏其他的信息之間的空間是一個例子:

UserList = [] 

with open("file1.txt","r") as fuser: 
     UserLine = fuser.readline() 
     while UserLine!='': 
      UserList.append(UserLine.split("\n")[0]) # Separate the user name from the new line command in the text file. 
      UserLine = fuser.readline() 

InfoUserList = [] 
InfoList = [] 

with open("file2.txt","r") as finfo: 
     InfoLine = finfo.readline() 
     while InfoLine!='': 
      InfoList.append(InfoLine) 
      line1 = InfoLine.split(' ') 
      InfoUserList.append(line1[0]) # Take just the user name to compare it later 
      InfoLine = finfo.readline() 

for user in UserList: 
    for i in range(len(InfoUserList)): 
     if user == InfoUserList[i]: 
      print InfoList[i] 
0
import pandas as pd 

df1 = pd.read_csv('df1.txt', header=None) 
df2 = pd.read_csv('df2.txt', header=None) 
df1[0] = df1[0].str.strip() # remove the 2 whitespace followed by the feild 
df2 = df2[0].str[0:-2].str.split(' ').apply(pd.Series) # split the word and remove whitespace 
df = df1.merge(df2) 

Out[26]: 
     0   1  2  3 
0 tony 20150405 1001 ewdf 
1 peter 20140405 0001 io90 
2 john 20120212 1110 zjc9 
0

您可以使用pandas

import pandas as pd 

file1 = pd.read_csv('file1.txt', sep =' ', header=None) 
file2 = pd.read_csv('file2.txt', sep=' ', header=None) 

shortlist = file2.loc[file2[0].isin(file1.values.T[0])] 

它會給你以下結果:

 0   1  2  3 
1 john 20120212 1110 zjc9 
3 peter 20140405  1 io90 
5 tony 20150405 1001 ewdf 

上面是DataFrame將其轉換回一個數組只使用shortlist.values

相關問題