2016-04-03 79 views
0

我們假設我有兩個文件是這樣的:如何使用BASH或Python腳本合併兩個文件?

文件1:

F: user1 password1 
F: user2 password2 
F: user3 password3 

文件2:

server1 24000 
server2 24000 
server3 24000 
server4 24000 

我想他們爲了結合起來,得到一個文件,此輸出文件:

OuputFile:

C: server1 24000 user1 password1 
C: server2 24000 user1 password1 
C: server3 24000 user1 password1 
C: server4 24000 user1 password1 
C: server1 24000 user2 password2 
C: server2 24000 user2 password2 
C: server3 24000 user2 password2 
C: server4 24000 user2 password2 
C: server1 24000 user3 password3 
C: server2 24000 user3 password3 
C: server3 24000 user3 password3 
C: server4 24000 user3 password3 

因此,在Windows中,我做了這個批處理文件來得到我所期待的,但是,我不知道如何做到這一點的BASH(的Bourne Again Shell的)或在Python腳本

批處理文件:

@echo off 
set file1=file1.txt 
set file2=file2.txt 
Set Output=Output_CCCam.cfg 
If Exist %Output% Del %Output% 
for /f "tokens=2 delims=:" %%a in ('Type "%file1%"') do (
    for /f "delims=" %%b in ('Type "%file2%"') do (
     >>%Output% echo C: %%b %%a 
    ) 
) 
Start "" Notepad %Output% 
+1

我刪除了'批處理文件'標記,因爲它不適用於你的問題。它指的是在MS-DOS,Windows或OS-2操作系統的特定上下文中的批處理文件。請不要僅僅因爲它們包含類似的聲音名稱或短語而使用標籤。這裏的標籤有特定的含義。如果您不確定,請閱讀標籤的說明。如果你還不確定,請不要使用它;如果有必要,有人會爲你添加它。 –

+1

您是否必須在BASH中執行此操作,或者可以使用Python或AWK之類的操作? – pkacprzak

+0

@pkacprzak在Python中是可以接受的!謝謝 !我將在Linux中添加'Python'標籤 – Hackoo

回答

3

一個bash解決方案

#!/bin/bash 

while IFS= read -r line1; do 
    while IFS= read -r line2; do 
     printf "C: %s %s\n" "$line2" "${line1/#F: }" 
    done < file2 
done < file1 

這會遍歷file1,並在file2file1循環的每一行。 printf行彙編輸出,並且line的參數擴展刪除了前導F:

結果:

C: server1 24000 user1 password1 
C: server2 24000 user1 password1 
C: server3 24000 user1 password1 
C: server4 24000 user1 password1 
C: server1 24000 user2 password2 
C: server2 24000 user2 password2 
C: server3 24000 user2 password2 
C: server4 24000 user2 password2 
C: server1 24000 user3 password3 
C: server2 24000 user3 password3 
C: server3 24000 user3 password3 
C: server4 24000 user3 password3 

一個解決方案加入和sed

這會工作,以及:

join -j 50 -o 2.1,1.1 -t '~' file1 file2 | sed s'/~F:/ /;s/^/C: /' 

這是join輕微的虐待。-j 50說加盟匹配場數50,不存在因此被認爲等於對所有線路,導致這兩個文件的笛卡爾乘積:

$ join -j 50 file1 file2 
F: user1 password1 server1 24000 
F: user1 password1 server2 24000 
F: user1 password1 server3 24000 
F: user1 password1 server4 24000 
F: user2 password2 server1 24000 
F: user2 password2 server2 24000 
F: user2 password2 server3 24000 
F: user2 password2 server4 24000 
F: user3 password3 server1 24000 
F: user3 password3 server2 24000 
F: user3 password3 server3 24000 
F: user3 password3 server4 24000 

要得到線成正確的順序,我們用-o 2.1,1,1指定輸出格式。因爲默認的字段分隔符是空白,我們規定不與-t '~'包含在輸入作爲新的分隔符字符:

$ join -j 50 -o 2.1,1.1 -t '~' file1 file2 
server1 24000~F: user1 password1 
server2 24000~F: user1 password1 
server3 24000~F: user1 password1 
server4 24000~F: user1 password1 
server1 24000~F: user2 password2 
server2 24000~F: user2 password2 
server3 24000~F: user2 password2 
server4 24000~F: user2 password2 
server1 24000~F: user3 password3 
server2 24000~F: user3 password3 
server3 24000~F: user3 password3 
server4 24000~F: user3 password3 

最後,我們每行一個空格替換~F:和使用SED前面加上C:

$ join -j 50 -o 2.1,1.1 -t '~' file1 file2 | sed 's/~F:/ /;s/^/C: /' 
C: server1 24000 user1 password1 
C: server2 24000 user1 password1 
C: server3 24000 user1 password1 
C: server4 24000 user1 password1 
C: server1 24000 user2 password2 
C: server2 24000 user2 password2 
C: server3 24000 user2 password2 
C: server4 24000 user2 password2 
C: server1 24000 user3 password3 
C: server2 24000 user3 password3 
C: server3 24000 user3 password3 
C: server4 24000 user3 password3 

如果行的順序並不重要,這可以稍微縮短到

$ join -j 50 file2 file1 | sed 's/F://;s/^/C:/' 
C: server1 24000 user1 password1 
C: server1 24000 user2 password2 
C: server1 24000 user3 password3 
C: server2 24000 user1 password1 
C: server2 24000 user2 password2 
C: server2 24000 user3 password3 
C: server3 24000 user1 password1 
C: server3 24000 user2 password2 
C: server3 24000 user3 password3 
C: server4 24000 user1 password1 
C: server4 24000 user2 password2 
C: server4 24000 user3 password3 
1

我本來建議使用paste工具。然而,正如Benjamin W.指出的那樣,這個問題想要排列,儘管使用了「combine」這個詞。

粘貼不能單獨執行排列,更不用說刪除不需要的標記,因爲只有基於問題作者提供的代碼片段才能看到。遵循執行所要求的Python 3腳本。

#!/bin/python3 


def merge_lines(line_list_a, line_list_b): 
    # List comprehension could be shorter if smaller identifiers were used. However, I consider readability more important than small column limits. 
    return [' '.join(['C:'] + line_of_b.split() + [' '] + line_of_a.split()[1:]) for line_of_a in line_list_a for line_of_b in line_list_b] 


def main(): 
    with open('file1.txt') as file_1: 
     with open('file2.txt') as file_2: 
      with open('output.txt', 'w') as output_file: 
       output_file.write('\n'.join((merge_lines(file_1.readlines(), file_2.readlines())))) 
       output_file.write('\n') # Python converts '\n' to the system's default line separator. 

if __name__ == '__main__': 
    main() 
1

這種解決方案可能無法贏得最少的字符類型的輸入,但我認爲這將是相當直截了當的理解。我假設你的文件足夠小,可以一次輕鬆地裝入內存。

#! /bin/bash 

## File names of the files we want to join. 

file_1st='file_1st.txt' 
file_2nd='file_2nd.txt' 

## Declare array variables to hold the lines of the data contained in the files. 

declare -a file_data_1st 
declare -a file_data_2nd 

## Read both files into memory. The `-t` option trims trailing newline 
## characters. The arrays will now contain the trimmed lies of each file. 

mapfile -t file_data_1st < "${file_1st}" 
mapfile -t file_data_2nd < "${file_2nd}" 

## Now iterate over the lines of the first file and inside that loop over the 
## lines of the second file. Split both lines into white-space separated words 
## and then re-assemble the output line as desired. This is a little more 
## general than actually needed here (you don't really have to split the lines 
## from the second file. 

for line_1st in "${file_data_1st[@]}" 
do 
    words_1st=(${line_1st}) 
    for line_2nd in "${file_data_2nd[@]}" 
    do 
     words_2nd=(${line_2nd}) 
     echo "C: ${words_2nd[0]} ${words_2nd[1]} ${words_1st[1]} ${words_1st[2]}" 
    done 
done 
1

在Python,我問在註釋:

filename1 = 'file1.txt' 
filename2 = 'file2.txt' 

user_data = [] 
server_data = [] 

with open(filename1, 'r') as fp: 
    user_data = map(lambda x: x.split()[1:], fp.readlines()) 

with open(filename2, 'r') as fp: 
    server_data = map(lambda x: x.split(), fp.readlines()) 

output_filename = 'file3.txt' 

with open(output_filename, 'w') as fp: 
    for user_row in user_data: 
     for server_row in server_data: 
      fp.write("C: %s %s\n" % (" ".join(server_row), " ".join(user_row)))