2015-05-28 68 views
0

我正在嘗試從單個文本文件中讀取兩段數據。下面是該文件如下所示:從文本文件中讀取多個數據

PaxHeader/data-science000755 777777 777777 00000000262 12525446741 015207 xustar00armourp000000 000000 18 gid=1050026054 
17 uid=488147323 
20 ctime=1431779590 
20 atime=1431779720 
38 LIBARCHIVE.creationtime=1431719347 
23 SCHILY.dev=16777218 
24 SCHILY.ino=110226037 
18 SCHILY.nlink=4 
data-science/000755 Äâ{Ä>ñ F00000000000 12525446741 013547 5ustar00armourp000000 000000 data-science/PaxHeader/merged-sensor-files.csv000644 777777 777777 00000000214 12525446724 021646 xustar00armourp000000 000000 18 gid=1050026054 
17 uid=488147323 
20 ctime=1431779590 
20 atime=1431779720 
23 SCHILY.dev=16777218 
24 SCHILY.ino=110226038 
18 SCHILY.nlink=1 
data-science/merged-sensor-files.csv000644 Äâ{Ä>ñ F00016452751 12525446724 020164 0ustar00armourp000000 000000 MTU, Time, Power, Cost, Voltage 
MTU1,05/11/2015 19:59:06,4.102,0.62,122.4 
MTU1,05/11/2015 19:59:05,4.089,0.62,122.3 
MTU1,05/11/2015 19:59:04,4.089,0.62,122.3 
MTU1,05/11/2015 19:59:06,4.089,0.62,122.3 
MTU1,05/11/2015 19:59:04,4.097,0.62,122.4 
MTU1,05/11/2015 19:59:03,4.097,0.62,122.4 
MTU1,05/11/2015 19:59:02,4.111,0.62,122.5 
MTU1,05/11/2015 19:59:03,4.111,0.62,122.5 
MTU1,05/11/2015 19:59:02,4.104,0.62,122.5 
MTU1,05/11/2015 19:59:01,4.090,0.62,122.4 
MTU1,05/11/2015 19:59:00,4.093,0.62,122.4 
MTU1,05/11/2015 19:58:59,4.112,0.62,122.5 
data-science/PaxHeader/weather.json000644 777777 777777 00000000214 12525446741 017610 xustar00armourp000000 000000 18 gid=1050026054 
17 uid=488147323 
20 ctime=1431779590 
20 atime=1431779720 
23 SCHILY.dev=16777218 
24 SCHILY.ino=110226039 
18 SCHILY.nlink=1 
data-science/weather.json000644 Äâ{Ä>ñ F00000000766 12525446741 016112 0ustar00armourp000000 000000 {"1431388800":"75.4","1431392400":"73.2","1431396000":"72.1","1431399600":"71.0", "1431403200":"70.7","1431406800":"69.6","1431410400":"69.0","1431414000":"68.8","1431417600":"69.2","1431421200":"67.9","1431424800":"68.6","1431428400":"68.7","1431432000":"72.1","1431435600":"76.2","1431439200":"80.1","1431442800":"80.7","1431446400":"80.9","1431450000":"83.3","1431453600":"84.5","1431457200":"85.1","1431460800":"87.0","1431464400":"84.2","1431468000":"84.4","1431471600":"83.0","1431475200":"81.1"} 

所以基本上我想要得到像下面的值

MTU, Time, Power, Cost, Voltage 
    MTU1,05/11/2015 19:59:06,4.102,0.62,122.4 

爲單獨的大熊貓幀,然後另一架爲低於字典。

{"1431388800":"75.4","1431392400":"73.2","1431396000":"72.1","1431399600":"71.0", "1431403200":"70.7","1431406800":"69.6","1431410400":"69.0","1431414000":"68.8","1431417600":"69.2","1431421200":"67.9","1431424800":"68.6","1431428400":"68.7","1431432000":"72.1","1431435600":"76.2","1431439200":"80.1","1431442800":"80.7","1431446400":"80.9","1431450000":"83.3","1431453600":"84.5","1431457200":"85.1","1431460800":"87.0","1431464400":"84.2","1431468000":"84.4","1431471600":"83.0","1431475200":"81.1"} 

我可以手動剪切和複製在單獨的文件這兩部分粘貼和閱讀,但我想用正則表達式來自動執行它。我想我知道我們怎樣才能把它整理出來,但是當把整個文件作爲文本讀取時,我看到下面的值。

所以我這樣做:

f=open("file",'r').read() 
print(f) 

'PaxHeader/data-science\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00000755 \x00777777 \x00777777 \x0000000000262 12 

這些文件的前幾行。不知道爲什麼我看到\ x00很多。是因爲有些空間還是一些不被認可的角色?

任何想法如何獲得所需的結果?

感謝

回答

0
with open("file","r") as handler: 
    lines=handler.readlines() 
    lines[#] 

#是你想要的行號。

+0

如果您需要的線具有獨特功能,您還可以通過搜索功能來確定線號。 – stardust

+0

我不明白你的意思。我不能繼續輸入第一行。因爲檔案非常大。我只給了一小段文字。 – user

+0

有什麼方法可以確定你需要的線條嗎? – stardust