我已經從CFD模擬以下數據:使用Python和大熊貓在一個文本文件分割數據
Average value for X = 0.5080000265E-0003 to 0.2489200234E-0001
Z = -.3141592741E+0001
Time = 0.7000032425E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.4535714164E-0002 0.2565349844E+0006
0.7559523918E-0002 0.2565098906E+0006
0.1058333274E-0001 0.2564848125E+0006
0.1360714249E-0001 0.2564597656E+0006
0.1663095318E-0001 0.2564346563E+0006
0.1965476200E-0001 0.2564095625E+0006
... ...
... ...
0.1259419441E+0001 0.2549983125E+0006
0.1262443304E+0001 0.2549983125E+0006
0.1265467167E+0001 0.2549983125E+0006
0.1268491030E+0001 0.2549982656E+0006
Time = 0.7010014057E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.4535714164E-0002 0.2565349844E+0006
0.7559523918E-0002 0.2565098906E+0006
0.1058333274E-0001 0.2564848125E+0006
... ...
... ...
0.1259419441E+0001 0.2549983125E+0006
0.1262443304E+0001 0.2549983125E+0006
0.1265467167E+0001 0.2549983125E+0006
0.1268491030E+0001 0.2549982656E+0006
Time = 0.7020006657E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.1058333274E-0001 0.2564848125E+0006
... ...
正如你可以從上面的例子中看到,該數據被分成由幾個垂直分區時間步標頭標記爲Time
。在每個部分中,Y
不會更改,但P_g
確實會更改。爲了繪製數據,我需要將每個部分中的P_g
列在下一列中。例如,這是我需要重新創建數據:
Y 0.7000032425E+1 0.7020006657E+1 ...
0.1511904760E-0002 0.2565604063E+0006 0.2549982656E+0006 ...
0.4535714164E-0002 0.2565349844E+0006 0.2549982656E+0006 ...
0.7559523918E-0002 0.2565098906E+0006 0.2549982656E+0006 ...
0.1058333274E-0001 0.2564848125E+0006 0.2549982656E+0006 ...
0.1360714249E-0001 0.2564597656E+0006 0.2549982656E+0006 ...
使用熊貓,我可以從文本文件中讀取數據,並創建具有Y
值的新數據幀索引(行)和Time
值作爲列:
import pandas as pd
# Read in data from text file
# -------------------------------------------------------------------------
# data frame from text file contents, skip first 4 rows, separate by variable
# white space, no header
df = pd.read_table('ROP_s_SD.dat', skiprows=4, sep='\s*', header=None)
# Time data
# -------------------------------------------------------------------------
# data frame of the rows that contain the Time string
dftime = df.loc[df.ix[:,0].str.contains('Time')]
t = dftime[2].tolist() # time list
idx = dftime.index # index of rows containing Time string
# Y data
# -------------------------------------------------------------------------
# grab values for y to create index for new data frame
ido = idx[0]+2 # index of first y value
idf = idx[1] # index of last y value
y = [] # empty list to store y values
for i in range(ido, idf): # iterate through first section of y values
v = df.ix[i, 0] # get y value from data frame
y.append(float(v)) # add y value to y list
# New data frame
# ------------------------------------------------------------------------
# empty data frame with y as index and t as columns
dfnew = pd.DataFrame(None, index=y, columns=t)
print('dfnew is \n', dfnew.head())
空數據幀的頭部,dfnew.head()
看起來如下:
7.000032 7.010014 7.020007 7.030043 7.040020 7.050035 7.060043
0.001512 NaN NaN NaN NaN NaN NaN NaN
0.004536 NaN NaN NaN NaN NaN NaN NaN
0.007560 NaN NaN NaN NaN NaN NaN NaN
0.010583 NaN NaN NaN NaN NaN NaN NaN
0.013607 NaN NaN NaN NaN NaN NaN NaN
7.070004 7.080036 7.090022 ... 7.650011 7.660032 7.670026
0.001512 NaN NaN NaN ... NaN NaN NaN
0.004536 NaN NaN NaN ... NaN NaN NaN
0.007560 NaN NaN NaN ... NaN NaN NaN
0.010583 NaN NaN NaN ... NaN NaN NaN
0.013607 NaN NaN NaN ... NaN NaN NaN
7.680044 7.690029 7.700008 7.710012 7.720014 7.730019 7.740026
0.001512 NaN NaN NaN NaN NaN NaN NaN
0.004536 NaN NaN NaN NaN NaN NaN NaN
0.007560 NaN NaN NaN NaN NaN NaN NaN
0.010583 NaN NaN NaN NaN NaN NaN NaN
0.013607 NaN NaN NaN NaN NaN NaN NaN
[5 rows x 75 columns]
Ť每欄中的NaN
應包含來自該特定Time
部分的P_g
值。我如何將每個部分的P_g
值添加到各自的列中?
我正在閱讀的文本文件可以下載here。
這很好用!謝謝。如果您有時間,將每行繪製爲一條線的示例會很有幫助。 x軸應該是時間t,而y軸應該是壓力P_g。 – wigging 2015-02-12 17:48:39
你真的想要420個獨立的行嗎?這可能不是最好的方式來看... – Ajean 2015-02-12 19:29:16
@Gavin我添加了一些繪圖代碼。 420條個體會變得很討厭,所以我在2D中做到了。 – Ajean 2015-02-12 19:57:50