用3個文本文件創建熊貓數據框

這是我的情況：我從Matlab（X，Y，Z）的大小爲（126,321）的3個矩陣X是x座標，Y y座標，Z是機器的效率取決於座標X和Y. 我想在python中使用矩陣Z.所以我將Z保存在一個文本文件中。但是在將它轉置並旋轉90°之前（因爲Matlab中的矩陣與圖中的矩陣並不相同）。然後我在文本文件中用x座標保存了矢量然後我用y座標將矢量保存在文本文件中。用3個文本文件創建熊貓數據框

所以我有3個文本文件： - text1.txt與尺寸（126321）（它是Z） - text2.txt其與126倍的值一個線 - text3.txt其與321倍的值的線

我想要做的就是創建一個帶有text1數據，文本2索引，text3頭部的熊貓數據框。

我做了下面的代碼：

Efficiency=pd.read_csv('text1.txt',sep=';',header=None,index_col=False) 
x=pd.read_csv('text3.txt',sep=';',header=None,index_col=False) 
y=pd.read_csv('text2.txt',sep=';',header=None,index_col=False) 
Efficiency.columns=x 
Efficiency.index=y

但最後兩行不工作。我試圖通過numpy，但結果也不好。

所以，如果你有任何解釋或解決方案告訴我！

非常感謝。

來源

2017-08-23 Nathan

考查大熊貓concat函數的https：//大熊貓.pydata.org/pandas-docs/stable/generated/pandas.concat.html –

df1=pd.DataFrame(np.random.randint(0,100,126)) 

df2=pd.DataFrame(np.random.randint(322,1000,321))#The problem is that at least two columnn names are equal and thus it throws an error

您可以使用它調查重複值。這應該以同樣的方式爲您

duplicates=df2.duplicated() 
print(df2[duplicates]) 

    0 
22 828 
30 575 
41 341 
55 713 
75 341 
80 353 
92 759 
117 520 
118 330 
126 828 
130 547 
134 927 
142 451 
150 778 
155 417

....

Bacause下探值以及改變值是不是對你的選擇一個方便的方法是使用多指標，其中的x值是在第一級和第二級是數字從0到你的列數。

mcols=pd.MultiIndex.from_arrays([np.random.randint(322,1000,321),np.linspace(0,320,321)]) 

df3=pd.DataFrame(np.random.randint(0,100,size=(126,321)))# This ranom numbers should simulate your (126,321) DataFrame 


df4=pd.DataFrame(df3.values,index=df1,columns=mcols) 
print(df4)

.....

868 679 757 464 420 381 843 549 978 450 ... 578 \ 
    0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ... 311.0 
47  7 73 78 98 41 62 48 65 35 26 ...  85 
68 54 40 61 75 24  9 15 25  1 35 ...  63 
89 44 30 48 95 27 11 52 41 87 31 ...  73 
57 61 46 11 88 21 58 80 42 99 65 ...  23 
37 70 88 32 95 46 66 93 37 88 95 ...  64 
38 14 19 63 73  0 53 71  4 20 63 ...  88 
60 71 87 18 30 94 30 32  9 32 82 ...  36 
15 87  8 57 68 24 95 26 47 29 29 ...  5 
77 70 54 82 31 85 27 13 13 66 16 ...  3 
10  1 28 64  2 75 22 20  9 93  0 ...  89 
60 26 62 81 13  8 18 40 15 13 47 ...  44 
35 24 42 16 68 45 73 96 81  3 44 ...  16 
81 63 30 19 81 99 81  9  9 34 37 ...  53

.....

參考Shihe Zhang您可以直接設置索引和列名不重建索引，並沒有多指數使用：

df4=pd.DataFrame(df3.values,index=df1.iloc[:,0],columns=df2.iloc[:,0])

來源

2017-08-23 07:37:41 2Obe

只需使用pd創建df1，df2和df3。read_csv（） – 2Obe

我這樣做了，但我收到消息錯誤：緩衝區的維數不正確（expected1，got2） – Nathan

我在末尾使用了以下代碼： 'df4 = pd.DataFrame（df3，index = df1.loc [：，0]，columns = df2.loc [：，0]）'，它工作。謝謝！ – Nathan

你需要的是使x的一行和y的一行成爲指數。要更改索引，請使用reindex。

Efficiency.reindex(index=x.iloc[0], columns=y.iloc[0])

注：

產生一個新的對象，除非新的索引相當於當前和複製=假

來源

2017-08-24 06:31:37

用3個文本文件創建熊貓數據框

回答

相關問題