numpy/scipy相當於R ecdf（x）（x）函數嗎？

在numpy或scipy中，Python的R的ecdf(x)(x)函數等價於什麼？是ecdf(x)(x)基本上是一樣的：numpy/scipy相當於R ecdf（x）（x）函數嗎？

import numpy as np 
def ecdf(x): 
    # normalize X to sum to 1 
    x = x/np.sum(x) 
    return np.cumsum(x)

還是別的什麼要求？

編輯如何控制ecdf使用的bin數？

來源

2013-04-03 user248237dfsf

[這]（http://stackoverflow.com/questions/3209362/how-to-plot-empirical-cdf-in-matplotlib-in-python）應該幫助。 – agstudy

嘗試這些鏈接：

statsmodels.ECDF

ECDF in python without step function?

來源

2013-04-03 16:18:27 yasouser

雖然我認爲'statsmodels'是OP正在尋找的東西，但如果您添加了示例代碼，使其自成一體，那將會很不錯。 – root

scipy不可能嗎？ – Rhubarb

筆者有一個用戶編寫ECDF功能的一個很好的例子：John Stachurski's Python lectures。他的講座系列面向計算經濟學的研究生;但是他們是我學習Python的普通科學計算的人的首選資源。

編輯：現在已經過了一年了，但我仍然想回答問題的「編輯」部分，以防您（或其他人）仍然有用。

確實沒有任何ECDF的「容器」，因爲它們具有直方圖。如果G是使用數據向量Z形成的經驗分佈函數，則G（x）的字面意思是Z除以len（Z）的出現次數。這不需要「分箱」來確定。因此，ECDF保留關於數據集的所有可能的信息（因爲它必須保留整個數據集以用於計算），而直方圖實際上通過分組丟失關於數據集的一些信息。我更喜歡在可能的情況下使用ecdfs vs直方圖，因爲這個原因。有趣的好處：如果你需要從非常大的流數據中創建一個小尺寸的ECDF對象，你應該看看這個由McDermott等人撰寫的「Data Skeletons」論文。

來源

2013-06-23 00:36:55 CompEcon

鏈接現在已損壞。作者將他的python講座移至：http://quant-econ.net/py/index.html – bersanri

對於ecdf的OP實現是錯誤的，你不應該爲cumsum()的值。所以不ys = np.cumsum(x)/np.sum(x)但ys = np.cumsum(1 for _ in x)/float(len(x))或更好ys = np.arange(1, len(x)+1)/float(len(x))

你要麼statmodels的ECDF去，如果你是一個額外的依賴性確定或提供自己的實現。請看下圖：

import numpy as np 
import matplotlib.pyplot as plt 
from statsmodels.distributions.empirical_distribution import ECDF 
%matplotlib inline 

grades = (93.5,93,60.8,94.5,82,87.5,91.5,99.5,86,93.5,92.5,78,76,69,94.5, 
      89.5,92.8,78,65.5,98,98.5,92.3,95.5,76,91,95,61) 


def ecdf_wrong(x): 
    xs = np.sort(x) # need to be sorted 
    ys = np.cumsum(xs)/np.sum(xs) # normalize so sum == 1 
    return (xs,ys) 
def ecdf(x): 
    xs = np.sort(x) 
    ys = np.arange(1, len(xs)+1)/float(len(xs)) 
    return xs, ys 

xs, ys = ecdf_wrong(grades) 
plt.plot(xs, ys, label="wrong cumsum") 
xs, ys = ecdf(grades) 
plt.plot(xs, ys, label="handwritten", marker=">", markerfacecolor='none') 
cdf = ECDF(grades) 
plt.plot(cdf.x, cdf.y, label="statmodels", marker="<", markerfacecolor='none') 
plt.legend() 
plt.show()

來源

2016-06-06 14:55:19 ecerulm

numpy/scipy相當於R ecdf（x）（x）函數嗎？

回答

相關問題