2017-04-21 80 views
2

我想弄清楚如何從具有點的最佳擬合線確定斜率趨勢。基本上,一旦我有了傾斜的趨勢,我想在同一個陰謀中繪製多條其他線條。例如:enter image description here從最佳擬合線發現斜率趨勢

這個情節基本上是我想要做的,但我不知道該怎麼做。正如你所看到的,它有幾條最佳擬合線,點的斜率在x = 6處相交。在這些線之後,它有幾條基於其他斜坡趨勢的線。我假設使用這段代碼我可以做類似的事情,但我不確定如何操作代碼來做我想做的事情。

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 

# simulate some artificial data 
# ===================================== 
df = pd.DataFrame({ 'Age' : np.random.rand(25) * 160 }) 

df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000 

# plot those data points 
# ============================== 
fig, ax = plt.subplots() 
ax.scatter(df['Length'], df['Age']) 

# Now add on a line with a fixed slope of 0.03 
slope = 0.03 

# A line with a fixed slope can intercept the axis 
# anywhere so we're going to have it go through 0,0 
x_0 = 0 
y_0 = 0 

# And we'll have the line stop at x = 5000 
x_1 = 5000 
y_1 = slope (x_1 - x_0) + y_0 

# Draw these two points with big triangles to make it clear 
# where they lie 
ax.scatter([x_0, x_1], [y_0, y_1], marker='^', s=150, c='r') 

# And now connect them 
ax.plot([x_0, x_1], [y_0, y_1], c='r')  

plt.show() 

回答

4

價值y_1可以通過使用您slopey_0給出一個直線的方程找到:

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 

df = pd.DataFrame({'Age': np.random.rand(25) * 160}) 
df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000 

fig, ax = plt.subplots() 
ax.scatter(df['Length'], df['Age']) 

slope = 0.03 
x_0 = 0 
y_0 = 0 
x_1 = 5000 
y_1 = (slope * x_1) + y_0 # equation of a straight line: y = mx + c 

ax.plot([x_0, x_1], [y_0, y_1], marker='^', markersize=10, c='r') 

plt.show() 

將會產生如下圖:

enter image description here

爲了繪製多行,首先創建一個將要使用的漸變的數組/列表,然後按照同樣的步驟:

df = pd.DataFrame({'Age': np.random.rand(25) * 160}) 
df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000 

fig, ax = plt.subplots() 
ax.scatter(df['Length'], df['Age']) 

slope = 0.03 
x_0 = 0 
y_0 = 0 
x_1 = 5000 

slopes = np.linspace(0.01, 0.05, 5) # create an array containing the gradients 

new_y = (slopes * x_1) + y_0 # find the corresponding y values at x = 5000 

for i in range(len(slopes)): 
    ax.plot([x_0, x_1], [y_0, new_y[i]], marker='^', markersize=10, label=slopes[i]) 

plt.legend(title="Gradients") 
plt.show() 

這將產生如下圖所示:

enter image description here

+0

謝謝!我可以像我在圖中顯示的那樣做多行嗎? – Cosmoman

+0

@Cosmoman我已經更新了我的答案 – DavidG

1

我只是修改你的代碼一點點在這裏。基本上你需要的是分段功能。在一定的價值,你有不同的斜率,但都最終與3000,之後斜率僅僅是0

情節如下:

enter image description here

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 

# simulate some artificial data 
# ===================================== 
df = pd.DataFrame({ 'Age' : np.random.rand(25) * 160 }) 

df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000 

# plot those data points 
# ============================== 
fig, ax = plt.subplots() 
ax.scatter(df['Length'], df['Age']) 

# Now add on a line with a fixed slope of 0.03 
#slope1 = -0.03 
slope1 = np.arange(-0.05, 0, 0.01) 
slope2 = 0 

# A line with a fixed slope can intercept the axis 
# anywhere so we're going to have it go through 0,0 
x_0 = 0 
y_1 = 0 

# And we'll have the line stop at x = 5000 
for slope in slope1: 
    x_1 = 3000 
    y_0 = y_1 - slope * (x_1 - x_0) 
    ax.plot([x_0, x_1], [y_0, y_1], c='r') 

x_2 = 5000 
y_2 = slope2 * (x_2 - x_1) + y_1 

# Draw these two points with big triangles to make it clear 
# where they lie 
ax.scatter([x_0, x_1], [y_0, y_1], marker='^', s=150, c='r') 

# And now connect them 
ax.plot([x_1, x_2], [y_1, y_2], c='r')  

plt.show()