如何從數據幀中得到一個字符串

我想用兩個參數定義一個函數：df（dataframe）和一個整數（employerID）作爲我的參數。此功能將返回僱主的全名。如何從數據幀中得到一個字符串

如果給定的ID不屬於任何員工，我想返回字符串「UNKNOWN」/如果沒有給出中間名只返回「LAST，FIRST」。 /如果僅給出中間首字母，則以「LAST，FIRST M」格式返回全名。中間首字母后跟'。'。

def getFullName(df, int1): 
    df = pd.read_excel('/home/data/AdventureWorks/Employees.xls') 
    newdf = df[(df['EmployeeID'] == int1)] 
    print("'" + newdf['LastName'].item() + "," + " " + newdf['FirstName'].item() + " " + newdf['MiddleName'].item() + "." + "'") 

getFullName('df', 110)

我寫了這個代碼，但是有兩個問題上來： 1）如果我不把引號周圍的DF，它會給我一個錯誤信息，但我只是想取一個數據幀一個參數不是一個字符串。

2）此代碼無法處理中間名以外的人。

我很抱歉，但我用pd.read_excel來讀取您無法訪問的excel文件。我知道如果有人讓我知道如何用列名創建一個隨機數據框，那麼在沒有excel文件的情況下測試代碼將會很困難，我會繼續並改變它。謝謝你，

來源

2017-09-25 Yun Tae Hwang

文本形式的一些樣本數據將是有益的。 –

你有什麼錯誤？錯誤信息也會有幫助。 – TheF1rstPancake

錯誤消息說「名稱'df'未定義」如果我不把df引號。也是僱員ID（259,278,204）。 FirstName（Be，Garrett，Gabe），MiddleName（T，R，NAN），LastName（Miller，Vargas，Mares）這裏是數據框的文本形式。 –

我創造了一些假的數據是：

  EmployeeID FirstName LastName MiddleName 
0   0   a  a   a 
1   1   b  b   b 
2   2   c  c   c 
3   3   d  d   d 
4   4   e  e   e 
5   5   f  f   f 
6   6   g  g   g 
7   7   h  h   h 
8   8   i  i   i 
9   9   j  j  None

EmployeeID 9沒有中間名，但其他人一樣。我會這樣做的方式是將邏輯分解成兩部分。第一次，因爲當你無法找到EmployeeID。第二個管理員工姓名的打印。第二部分還應該有兩套邏輯，一種用於控制員工是否擁有中間名，另一種用於如果他們不這樣做。您可能會將很多這些內容合併到單行語句中，但您可能會犧牲清晰度。

我也從函數中刪除了pd.read_excel調用。如果你想將數據幀傳遞給函數，那麼應該創建數據幀。

def getFullName(df, int1): 
    newdf = df[(df['EmployeeID'] == int1)] 

    # if the dataframe is empty, then we can't find the give ID 
    # otherwise, go ahead and print out the employee's info 
    if(newdf.empty): 
     print("UNKNOWN") 
     return "UNKNOWN" 
    else: 
     # all strings will start with the LastName and FirstName 
     # we will then add the MiddleName if it's present 
     # and then we can end the string with the final ' 
     s = "'" + newdf['LastName'].item() + ", " +newdf['FirstName'].item() 
     if (newdf['MiddleName'].item()): 
      s = s + " " + newdf['MiddleName'].item() + "." 
     s = s + "'" 
     print(s) 
     return s

我有函數返回值的情況下，你想進一步操縱字符串。但那只是我。

如果你運行getFullName(df, 1)你應該得到'b, b b.'。而對於getFullName(df, 9)，您應該得到'j, j'。因此，在全

，這將是：

df = pd.read_excel('/home/data/AdventureWorks/Employees.xls') 
getFullName(df, 1) #outputs 'b, b b.' 
getFullName(df, 9) #outputs 'j, j' 
getFullName(df, 10) #outputs UNKNOWN

假數據：

d = {'EmployeeID' : [0,1,2,3,4,5,6,7,8,9], 
    'FirstName' : ['a','b','c','d','e','f','g','h','i','j'], 
    'LastName' : ['a','b','c','d','e','f','g','h','i','j'], 
    'MiddleName' : ['a','b','c','d','e','f','g','h','i',None]} 
df = pd.DataFrame(d)

來源

2017-09-25 02:14:40 TheF1rstPancake

對於OP來說，看看如何創建假數據可能是有用的 - 他們在問題中要求儘可能多。 – wwii

好呼喚。我認爲OP在我開始回答後編輯了這個問題。 – TheF1rstPancake

OP代表什麼？對不起，夥計們。林有點新來這個。並非常感謝您的幫助。我希望我能在不久的將來幫助別人。 –

如何從數據幀中得到一個字符串

回答

相關問題