Python Pandas索引錯誤：列表索引超出範圍

我的代碼在以前的數據集上工作，現在停止工作。我查看了這個錯誤消息的其他答案，但沒有一個適用於我的。Python Pandas索引錯誤：列表索引超出範圍

我在我的數據框df中有一列用於Email_Address，我想將這個域分割成新的列。

我的數據框是以前的df的子集。

#create new df, for only email addresses I need to review 
df = df_raw.loc[df_raw['Review'] == 'Y'].copy() 

#I reset the index to fix the problem, but it didnt help 
df = df.reset_index(drop=True) 

#ensure Email Address is a string 
df['Email_Address']= df.Email_Address.apply(str) 

#make Email Address lower case 
df['email_lowercase'] = df['Email_Address'].str.lower() 

#Split out domain into a new column 
df['domain'] = df['email_lowercase'].apply(lambda x: x.split('@')[1]) 

IndexError: list index out of range

來源

2017-08-29 jeangelj

這可能意味着，符號'@'沒有按不存在於你的一個單元格中，因此你不能訪問「@」後面的電子郵件部分。有時用戶輸入「at」而不是「@」，因此它們不能被機器人追蹤。你檢查過嗎？ – ysearka

林不知道，但嘗試改變這個'df ['Email_Address'] = df.Email_Address.apply（str）'這個'df ['Email_Address'] = df.Email_Address.astype（str）'也可能你有非在'@'後面的某些行上沒有數據會導致數據失敗的清理數據。檢查一下。沒有代表'df'的 –

，不可能重現你的錯誤。請提供一個[MVCE]（https://stackoverflow.com/help/mcve） – C8H10N4O2

您的數據框中很可能有無效的電子郵件。您可以通過使用

df[~df.Email_Address.astype(str).str.contains('@')]

確定這些你可以用這種方法來提取域

def extract_domain(email): 
    email_domain = email.split('@') 
    if len(email_domain) > 1: 
     return email_domain[1] 

df['domain'] = df['email_lowercase'].apply(extract_domain)

甚至更短：

df['domain'] = df['email_lowercase'].str.split('@').apply(lambda li: li[1] if len(li) > 1 else None)

來源

2017-08-29 14:31:40

謝謝，我試過這個並得到AttributeError：'系列'對象沒有屬性'contains' – jeangelj

@jeangelj我修復了這個問題。（在'contains'之前忘了'str.'） –

謝謝，看起來有些令人驚訝的 - 我把它們變成了0 – jeangelj

Python Pandas索引錯誤：列表索引超出範圍

回答

相關問題