在一個系列上設置操作

我想知道如果有人能幫助我想出一個問題的解決方案;我基本上有一系列熊貓系列清單（空間分隔），我使用熊貓的系列字符串操作創建了這些清單（str.split(' ')）。我需要創建另一系列列表，這些列表是每個列表與另一個列表的交集。在一個系列上設置操作

我相信apply（）是這裏缺失的部分，但我的用法必須不正確，因爲我收到錯誤。使用apply（）中的set操作並不是真的被熊貓手冊所涵蓋，但我認爲它應該可以實現嗎？

基本上，我有一組事件（evector），並且想要設置哪個事件向量與給定事件（e2）共享用戶。這些一直是我的方法至今：

原始的嘗試：

evector = attendframe.yes.str.split(' ') #creates the series of lists 

e2 = [attendframe.yes[attendframe.event==686467261]] #just for testing - returns [0 
    # 1975964455 252302513 4226086795 3805886383 142... 
    #Name: yes] 

sharedvector = evector.apply(lambda x: [n for n in [x] if n in e2]) # the important bit 

print sharedvector

錯誤：陣列，不同長度：1對7

我縮小了問題，下至以下行： evector = attendframe.yes.str.split(' ').apply(lambda x: set([x]))

然後，我再做了幾次嘗試，以使其正確。

嘗試1

evector = attendframe.yes.str.split(' ').apply(lambda x: set([x])) 
#Unhashable type "list"

嘗試2

evector = attendframe.yes.str.split(' ').apply(lambda x: set(x)) 
#TypeError: 'float' object is not iterable

嘗試3（信用安迪海登）

evector = attendframe.yes.str.split(' ').apply(lambda x: x 
               if isinstance(x, float) 
               else set(x)) 

e2 = set([2394228942, 2686116898, 1056558062, 379294223]) 
sharedvector = evector.apply(lambda x: x if isinstance(x, float) else x.intersection(e2)) 
sharedvector.dropna()) 
#works, but returns empty arrays.

這裏是一個樣本造成問題的數據本身：

print attendframe.yes.str.split(' ') 

0  [1975964455, 252302513, 4226086795, 3805886383... 
1  [2394228942, 2686116898, 1056558062, 379294223... 
2             NaN 
3             NaN

如果任何相關的最終解決方案，我想最終還是要創建一個數據幀的頁邊距包含事件，其細胞中含有的列表用戶在任何兩個給定事件之間共享。生成列向量是第一部分，然後我希望在函數中運行一個類似的apply（）步驟來創建完整的矩陣。

來源

2013-02-25 analystic

嗨安迪我不這麼認爲，我不在我的現在要檢查計算機，但是從內存中它或者給了我一個錯誤，指出浮點數不可迭代或類型不可哈希。今晚將確認！ – analystic 2013-02-26 02:21:08

嗨安迪，正如我懷疑的那樣，我回來了''浮動'對象不可迭代「。 – analystic 2013-02-26 10:21:05

嗨，Andy，在我看到您的評論之前，我剛剛添加了前幾行，這有幫助嗎？ – analystic 2013-02-26 10:41:57

既然你是問關於集合運算，爲什麼不使用set對象：

evector = attendframe.yes.str.split(' ').apply(set) 
e2 = set(attendframe[attendframe.event==686467261]]['yes'])

和應用交集：

sharedvector = evector.apply(lambda x: x & e2)

如果你的數據有NaN你可以用每一組通話測試如果它是一個浮動：

evector = df.yes.str.split(' ').apply(lambda x: x 
               if isinstance(x, float) 
               else set(x)) 
e2 = set(attendframe[attendframe.event==686467261]]['yes']) 
sharedvector = evector.apply(lambda x: x if isinstance(x, float) else x & e2)

來源

2013-02-25 13:49:53

不幸的是，這一個也給了我一個「float不可迭代」的錯誤。它似乎沒有幫助，如果我把[]圍繞操作：evector = [attendframe.yes.str.split（''）]。apply（set），因爲明顯的列表對象沒有apply（）函數。 – analystic 2013-02-26 10:24:27

@analystic更新與NaNs打好:) – 2013-02-26 10:54:24

因此NaNs算作numpy/pandas說法的花車嗎？（x，float） else set（x）） e2 =參加者（參與者名稱）如果isinstance（x，float）else x），則返回true，否則返回false。[參考frame.event == 686467261] ['yes']。＆e2）' '「不支持的操作數類型爲＆：'set'和'Series'」' – analystic 2013-02-26 10:56:56

在一個系列上設置操作

回答

相關問題