計數場的第一次出現在CSV文件

使用以下CSV文件的格式：計數場的第一次出現在CSV文件

Pos ID Name 
1 0001L01 50293 
2 0002L01 128864 
3 0003L01 172937 
4 0004L01 12878 
5 0005L01 demo 
6 0004L01 12878 
7 0004L01 12878 
8 0005L01 demo

我想在字典包括：[ID], {Pos, Name, FirstTime}其中FirstTime對應與第一一個ID出現在位置CSV文件。例如ID = 0005L01將具有：[0005L01],{5,demo,5},{8,demo,5}

我已經設法存儲[ID], {Pos,Name}但我掙扎與FirstTime。到目前爲止，我已經有了：

# From the csv reader, save it to a list 
dlist=[] 
for row in reader: 
     # store only the non empty lines 
     if any(row): 
     dlist.append(row) 
d={} 
for row in dlist: 
    d.setdefault(row[1],[]).append([row[0],row[2]])

來源

2015-04-29 Manolete

什麼是'FirstTime' – thefourtheye

@thefourtheye' FirstTime'是ID第一次出現在CSV文件中的位置 – Manolete

您真的希望'{Pos，Name，FirstTime}'是一個集合，而不是像一個元組那樣排序的東西？ – abarnert

，如果你計算firstTime首先，它更容易，然後你填寫你的字典：

# From the csv reader, save it to a list 
dlist=[] 
for row in reader: 
    # store only the non empty lines 
    if any(row): 
     dlist.append(row) 
firstTime={} 
for row in dlist: 
    if row[1] not in firstTime: firstTime[row[1]] = row[0] 
d={} 
for row in dlist: 
    d.setdefault(row[1],[]).append([row[0],row[2],firstTime[row[1]]])

來源

2015-04-29 10:35:23 fferri

@ mescalinum：幾乎得到它，只是通過如果行[1]不在firstTime中firstTime [行] [行] [行] [行] [行] [行] [行] [行[1] ] =行[0]' – Manolete

OH！對！對不起，我認爲這個解決方案很簡單，我沒有測試它:-) [我只是糾正了我的答案] – fferri

如果您是怎麼運用大熊貓，試試這個：

In [269]: temp 
Out[269]: 
    Pos  ID Name 
0 1 0001L01 50293 
1 2 0002L01 128864 
2 3 0003L01 172937 
3 4 0004L01 12878 
4 5 0005L01 demo 
5 6 0004L01 12878 
6 7 0004L01 12878 
7 8 0005L01 demo

接下來，通過組ID和應用min：

In [271]: temp.groupby('ID').min().rename(columns={'Pos':'Firsttime'}) 
Out[271]: 
     Firsttime Name 
ID       
0001L01   1 50293 
0002L01   2 128864 
0003L01   3 172937 
0004L01   4 12878 
0005L01   5 demo 

In [272]: y = temp.groupby('ID').min().rename(columns={'Pos':'Firsttime'})

現在，與原始數據幀合併：

In [276]: temp.merge(y) 
Out[276]: 
    Pos  ID Name Firsttime 
0 1 0001L01 50293   1 
1 2 0002L01 128864   2 
2 3 0003L01 172937   3 
3 4 0004L01 12878   4 
4 6 0004L01 12878   4 
5 7 0004L01 12878   4 
6 5 0005L01 demo   5 
7 8 0005L01 demo   5

現在，迭代，並把它保存到字典：

In [280]: temp.merge(y).iterrows().next() 
Out[280]: 
(0, Pos    1 
ID   0001L01 
Name   50293 
Firsttime   1 
Name: 0, dtype: object)

來源

2015-04-29 10:34:44 fixxxer

from collections import defaultdict 

d = defaultdict(list) 
first = {} 

for row in reader: 
    if any(row): 
     pos, ID, name = row 
     if ID not in first: 
      first[ID] = pos 
     d[ID].append(pos, name, first[ID])

來源

2015-04-29 10:40:07

計數場的第一次出現在CSV文件

回答

相關問題