1
我有記錄的CSV:如何創建熊貓分類索引記錄列表?
name,credits,email
bob,,[email protected]
bob,6.0,[email protected]
bill,3.0,[email protected]
bill,4.0,[email protected]
tammy,5.0,[email protected]
其中name
是該指數。因爲有相同名稱的多個記錄,我想整個行(減去名稱)捲成列表創建窗體的JSON:
{
"bob": [
{ "credits": null, "email": "[email protected]"},
{ "credits": 6.0, "email": "[email protected]" }
],
// ...
}
我目前的解決方案是有點kludgey因爲它似乎用大熊貓僅作爲閱讀CSV的工具,但仍然是產生預期的我輸出JSONish:
#!/usr/bin/env python3
import io
import pandas as pd
from pprint import pprint
from collections import defaultdict
def read_data():
s = """name,credits,email
bob,,[email protected]
bob,6.0,[email protected]
bill,3.0,[email protected]
bill,4.0,[email protected]
tammy,5.0,[email protected]
"""
data = io.StringIO(s)
return pd.read_csv(data)
if __name__ == "__main__":
df = read_data()
columns = df.columns
index_name = "name"
print(df.head())
records = defaultdict(list)
name_index = list(columns.values).index(index_name)
columns_without_index = [column for i, column in enumerate(columns) if i != name_index]
for record in df.values:
name = record[name_index]
record_without_index = [field for i, field in enumerate(record) if i != name_index]
remaining_record = {k: v for k, v in zip(columns_without_index, record_without_index)}
records[name].append(remaining_record)
pprint(dict(records))
有沒有辦法做到在本地大熊貓(和numpy的)是一回事嗎?
差不多!如果我不需要明確列出「groupby」後面的列,那很好,但我認爲這很簡單。 – erip
@erip,我已更新我的文章 - 請檢查... – MaxU
完美!非常感謝你的幫助! – erip