我有一個由字符串組成的熊貓數據框,即'P1','P2','P3',...,null。熊貓數據框用NaN替換字符串使用pd.concat
當我嘗試連接這個數據框與另一個時,所有的字符串被替換爲'NaN'。
看我下面的代碼:
descriptions = pd.read_json('https://raw.githubusercontent.com/ansymo/msr2013-bug_dataset/master/data/v02/eclipse/short_desc.json')
descriptions = descriptions.reset_index(drop=1)
descriptions['desc'] = descriptions.short_desc.apply(operator.itemgetter(0)).apply(operator.itemgetter('what'))
f1=pd.DataFrame(descriptions['desc'])
bugPrior = pd.read_json('https://raw.githubusercontent.com/ansymo/msr2013-bug_dataset/master/data/v02/eclipse/priority.json')
bugPrior = bugPrior.reset_index(drop=1)
bugPrior['priority'] = bugPrior.priority.apply(operator.itemgetter(0)).apply(operator.itemgetter('what'))
f2=pd.DataFrame(bugPrior['priority'])
df = pd.concat([f1,f2])
print(df.head())
輸出如下:
desc priority
0 Usability issue with external editors (1GE6IRL) NaN
1 API - VCM event notification (1G8G6RR) NaN
2 Would like a way to take a write lock on a tea... NaN
3 getter/setter code generation drops "F" in "..... NaN
4 Create Help Index Fails with seemingly incorre... NaN
任何想法,我怎麼可能會停止這種情況的發生?
最終,我的目標是將所有內容都放在一個數據框中,以便我可以刪除所有具有「空」值的行。這也有助於後面的代碼。
謝謝。
謝謝你的幫助,這個數據集已經在驅動m個堅果了,這只是數據導入! – JohnWayne360