-1
我希望通過指定特定的列來刪除重複的條目。 列標記爲 'sent_name'熊貓drop_duplicates問題
print(new_df)
sent_name \
0 Abbey Road Station, London, UK
1 Abbey Wood Station, London, UK
2 Acton Station, London, UK
3 Acton Central Station, London, UK
Name Lat Lng \
0 Abbey Road, London E15, UK 51.531930 0.003760
1 Abbey Wood, London SE2, UK 51.491060 0.121420
2 Station Parade, West Acton London Underground ... 51.518055 -0.281053
3 Acton Central, London W3, UK 51.508720 -0.262950
type
0 [u'transit_station', u'point_of_interest', u'e...
1 [u'transit_station', u'point_of_interest', u'e...
2 [u'train_station', u'transit_station', u'point...
3 [u'transit_station', u'point_of_interest', u'e...
我試圖
new_df.drop_duplicates(["sent_name"])
和
new_df.drop_duplicates(subset="sent_name")
在檢查時,這些nither刪除所有重複的。
例如,
1038 Woodford Station, London, UK
1040 Woodford Station, London, UK
1041 Woodford Station, London, UK
1043 Woodford Station, London, UK
1044 Woodford Station, London, UK
1038 South Woodford London Underground Station, Geo... 51.591789 0.027315
1040 Woodford, Woodford, Woodford Green, Greater Lo... 51.606900 0.034000
1041 South Woodford, London E18, UK 51.591910 0.027360
1043 South Woodford (Stop C), London E18, UK 51.591312 0.029013
1044 South Woodford (Stop D), London E18, UK 51.592010 0.027658
1038 [u'train_station', u'transit_station', u'point...
1040 [u'transit_station', u'point_of_interest', u'e...
1041 [u'transit_station', u'point_of_interest', u'e...
1043 [u'transit_station', u'point_of_interest', u'e...
1044 [u'transit_station', u'point_of_interest', u'e...
你分配結果回來? 'new_df = new_df.drop_duplicates([「sent_name」])'默認情況下,除非通過參數'inplace = True',否則返回修改的df的副本,請參閱[docs](http://pandas.pydata.org/ pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html#pandas.DataFrame.drop_duplicates) – EdChum
我欠你一分! – LearningSlowly