2016-10-21 179 views
-1

數據:刪除所有行與特定條目

DB <- data.frame(orderID = c(1,2,3,4,4,5,6,6,7,8),  
orderDate = c("1.1.12","1.1.12","1.1.12","13.1.12","13.1.12","12.1.12","10.1.12","10.1.12","21.1.12","24.1.12"), 
itemID = c(2,3,2,5,12,4,2,3,1,5), 
customerID = c(1, 2, 3, 1, 1, 3, 2, 2, 1, 1), 
itemPrice = c(9.99, 14.99, 9.99, 19.99, 29.99, 4.99, 9.99, 14.99, 49.99, 19.99) 
orderItemStatus = c(sold, sold, sold, refunded, sold, refunded, sold, refunded, sold, refunded)) 

預期成果:

DB <- data.frame(orderID = c(1,2,3,4,6,7),  
orderDate = c("1.1.12","1.1.12","1.1.12","13.1.12","10.1.12","21.1.12"), 
itemID = c(2,3,2,12,2,1), 
customerID = c(1, 2, 3, 1, 2, 1,), 
itemPrice = c(9.99, 14.99, 9.99, 29.99, 9.99, 49.99,) 
orderItemStatus = c(sold, sold, sold, sold, sold, sold) 

的理解:

orderID是連續的。同一天訂購的產品customerID在同一天獲得相同的orderID。當同一客戶在另一天訂購產品時,這是新的orderID

我想刪除orderItemStatus = refunded的所有訂單。我怎樣才能做到這一點? (我認爲它退出簡單,我發現Removing specific rows from a dataframe:但我不明白它是如何工作的 - 所以PLZ幫助我:()

- >原始數據有大約500k行:所以PLZ給解決方案只需要少許性能比較...

非常感謝您的支持

+0

嘗試'DB < - DB [DB $ orderItemStatus!=「退款」,]' – rosscova

+0

工作!謝謝! – AbsoluteBeginner

回答

0

下面的代碼應該做的工作:。

DB_new <- DB[-which(DB$orderItemStatus == "refunded"), ] 

which爲您提供了滿足指標的比較,例如用DB[-c(1,5,10),]你能夠刪除項目1,5和10你也可以做的兩個步驟:

indices_to_remove <- which(DB$orderItemStatus == "refunded") 
DB_new <- DB[-indices_to_remove, ] 

在評論由@rosscova提出的另一種方法是找到所需的指標,並將其分配到的結果。