2013-07-08 133 views
2

我正在輸入的例子非數字下面給出:如何創建從原始數據的鄰接矩陣在本質上是

User ID 1 --- Artist 5 
User ID 2 --- Artist 1 
User ID 3 --- Artist 7 
User ID 4 --- Artist 2 
User ID 5 --- Artist 3 
User ID 1 --- Artist 2 
User ID 3 --- Artist 1 

以上數據是音樂的記錄聽取應用的用戶。

我想生成對應於下面給出例子的鄰接矩陣:

  ARTIST 1 ARTIST 2 ARTIST 3 ARTIST 4 ARTIST 5 ARTIST 6 ARTIST 7 
USER ID 1  0  1   0   0   1   0   0 
USER ID 2  1  0   0   0   0   0   0 
USER ID 3  1  0   0   0   0   0   1 
USER ID 4  0  1   0   0   0   0   0 
USER ID 5  0  0   1   0   0   0   0 

這將如何能夠在R.任何提示或指針將最讚賞。

預先感謝您的時間和幫助。

+0

我建議增加 「R」 標記,這將達到[R專家 – doctorlove

+0

謝謝doctorlove ...將添加標籤 – Manus

+0

只需在數據中添加一列'1'並使用上面的答案。 – flodel

回答

3

這工作:

# get data in useable form 
ContingencyTable <- read.table(text=gsub(pattern = " --- ", replacement = ",","User ID 1 --- Artist 5 
User ID 2 --- Artist 1 
User ID 3 --- Artist 7 
User ID 4 --- Artist 2 
User ID 5 --- Artist 3 
User ID 1 --- Artist 2 
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE) 
# add variable for match value 
ContingencyTable$Val <- 1 
# more or less lifted from Arun's answer linked by @Hong Ooi, above 
adjMat <- reshape2::dcast(ContingencyTable, V1 ~ V2, value.var = "Val", fill=0) 
rownames(adjMat) <- adjMat[,1] 
adjMat <- adjMat[,2:ncol(adjMat)] 

adjMat 
     Artist 1 Artist 2 Artist 3 Artist 5 Artist 7 
User ID 1  0  1  0  1  0 
User ID 2  1  0  0  0  0 
User ID 3  1  0  0  0  1 
User ID 4  0  1  0  0  0 
User ID 5  0  0  1  0  0 
+0

謝謝蒂姆....你的回答非常有幫助。 – Manus

+1

'表(ContingencyTable)'似乎也工作 – user20650

2

qdap packageadjmat功能,可以這樣做:

dat <- read.table(text=gsub(pattern = " --- ", replacement = ",", 
"User ID 1 --- Artist 5 
User ID 2 --- Artist 1 
User ID 3 --- Artist 7 
User ID 4 --- Artist 2 
User ID 5 --- Artist 3 
User ID 1 --- Artist 2 
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE) 


library(qdap) 
x <- with(dat, termco(V1, V2, unique(V1))) 
adjmat(x)$boolean 

## > adjmat(x)$boolean 
##   Artist 1 Artist 2 Artist 3 Artist 5 Artist 7 
## User ID 1  0  1  0  1  0 
## User ID 2  1  0  0  0  0 
## User ID 3  1  0  0  0  1 
## User ID 4  0  1  0  0  0 
## User ID 5  0  0  1  0  0 

PS添Riffe尼斯的方法來在數據:)

+0

也我認爲這被稱爲布爾矩陣不是一個鄰接矩陣,但我可能是錯的。 –

+0

謝謝泰勒引用qdap軟件包......你的回答非常有用。 – Manus

4

如果DF閱讀是與問題中的數據對應的兩列數據幀:

xtabs(data = DF) 

這給:

  V2 
V1   Artist 1 Artist 2 Artist 3 Artist 5 Artist 7 
    User ID 1  0  1  0  1  0 
    User ID 2  1  0  0  0  0 
    User ID 3  1  0  0  0  1 
    User ID 4  0  1  0  0  0 
    User ID 5  0  0  1  0  0 

注:我們用它進行輸入:

DF <- structure(list(V1 = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 3L), .Label = c("User ID 1", 
"User ID 2", "User ID 3", "User ID 4", "User ID 5"), class = "factor"), 
    V2 = structure(c(4L, 1L, 5L, 2L, 3L, 2L, 1L), .Label = c("Artist 1", 
    "Artist 2", "Artist 3", "Artist 5", "Artist 7"), class = "factor")), .Names = c("V1", 
"V2"), class = "data.frame", row.names = c(NA, -7L)) 
+0

這太棒了,謝謝! –

+0

很好的回答!謝謝格洛騰迪克! – Manus