R：不降低原始矩陣維數

從該數據幀稀疏子矩陣dfR：不降低原始矩陣維數

group from  to weight 
1  1 Joey Joey  1 
2  1 Joey Deedee  1 
3  1 Deedee Joey  1 
4  1 Deedee Deedee  1 
5  2 Johnny Johnny  1 
6  2 Johnny Tommy  1 
7  2 Tommy Johnny  1 
8  2 Tommy Tommy  1

其可以這樣

df <- structure(list(group = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), from = 
structure(c(2L, 2L, 1L, 1L, 3L, 3L, 4L, 4L), .Label = c("Deedee", 
"Joey", "Johnny", "Tommy"), class = "factor"), to = structure(c(2L, 1L, 
2L, 1L, 3L, 4L, 3L, 4L), .Label = c("Deedee", "Joey", "Johnny", 
"Tommy"), class = "factor"), weight = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L)), .Names = c("group", "from", "to", "weight"), class = "data.frame", 
row.names = c(NA, -8L))

創建一個稀疏矩陣mat可以使用矩陣包

可以得到

mat <- sparseMatrix(i = as.numeric(df$from), j = as.numeric(df$to), x = 
df$weight, dimnames = list(levels(df$from), levels(df$to)))

看起來像這樣：

4 x 4 sparse Matrix of class "dgCMatrix" 
     Deedee Joey Johnny Tommy 
Deedee  1 1  .  . 
Joey  1 1  .  . 
Johnny  . .  1  1 
Tommy  . .  1  1

。

如何創建使用df$group稀疏子矩陣不降低原有的矩陣尺寸？

結果應該是這樣的：

4 x 4 sparse Matrix of class "dgCMatrix" 
     Deedee Joey Johnny Tommy 
Deedee  1 1  .  . 
Joey  1 1  .  . 
Johnny  . .  .  . 
Tommy  . .  .  .

一是理念

如果我子集的數據幀，並創建子矩陣

df1 <- subset(df, group == 1) 
mat1 <- sparseMatrix(i = as.numeric(df1 $from), j = as.numeric(df1 $to), 
x = df1 $weight)

結果是2 x 2稀疏矩陣。這不是一個選項。除了「丟失兩個節點」之外，我還必須過濾要用作維名稱的因子級別。

訣竅可能是在創建矩陣時不會丟失因素。

第二個想法

如果我設置df$weight爲零組我不感興趣，並創建子矩陣

df2 <- df 
df2[df2$group == 2, 4] <- 0 
mat2 <- sparseMatrix(i = as.numeric(df2$from), j = as.numeric(df2$to), x 
= df2$weight, dimnames = list(levels(df$from), levels(df$to)))

矩陣具有正確的尺寸，我可以輕鬆地隨身攜帶因子水平爲尺寸名稱，但矩陣現在包含零：

4 x 4 sparse Matrix of class "dgCMatrix" 
     Deedee Joey Johnny Tommy 
Deedee  1 1  .  . 
Joey  1 1  .  . 
Johnny  . .  0  0 
Tommy  . .  0  0

這是als o不是一個選項，因爲行標準化創建了NaN s，當我將矩陣轉換爲圖形並執行網絡分析時，我遇到了麻煩。

在這裏，訣竅可能是從稀疏矩陣中去除零點？但是如何？

在任何情況下，解決方案必須儘可能高效，因爲矩陣變得非常大。

來源

2015-12-09 hyco

基本上你的第一個想法：

mat1 <- sparseMatrix(i = as.numeric(df1$from), j = as.numeric(df1$to), 
        x = df1$weight, 
        dims = c(length(levels(df$from)), length(levels(df$to))), 
        dimnames = list(levels(df$from), levels(df$to))) 

#4 x 4 sparse Matrix of class "dgCMatrix" 
#  Deedee Joey Johnny Tommy 
#Deedee  1 1  .  . 
#Joey  1 1  .  . 
#Johnny  . .  .  . 
#Tommy  . .  .  .

來源

2015-12-09 11:52:07 Roland

非常感謝，這是它。 – hyco

R：不降低原始矩陣維數

回答

相關問題