2017-01-19 29 views
0

我有一箇中等大小的權限列表和分配給這些權限的用戶。我想在分享相同權限時將用戶組合在一起,但我遇到了一些問題。如何識別一組相似組中存在的子組?

操作電子表格中的數據時,我可以根據它們的整套權限計算每組唯一的權限並將用戶組合到一個角色中。這樣做的結果是每個用戶只有一個角色。

我希望能夠做的是識別數據集中的子組,以便我可以減少角色數量,同時增加每個用戶的角色分配數量。

下面是示例數據集中: enter image description here

的數據看很容易找到潛在的角色(用戶1和2都分享前6個權限),但有辦法逗這種類型的數據通過SQL,電子表格函數或一個簡單的程序?

我知道有多個答案,根據每個角色的權限,或分配給角色的用戶的最小數量的最小數這個問題,等

我不希望找到一個最終的答案,但如果這是有道理的,試圖向前推進一個算法步驟。

回答

1

而不是集羣(這是非常糟糕的二進制數據)請使用:

  • 鏈接預測/推薦系統:如果用戶A有權b和c,建議其他什麼權限?
  • 頻繁項集挖掘/關聯規則:如果用戶有A,B那麼他也應該有權限Ça, b -> c
+0

對這些理論有一點挖掘,但它們令人難以置信,但是,我認爲它們對於我的小數據集來說太過先進了。 – gfroese

+0

(錯過了我的編輯窗口) 感謝您提供的信息,我想我希望得到一個簡單的算法,您可以在兩個組及其關聯中提供各種可能的方法,以最有效地對這些關聯進行分類。我認爲我要做的只是嘗試在大網格中將這些關係可視化,並手動將相似的用戶在網格中彼此靠近,然後手動創建組。 – gfroese

+0

在做了更多的閱讀之後,我偶然發現了這個令人驚奇的[library](http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php#example1),它正是我所需要的。謝謝! – gfroese

1

好的,讓我們做一些數據吧!

DECLARE @User TABLE 
(
    Perm INT, 
    User1 INT, 
    User2 INT, 
    User3 INT, 
    User4 INT, 
    User5 INT, 
    User6 INT, 
    User7 INT, 
    User8 INT, 
    User9 INT, 
    User10 INT 
) 

INSERT INTO @User 
(Perm, User1, User2, User3, User4, User5, User6, User7, User8, User9, User10) 
VALUES 
(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), 
(2, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1), 
(3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
(4, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0), 
(5, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1), 
(6, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1), 
(7, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1), 
(8, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
(9, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1); 

現在我們有一個表中的權限和用戶,現在我們做一些計數並且創建一個分組值。

SELECT 
    u.Perm, 
    u.User1, 
    u.User2, 
    u.User3, 
    u.User4, 
    u.User5, 
    u.User6, 
    u.User7, 
    u.User8, 
    u.User9, 
    u.User10, 
    CASE WHEN u.User1 = 1 THEN 1 ELSE 0 END + 
    CASE WHEN u.User2 = 1 THEN 2 ELSE 0 END + 
    CASE WHEN u.User3 = 1 THEN 4 ELSE 0 END + 
    CASE WHEN u.User4 = 1 THEN 8 ELSE 0 END + 
    CASE WHEN u.User5 = 1 THEN 16 ELSE 0 END + 
    CASE WHEN u.User6 = 1 THEN 32 ELSE 0 END + 
    CASE WHEN u.User7 = 1 THEN 64 ELSE 0 END + 
    CASE WHEN u.User8 = 1 THEN 128 ELSE 0 END + 
    CASE WHEN u.User9 = 1 THEN 256 ELSE 0 END + 
    CASE WHEN u.User10 = 1 THEN 512 ELSE 0 END AS GroupMe 
FROM @User u 

這裏是輸出:

Perm User1 User2 User3 User4 User5 User6 User7 User8 User9 User10 GroupMe 
1 1 1 1 1 1 1 1 1 1 1 1023 
2 1 1 0 0 0 0 0 1 1 1 899 
3 1 0 0 0 0 0 0 0 0 0 1 
4 1 1 1 1 0 0 0 0 0 0 15 
5 1 1 0 0 0 0 0 1 1 1 899 
6 1 1 0 0 0 0 0 0 1 1 771 
7 0 0 1 1 1 1 1 0 1 1 892 
8 1 0 0 0 0 0 0 0 0 0 1 
9 1 0 1 1 0 1 1 0 1 1 877 

你會看到,3和8具有相同的值。 另外2和5具有相同的值。

好吧,讓我們用數字表中添加燙髮突破區域:

;WITH 
a AS (SELECT 1 AS i UNION ALL SELECT 1), 
b AS (SELECT 1 AS i FROM a AS x, a AS y), 
c AS (SELECT 1 AS i FROM b AS x, b AS y), 
d AS (SELECT 1 AS i FROM c AS x, c AS y), 
e AS (SELECT 1 AS i FROM d AS x, d AS y), 
f AS (SELECT 1 AS i FROM e AS x, e AS y), 
numbers AS 
(
    SELECT TOP(10) 
     ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS number 
    FROM f 
), PrivBreakout AS 
(
    SELECT 1 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User1 = 1 
    UNION 
    SELECT 2 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User2 = 1 
    UNION 
    SELECT 3 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User3 = 3 
    UNION 
    SELECT 4 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User4 = 1 
    UNION 
    SELECT 5 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User5 = 1 
    UNION 
    SELECT 6 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User6 = 1 
    UNION 
    SELECT 7 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User7 = 1 
    UNION 
    SELECT 8 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User8 = 1 
    UNION 
    SELECT 9 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User9 = 1 
    UNION 
    SELECT 10 AS UserId, u.Perm 
    FROM @User u 
    WHERE u.User10 = 1 
), ThreeLayerCombo AS 
(
    SELECT 
     a.number AS priva, 
     b.number AS privb, 
     c.number AS privc 
    FROM numbers a 
    CROSS JOIN numbers b 
    CROSS JOIN numbers c 
    WHERE b.number > a.number 
     AND c.number > b.number 
) 
在上面的代碼

現在,我決定去尋找至少3權限的組合

SELECT t.priva, t.privb, t.privc, COUNT(DISTINCT a.UserId) AS Grouper 
FROM ThreeLayerCombo t 
INNER JOIN PrivBreakout a 
    ON t.priva = a.Perm 
INNER JOIN PrivBreakout b 
    ON b.UserId = a.UserId 
    AND t.privb = b.Perm 
INNER JOIN PrivBreakout c 
    ON c.UserId = a.UserId 
    AND t.privc = c.Perm 
GROUP BY t.priva, t.privb, t.privc 
ORDER BY COUNT(DISTINCT a.UserId) DESC 

讓我們看看對於最好的組合,這裏是輸出:

priva privb privc Grouper 
1 2 5 5 
1 7 9 5 
2 5 6 4 
1 2 6 4 
1 5 6 4 
1 2 9 3 
2 5 9 3 
1 5 9 3 
1 6 9 3 
2 6 9 3 
5 6 9 3 
5 7 9 2 
5 6 7 2 
4 5 6 2 
2 7 9 2 
6 7 9 2 
1 4 9 2 
1 6 7 2 
2 6 7 2 
2 5 7 2 
2 4 5 2 
2 4 6 2 
1 2 7 2 
1 5 7 2 
1 2 4 2 
1 4 5 2 
1 4 6 2 
1 4 7 1 
1 4 8 1 
1 2 3 1 
1 5 8 1 
1 2 8 1 
1 3 4 1 
1 3 5 1 
1 3 6 1 
1 3 8 1 
1 3 9 1 
2 4 8 1 
2 4 9 1 
2 5 8 1 
2 6 8 1 
1 6 8 1 
1 8 9 1 
2 3 4 1 
2 3 5 1 
2 3 6 1 
2 3 8 1 
2 3 9 1 
6 8 9 1 
2 8 9 1 
3 4 5 1 
3 4 6 1 
3 4 8 1 
3 4 9 1 
3 5 6 1 
3 5 8 1 
3 5 9 1 
3 6 8 1 
3 6 9 1 
3 8 9 1 
4 5 8 1 
4 5 9 1 
4 6 8 1 
4 6 9 1 
4 7 9 1 
4 8 9 1 
5 6 8 1 
5 8 9 1 

從輸出最好的賭注是(1,2,5)和(1,7,9)建立特定的角色。

希望這會有所幫助!

+0

感謝凱文,這其實是我用來創建我原來的組相同的方法。我的目標是找到共同的小組。在這個解決方案中,每個用戶都是一個組/角色的成員,實際上,他們可能是多個較小的權限/角色組的一部分。 – gfroese