創建3-way百分比表

我想有一個3路表顯示列或行百分比使用三個分類變量。下面的命令給出了計數，但我無法找到如何獲得百分比。創建3-way百分比表

sysuse nlsw88 

table married race collgrad, col 

-------------------------------------------------------------------- 
      |    college graduate and race     
      | ---- not college grad ---- ------ college grad ------ 
    married | white black other Total white black other Total 
----------+--------------------------------------------------------- 
    single | 355 256  5 616  132  53  3 188 
    married | 862 224  12 1,098  288  50  6 344 
--------------------------------------------------------------------

我怎樣才能得到百分比？

來源

2017-04-06 amo

百分比'已婚'？ 'race'？ 'collgrad'？ 'everyone'？或者2或3或4個？ –

@nick給予'collgrad'的百分比。像白人，單身，而不是大學畢業生的百分比是'355 * 100 /（355 + 862）'。單個非大學畢業生（不分種族）的比例是616 * 100 /（606 + 1098）'。類似於'bys collgrad'的結果：表已婚種族，列'，但在一張桌子上，但不是像'bys'給出的那樣兩張桌子。 – amo

這個答案會顯示一些雜耍的技巧。缺點是我不知道一個簡單的方法來得到你所要求的。好處是所有這些技巧都很容易理解，並且通常很有用。

讓我們用你的例子，這是很好的目的。

. sysuse nlsw88, clear 
(NLSW, 1988 extract)

提示1您可以爲自己計算百分比變量。我專注於％單。在這個數據集married是二元的，所以我不會顯示補充百分比。一旦你計算出來了，你可以（a）依靠它在你用來定義它的組內保持不變的事實（b）直接列表它。我發現tabdisp被用戶低估。它被視爲程序員的命令，但根本不難使用。 tabdisp可讓您即時設定顯示格式;它沒有任何傷害，並且可能對其他命令直接使用format分配一個有用。

. egen pcsingle = mean(100 * (1 - married)), by(collgrad race) 

. tabdisp collgrad race, c(pcsingle) format(%2.1f) 

-------------------------------------- 
       |  race   
college graduate | white black other 
-----------------+-------------------- 
not college grad | 29.2 53.3 29.4 
    college grad | 31.4 51.5 33.3 
-------------------------------------- 

. format pcsingle %2.1f

提示＃2的用戶編寫的命令groups提供不同的彈性。groups可以從SSC安裝（嚴格來說，必須先安裝，然後才能使用它）。它是各種表格的包裝，但使用list作爲顯示引擎。

. * do this installation just once 
. ssc inst groups 

. groups collgrad race pcsingle 

    +-------------------------------------------------------+ 
    |   collgrad race pcsingle Freq. Percent | 
    |-------------------------------------------------------| 
    | not college grad white  29.2 1217  54.19 | 
    | not college grad black  53.3  480  21.37 | 
    | not college grad other  29.4  17  0.76 | 
    |  college grad white  31.4  420  18.70 | 
    |  college grad black  51.5  103  4.59 | 
    |-------------------------------------------------------| 
    |  college grad other  33.3  9  0.40 | 
    +-------------------------------------------------------+

我們可以改進。我們可以使用特徵來設置更好的標題文本。（實際上，這些變量的名稱可能比變量名的約束更少，但通常需要比變量標籤更短）。我們可以通過調用標準list選項來使用分隔符。

. char pcsingle[varname] "% single" 

. char collgrad[varname] "college?" 

. groups collgrad race pcsingle , subvarname sepby(collgrad) 

    +-------------------------------------------------------+ 
    |   college? race % single Freq. Percent | 
    |-------------------------------------------------------| 
    | not college grad white  29.2 1217  54.19 | 
    | not college grad black  53.3  480  21.37 | 
    | not college grad other  29.4  17  0.76 | 
    |-------------------------------------------------------| 
    |  college grad white  31.4  420  18.70 | 
    |  college grad black  51.5  103  4.59 | 
    |  college grad other  33.3  9  0.40 | 
    +-------------------------------------------------------+

提示＃3線的顯示格式爲可變通過使一個字符串等效。我沒有完全說明這一點，但是當我想要在tabdisp中將計數顯示與小數位數字結果相結合時，我經常使用它。 format(%2.1f)和format(%3.2f)可能適用於大多數變量（並且偶然的重要細節是小數位數），但它們會導致42的計數顯示爲42.0或42.00，這看起來很愚蠢。 tabdisp的format()選項未觸及字符串並更改內容;它甚至不知道字符串變量包含什麼或來自哪裏。所以，字符串會在tabdisp出現時顯示出來，這就是你想要的。

. gen s_pcsingle = string(pcsingle, "%2.1f") 

. char s_pcsingle[varname] "% single"

groups有一個選項救什麼表列新的數據集。

提示＃4要有一個總類別，暫時將數據加倍。原始副本被重新標記爲Total類別。你可能需要做一些額外的計算，但沒有任何東西等同於火箭科學：一位聰明的高中生可以搞清楚。在這裏，逐行研究的具體例子勝過冗長的解釋。

. preserve 

. local Np1 = _N + 1 

. expand 2 
(2,246 observations created) 

. replace race = 4 in `Np1'/L 
(2,246 real changes made) 

. label def racelbl 4 "Total", modify 

. drop pcsingle 

. egen pcsingle = mean(100 * (1 - married)), by(collgrad race) 

. char pcsingle[varname] "% single" 

. format pcsingle %2.1f 

. gen istotal = race == 4 

. bysort collgrad istotal: gen total = _N 

. * for percents of the global total, we need to correct for doubling up  
. scalar alltotal = _N/2 

. * the table shows percents for college & race | collgrad and for collgrad | total 
. bysort collgrad race : gen pc = 100 * cond(istotal, total/alltotal, _N/total) 
. format pc %2.1f 
. char pc[varname] "Percent" 

. groups collgrad race pcsingle pc , show(f) subvarname sepby(collgrad istotal) 

    +-------------------------------------------------------+ 
    |   college? race % single Percent Freq. | 
    |-------------------------------------------------------| 
    | not college grad white  29.2  71.0 1217 | 
    | not college grad black  53.3  28.0  480 | 
    | not college grad other  29.4  1.0  17 | 
    |-------------------------------------------------------| 
    | not college grad Total  35.9  76.3 1714 | 
    |-------------------------------------------------------| 
    |  college grad white  31.4  78.9  420 | 
    |  college grad black  51.5  19.4  103 | 
    |  college grad other  33.3  1.7  9 | 
    |-------------------------------------------------------| 
    |  college grad Total  35.3  23.7  532 | 
    +-------------------------------------------------------+

請注意使用未明確顯示的變量添加分隔線的額外技巧。

來源

2017-04-07 12:54:18

工作良好，當然不涉及火箭科學。將它包裝在一個函數調用中會很好。加上卡方統計數據也是一個好的選擇。例如。種族和結婚之間的聯繫給予collgrad – amo

當你說功能，你的意思是命令。但是你的評論強調了關鍵的難點。每個人都有他們認爲是一個相當簡單，直接的表格來生產，但有成千上萬個這樣的表格類型。製作任何類型表的一般命令是這裏的漸近線。語法是頁面和頁面長度，文檔是整個手動體積。或者你可以自己編寫一個程序，用一種語法創建你想要的表。這就是每個程序開始的方式，這是專業和發燒友程序員的論壇！ –

創建3-way百分比表

回答

相關問題