我有一個看起來類型的數據幀是這樣的:如何通過所有列循環,並比較特定列和情節頻率讀取出局
y<-c("A1","B1", "C2", "A1", "B1","C1", "A1","B2", "C3", "A1", "B1", "C4", "A1", "B1","C4", "A1","B2", "C4", "A1","B1", "C4", "A1", "B1", "C4")
test<- data.frame(matrix(y, nrow = 3, ncol = 8))
colnames(test) <- c("Learn_1", "Car_1", "Car_2", "Fan_1", "Fan_2", "Fan_3","Kart_1", "God_1")
test
Learn_1 Car_1 Car_2 Fan_1 Fan_2 Fan_3 Kart_1 God_1
1 A1 A1 A1 A1 A1 A1 A1 A1
2 B1 B1 B2 B1 B1 B2 B1 B1
3 C2 C1 C3 C4 C4 C4 C4 C4
我的真實數據有不等長和數以千計的13列的行和值混在一起。我想確定God_1中每個值到所有其他列的頻率,但是對於每個具有相同字的列(意味着列來自同一研究)(即,列和Fan將數值頻率計爲1,如果該值顯示在這些列中不止一次,然後我想繪出在GOD_1中可用值的總百分比(100%)上顯示5,4,3,2,1的值的百分比。總數值,然後是不同的百分比陰影區分頻率值(1,2,3,4,5)。我的陰謀應該有最小1和最大5(有5個獨特的列字)。
我的問題是,我不知道如何開始這個,但在過去的幾天思考這個。想法任何人?
這些頻率多少次y根據我想要的東西顯示:
A1 = 5
B1 = 5
C4 = 3
這裏是我的例子,我的真實數據看起來像這樣,但有2366 obs。 13個變量,各個係數w /一些數量的級別(範圍從200:3000)
str(test)
'data.frame': 3 obs. of 8 variables:
$ Learn_1: Factor w/ 3 levels "A1","B1","C2": 1 2 3
$ Car_1 : Factor w/ 3 levels "A1","B1","C1": 1 2 3
$ Car_2 : Factor w/ 3 levels "A1","B2","C3": 1 2 3
$ Fan_1 : Factor w/ 3 levels "A1","B1","C4": 1 2 3
$ Fan_2 : Factor w/ 3 levels "A1","B1","C4": 1 2 3
$ Fan_3 : Factor w/ 3 levels "A1","B2","C4": 1 2 3
$ Kart_1 : Factor w/ 3 levels "A1","B1","C4": 1 2 3
$ God_1 : Factor w/ 3 levels "A1","B1","C4": 1 2 3
你知道列前的名字嗎?或者它是動態的和未知的,這取決於你正在拉動的數據 –
嗨Road_to_quandom,所有列名都在手前已知 – Chad