下面是將國家智能存儲在myfiles中的多個數據框應用PCA的代碼。如何從文件列表中刪除總和爲零的列
## Get file names for a working directory ###
temp = list.files(pattern="*.csv")
## Read files ###
myfiles = lapply(temp, read.csv)
### Name the files ###
names(myfiles)<-c("mCRC_2015_Q1","mCRC_2015_Q2","mCRC_2015_Q3","mCRC_2015_Q4")
##### to check the names of the columns #######
names(myfiles$mCRC_2015_Q1)
##### to change the names of the columns ######
colnames = c("Insufficient efficacy","Issues around safety/tolerability","Inconvenient dosage regimen/administration","Price issues","Not reimbursed","Not included on hospital/government medicines formulary","Insufficient clinical data available for acceptance","Previously used for this patient","Prescription only possible in selected cases with detailed justification to authorities/payers ","I don’t have enough scientific information about it","Lack of experience in this setting","Involved in clinical trial with other drugs","Patient not appropriate for Targeted therapy","Patient not appropriate for cetuximab (Erbitux)","Others","Country")
for (i in seq_along(myfiles)){
colnames(myfiles[[i]]) <- colnames
}
##### Delete all those columns which have zero sum from each dataframe #####
for(i in 1:length(myfiles)){
myfiles[[i]] <- myfiles[[i]][,which(!lapply(myfiles,FUN = function(x){colSums(x!=0)>0}))]
}
####### Run PCA for each dataframe country wise ####
Myfiles<- split(myfiles, myfiles$Country)
for(i in 1:length(Myfiles)){
assign(paste0("pca", i), prcomp(Myfiles[[i]][which(names(myfiles)!="Country")], center=T, scale.=T))
}
這些都是我所面臨的問題:
1)如何刪除所有那些都只有零值的列。
2)我們如何應用prcomp命令對每個數據幀countrywise(國家是數據幀中的變量之一)
3)從加載矩陣我怎麼能得到前4個最相關的變量(不論符號)爲每個數據幀。
這就是太多的問題。請一次一個。 –
@RichardScriven請回答第一個..!謝謝 ! – Kavya
@Kavya你能舉個例子嗎?它會讓你更容易幫助你。 – Learner