2015-10-20 45 views
0

我有一個文檔術語矩陣,其頻率> 600個字,並且有相應的日期(mm/dd/yyyy)對於每個頻率值:如何繪製字與時間的頻率,並將時間變量分組爲月/年和年R

 > head(mydata3,3) 
    Claim.Number Note.Date LOSSDATE DATEREPORTED 
1  106810 7/10/1998 12/9/1997 12/29/1997 
2  106810 7/21/1998 12/9/1997 12/29/1997 
3  106810 10/21/1999 12/9/1997 12/29/1997 
    DATEENTERED Row Topic absenc abus academ access 
1 1/5/1998 3  4  0 0  0  0 
2 1/5/1998 4  2  0 0  0  0 
3 1/5/1998 8 11  0 0  0  0 
    accid accommod account accus act action activ add 
1  0  0  0  0 0  0  0 0 
2  0  0  0  0 0  0  0 0 
3  0  0  0  0 0  0  0 0 
    addit addl adequ adjust administr admiss advanc 
1  0 0  0  0   0  0  0 
2  0 0  0  0   0  0  0 
3  0 0  0  0   0  0  0 
    advers advic african age agenc agreement aid ambul 
1  0  0  0 0  0   0 0  0 
2  0  0  0 0  0   0 0  0 
3  0  0  0 0  0   0 0  0 
    amount analysi ankl answer anticip appeal appel 
1  0  0 0  0  0  0  0 
2  0  0 0  0  0  2  0 
3  0  0 0  0  0  1  0 
    appli applic appoint appropri approv approxim arbitr 
1  0  0  0  1  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    argu argument aris arm arrang arriv asap assault 
1 0  0 0 0  0  0 0  0 
2 0  0 0 0  0  0 0  0 
3 0  0 1 0  0  0 0  0 
    assert assess assist athlet attach attent audit auto 
1  0  0  0  0  0  2  0 0 
2  0  0  0  0  0  0  0 0 
3  0  0  0  0  0  0  0 0 
    avoid await award background balanc ball bar basi 
1  0  0  0   0  0 0 0 0 
2  0  0  0   0  0 0 0 0 
3  0  0  0   0  0 0 0 0 
    benefit big bill black board breach break. brief 
1  0 0 0  0  0  0  0  0 
2  0 0 0  0  0  0  0  0 
3  0 0 0  0  0  0  0  0 
    broken broker budget build bus busi call campus cap 
1  0  0  0  0 0 0 0  0 0 
2  0  0  0  0 0 0 2  0 0 
3  0  0  0  0 0 0 0  0 0 
    car care carrier center cgl chair chang charg child 
1 0 0  0  0 0  0  0  0  0 
2 0 0  0  0 0  0  0  0  0 
3 0 0  0  0 0  0  0  0  0 
    children circuit cite citi civil clean client clinic 
1  0  0 0 0  0  0  0  0 
2  0  0 0 0  0  0  0  0 
3  0  0 0 0  0  0  0  0 
    close closur cmc coach code collect commit committe 
1  0  0 0  0 0  0  0  0 
2  0  0 0  0 0  0  0  0 
3  0  0 0  0 0  0  0  0 
    communic compani compar compel compens complain 
1  0  0  0  0  0  0 
2  0  0  0  0  0  0 
3  0  0  0  0  0  0 
    complet conclud condit conduct conf confer confid 
1  0  0  0  0 0  0  0 
2  0  0  0  0 0  0  0 
3  0  0  0  0 0  0  0 
    conflict connect construct consult contact contend 
1  0  0   0  0  0  0 
2  0  0   0  0  0  0 
3  0  0   0  0  0  0 
    contract contractor contribut control convers 
1  0   0   0  0  0 
2  0   0   0  0  0 
3  0   0   0  0  0 
    convinc cooper coordin copi correct cost counter 
1  0  0  0 0  0 0  0 
2  0  0  0 0  0 0  0 
3  0  0  0 1  0 0  0 
    counti cours court cover coverag creat credibl 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    credit crimin cross cut damag danger deadlin deal 
1  0  0  0 0  0  0  0 0 
2  0  0  0 0  0  0  0 0 
3  0  0  0 0  0  0  0 0 
    dean death decis declin deduct defam defect defend 
1 0  0  0  0  0  0  0  0 
2 0  0  0  0  0  0  0  0 
3 0  0  0  0  0  0  0  0 
    degre delay demand deni denial depart depos deposit 
1  0  0  0 0  0  0  0  0 
2  1  0  0 1  0  0  0  0 
3  1  0  0 0  0  0  0  0 
    dept despit develop diari difficult director disabl 
1 0  1  0  1   0  0  0 
2 1  0  0  0   0  0  0 
3 0  0  0  0   0  0  0 
    discharg disciplin disciplinari discoveri discrimin 
1  0   0   0   0   1 
2  0   0   0   0   1 
3  0   0   0   0   0 
    discuss dismiss disput distress district doc docket 
1  0  0  0  0  0 0  0 
2  0  0  0  0  0 0  0 
3  0  0  0  0  0 0  0 
    doctor document done door dorm doubt draft drive 
1  0  0 0 0 0  0  0  0 
2  0  0 0 0 0  0  1  0 
3  0  0 0 0 0  0  0  0 
    driver drop due earlier earn educ eeoc effort ell 
1  0 0 0  0 0 0 0  0 0 
2  0 0 0  0 0 0 0  0 0 
3  0 0 0  0 0 0 0  0 0 
    els email emot employ employe encourag end endors 
1 0  0 0  0  0  0 1  0 
2 0  0 0  0  0  0 0  0 
3 0  0 0  1  2  0 1  0 
    enrol entitl environ estim evalu event evid exam 
1  0  0  0  0  0  0 0 2 
2  0  0  0  0  0  0 0 2 
3  0  0  0  0  0  0 0 2 
    examin exceed excess exchang exclus execut expens 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    experi expert expir exposur extend extens extent 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    extrem eye face facil faculti fail failur fall fals 
1  0 0 0  0  0 0  0 0 0 
2  0 0 0  0  1 2  1 0 0 
3  0 0 0  0  0 3  0 0 0 
    fault favor fax feder fee fell femal field fight 
1  0  0 0  0 0 0  0  0  0 
2  0  0 0  0 0 0  0  0  0 
3  0  0 0  0 0 0  0  0  0 
    final financi finish fire firm floor focus foot forc 
1  0  0  0 0 0  0  0 0 0 
2  0  0  0 0 0  0  0 0 0 
3  0  0  0 0 0  0  0 0 0 
    form formal former forward fractur free fund futur 
1 0  0  0  0  0 0 0  0 
2 0  0  0  0  0 0 0  0 
3 0  0  0  0  0 0 0  0 
    game gender gone grade graduat grant grievanc ground 
1 0  0 0  0  0  0  0  0 
2 0  0 0  0  0  1  0  0 
3 0  0 0  1  1  0  0  0 
    group hand happi harass head health hear held higher 
1  0 0  0  0 0  0 0 0  0 
2  0 0  0  0 0  0 0 0  0 
3  0 0  0  0 0  0 0 0  0 
    hire histori hit hold home hospit hostil hous human 
1 0  0 0 0 0  0  0 0  0 
2 0  0 0 0 0  0  0 0  0 
3 0  0 0 0 0  0  0 0  0 
    ice identifi immedi immun impact import impress 
1 0  0  0  0  0  0  0 
2 0  0  0  0  0  0  0 
3 0  0  0  0  0  0  0 
    improv inappropri inclin incur indemn individu injur 
1  0   0  0  0  0  0  0 
2  0   0  0  0  0  0  0 
3  0   0  0  0  0  0  0 
    injuri inquir inquiri inspect instruct intent 
1  0  0  0  0  0  0 
2  0  0  0  0  0  0 
3  0  0  0  0  0  0 
    interest intern invoic job joint judg judgment juri 
1  0  0  0 0  0 0  0 0 
2  0  0  0 0  0 0  0 0 
3  0  1  0 0  0 0  0 0 
    jurisdict key knee knowledg lacer lack larg latest 
1   0 0 0  0  0 0 0  0 
2   0 0 0  0  0 0 0  0 
3   0 0 0  0  0 0 0  0 
    law lawyer layer learn leav leg legal letter level 
1 0  0  0  0 0 0  0  1  0 
2 0  1  0  0 0 0  0  0  0 
3 0  0  0  0 0 0  0  0  0 
    liabil lien life limit litig live lmtcb local lose 
1  0 0 0  0  0 0  0  0 0 
2  0 0 0  0  0 0  0  0 0 
3  0 0 0  0  0 0  0  0 0 
    loss lost low mail mainten major male manag materi 
1 0 0 0 0  0  0 0  0  0 
2 0 0 0 0  0  0 0  0  0 
3 0 0 0 0  0  0 0  0  0 
    mcad med mediat medic medicar meet memo merit messag 
1 0 0  0  0  0 0 0  0  0 
2 0 0  0  0  0 0 0  0  0 
3 0 0  0  0  0 2 0  0  0 
    million minor mom money monitor motion msj mtd 
1  0  0 0  0  0  0 0 0 
2  0  0 0  0  0  0 0 0 
3  0  0 0  0  0  0 0 0 
    nation near neck neglig negoti news noth notic 
1  1 0 0  0  0 0 0  0 
2  0 0 0  0  0 0 0  0 
3  0 0 0  0  0 0 0  0 
    notifi numer nurs object oblig ocr offer offici ongo 
1  0  0 0  0  0 0  0  0 0 
2  0  0 0  0  0 0  0  0 0 
3  0  0 0  0  0 2  0  0 0 
    open oper opinion opportun oppos opposit oral order 
1 0 0  0  0  0  0 0  0 
2 1 0  0  0  0  0 0  0 
3 0 0  0  0  0  0 0  0 
    origin outlin outstand owe paid pain park parti 
1  0  0  0 0 0 0 0  0 
2  0  0  0 0 0 0 0  0 
3  0  0  0 0 0 0 0  0 
    partner pass pay payment pend perman personnel petit 
1  0 1 0  0 0  0   0  0 
2  0 1 0  0 0  0   0  0 
3  0 2 0  0 1  0   0  0 
    phone photo physic physician pictur plan player 
1  0  0  0   0  0 0  0 
2  0  0  0   0  0 0  0 
3  0  0  0   0  0 0  0 
    plead poa polic polici poor postpon potenti practic 
1  0 0  0  0 0  0  0  0 
2  0 0  0  0 0  0  0  0 
3  0 0  0  0 0  0  0  0 
    preliminari premis prepar pres presid press pressur 
1   0  0  0 0  0  0  0 
2   0  0  0 0  0  0  0 
3   0  0  0 0  0  0  0 
    prevail prevent primari privat proceed product 
1  0  0  0  0  0  0 
2  0  0  0  0  0  0 
3  0  0  0  0  0  0 
    profession professor progress project promis promot 
1   0   0  0  0  0  0 
2   0   1  0  0  0  0 
3   0   2  0  0  0  0 
    proper properti propos protect provis provost pull 
1  0  0  0  0  0  0 0 
2  0  0  0  0  0  1 0 
3  0  0  0  0  0  0 0 
    punit pursu push qualifi quick quiet quit race rais 
1  0  0 0  0  0  0 0 0 0 
2  0  0 0  0  0  0 0 0 0 
3  0  0 0  0  0  0 0 0 0 
    rang rate reach recal receipt recov recoveri rediari 
1 0 0  0  0  0  0  0  0 
2 0 0  0  0  0  0  0  0 
3 0 0  0  0  0  0  0  0 
    reduc reimburs reinsur reject relationship releas 
1  0  0  0  0   0  0 
2  0  0  0  0   0  0 
3  0  0  0  0   0  0 
    relief remain remedi remov renew reopen rep repair 
1  0  0  0  0  0  0 0  0 
2  0  0  0  0  0  0 0  0 
3  0  0  1  0  0  0 0  0 
    repeat. replac repli repres represent research 
1  0  0  0  0   0  0 
2  0  0  0  0   0  0 
3  0  0  0  0   0  0 
    reserv resid resign resolut resolv respect respond 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    rest retain retali retent retir return reveal review 
1 0  0  0  0  0  0  0  2 
2 0  0  0  0  0  0  0  0 
3 0  0  0  0  0  0  0  1 
    revis risk role ror rts rule run safeti salari 
1  0 0 0 0 0 0 0  0  0 
2  0 0 0 0 0 0 0  0  0 
3  0 0 0 0 0 0 0  0  0 
    schedul search section secur select semest separ 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    serious serv servic settl settlement sex sexual 
1  0 0  0  0   0 0  0 
2  0 0  0  0   0 0  0 
3  0 0  0  0   0 0  0 
    shoulder side sidewalk sign signific sir sit site 
1  0 0  0 0  0 0 0 0 
2  0 0  0 0  0 0 0 0 
3  0 0  0 0  0 0 0 0 
    situat slip small snow speak spent split staff stage 
1  0 0  0 0  0  0  0  0  0 
2  0 0  0 0  0  0  0  0  0 
3  0 0  0 0  0  0  0  0  0 
    stair standard statement status statut step stop 
1  0  0   0  0  0 0 0 
2  0  0   0  2  0 0 0 
3  0  0   0  0  0 0 0 
    stori strategi street strike struck studi subject 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    substanti success sue suffer suffici suggest summari 
1   0  0 0  0  0  0  0 
2   0  0 0  0  0  0  0 
3   0  0 0  0  0  0  0 
    supervis supervisor supplement supv surgeri suspect 
1  0   0   0 0  0  0 
2  0   0   0 0  0  0 
3  0   0   0 0  0  0 
    suspend sustain system tabl tcw teach teacher team 
1  0  0  0 0 0  0  0 0 
2  0  0  0 0 0  0  0 0 
3  0  0  0 0 0  0  0 0 
    telephon tender tenur term termin test testifi 
1  0  0  0 0  0 0  0 
2  0  0  0 0  0 1  0 
3  0  0  0 0  0 0  0 
    testimoni theori threaten titl top total tpa track 
1   0  0  0 0 0  0 0  0 
2   0  0  0 0 0  0 0  0 
3   0  0  0 0 0  0 0  0 
    train transcript transfer transport travel treat 
1  0   0  0   0  0  0 
2  0   0  0   0  0  0 
3  0   0  0   0  0  0 
    treatment trial trip troubl tuition unabl unclear 
1   0  0 0  0  0  0  0 
2   0  0 0  0  0  0  0 
3   0  0 0  0  0  0  0 
    unfortun upcom updat vacat valu vehicl verdict video 
1  0  0  1  0 0  0  0  0 
2  0  0  0  0 0  0  0  0 
3  0  0  0  0 0  0  0  0 
    violat visitor voicemail wage wait walk warn watch 
1  0  0   0 0 0 0 0  0 
2  0  0   0 0 0 0 0  0 
3  0  0   0 0 0 0 0  0 
    water weak white win withdraw worker write written 
1  0 0  0 0  0  0  0  0 
2  0 0  0 0  0  0  0  0 
3  0 0  0 0  0  0  1  0 
    wrote xbocx xdolx ximex xmsjx xnpcx xoopx xprosex 
1  0  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0  0 
3  1  0  0  0  0  0  0  0 
    xsolx 
1  0 
2  0 
3  0 

我試圖按月份/年份和年份對頻率值進行分組。例如,對於「上訴」這個詞,不是在1998年1月5日發生了2次,而是在1998年5月1日發生了另一次發生,我想在1998年1月發生3次,然後發生3次假設一年中其餘時間沒有更多的點擊)。然後我想繪製每月/每年/每月/每年的頻率,以及每年與每年的頻率。

我試圖通過月/年使用下面的代碼組:

df %>% 
     mutate(month_year = format(date, "%Y/%m")) %>% 
     group_by(month_year) %>% 
     summarise(total = sum(vocabfreq)) 

其中值均列文字的原始數據集的頻率。另一個問題是我的數據集非常大,我很難在一個顯示特色的圖表上繪製多個系列。

回答

1

xts方法:

library(xts) 
dat <- data.frame(date=c('7/10/2014', '7/10/2014', '7/11/2014', '8/05/2015', '9/21/2015'), 
        word1= c(1,2,1, 4, 3), word2=c(3, 10, 1, 2, 4)) 
dates <- as.POSIXct(dat$date, format='%m/%d/%Y') 
dat.xts <- xts(subset(dat, select= -date), order.by=dates) 
apply.daily(dat.xts, colSums) 
apply.monthly(dat.xts, colSums) 
+0

當我嘗試這個,日期列不包括在數據框? –

+0

這不是一個data.frame,它是一個xts對象。日期可以通過'index(dat.xts)'來檢索。這是處理日期數據的更有效的方法。 – DunderChief

+0

另外,如果在您的問題中提供一個可重複的示例,那麼我們可以使用您的數據給出一個示例。請參閱'?dput' – DunderChief

0

您應該使用summarise_each,而不是summarise。順便說一句,我使用@DunderChief的代碼來生成數據。謝謝你。

dat <- data.frame(date=c('7/10/2014', '7/10/2014', '7/11/2014', '8/05/2015', '9/21/2015'), 
       word1= c(1,2,1, 4, 3), word2=c(3, 10, 1, 2, 4)) 
library(dplyr) 

dat %>% 
    mutate(date = as.Date(date, format='%m/%d/%Y')) %>% 
    group_by(date) %>% 
    summarise_each(funs(sum(.))) 
+0

當我在我的數據集上嘗試這個時,我得到一個錯誤:沒有爲「日期」對象定義總和 –

+0

@Learning_R哦,原因是你在數據文件中有其他日期列,而'.'表示除分組列以外的每一列。您不能在這些日期對象上使用'sum' – Hao

+0

@Learning_R您應該將所有日期列放入group_by中,或者考慮組織日期列的好方法。 – Hao

相關問題