-1
我使用R來讀取文本。的通道由100句,然後將其放入一個列表,該列表是這樣的:使用r計算文本中n-gram的頻率
[[1]]
[1] "WigWagCo: For #TBT here's a video of Travis McCollum (Co-Founder and COO of WigWag) at #SXSW2016
[[2]]
[1] "chrisreedfilm: RT @hammertonail: #SXSW2016 doc THE SEER: A PORTRAIT OF WENDELL BERRY gets reviewed by @chrisreedfilm
[[3]]
[1] "iamscottrandell: RT @therevue: Take a jaunt down #MemoriesofSXSW & read the stories of @JRNelsonMusic @thegillsmusic & @TheBlancosMusic
...
...
[[99]]
[1] "SunPowerTalent: SunPower #Clerical #Job: Supply Chain Intern (#Austin, TX)
[[100]]
[1] "SunPowerTalent: #Finance #Job alert: General Ledger Accountant | SunPower
列表中的每個對象都是從文字相同的「句子」。 如何計算本文中所有3-gram的頻率並知道哪個句子是每個3-gram?
非常感謝。