要產生朱莉婭詞二元語法,我可以簡單地通過原始列表和下降的第一個元素的列表,如ZIP:生成的n-gram與朱莉婭
julia> s = split("the lazy fox jumps over the brown dog")
8-element Array{SubString{String},1}:
"the"
"lazy"
"fox"
"jumps"
"over"
"the"
"brown"
"dog"
julia> collect(zip(s, drop(s,1)))
7-element Array{Tuple{SubString{String},SubString{String}},1}:
("the","lazy")
("lazy","fox")
("fox","jumps")
("jumps","over")
("over","the")
("the","brown")
("brown","dog")
要生成一個卦,我可以使用相同的collect(zip(...))
成語來獲得:
julia> collect(zip(s, drop(s,1), drop(s,2)))
6-element Array{Tuple{SubString{String},SubString{String},SubString{String}},1}:
("the","lazy","fox")
("lazy","fox","jumps")
("fox","jumps","over")
("jumps","over","the")
("over","the","brown")
("the","brown","dog")
但我必須手動在第三列表中通過壓縮增加,有一個慣用的方式,這樣我可以做ň -gram的任何命令?
例如我想避免這樣做,以提取5克:
julia> collect(zip(s, drop(s,1), drop(s,2), drop(s,3), drop(s,4)))
4-element Array{Tuple{SubString{String},SubString{String},SubString{String},SubString{String},SubString{String}},1}:
("the","lazy","fox","jumps","over")
("lazy","fox","jumps","over","the")
("fox","jumps","over","the","brown")
("jumps","over","the","brown","dog")
很酷!謝謝@HarrisonGrodin,不知道'drop(s,0)'是可能的=) – alvas
@alvas沒問題!而且,在「drop(s,0)」不可行的情況下,以下操作將起作用。 :)'zip(s,(drop(s,k)for k = 1:n-1)...)' –