2017-09-26 43 views
0

我想用SAX表示一些時間序列圖,以便我可以挖掘它們的相似之處。我正在使用R中的jmotif軟件包:使用jmotif包的R中的SAX時間序列表示

#Create an example dataframe 
example1 <- data.frame(flow=c(1.1,2.2,3.3,4.4,5.5,6.6), 
        weight1=c(7.1,7.2,7.3,7.4,7.5,7.6), 
        weight2=c(8.1,8.2,8.3,8.4,8.5,8.6)) 
# Create a timeseries object 
examplets1 <- ts(example1, start = 1, end = 6) 

#Analysis 
library(jmotif) 
#Normalise the data using Znorm 
examplezn <- znorm(examplets1, threshold = 0.01) 
#Perform piecewise aggregate approximation 
examplepaa <- paa(examplezn, 3) 
#Represent time series as SAX 
sax_via_window(examplepaa, 3, 3, 10, "mindist", 0.1) 

#This produces the result 
> sax_via_window(examplepaa, 3, 3, 10, "mindist", 0.1) 
$`0` 
[1] "bgh" 

我無法解釋這些結果。我期望的是象徵性的表示,我可以將它與每列相關聯。流量:acc,weight1:bgh等。真正的數據集將有大約100列的ts數據!

我錯誤地應用該方法嗎?

任何幫助是極大的讚賞

回答

0

這裏的問題是,我沒有「矢量化」 jmotif,所以它的功能只適用於數字的有序序列表示輸入時間序列,即不以數據幀對象或時間序列對象。可爭論的,但我只是想保持簡單。

我沒有修改代碼中的位來執行任務,希望它有助於:

library(jmotif) 

# create an example dataframe, list works the best cause library is not "vectorized" 
example1 <- list(flow = c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9), 
      weight1 = c(7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 8.8, 9.9), 
      weight2 = c(8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9)) 

# this library makes working with not-vectorized code easier 
library(plyr) 

# z-normalize 
examplezn <- llply(example1, function(x){znorm(x, threshold = 0.01)}) 

# perform piecewise aggregate approximation, probably not needed for following up with SAX transform, so just for illustration ... 
llply(examplezn, function(x){paa(x, 3)}) 

# represent time series as SAX strings using via window SAX transform 
example_sax <- llply(example1, function(x){sax_via_window(x, 3, 2, 3, "none", 0.1)}) 

# convert the result to a data frame, by rows though 
df_by_row <- ldply(example_sax, unlist) 

# and finally obtain a column-oriented data frame 
df_by_column <- as.data.frame(t(df_by_row))