2017-04-30 84 views
0

如何在Apache Beam中創建基於元組的滑動窗口?這在Flink中很容易做到:Apache Beam中基於元組的窗口

DataStream.countWindowAll(long size, long slide) 

但是從Beam(或DataFlow)的文檔中不清楚如何做到這一點。它是一些窗口和觸發器的組合嗎?它有效嗎?

回答

1

橫樑本身支持滑動窗口。請參閱programming guideSlidingWindows課程的文檔。

例如爲:

PCollection<Foo> foos = ...; 
PCollection<Integer> counts = foos 
    .apply(Window.into(
     SlidingWindows.of(Duration.standardMinutes(5)) 
         .every(Duration.standardMinutes(1)))) 
    // Below is required instead of Count.globally() when you use 
    // a non-global windowing function. 
    .apply(Combine.globally(Count.<Foo>combineFn()).withoutDefaults()); 
PCollection<String> formattedCounts = counts.apply(
     ParDo.of(new DoFn<Integer, String>() { 
      @ProcessElement 
      public void process(ProcessContext c, BoundedWindow w) { 
      c.output("Window: " + w + ", count: " + c.element()); 
      } 
     })); 

觸發是問題的一個單獨的維度,它控制在一個特定窗口中的數據將被視爲「完全夠」應用聚合。見programming guide