我遇到了一些麻煩,理解事件時間窗口周圍的語義。以下程序會生成一些帶時間戳的元組,用作事件時間並進行簡單的窗口聚合。我希望輸出的順序與輸入相同,但輸出順序不同。爲什麼輸出與事件時間無關?Flink流事件時間窗口排序
import java.util.concurrent.TimeUnit
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.windowing.time.Time
import org.apache.flink.streaming.api.scala._
object WindowExample extends App {
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
env.getConfig.enableTimestamps()
env.setParallelism(1)
val start = 1449597577379L
val tuples = (1 to 10).map(t => (start + t * 1000, t))
env.fromCollection(tuples)
.assignAscendingTimestamps(_._1)
.timeWindowAll(Time.of(1, TimeUnit.SECONDS))
.sum(1)
.print()
env.execute()
}
輸入:
(1449597578379,1)
(1449597579379,2)
(1449597580379,3)
(1449597581379,4)
(1449597582379,5)
(1449597583379,6)
(1449597584379,7)
(1449597585379,8)
(1449597586379,9)
(1449597587379,10)
結果:
[info] (1449597579379,2)
[info] (1449597581379,4)
[info] (1449597583379,6)
[info] (1449597585379,8)
[info] (1449597587379,10)
[info] (1449597578379,1)
[info] (1449597580379,3)
[info] (1449597582379,5)
[info] (1449597584379,7)
[info] (1449597586379,9)
您可以給出還是指向下游操作的示例,它可以在收到水印後根據事件時間對元素重新排序?謝謝! –
@MaximKolchin這樣的重排序發生在例如在CEP庫中。你可以看看這裏:https://github.com/apache/flink/blob/master/flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/operator/AbstractKeyedCEPPatternOperator.java –