Flink 广播流使用
代码
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// ...env setting
// Source
KafkaSource<SourceValue> kafkaSource =
KafkaSource.<>builder()
.setBootstrapServers(BOOTSTRAP_SERVERS)
.setTopics(TOPIC)
.build();
DataStreamSource<SourceValue> source =
env.fromSource(kafkaSource, WatermarkStrategy.noWatermarks(), "sourceName");
// Broadcast
MapStateDescriptor<Key, Value> state =
new MapStateDescriptor<>("state", Key.class, Value.class);
BroadcastStream<Value> broadcastStream = env.addSource().broadcast(state);
// Process
source.connect(broadcastStream)
.process(new MyProcessFunction())
.addSink(new MySinkFunction());
env.execute();
思考
Flink 中的广播流适合维度表不常变化的场景,因为一旦广播流算子将上游表的数据读取完成便会进入FINISHED状态
参考
- 代码
- 思考
- 参考