Apache Flink 零基础入门(十三)Flink 计数器
生活随笔
收集整理的這篇文章主要介紹了
Apache Flink 零基础入门(十三)Flink 计数器
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
需求:當一個文本文件進入時,有可能會有一些格式亂碼的錯誤行,如何統計哪些錯誤行?如何提取錯誤行
def main(args: Array[String]): Unit = {val env = ExecutionEnvironment.getExecutionEnvironmentval data = env.fromElements("hadoop","spark","pyspark", "storm")data.map(new RichMapFunction[String, Long] {var counter = 0loverride def map(value: String): Long = {counter = counter + 1println("counter:"+counter)counter}}).setParallelism(2).print()}使用這種方式,設置并行度之后,無法正確統計。
正確的方式是通過定義Accumulator來進行計數操作。scala實現方式如下:
val info = data.map(new RichMapFunction[String, String] {// step1:定義計數器val counter = new LongCounter()override def open(parameters: Configuration): Unit = {// step2: 注冊計數器getRuntimeContext.addAccumulator("ele-counts-scala", counter)}override def map(in: String): String = {counter.add(1)in}})info.writeAsText("E:/test3", WriteMode.OVERWRITE).setParallelism(4)val jobResult=env.execute("CounterApp")// step3: 獲取計數器val num =jobResult.getAccumulatorResult[Long]("ele-counts-scala")println("num:" + num )Java
public class JavaCounterApp {public static void main(String[] args) throws Exception {ExecutionEnvironment executionEnvironment = ExecutionEnvironment.getExecutionEnvironment();DataSource<String> data = executionEnvironment.fromElements("hadoop", "spark", "pyspark", "storm");DataSet dataSet = data.map(new RichMapFunction<String, String>() {LongCounter counter = new LongCounter();@Overridepublic void open(Configuration parameters) throws Exception {getRuntimeContext().addAccumulator("ele-counts-java",counter);}@Overridepublic String map(String value) throws Exception {counter.add(1);return value;}});dataSet.writeAsText("E:/test4", FileSystem.WriteMode.OVERWRITE).setParallelism(3);JobExecutionResult javaCounterApp = executionEnvironment.execute("JavaCounterApp");long num = javaCounterApp.getAccumulatorResult("ele-counts-java");System.out.println("num:" + num);} }?
與50位技術專家面對面20年技術見證,附贈技術全景圖總結
以上是生活随笔為你收集整理的Apache Flink 零基础入门(十三)Flink 计数器的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Apache Flink 零基础入门(十
- 下一篇: Apache Flink 零基础入门(十