1、How Uber Builds A Cross Data Center Replication Platform on Apache Kafka01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgendaReal-time Dynamic PricingStreamProcessingDynamic pricing App ViewsVehicle InformationApache KafkaUber Eats-Real-Time ETAs
2、A bunch more.Fraud DetectionDriver&Rider Sign-ups,etc.Apache Kafka-Use CasesGeneral pub-sub,messaging queueStream processing AthenaX-self-service streaming analytics platform(Apache Samza&Apache Flink)Database changelog transportCassandra,MySQL,etc.Ingestion HDFS,S3LoggingData Infrastructure UberPRO
3、DUCERSCONSUMERSReal-time Analytics,Alerts,DashboardsSamza/FlinkApplicationsData ScienceAnalyticsReportingKafkaVertica/HiveRider AppDriver AppAPI/ServicesEtc.Ad-hoc ExplorationELKDebuggingHadoopSurgeMobile AppCassandraMySQLDATABASES(Internal)ServicesAWS S3PaymentPBsMessages/DayTrillionsData Tens of T
4、housands TopicsScaleexcluding replication01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgendaApache Kafka Pipeline UberDC2DC1ApplicationsProxyClientKafka RESTProxyRegionalKafkaApplicationsProxyClientKafka RESTProxyRegionalKafkaSecondaryApache Ka
5、fkaAggregateKafkauReplicatorOffset Sync ServiceAggregateKafkauReplicatorAggregationRegionalKafkaRegionalKafkaAggregateKafkauReplicatorOffset Sync ServiceAggregateKafkauReplicatorDC1DC2Global viewCross-Data Center FailoverRegionalKafkaRegionalKafkaAggregateKafkauReplicatorOffset Sync ServiceAggregate
6、KafkauReplicatorDC1DC2During runtimeuReplicator reports offset mapping to offset sync serviceOffset sync service is all-active and the offset info is replicated across data centersDuring failoverConsumers ask offset sync service for offsets to resume consumption based on its last commit offsetsOffse