SlideShare une entreprise Scribd logo
1  sur  21
Apache Kafka
http://incubator.apache.org/kafka


         /*

              Joe Stein, Chief Architect
              http://www.medialets.com
              Twitter: @allthingshadoop

              Committer: Apache Kafka
         */




                     1
Overview

• whoami
• Kafka: What is it?
• Kafka: Why do we need it?
• Kafka: What do we get?
• Kafka: How do we get it?
• Kafka: Lets jump in!!!


                   2
Medialets




    3
Medialets
•   Largest deployment of rich media ads for mobile devices
•   Installed on hundreds of millions of devices
•   3-4 TB of new data every day
•   Thousands of services in production
•   Hundreds of thousands of events received every second
•   Response times are measured in microseconds
•   Languages
     – 55% JVM (70% Scala & 30% Java)
     – 20% C/C++
     – 13% Python
     – 10% Ruby
     – 2% Bash


                                 4
Apache Kafka

  What is it?




       6
A distributed publish-subscribe messaging system

• Originally created by LinkedIn, contributed to Apache in
  July 2011 currently in incubation
• Kafka is written in Scala
• Multi-language support for Publish/Consumer API
  (Scala, Java, Ruby, Python, C++, Go, PHP, etc)




                            7
Apache Kafka

Why do we need it?




        8
Offline log aggregation and real-time messaging
Other “log-aggregation only” systems (e.g. Scribe and Flume) are
architected for “push” to drive the data.
   – high performance and scale however:
      • Expected end points are large (e.g. Hadoop)
      • End points can’t have lots of business logic in real-time
        because they have consume as fast as data is pushed to
        them… excepting the data is their main job
• Messaging Systems (e.g. RabitMQ, ActiveMQ)
   – Does not scale
      • No API for batching, transactional (broker retains consumers
        stream position)
      • No message persistence means multiple consumers over
        time are impossible limiting architecture




                                  9
All-in-one system with one architecture and one API
• Kafka is a specialized system and overlaps uses cases for
  both offline and real-time log processing.




                              10
Apache Kafka

What do we get?




       11
Performance & Scale
  • Producer Test:
    – LinkedIn configured the broker in all systems to asynchronously
      flush messages to its persistence store.
    – For each system, they ran a single producer to publish a total of 10
      million messages, each of 200 bytes.
    – They configured the Kafka producer to send messages in batches of
      size 1 and 50. ActiveMQ and RabbitMQ don’t seem to have an easy
      way to batch messages and they assumed that it used a batch size
      of 1.
    – In the next slide, the x-axis represents the amount of data sent to
      the broker over time in MB, and the y-axis corresponds to the
      producer throughput in messages per second. On average, Kafka
      can publish messages at the rate of 50,000 and 400,000 messages
      per second for batch size of 1 and 50, respectively.




http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

                                                                        12
Performance & Scale




http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

                                                                        13
Performance & Scale
  • Consumer Test:
    – In the second experiment, LinkedIn tested the performance
      of the consumer. Again, for all systems, they used a single
      consumer to retrieve a total of 10 millions messages.
    – They configured all systems so that each pull request
      should prefetch approximately the same amount data---up
      to 1000 messages or about 200KB.
    – For both ActiveMQ and RabbitMQ, they set the consumer
      acknowledge mode to be automatic. Since all messages fit
      in memory, all systems were serving data from the page
      cache of the underlying file system or some in-memory
      buffers.




http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

                                                                        14
Performance & Scale




http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

                                                                        15
Apache Kafka

How do we get it?




       16
Performance & Scale
• Producing:
   – Kafka producer currently doesn’t wait for acknowledgements from the broker
     and sends messages as faster as the broker can handle.
       • This is a valid optimization for the log aggregation case, as data must be
         sent asynchronously to avoid introducing any latency into the live serving of
         traffic. We note that without acknowledging the producer, there is no
         guarantee that every published message is actually received by the broker.
       • For many types of log data, it is desirable to trade durability for throughput,
         as long as the number of dropped messages is relatively small
       • Durability through replication is being addressed in 0.8
   – Kafka has a very efficient storage format.
     http://incubator.apache.org/kafka/design.html
   – Batching
• Consuming:
   – Kafka does not maintain the delivery state which means 0 writes for each
     consumed message.
   – Kafka uses the sendfile API making the transfer of bytes from socket to disk
     through kernal space saving copies and calls between kernel user back to
     kernel



                                           17
Apache Kafka

  Lets jump in




       18
Producer
core/src/main/scala/kafka/tools/ProducerShell.scala

/**
 * Interactive shell for producing messages from the command line
 */

// config setup

val propsFile = options.valueOf(producerPropsOpt)
  val producerConfig = new ProducerConfig(Utils.loadProps(propsFile))
  val topic = options.valueOf(topicOpt)
  val producer = new Producer[String, String](producerConfig)

  val input = new BufferedReader(new InputStreamReader(System.in))
  var done = false
  while(!done) {
    val line = input.readLine()
    if(line == null) {
      done = true
    } else {
      val message = line.trim
      producer.send(new ProducerData[String, String](topic, message))
      println("Sent: %s (%d bytes)".format(line, message.getBytes.length))
    }
  }
  producer.close()
                                                  19
Consumer
core/src/main/scala/kafka/consumer/ConsoleConsumer.scala
/**
 * Consumer that dumps messages out to standard out.
 */
val connector = Consumer.create(config) //kafka.consumer.ConsumerConnector
val stream = connector.createMessageStreamsByFilter(filterSpec).get(0)
val iter = if(maxMessages >= 0)
      stream.slice(0, maxMessages)
    else
      stream
 val formatter: MessageFormatter = messageFormatterClass.newInstance().asInstanceOf[MessageFormatter]
    formatter.init(formatterArgs)

  try {
    for(messageAndTopic <- iter) {
      try {
        formatter.writeTo(messageAndTopic.message, System.out)
      } catch {
        case e =>
          if (skipMessageOnError)
            error("Error processing message, skipping this message: ", e)
          else
            throw e
      }
      if(System.out.checkError()) {
        // This means no one is listening to our output stream any more, time to shutdown
        System.err.println("Unable to write to standard out, closing consumer.")
        formatter.close()
        connector.shutdown()
        System.exit(1)
      }
    }
  } catch {
                                                                  20
    case e => error("Error processing message, stopping consumer: ", e)
Running everything
Download Kafka Source
• http://incubator.apache.org/kafka/downloads.html

Open a Terminal
• cp ~/Downloads/kafka-0.7.1-incubating-src.tgz .
• tar -xvf kafka-0.7.1-incubating-src.tgz
• cd kafka-0.7.1-incubating
• ./sbt update
• ./sbt package
Open 3 more terminals http://incubator.apache.org/kafka/quickstart.html
• Terminal 1
    – bin/zookeeper-server-start.sh config/zookeeper.properties
• Terminal 2
    – bin/kafka-server-start.sh config/server.properties
• Terminal 3
    – bin/kafka-producer-shell.sh --props config/producer.properties --topic scalathon
    – start typing
• Terminal 4
    – bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic scalathon
• Terminal 3
    – Type some more
• Terminal 4
    – See what you just typed
    – Ctrl+c
    – bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic scalathon --from-beginning
    – See EVERYTHING you have typed
                                                21
We are hiring!
 /*

      Joe Stein, Chief Architect
      http://www.medialets.com
      Twitter: @allthingshadoop

 */


 Medialets
 The rich media ad
 platform for mobile.
                      connect@medialets.com
                      www.medialets.com/showcas
                      e




              22

Contenu connexe

Tendances

Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101Whiteklay
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Erik Onnen
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin PodvalMartin Podval
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...Lucas Jellema
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemEdureka!
 
A la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIAA la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIALa Cuisine du Web
 
LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...kawamuray
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaLászló-Róbert Albert
 

Tendances (20)

Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101
 
Kafka
KafkaKafka
Kafka
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging System
 
A la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIAA la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIA
 
LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: Internals
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
 

En vedette

Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Michael Noll
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - TürkçeApache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - TürkçeEmre Akış
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Knoldus Inc.
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaJoe Stein
 
Apache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantApache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantALTIC Altic
 
BigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and futureBigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and futureNir Rubinstein
 
Event Driven Architecture - OPITZ CONSULTING - Schmutz - Winterberg
Event Driven Architecture - OPITZ CONSULTING - Schmutz - WinterbergEvent Driven Architecture - OPITZ CONSULTING - Schmutz - Winterberg
Event Driven Architecture - OPITZ CONSULTING - Schmutz - WinterbergOPITZ CONSULTING Deutschland
 
Devoxx fr 2016 - Apache Kafka - Stream Data Platform
Devoxx fr 2016 - Apache Kafka - Stream Data PlatformDevoxx fr 2016 - Apache Kafka - Stream Data Platform
Devoxx fr 2016 - Apache Kafka - Stream Data PlatformPublicis Sapient Engineering
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceOla Spjuth
 
A Practical Guide for Selecting an Enterprise Messaging Platforms
A Practical Guide for Selecting an Enterprise Messaging PlatformsA Practical Guide for Selecting an Enterprise Messaging Platforms
A Practical Guide for Selecting an Enterprise Messaging PlatformsJesus Rodriguez
 
Kafka website activity architecture
Kafka website activity architectureKafka website activity architecture
Kafka website activity architectureOmid Vahdaty
 
jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011Joe Stein
 
Storing Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsStoring Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsJoe Stein
 
Hands on Hadoop and pig
Hands on Hadoop and pigHands on Hadoop and pig
Hands on Hadoop and pigSudar Muthu
 

En vedette (20)

Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - TürkçeApache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Apache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantApache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performant
 
BigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and futureBigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and future
 
Event Driven Architecture - OPITZ CONSULTING - Schmutz - Winterberg
Event Driven Architecture - OPITZ CONSULTING - Schmutz - WinterbergEvent Driven Architecture - OPITZ CONSULTING - Schmutz - Winterberg
Event Driven Architecture - OPITZ CONSULTING - Schmutz - Winterberg
 
Det May Btm
Det May BtmDet May Btm
Det May Btm
 
Devoxx fr 2016 - Apache Kafka - Stream Data Platform
Devoxx fr 2016 - Apache Kafka - Stream Data PlatformDevoxx fr 2016 - Apache Kafka - Stream Data Platform
Devoxx fr 2016 - Apache Kafka - Stream Data Platform
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in science
 
A Practical Guide for Selecting an Enterprise Messaging Platforms
A Practical Guide for Selecting an Enterprise Messaging PlatformsA Practical Guide for Selecting an Enterprise Messaging Platforms
A Practical Guide for Selecting an Enterprise Messaging Platforms
 
Kafka website activity architecture
Kafka website activity architectureKafka website activity architecture
Kafka website activity architecture
 
jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011
 
Storing Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsStoring Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite Columns
 
Hands on Hadoop and pig
Hands on Hadoop and pigHands on Hadoop and pig
Hands on Hadoop and pig
 

Similaire à Apache Kafka

Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedEdureka!
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaGuido Schmutz
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guideChetan Khatri
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Fwdays
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEkawamuray
 

Similaire à Apache Kafka (20)

Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINE
 

Plus de Joe Stein

Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1Joe Stein
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosJoe Stein
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache MesosJoe Stein
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache MesosJoe Stein
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Joe Stein
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on MesosJoe Stein
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosJoe Stein
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosJoe Stein
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosJoe Stein
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaJoe Stein
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache MesosJoe Stein
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0Joe Stein
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonJoe Stein
 

Plus de Joe Stein (16)

Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on Mesos
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache Mesos
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache Mesos
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With Python
 

Dernier

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 

Dernier (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 

Apache Kafka

  • 1. Apache Kafka http://incubator.apache.org/kafka /* Joe Stein, Chief Architect http://www.medialets.com Twitter: @allthingshadoop Committer: Apache Kafka */ 1
  • 2. Overview • whoami • Kafka: What is it? • Kafka: Why do we need it? • Kafka: What do we get? • Kafka: How do we get it? • Kafka: Lets jump in!!! 2
  • 4. Medialets • Largest deployment of rich media ads for mobile devices • Installed on hundreds of millions of devices • 3-4 TB of new data every day • Thousands of services in production • Hundreds of thousands of events received every second • Response times are measured in microseconds • Languages – 55% JVM (70% Scala & 30% Java) – 20% C/C++ – 13% Python – 10% Ruby – 2% Bash 4
  • 5. Apache Kafka What is it? 6
  • 6. A distributed publish-subscribe messaging system • Originally created by LinkedIn, contributed to Apache in July 2011 currently in incubation • Kafka is written in Scala • Multi-language support for Publish/Consumer API (Scala, Java, Ruby, Python, C++, Go, PHP, etc) 7
  • 7. Apache Kafka Why do we need it? 8
  • 8. Offline log aggregation and real-time messaging Other “log-aggregation only” systems (e.g. Scribe and Flume) are architected for “push” to drive the data. – high performance and scale however: • Expected end points are large (e.g. Hadoop) • End points can’t have lots of business logic in real-time because they have consume as fast as data is pushed to them… excepting the data is their main job • Messaging Systems (e.g. RabitMQ, ActiveMQ) – Does not scale • No API for batching, transactional (broker retains consumers stream position) • No message persistence means multiple consumers over time are impossible limiting architecture 9
  • 9. All-in-one system with one architecture and one API • Kafka is a specialized system and overlaps uses cases for both offline and real-time log processing. 10
  • 10. Apache Kafka What do we get? 11
  • 11. Performance & Scale • Producer Test: – LinkedIn configured the broker in all systems to asynchronously flush messages to its persistence store. – For each system, they ran a single producer to publish a total of 10 million messages, each of 200 bytes. – They configured the Kafka producer to send messages in batches of size 1 and 50. ActiveMQ and RabbitMQ don’t seem to have an easy way to batch messages and they assumed that it used a batch size of 1. – In the next slide, the x-axis represents the amount of data sent to the broker over time in MB, and the y-axis corresponds to the producer throughput in messages per second. On average, Kafka can publish messages at the rate of 50,000 and 400,000 messages per second for batch size of 1 and 50, respectively. http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf 12
  • 13. Performance & Scale • Consumer Test: – In the second experiment, LinkedIn tested the performance of the consumer. Again, for all systems, they used a single consumer to retrieve a total of 10 millions messages. – They configured all systems so that each pull request should prefetch approximately the same amount data---up to 1000 messages or about 200KB. – For both ActiveMQ and RabbitMQ, they set the consumer acknowledge mode to be automatic. Since all messages fit in memory, all systems were serving data from the page cache of the underlying file system or some in-memory buffers. http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf 14
  • 15. Apache Kafka How do we get it? 16
  • 16. Performance & Scale • Producing: – Kafka producer currently doesn’t wait for acknowledgements from the broker and sends messages as faster as the broker can handle. • This is a valid optimization for the log aggregation case, as data must be sent asynchronously to avoid introducing any latency into the live serving of traffic. We note that without acknowledging the producer, there is no guarantee that every published message is actually received by the broker. • For many types of log data, it is desirable to trade durability for throughput, as long as the number of dropped messages is relatively small • Durability through replication is being addressed in 0.8 – Kafka has a very efficient storage format. http://incubator.apache.org/kafka/design.html – Batching • Consuming: – Kafka does not maintain the delivery state which means 0 writes for each consumed message. – Kafka uses the sendfile API making the transfer of bytes from socket to disk through kernal space saving copies and calls between kernel user back to kernel 17
  • 17. Apache Kafka Lets jump in 18
  • 18. Producer core/src/main/scala/kafka/tools/ProducerShell.scala /** * Interactive shell for producing messages from the command line */ // config setup val propsFile = options.valueOf(producerPropsOpt) val producerConfig = new ProducerConfig(Utils.loadProps(propsFile)) val topic = options.valueOf(topicOpt) val producer = new Producer[String, String](producerConfig) val input = new BufferedReader(new InputStreamReader(System.in)) var done = false while(!done) { val line = input.readLine() if(line == null) { done = true } else { val message = line.trim producer.send(new ProducerData[String, String](topic, message)) println("Sent: %s (%d bytes)".format(line, message.getBytes.length)) } } producer.close() 19
  • 19. Consumer core/src/main/scala/kafka/consumer/ConsoleConsumer.scala /** * Consumer that dumps messages out to standard out. */ val connector = Consumer.create(config) //kafka.consumer.ConsumerConnector val stream = connector.createMessageStreamsByFilter(filterSpec).get(0) val iter = if(maxMessages >= 0) stream.slice(0, maxMessages) else stream val formatter: MessageFormatter = messageFormatterClass.newInstance().asInstanceOf[MessageFormatter] formatter.init(formatterArgs) try { for(messageAndTopic <- iter) { try { formatter.writeTo(messageAndTopic.message, System.out) } catch { case e => if (skipMessageOnError) error("Error processing message, skipping this message: ", e) else throw e } if(System.out.checkError()) { // This means no one is listening to our output stream any more, time to shutdown System.err.println("Unable to write to standard out, closing consumer.") formatter.close() connector.shutdown() System.exit(1) } } } catch { 20 case e => error("Error processing message, stopping consumer: ", e)
  • 20. Running everything Download Kafka Source • http://incubator.apache.org/kafka/downloads.html Open a Terminal • cp ~/Downloads/kafka-0.7.1-incubating-src.tgz . • tar -xvf kafka-0.7.1-incubating-src.tgz • cd kafka-0.7.1-incubating • ./sbt update • ./sbt package Open 3 more terminals http://incubator.apache.org/kafka/quickstart.html • Terminal 1 – bin/zookeeper-server-start.sh config/zookeeper.properties • Terminal 2 – bin/kafka-server-start.sh config/server.properties • Terminal 3 – bin/kafka-producer-shell.sh --props config/producer.properties --topic scalathon – start typing • Terminal 4 – bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic scalathon • Terminal 3 – Type some more • Terminal 4 – See what you just typed – Ctrl+c – bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic scalathon --from-beginning – See EVERYTHING you have typed 21
  • 21. We are hiring! /* Joe Stein, Chief Architect http://www.medialets.com Twitter: @allthingshadoop */ Medialets The rich media ad platform for mobile. connect@medialets.com www.medialets.com/showcas e 22