SlideShare une entreprise Scribd logo
1  sur  59
Télécharger pour lire hors ligne
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Kafka Transactionally
Consistent Replication
—
Shawn Robertson, P. Eng
IIDR Kafka Architect
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Notices and disclaimers
Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM.
U.S. Government Users Restricted Rights -Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include
unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR
IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF
PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.
IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.”
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the
results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does
business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for
informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and
regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its
services or products will ensure that the customer is in compliance with any law.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in
connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be
addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY
DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of
the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com,
Aspera®, Bluemix, BlueworksLive, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM
ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®,
pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many
jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at:
www.ibm.com/legal/copytrade.shtml.33
Legal Stuff
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Introduction And Agenda
Agenda
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Objective & Background 05
Terminology For the Talk 07
The Source Transaction Stream 10
Database Change Capture, Kafka Style 15
Leveraging The Source Transaction Stream 17
The IBM IIDR Kafka Replication Objective 27
Evolution Towards Transactionally Consistent 31
Consumption
Introducing the Transactionally Consistent 34
Consumer (TCC)
TCC Cloud Advantages 52
Comparison with Kafka Transactional and 54
Idempotence Feature (KIP-98)
Links And Resources 57
Objective
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Describe the challenges and an approach to replicating database change data to Kafka in a transactionally
consistent manner.
Approach:
Outline the challenges of transactional database change replication to Kafka. Describe the evolution,
design motivations, and methodology behind IBM’s development of Transactionally Consistent
Replication to Kafka.
Provide insight into the following areas:
• Serial representation of source database changes (The Source Transaction Stream)
• Consuming applications leveraging the STS – How and Why
• Kafka and the unique Requirements for Database Change Stream replication
• Possible Approaches to ordered/transactional Kafka Replication
• How's and Whys of IBMs Kafka Transactional Replication Methodology
• Comparison with Kafka’s Transactional/Idempotence functionality
Introduction And Agenda
Motivation
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Why develop the IBM Transactionally Consistent Consumer?
• The customer’s always right!
• Next generation Kafka Data Flows
• Additional influence from having owned Db2 Backup and Restore
Introduction And Agenda
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Some Terminology!
Anatomy of A Transaction
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
DML Operation
An Insert, Update, or Delete against a specific
database table on the source database.
Transaction
A logical unit of work representing a change to
the database.
Consists of a sequence of one or more
operations which are applied to the source
database and treated as a single atomic change.
Commit Operation
Used to mark the end point or completion of a
transaction.
Rollback
A transaction that is not committed and so will
not have its operations replicated to the CDC
target.
Terminology
Replication Components
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Source Database
The Database who’s log changes are being
captured and replicated.
IIDR (CDC) Source Engine
The installed IIDR (CDC) agent that captures the
changes to the source database from log files.
IIDR (CDC) Target Engine
The IIDR (CDC) for Kafka agent installed on a
Linux server and responsible for producing data
into Kafka.
Target
The Kafka cluster which ultimately receives the
replicated data.
Subscription
A set Source database tables which are to have
their change data capture information replicated
consistently with respect to each other.
Terminology
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
The Source Transaction Stream – Where it All Begins
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Four Time Interleaved Source Database Transactions
Transaction 1 Op1(Tab2) Op2(Tab3) Op3(tab2) Commit
Transaction 2 Op1(Tab2) Op2(Tab2) Op3(Tab3) Op4(tab2) Commit
Transaction 3 Op1(Tab1) Commit
Transaction 4 Op1(tab1) Commit
===================== TIME =====================è
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Source Transaction Stream
IIDR Kafka Target <========= T3 Op1, T2 Op1, T2 Op2, T2 Op3, T2 Op4, T4 Op1, T1 Op1, T1 Op2, T1 Op3
(send)
As Transaction “T3” was the first to commit, it’s operation is first in the source transaction stream. T3 is comprised of only one
operation.
T2 having four operations, committed next in time and so its operations in order of occurrence are next in the source transaction
stream.
Source Database
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Source Transaction Stream
A serial logical sequence of operations. The
order of which is first determined by the
transaction commit order to which the operation
belongs and then the relative order of the operation
within the transaction.
Source Transaction Stream
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Transactions applied to Kafka Cluster with Multi Partitioned Topics
===================== Offset ===================== è
Tab 1 Partition1 T3 Op1 T4 Op1
Tab2 Parititon1 T2 Op1 T2 Op2 T1 Op3
Tab2 Partition2 T2 Op4 T1 Op1
Tab3 Partition1 T2 Op3
Tab3 Partition2 T1 Op2
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Database Change Capture Kafka Flows:
Common Characteristics
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Maximizing Value From Transactional Database Records:
• Order Matters (Operation and Transaction)
• Operation Order within a transaction even when operations are applied to different tables
• Transactions order as the source sees them even when constituent operations apply to different tables
• Atomicity Matters
• The “meta” transactional relationship of operations contains valuable/required insights
• “Eventual” Consistency Is Not Sufficient
• Data processed as if it were originating from the source database must be consistent
• Exactly Once Processing Matters
• Business logic can dictate each source operation must result in exactly one action in response (*Realtime apps)
• Performance Matters
• To The Cluster, From the Cluster, Crash Recovery
Database Change Capture Transaction Kafka Flows: Common Characteristics
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Leveraging The Source Transaction Stream
What Value does the Source
Transaction Stream provide a
Consumer application?
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Gives us the order of operations as they were “logically”
applied to the source database.
Gives us the units of work with which the changes were
logically applied to the source database.
Applying the changes in source transaction stream
order to another database will result in the target tables
being populated with the same information
Committing the operations of the source transaction
stream at the transaction boundaries ensures that the
target database only exists in states that the source
database did from an external applications perspective.
Utilizing the commit boundaries and ordering of the
source transaction stream ensure that multiple tables
can be kept transactionally consistent with each other.
Source Transaction Stream
What is a scenario where
knowing the source
transaction stream is useful?
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
When Operation Order Matters!
Enforced State Validation Example:
The source has two tables with a parent child
relationship. Transactions insert first into the
parent and then the child.
We wish to consume the replicated operations from
Kafka and apply the changes records to a target
that also mandates this referential integrity be
maintained.
Knowing the source transaction stream would
allow us to ensure that operations are applied first
to the parent and then to the child which is vital if
the target also strictly enforces its referential
integrity.
Source Transaction Stream
That Last Example Sounded a
Lot Like Your Old DB2 Life
Shawn!
Give me something Kafka and
single Topic.
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
When Operation Order Matters! (take 2)
Business Logic State Validation Example:
A topic represents an account balance which may
never be less than $0 at any point.
Initially a cheque is deposited for $50.
Subsequently a withdrawal is made for $40.
If the application reading from the Kafka topic and
applying the changes to database balances sees
the operations as they occur in the source
transaction stream then they will always have a
valid balance.
If the consuming application does not receive the
operations in source transaction stream order then
an invalid balance exists when consuming the
withdrawal prior to the deposit record.
Source Transaction Stream
What is an example of when
Exactly Once processing
matters?
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Exactly Once Real-Time Event Triggering:
Incoming Credit Card purchase requests are
replicated to Kafka.
The Bank requests text authorization, in real time,
for purchases that exceed a $500 threshold
A Kafka consumer application consumes the users
purchase topic and triggers an authentication event
for $500+ purchases
Customers would be unhappy (and concerned!) if
they are requested to do multiple authorizations for
a single purchase!
Source Transaction Stream
How about a real world
example making use of
multiple properties of the
source transaction stream?
Multi-Table Midnight Snapshot Example
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
When Operation Order and Transactional
Consistency Matter!
Customer replicates CDC data for several key
tables to Kafka.
Each table is replicated to its own Kafka topic.
A consumer application continuously applies the
replicated changes to a separate key/value store
table per topic (table).
Every night the customer makes a copy of the key
value store tables consistent to the last completed
transaction on the source before midnight.
Resulting tables used for analytic analysis, end of
day bookkeeping, and a potential table data
backup. (Ie. The key value pairs could form the
contents of a load statement)
Source Transaction Stream
Order Matters:
Multi Table Midnight Snapshot
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Operation Order Aggregate Net Change Example:
Records representing DML operations are being
actively applied to a target table.
Altering the source transaction stream sequence of
Update, Update, Delete would result in the
key/value store table no longer representing the
source table row for row.
Eg.
Initial key value pair: (1 / (abc))
Operations in source transaction stream order:
(1 / (def)) , (1 / (ghi)) , (1, / null)
Result in Key/Value store: no value for key 1.
Operations if applied out of stream order:
(1 / null) , (1 / (def)) , (1 / (ghi))
Result in Key/Value store: value ghi for Key 1.
Source Transaction Stream
Transaction Boundaries Matter
(Pt. 1):
Multi Table Midnight Snapshot
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Each individual table is essentially corrupt if
operations are not applied to a transaction
boundary.
Midnight table snapshot is likely to contain
operations from part of a transaction if applied
without knowledge of transaction boundaries.
Resulting target table can be in a state that did not
exist on the source.
Resulting target table may be in an “impossible”
state from a business logic perspective.
Resulting target table may violate business rules.
Analytics requiring more than general aggregation
at risk. Eg. Cannot perform aggregate accounting.
Using table snapshot as potential DR load would
corrupt source database.
Source Transaction Stream
Transaction Boundaries Matter
(Pt.2):
Multi Table Midnight Snapshot
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
It is essential to bring all tables being replicated to
the same transaction boundary in the source
transaction stream.
Example:
Topic 1 represents orders placed.
Topic 2 represents the quantity of iron available for
new manufacturing of orders.
Failure to have the source transaction streams
commit boundaries might see the table snapshot
for topic 1 includes an additional order that the
debit of available raw materials for is not reflected
in the key/value table representing topic 2.
Result: Incorrect assumption that more raw
materials are available for future orders than
actually are at end of day.
Source Transaction Stream
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
The IBM IIDR Kafka Replication Objective
GOAL
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Empower Kafka consumer applications to recreate the source commit stream!
• Consume Kafka records from multiple topics and multiple partitions in the order the originating
source operations occurred.
• Consume understanding the original source “transactionality” of the operations:
• Transaction Boundaries: so transactions can be atomically manipulated.
• Order of the Transactions: So the atomic units of work can be sequentially applied if required.
• Consume data without operation duplication.
In Other Words:
Give consuming applications the ability to treat data read from Kafka with the semantics of a
transactional database.
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
The Ideal Solution Criteria:
• No duplication. To truly recreate the source transaction stream is to do so without duplication of any
operations.
• No performance impact when applying data to Kafka.
• Must support the parallel apply of transactions to Kafka
• Must support the parallel apply of operations with differing topics/partitions
• High performance recreation of the source transaction stream sequence
• Allow for traditional consumption of topic data for legacy applications
• Allow for restarting the sequence of operations from any point in the source transaction stream
• Must be compatible with all versions/distributions of Kafka 0.10.X and higher
• Consumers must not require IIDR (CDC) to be running to generate sequence of source commit stream
operations.
The IBM IIDR Kafka Replication Objective
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
The Goal Visualized
===================== Offset ===================== è
Tab 1 Partition1 T3 Op1 T4 Op1
Tab2 Parititon1 T2 Op1 T2 Op2 T1 Op3
Tab2 Partition2 T2 Op4 T1 Op1
Tab3 Partition1 T2 Op3
Tab3 Partition2 T1 Op2
Allow a Kafka Consumer Application To Read This….
And Receive This…..
Kafka Consumer Application <======== T3 Op1 , T2 Op1, T2 Op2 , T2 Op3, T2 Op4 , T4 Op1, T1 Op1, T1 Op2, T1 Op3
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Evolution Towards Transactionally Consistent Consumption
First Attempt:
Send All Records to a Single
Topic, Enrich with Tx ID.
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
<<<<<< Play First Video >>>>>>>
Pros
• Operations are absolutely ordered
• Most transaction boundaries can be determined
• Consumer can checkpoint using offset
Cons
• Poor performance
• One topic with only one partition
• No consumer group parallelism
• Susceptible to Producer Duplication / Batch out of
order considerations
• Consumer must read entries it may be uninterested in
• Most Recent transaction boundary difficult to
determine
Evolution Towards Transactionally Consistent Consumption
Second Attempt:
One Topic Per Source Table,
Numerically Sequence all
Operations,
Explicitly write a special
commit record.
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
<<<<<< Play Second Video >>>>>>>
Pros
• Operations can be absolutely ordered
• Transaction boundaries can be determined
• Consumer can checkpoint using offset
Cons
• Consumer can receive messages for transactions
partially written to Kafka. Fails exactly once*
• Consumers will receive transactions/operations out
of order
• Staging area required to resequence Or a serialized
application reading every message on each
topic/partition
• Complex/Costly Producer crash recovery
• Consumer must read entries it may be uninterested in
Evolution Towards Transactionally Consistent Consumption
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
INTRODUCING:
The IBM IIDR Transactionally Consistent Consumer (TCC)
Consuming Using The IIDR Kafka Transactonally Consitent Consumer (TCC)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IBM Kafka Transactionally Consistent Consumer (TCC)
Kafka Cluster
Tab 1 Partition 1
User Data
Tab 2 Partition 1
User Data
Tab N Partition N
User Data
Commit Topic Metadata
Tx3 Metdata
Tx2 Metadata
…
3
2
TCC API
Consumer Application
User Topic Consumer 1
User Topic Consumer 2
User Topic Consumer N
Commit Topic Consumer
Deduplication and
Order Engine
1) Instantiation
4
5
4
Source Transaction Stream
T3 Op1, T3 Commit, T2 Op1, T2 Op2, T2 Op3, T2 Op4, T2 Commit
5
4
5
6
IBM Kafka Transactionally Consistent Consumer (TCC)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
<<Play Third Video>>
• IBM IIDR console-consumer utility - Similar to Kafka’s kafka-console-consumer, that wraps a kafka
consumer and prints formatted output to the screen. IBM IIDR provides a utility that wraps the TCC and
prints formatted output to the screen. Code for IBM’s console consumer is included in the IIDR
samples.jar so that users have an example application that leverages the TCC API
Relevant Avro Console Fields For DML Operation Records:
Operation Sequence – Order of the operation for this transaction
Bookmark – Used to restart the TCC producing records at this point in the output stream for the subscription
Topic – The user data topic the operation record being returned was replicated to
Partition/Offset – The user data partition and offset into the partition the operation record was written to
Key – The key of the operation record
Value – The value of the operation record
Audit Format TCC Example:
Db2 database change data replicated to Kafka
with “audit” style records using the Audit
KCOP.
Source Database operations:
$ db2 "insert into tab1 values (8,8,8, 'Tab1 stuff')"
$ db2 "insert into tab3 values (30,30,'Tab3 stuff')"
$ db2 "update tab1 set i1=9, i2=9, i3=9 where i1=8"
$ db2 "update tab3 set i1=31, i2=31 where i1=30"
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Formatted Output of the TCC based AvroConsole utility:
$ java -cp "lib/*"
com.datamirror.ts.kafka.txconsistentconsumer.sampleapplications.AvroConsole
-i kafka1 -s KAFKA_MULTI_TAB -c /CDC/conf/exactConsumer.properties -d
Dml record, operationSequence=0,
Bookmark hex value:
[000100236B61666B61312D4B41464B415F4D554C54495F5441422D63
6F6D6D697473747265616D000000000000001B00000000]
topic: kafka1.kafka_multi_tab.sourcedb.shawnrr.tab1
partition:offset (0:5)
Key {"I2": 8, "I3": 8}
Value {"I1": 8, "I2": 8, "I3": 8, "V1": "Tab1 stuff", "A_ENTTYP": "PT",
"A_TIMSTAMP": "2018-10-10 19:34:57.000000000000}
Commit record. operationSequence=0, commitStreamTopicName=kafka1-
KAFKA_MULTI_TAB-commitstream, commitStreamTopicOffset=27
Bookmark hex value:
[000100236B61666B61312D4B41464B415F4D554C54495F5441422D63
6F6D6D697473747265616D000000000000001B00000000]
Dml record, operationSequence=0,
Bookmark hex value:
[000100236B61666B61312D4B41464B415F4D554C54495F5441422D63
6F6D6D697473747265616D000000000000001B00000001]
topic: kafka1.kafka_multi_tab.sourcedb.shawnrr.tab3
partition:offset (0:0)
Key {"V3": "Tab3 stuff"}
Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "PT",
"A_TIMSTAMP": "2018-10-10 19:35:40.000000000000}
IBM Kafka Transactionally Consistent Consumer (TCC)
$ db2 "insert into tab1 values (8,8,8, 'Tab1 stuff')"
$ db2 "insert into tab3 values (30,30,'Tab3 stuff')"
$ db2 "update tab1 set i1=9, i2=9, i3=9 where i1=8"
$ db2 "update tab3 set i1=31, i2=31 where i1=30"
$ java -cp "lib/*"
com.datamirror.ts.kafka.txconsistentconsumer.sampleapplications.AvroConsole -
i kafka1 -s KAFKA_MULTI_TAB -c /CDC/conf/exactConsumer.properties -d
Key {"I2": 8, "I3": 8}
Value {"I1": 8, "I2": 8, "I3": 8, "V1": "Tab1 stuff", "A_ENTTYP":
"PT", "A_TIMSTAMP": "2018-10-10 19:34:57.000000000000}
Commit record.
Key {"V3": "Tab3 stuff"}
Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "PT",
"A_TIMSTAMP": "2018-10-10 19:35:40.000000000000}
Commit record.
Key {"I2": 9, "I3": 9}
Value {"I1": 8, "I2": 8, "I3": 8, "V1": "Tab1 stuff", "A_ENTTYP":
"UB", "A_TIMSTAMP": "2018-10-10 19:36:11.000000000000}
Key {"I2": 9, "I3": 9}
Value {"I1": 9, "I2": 9, "I3": 9, "V1": "Tab1 stuff", "A_ENTTYP":
"UP", "A_TIMSTAMP": "2018-10-10 19:36:11.000000000000”}
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Commit record.
Key {"V3": "Tab3 stuff"}
Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "UB", "A_TIMSTAMP":
"2018-10-10 19:39:38.000000000000}
Key {"V3": "Tab3 stuff"}
Value {"I1": 31, "I2": 31, "V3": "Tab3 stuff", "A_ENTTYP": "UP", "A_TIMSTAMP":
"2018-10-10 19:39:38.000000000000}
Commit record
** Note 1: In the Audit KCOP, updates are a single DML operation on the source
represented by two records in Kafka, a before image (the old value) and an after
image ( the value the row is updated to.)
** Note 2: The commit record therefore occurs after the two records of the
update which logically represent a single update operation on the source.
IBM Kafka Transactionally Consistent Consumer (TCC):
Abbreviated Entries
Commit record.
The TCC bookmark for the second last update operation:
Dml record, operationSequence=2,
Bookmark hex value:
[000100236B61666B61312D4B41464B415F4D554C5449
5F5441422D636F6D6D697473747265616D00000000000
0001B00000003]
topic: kafka1.kafka_multi_tab.sourcedb.shawnrr.tab1
partition:offset (0:7)
Key {"I2": 9, "I3": 9}
Value {"I1": 9, "I2": 9, "I3": 9, "V1": "Tab1 stuff",
"A_ENTTYP": "UP", "A_TIMSTAMP": "2018-10-10
19:36:11.000000000000”}
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
$ jre64/jre/bin/java -cp "lib/*"
com.datamirror.ts.kafka.txconsistentconsumer.sampleapplications.AvroConsole -i kafka1 -s
KAFKA_MULTI_TAB -c /home/nz/CDC_KAFKA/CDC/conf/exactConsumer.properties -b
000100236B61666B61312D4B41464B415F4D554C54495F5441422D636F6D6D69747374726561
6D000000000000001B00000003 -d
Key {"V3": "Tab3 stuff"}
Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "UB", "A_TIMSTAMP":
"2018-10-10 19:39:38.000000000000}
Key {"V3": "Tab3 stuff"}
Value {"I1": 31, "I2": 31, "V3": "Tab3 stuff", "A_ENTTYP": "UP", "A_TIMSTAMP":
"2018-10-10 19:39:38.000000000000"}
Commit record.
Note: This time a bookmark argument was added to the AvroConsole command
line. This bookmark corresponded to the second last update operation. As a
result the output of the TCC is the continuation of the transaction stream
immediately after the provided bookmark. Ie. The final update operation’s two
kafka recrods.
IBM Kafka Transactionally Consistent Consumer (TCC):
Bookmark Example
IBM Kafka Transctionally Consistent Consumer (TCC)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
• Provides Records in the Original Source Transaction Stream Order. (Total Order)
• Provides Correct order for Kafka Records written to different partitions within a topic, and Kafka
Records written to different topics. (Total Order)
• Consumer application is not required to know the topics a particular transaction impacted.
• Only Provides Records for Transactions that Have been written in their entirety (atomically) to Kafka.
(Transactional Atomic)
• Only Returns one copy of a given transaction. (Exactly Once)
• Only Returns records for a given table operation once (Exactly Once)
• Provides a bookmark with each operation so that ordered transactional consumption can resume from
any location as desired
• User can commit bookmark with transformation to record progress of processing data stream.
How The TCC Produces Data To Kafka (Producer Parallelism)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Exactly the same as default high performance IBM CDC Kafka replication.
• Producers write a transactions records in parallel.
• Multiple producers are employed for a single transaction
• For a transaction that spans multiple topics
• For a transaction with topic(s) that have multiple partitions
• Multiple Requests In Flight from each Producer to maximize bandwidth
• A dedicated producer for every topic/partition pairing involved in a transaction
IBM Kafka Transactionally Consistent Consumer (TCC)
How The TCC Produces Data To Kafka (Producer Parallelism)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Exactly the same as default high performance IBM CDC Kafka replication.
• Transactions Parallelism Also Employed!
• Transactions affecting different topic/partition pairings also written in Parallel
IBM Kafka Transactionally Consistent Consumer (TCC)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IIDR Kafka Target Engine – Producer Parallelism
1
<======== T3 Op1 (Tab1), T2 Op1(Tab2), T2 Op2 (Tab2), T2 Op3(Tab 3), T2 Op4 (Tab2) , T4 Op1 (Tab1), T1 Op1(Tab2), T1 Op2(Tab3), T1 Op3 (Tab2)
T3 Op1 , T2 Op1, T2 Op2
T2 Op3, T2 Op4 , T4 Op1
T1 Op1, T1 Op2, T1 Op3
KCOP Parallel Transformation Stage
KCOP Thread 2
KCOP Thread 1
KCOP Thread 3
Accounting and Assignment Stage
T3 Op1 Tab1/P1/0
T2 Op1 ?/?/?
T2 Op2 ?/?/?
T2 Op3 Tab3/P1/0
T2 Op4 ?/?/?
T4 Op1 Tab1/P1/1
T1 Op1 ?/?/?
T1 Op2 ?/?/?
T1 Op3 ?/?/?
Producer 1
T3 Op1, T4 Op1
2
3
Producer 2
T2 Op1, T2 Op2, T1 Op3
Producer 3
T2 Op4, T1 Op1
Producer 4
T2 Op3
Producer 5
T1 Op2
Parallel Producer Stage
IIDR Kafka Target Engine
4
5
Kafka Cluster
Tab 1 Partition 1
T3 Op1, T4 Op1
Tab 2 Partition 1
T2 Op1, T2 Op2, T1 Op3
Tab 2 Partition 2
T2 Op4, T1 Op1
Tab 3 Partition 1
T2 Op3
Tab 3 Partition 2
T1 Op2
6
6
6
6
6
Topic
Partition
Offset
Transaction
Operation
7
7) Un-Ordered Callbacks =======è
Transaction Parallelism Challenges (The Cost of Performance)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Challenges
• A transaction’s operations potentially arrive at the Kafka cluster out of order
• Can occur when a transaction involves multiple topics or partitions
• However a topic/partition pairing is written by a single producer in order*
• Transactions themselves can be written out of order
• Can occur when composite operations apply to different topic/partition
pairings
• Potential duplicate records in communication/cluster failure scenarios
• Potential for order to be incorrect even on a partition/topic pair if retries and
multiple requests in flight to same topic
IBM Kafka Transactionally Consistent Consumer (TCC)
Parallelism Challenges Met! Generating the Commit Stream
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IBM Kafka Transactionally Consistent Consumer (TCC)
A metadata list of transactions is maintained in
the Source Transaction Stream order
Each transaction metadata block maintains an
entry for each of its constituent operations
Each operation entry includes the
topic/partition/offset triplet reported by the
callback.
Accounting and Assignment Stage
T3 Op1 Tab1/P1/0
T2 Op1 Tab2/P1/0
T2 Op2 ?/?/?
T2 Op3 ?/?/?
T2 Op4 Tab2/P2/1
T4 Op1 Tab1/P1/1
T1 Op1 ?/?/?
T1 Op2 Tab3/P2/2
T1 Op3 ?/?/?
Topic
Partition
Offset
Transaction
Operation
Transaction Accounting List
Transaction 3 (1 Operation)
Tab1/P1/0
Transaction 2 (4 operations)
Tab2/P1/0, ?, ?, Tab2/P2/1
Transaction 4 (1 operation)
Tab1/P1/1
Transaction 1 (3 operations)
?, Tab3/P2/2, ?
Parallelism Challenge Met! Generating the Commit Stream
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IBM Kafka Transactionally Consistent Consumer (TCC)
Transaction Tx3 is the only candidate for being
officially committed for the TCC.
To be a candidate for TCC committing:
1) All prior transactions in the STS have been, or could be,
committed.
2) All operation triplets have been received by callback for the
current candidate transaction.
Accounting and Assignment Stage
T3 Op1 Tab1/P1/0
T2 Op1 Tab2/P1/0
T2 Op2 ?/?/?
T2 Op3 ?/?/?
T2 Op4 Tab2/P2/1
T4 Op1 Tab1/P1/1
T1 Op1 ?/?/?
T1 Op2 Tab3/P2/2
T1 Op3 ?/?/?
Topic
Partition
Offset
Transaction
Operation
Transaction Accounting List
Transaction 3 (1 Operation)
Tab1/P1/0
Transaction 2 (4 operations)
Tab2/P1/0, ?, ?, Tab2/P2/1
Transaction4 (1 operation)
Tab1/P1/1
Transaction 1 (3 operations)
?, Tab3/P2/2, ?
Parallelism Challenge Met! Writing the Commit Stream
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IBM Kafka Transactionally Consistent Consumer (TCC)
Transactions T3 and T2 are officially committed
for TCC functionality.
Candidate transactions are officially committed
for the TCC when their transaction metadata is
written to the commit topic
Transactions metadata is written to the commit
topic in Source Transaction Stream order
Transaction Accounting List
Tx3 (1 Operation)
Tab1/P1/0
Tx2 (4 operations)
Tab2/P1/0, Tab2/P1/1,
Tab3/P1/0, Tab2/P2/0
Tx4 (1 operation)
Tab1/P1/1
Tx1 (3 operations)
?, Tab3/P2/2, ?
IIDR Kafka Target Engine
1
2
Kafka Cluster
Tab 1 Partition 1
T3 Op1, T4 Op1
Tab 2 Partition 1
T2 Op1, T2 Op2, T1 Op3
Tab 2 Partition 2
T2 Op4, T1 Op1
Tab 3 Partition 1
T2 Op3
Tab 3 Partition 2
T1 Op2
Commit Topic
T3 (Tab1/P1/0),
T2 (Tab2/P1/0,
Tab2/P1/1,
Tab3/P1/0,
Tab2/P2/0)
Utilizing the TCC – Kafka Consumer Applications
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IBM Kafka Transactionally Consistent Consumer (TCC)
TCC API is called by the consumer
application.
TCC starts a configurable number of Kafka
Consumers
TCC returns a desired subscriptions
operation records
TCC returns Operations in Source
Transaction Stream order
TCC enriches each returned operation with
a “bookmark”
TCC enriches the returned record stream
with a commit record to identify the end of a
transaction
Kafka Cluster
Tab 1 Partition 1
T3 Op1, T4 Op1
Tab 2 Partition 1
T2 Op1, T2 Op2, T1 Op3
Tab 2 Partition 2
T2 Op4, T1 Op1
Tab 3 Partition 1
T2 Op3
Tab 3 Partition 2
T1 Op2
Commit Topic
T3 (Tab1/P1/0),
T2 (Tab2/P1/0,
Tab2/P1/1,
Tab3/P1/0,
Tab2/P2/0)
3
2
TCC API
Consumer Application
User Topic Consumer 1
User Topic Consumer 2
User Topic Consumer 3
Commit Topic Consumer
User Topic Consumer 4
Ordering/Deduplication
Logic
1) Instantiation
4
4
4
4
6) Source Transaction Stream
T3 Op1, T3 Commit, T2 Op1, T2 Op2, T2 Op3, T2 Op4, T2 Commit
5
TCC API Consumption Features!
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Consumer Parallelism - A single TCC utilizes multiple consumers
Horizontal Scalability - Multiple TCCs can be used at the same time on the same topics
Exactly Once - Bookmark returned with every operation allows restart at exactly that point in the STS
Exactly Once – Operation from TCC guarantees the entire transaction, and all previous transactions, replicated to Kafka
No Duplicates - Duplicate records in the user data topic are not included in TCC output
Topic Filtering - The TCC can be instructed to return a subset of the subscription topics replicated
User Topic Data Not Altered - No additional fields are added to the user data topics, enrichment occurs in TCC
No impact to Regular consumer applications - Consumer applications can simply read from user data topics as normal
Commit Boundaries Identified - TCC processing enriches output stream with commit boundary indicators
Logical Isolation – The TCC will ignore records written to user data topics by other applications even if they are separate TCC
instances.
Order - Operation and Transaction order is provided in the TCC output stream
IBM Kafka Transactionally Consistent Consumer (TCC)
Integrating the TCC in End To End Application Flow
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Performance - Leverages parallelism on many levels for performance (Producing and Consuming)
Performance - Kafka work flows can choose a subset of topics to process (only necessary data)
Performance - User can shred data across multiple partitions for performance and still read in Source Transaction Stream
order
Insight - Knowledge available only from real-time consistent (atomic) datasets can be leveraged.
Insight - Knowledge only available from relative transactions and operation ordering can be leveraged.
Insight - Ordered Source Transaction Stream allows for Numeric Sequencing at Partition, Topic, or Subscription level.
Robustness - Applications can use bookmark to ensure real-time exactly once processing of data and ensure no duplicates,
even in crash restart scenarios
Robustness – Data not related to the TCC’s subscription written to user data topics is ignored
Extreme Flexibility - Combined with CDC’s KCOP feature produced records can be in any format. Can write multiple versions
of a source operation to different topics (different formats/content) and still provide TCC semantics
IBM Kafka Transactionally Consistent Consumer (TCC)
TCC Dependencies
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
• Commit stream records must exist for TCC to produce data
• User Data records must exist at the topic and offset originally written to
• Consumer Application responsible for using bookmark for checkpointing if desired
• The consumer makes use of the Transactionally Consistent Consumer.
IBM Kafka Transactionally Consistent Consumer (TCC)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Transactionally Consistent Consumer Cloud Advantages
Leveraging the TCC for the Cloud
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Kafka In The Cloud + TCC consuming replication
• Generally, advantages outlined for TCC replication to Kafka are applicable
• Utilizes parallel producers to help mitigate longer round-trip times to cluster
• Utilizes more partitions per topic to horizontally scale (TCC allows original ordering to be known)
• Utilizes deduplication and exactly once delivery to mitigate connectivity / timeout / additional issues
• Utilizes bookmarking and total operation order to enable sequencing, parallel processing, and destination validation
Kafka on Prem + TCC Consuming Application in the cloud
• Generally, advantages outlined for TCC replication to Kafka are applicable
• Utilizes Parallel consumers reading from multiple topics to help mitigate longer round-trip times
• Utilizes TCC bookmark to enable consumers to checkpoint their consumption and processing
• Utilizes total order to enable the generation of sequencing
• Utilizes encoded operation topic/location to avoid unnecessary topic reads
Transactionally Consistent Consumer Cloud Advantages
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Comparison with Kafka Transactional and Idempotence
Feature (KIP-98)
How IBM TCC differs from
Kafka transactional / exactly once feature (KIP 98)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Different Approaches
KIP 98 - employs mechanisms to ensure records are written to topics with explicit transactions and exactly once
IBM TCC - employs mechanisms to ensure records are returned to the consuming application ordered and exactly once
Different Motivation
KIP 98 - “the main motivation for transactions is to enable exactly once processing in Kafka Streams.”
https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98-
ExactlyOnceDeliveryandTransactionalMessaging-TheOutOfOrderSequenceException
IBM TCC - Empower Kafka consumer applications to leverage change capture data maximizing: performance, robustness
and flexibility. Allow consumption and understanding of data as if it were read from the original source database.
Different Scope
KIP 98 - User must ensure data is produced correctly. Producer application is responsible for crash recovery, identifying
transactions explicitly, transaction and idempotence have single producer session limits. Consumer also have limitations
and need specific considerations when utilized.
Eg. “Further, since each new instance of a producer is assigned a new, unique, PID, we can only guarantee idempotent production within a single producer session.”
https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98-ExactlyOnceDeliveryandTransactionalMessaging-
TheOutOfOrderSequenceException
IBM TCC - Since the IBM CDC producer is part of the TCC guarantees and semantics span sessions. The scope is end to
end delivery from source database to the consumer application.
Comparison with Kafka Transactional and Idempotence Feature (KIP-98)
Contrast of the IBM TCC to Kafka Exactly Once Feature
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
TCC is focused on replicating database change data transaction streams and providing transactional semantics
TCC inherently orders operations within a transaction
TCC empowers multiple producers to write a single transaction’s operations in parallel
TCC allows transactions to be written in parallel, out of order potentially, but can provide original transaction order
TCC Provides Explicit Transaction Boundaries for processing data in atomic units of work
TCC provides mechanisms for downstream consuming logic to serialize at the STS level, topic level, partition level
TCC focuses on delivering data exactly once to the consuming application, Kafka focuses on writing it once into the
cluster
TCC logically isolates itself from the user data topics. Other application’s uncommitted transactions on a topic do
not block
TCC does not need to utilize idempotence or Kafka transactional functionality. No overhead associated with those
features.
TCC returns all data for a given transaction sequentially, no interleaved operations from different transactions
Comparison with Kafka Transactional and Idempotence Feature (KIP-98)
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Links and Resources
Links And Resources
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
IBM Infosphere Data Replication (IIDR)
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.idr.frontend.doc/pv_welcome.html
IIDR Transactionally Consistent Consumer (TCC)
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/concepts/kafkatcc.html
IIDR KCOP (Kafka Custom Operation Formatter) Feature
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/concepts/kafkakcop.html
IBM Event Streams / Message Hub (Kafka Based Solutions)
https://www.ibm.com/cloud/message-hub. (CLOUD)
https://www.ibm.com/cloud/event-streams (On Premise)
Kafka Exactly Once Delivery and Transaction Messaging KIP
https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98-
ExactlyOnceDeliveryandTransactionalMessaging-TheOutOfOrderSequenceException
Links And Resources
Thank You!
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation
Shawn Robertson, P. Eng
IBM IIDR Kafka Architect
—
shawnrr@ca.ibm.com
https://ca.linkedin.com/in/shawn-robertson-4738937b
IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation

Contenu connexe

Tendances

Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web ServicesAmazon Web Services
 
Oracle 10g Performance: chapter 04 new features
Oracle 10g Performance: chapter 04 new featuresOracle 10g Performance: chapter 04 new features
Oracle 10g Performance: chapter 04 new featuresKyle Hailey
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Oracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous DatabaseOracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous DatabaseMarkus Michalewicz
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Upgrading to VMware vSphere 6.0
Upgrading to VMware vSphere 6.0Upgrading to VMware vSphere 6.0
Upgrading to VMware vSphere 6.0Tim Carman
 
Oracle Database Performance Tuning Basics
Oracle Database Performance Tuning BasicsOracle Database Performance Tuning Basics
Oracle Database Performance Tuning Basicsnitin anjankar
 
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...confluent
 
AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...
AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...
AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...Amazon Web Services
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
20210216 AWS Black Belt Online Seminar AWS Database Migration Service
20210216 AWS Black Belt Online Seminar AWS Database Migration Service20210216 AWS Black Belt Online Seminar AWS Database Migration Service
20210216 AWS Black Belt Online Seminar AWS Database Migration ServiceAmazon Web Services Japan
 
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...Amazon Web Services
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
Time Series Vs Order based Planning in SAP IBP
Time Series Vs Order based Planning in SAP IBPTime Series Vs Order based Planning in SAP IBP
Time Series Vs Order based Planning in SAP IBPAYAN BISHNU
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSAmazon Web Services
 

Tendances (20)

Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web Services
 
Oracle 10g Performance: chapter 04 new features
Oracle 10g Performance: chapter 04 new featuresOracle 10g Performance: chapter 04 new features
Oracle 10g Performance: chapter 04 new features
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Oracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous DatabaseOracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous Database
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Upgrading to VMware vSphere 6.0
Upgrading to VMware vSphere 6.0Upgrading to VMware vSphere 6.0
Upgrading to VMware vSphere 6.0
 
Oracle Database Performance Tuning Basics
Oracle Database Performance Tuning BasicsOracle Database Performance Tuning Basics
Oracle Database Performance Tuning Basics
 
Amazon ElastiCache and Redis
Amazon ElastiCache and RedisAmazon ElastiCache and Redis
Amazon ElastiCache and Redis
 
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...
 
AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...
AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...
AWS CodeDeploy, AWS CodePipeline, and AWS CodeCommit: Transforming Software D...
 
SAP HANA Platform
SAP HANA Platform SAP HANA Platform
SAP HANA Platform
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Running Kubernetes on AWS.pdf
Running Kubernetes on AWS.pdfRunning Kubernetes on AWS.pdf
Running Kubernetes on AWS.pdf
 
20210216 AWS Black Belt Online Seminar AWS Database Migration Service
20210216 AWS Black Belt Online Seminar AWS Database Migration Service20210216 AWS Black Belt Online Seminar AWS Database Migration Service
20210216 AWS Black Belt Online Seminar AWS Database Migration Service
 
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
 
AWR & ASH Analysis
AWR & ASH AnalysisAWR & ASH Analysis
AWR & ASH Analysis
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Amazon EC2 Container Service
Amazon EC2 Container ServiceAmazon EC2 Container Service
Amazon EC2 Container Service
 
Time Series Vs Order based Planning in SAP IBP
Time Series Vs Order based Planning in SAP IBPTime Series Vs Order based Planning in SAP IBP
Time Series Vs Order based Planning in SAP IBP
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 

Similaire à A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions

Introduction to IBM Cloud Private - April 2018
Introduction to IBM Cloud Private - April 2018Introduction to IBM Cloud Private - April 2018
Introduction to IBM Cloud Private - April 2018Michael Elder
 
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPARInterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPARDevOps for Enterprise Systems
 
BP205: There’s an API for that! Why and how to build on the IBM Connections P...
BP205: There’s an API for that! Why and how to build on the IBM Connections P...BP205: There’s an API for that! Why and how to build on the IBM Connections P...
BP205: There’s an API for that! Why and how to build on the IBM Connections P...Mikkel Flindt Heisterberg
 
InterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT Applications
InterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT ApplicationsInterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT Applications
InterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT Applicationsgjuljo
 
Managing integration in a multi cluster world
Managing integration in a multi cluster worldManaging integration in a multi cluster world
Managing integration in a multi cluster worldShikha Srivastava
 
Session 2546 - Solving Performance Problems in CICS using CICS Performance A...
Session 2546 -  Solving Performance Problems in CICS using CICS Performance A...Session 2546 -  Solving Performance Problems in CICS using CICS Performance A...
Session 2546 - Solving Performance Problems in CICS using CICS Performance A...nick_garrod
 
Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!Arthur De Magalhaes
 
Why z/OS is a great platform for developing and hosting APIs
Why z/OS is a great platform for developing and hosting APIsWhy z/OS is a great platform for developing and hosting APIs
Why z/OS is a great platform for developing and hosting APIsTeodoro Cipresso
 
DESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA IIIDESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA IIIUlf Troppens
 
Java on IBM z15
Java on IBM z15Java on IBM z15
Java on IBM z15Joran Siu
 
IBM Think 2018: IBM MQ Appliance
IBM Think 2018: IBM MQ ApplianceIBM Think 2018: IBM MQ Appliance
IBM Think 2018: IBM MQ ApplianceJamie Squibb
 
6329 get hands on with kubernetes v2
6329 get hands on with kubernetes v26329 get hands on with kubernetes v2
6329 get hands on with kubernetes v2Sreeni Pamidala
 
InterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make Sense
InterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make SenseInterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make Sense
InterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make SenseDevOps for Enterprise Systems
 
Ims13 ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...
Ims13   ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...Ims13   ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...
Ims13 ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...Robert Hain
 
WebSphere Liberty HTML5 Real-Time Features Lab
WebSphere Liberty HTML5 Real-Time Features LabWebSphere Liberty HTML5 Real-Time Features Lab
WebSphere Liberty HTML5 Real-Time Features LabBrian Pulito
 
Sap guided workflow in ibm bpm
Sap guided workflow in ibm bpmSap guided workflow in ibm bpm
Sap guided workflow in ibm bpmsflynn073
 
SAP guided workflow in IBM BPM
SAP guided workflow in IBM BPMSAP guided workflow in IBM BPM
SAP guided workflow in IBM BPMsflynn073
 
#8311: Transform the Enterprise with IBM Cloud Private
#8311: Transform the Enterprise with IBM Cloud Private#8311: Transform the Enterprise with IBM Cloud Private
#8311: Transform the Enterprise with IBM Cloud PrivateMichael Elder
 

Similaire à A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions (20)

Automatic Performance Improvement for Legacy COBOL
Automatic Performance Improvement for Legacy COBOLAutomatic Performance Improvement for Legacy COBOL
Automatic Performance Improvement for Legacy COBOL
 
Introduction to IBM Cloud Private - April 2018
Introduction to IBM Cloud Private - April 2018Introduction to IBM Cloud Private - April 2018
Introduction to IBM Cloud Private - April 2018
 
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPARInterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
 
BP205: There’s an API for that! Why and how to build on the IBM Connections P...
BP205: There’s an API for that! Why and how to build on the IBM Connections P...BP205: There’s an API for that! Why and how to build on the IBM Connections P...
BP205: There’s an API for that! Why and how to build on the IBM Connections P...
 
InterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT Applications
InterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT ApplicationsInterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT Applications
InterConnect2015 ICP3222 A MDD Approach to Agile Development of IoT Applications
 
Managing integration in a multi cluster world
Managing integration in a multi cluster worldManaging integration in a multi cluster world
Managing integration in a multi cluster world
 
Session 2546 - Solving Performance Problems in CICS using CICS Performance A...
Session 2546 -  Solving Performance Problems in CICS using CICS Performance A...Session 2546 -  Solving Performance Problems in CICS using CICS Performance A...
Session 2546 - Solving Performance Problems in CICS using CICS Performance A...
 
Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!
 
Why z/OS is a great platform for developing and hosting APIs
Why z/OS is a great platform for developing and hosting APIsWhy z/OS is a great platform for developing and hosting APIs
Why z/OS is a great platform for developing and hosting APIs
 
DESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA IIIDESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA III
 
Java on IBM z15
Java on IBM z15Java on IBM z15
Java on IBM z15
 
IBM Think 2018: IBM MQ Appliance
IBM Think 2018: IBM MQ ApplianceIBM Think 2018: IBM MQ Appliance
IBM Think 2018: IBM MQ Appliance
 
6329 get hands on with kubernetes v2
6329 get hands on with kubernetes v26329 get hands on with kubernetes v2
6329 get hands on with kubernetes v2
 
InterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make Sense
InterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make SenseInterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make Sense
InterConnect 2017 : Git for COBOL and PL/I?—Yes, It Can Make Sense
 
Ims13 ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...
Ims13   ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...Ims13   ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...
Ims13 ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...
 
WebSphere Liberty HTML5 Real-Time Features Lab
WebSphere Liberty HTML5 Real-Time Features LabWebSphere Liberty HTML5 Real-Time Features Lab
WebSphere Liberty HTML5 Real-Time Features Lab
 
Sap guided workflow in ibm bpm
Sap guided workflow in ibm bpmSap guided workflow in ibm bpm
Sap guided workflow in ibm bpm
 
SAP guided workflow in IBM BPM
SAP guided workflow in IBM BPMSAP guided workflow in IBM BPM
SAP guided workflow in IBM BPM
 
Why Ibm cloud private
Why Ibm cloud private Why Ibm cloud private
Why Ibm cloud private
 
#8311: Transform the Enterprise with IBM Cloud Private
#8311: Transform the Enterprise with IBM Cloud Private#8311: Transform the Enterprise with IBM Cloud Private
#8311: Transform the Enterprise with IBM Cloud Private
 

Plus de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

Plus de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Dernier

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Dernier (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions

  • 1. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Kafka Transactionally Consistent Replication — Shawn Robertson, P. Eng IIDR Kafka Architect
  • 2. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Notices and disclaimers Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights -Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.” Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com, Aspera®, Bluemix, BlueworksLive, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.33 Legal Stuff
  • 3. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Introduction And Agenda
  • 4. Agenda IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Objective & Background 05 Terminology For the Talk 07 The Source Transaction Stream 10 Database Change Capture, Kafka Style 15 Leveraging The Source Transaction Stream 17 The IBM IIDR Kafka Replication Objective 27 Evolution Towards Transactionally Consistent 31 Consumption Introducing the Transactionally Consistent 34 Consumer (TCC) TCC Cloud Advantages 52 Comparison with Kafka Transactional and 54 Idempotence Feature (KIP-98) Links And Resources 57
  • 5. Objective IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Describe the challenges and an approach to replicating database change data to Kafka in a transactionally consistent manner. Approach: Outline the challenges of transactional database change replication to Kafka. Describe the evolution, design motivations, and methodology behind IBM’s development of Transactionally Consistent Replication to Kafka. Provide insight into the following areas: • Serial representation of source database changes (The Source Transaction Stream) • Consuming applications leveraging the STS – How and Why • Kafka and the unique Requirements for Database Change Stream replication • Possible Approaches to ordered/transactional Kafka Replication • How's and Whys of IBMs Kafka Transactional Replication Methodology • Comparison with Kafka’s Transactional/Idempotence functionality Introduction And Agenda
  • 6. Motivation IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Why develop the IBM Transactionally Consistent Consumer? • The customer’s always right! • Next generation Kafka Data Flows • Additional influence from having owned Db2 Backup and Restore Introduction And Agenda
  • 7. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Some Terminology!
  • 8. Anatomy of A Transaction IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation DML Operation An Insert, Update, or Delete against a specific database table on the source database. Transaction A logical unit of work representing a change to the database. Consists of a sequence of one or more operations which are applied to the source database and treated as a single atomic change. Commit Operation Used to mark the end point or completion of a transaction. Rollback A transaction that is not committed and so will not have its operations replicated to the CDC target. Terminology
  • 9. Replication Components IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Source Database The Database who’s log changes are being captured and replicated. IIDR (CDC) Source Engine The installed IIDR (CDC) agent that captures the changes to the source database from log files. IIDR (CDC) Target Engine The IIDR (CDC) for Kafka agent installed on a Linux server and responsible for producing data into Kafka. Target The Kafka cluster which ultimately receives the replicated data. Subscription A set Source database tables which are to have their change data capture information replicated consistently with respect to each other. Terminology
  • 10. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation The Source Transaction Stream – Where it All Begins
  • 11. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Four Time Interleaved Source Database Transactions Transaction 1 Op1(Tab2) Op2(Tab3) Op3(tab2) Commit Transaction 2 Op1(Tab2) Op2(Tab2) Op3(Tab3) Op4(tab2) Commit Transaction 3 Op1(Tab1) Commit Transaction 4 Op1(tab1) Commit ===================== TIME =====================è
  • 12. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Source Transaction Stream IIDR Kafka Target <========= T3 Op1, T2 Op1, T2 Op2, T2 Op3, T2 Op4, T4 Op1, T1 Op1, T1 Op2, T1 Op3 (send) As Transaction “T3” was the first to commit, it’s operation is first in the source transaction stream. T3 is comprised of only one operation. T2 having four operations, committed next in time and so its operations in order of occurrence are next in the source transaction stream.
  • 13. Source Database IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Source Transaction Stream A serial logical sequence of operations. The order of which is first determined by the transaction commit order to which the operation belongs and then the relative order of the operation within the transaction. Source Transaction Stream
  • 14. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Transactions applied to Kafka Cluster with Multi Partitioned Topics ===================== Offset ===================== è Tab 1 Partition1 T3 Op1 T4 Op1 Tab2 Parititon1 T2 Op1 T2 Op2 T1 Op3 Tab2 Partition2 T2 Op4 T1 Op1 Tab3 Partition1 T2 Op3 Tab3 Partition2 T1 Op2
  • 15. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Database Change Capture Kafka Flows: Common Characteristics
  • 16. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Maximizing Value From Transactional Database Records: • Order Matters (Operation and Transaction) • Operation Order within a transaction even when operations are applied to different tables • Transactions order as the source sees them even when constituent operations apply to different tables • Atomicity Matters • The “meta” transactional relationship of operations contains valuable/required insights • “Eventual” Consistency Is Not Sufficient • Data processed as if it were originating from the source database must be consistent • Exactly Once Processing Matters • Business logic can dictate each source operation must result in exactly one action in response (*Realtime apps) • Performance Matters • To The Cluster, From the Cluster, Crash Recovery Database Change Capture Transaction Kafka Flows: Common Characteristics
  • 17. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Leveraging The Source Transaction Stream
  • 18. What Value does the Source Transaction Stream provide a Consumer application? IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Gives us the order of operations as they were “logically” applied to the source database. Gives us the units of work with which the changes were logically applied to the source database. Applying the changes in source transaction stream order to another database will result in the target tables being populated with the same information Committing the operations of the source transaction stream at the transaction boundaries ensures that the target database only exists in states that the source database did from an external applications perspective. Utilizing the commit boundaries and ordering of the source transaction stream ensure that multiple tables can be kept transactionally consistent with each other. Source Transaction Stream
  • 19. What is a scenario where knowing the source transaction stream is useful? IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation When Operation Order Matters! Enforced State Validation Example: The source has two tables with a parent child relationship. Transactions insert first into the parent and then the child. We wish to consume the replicated operations from Kafka and apply the changes records to a target that also mandates this referential integrity be maintained. Knowing the source transaction stream would allow us to ensure that operations are applied first to the parent and then to the child which is vital if the target also strictly enforces its referential integrity. Source Transaction Stream
  • 20. That Last Example Sounded a Lot Like Your Old DB2 Life Shawn! Give me something Kafka and single Topic. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation When Operation Order Matters! (take 2) Business Logic State Validation Example: A topic represents an account balance which may never be less than $0 at any point. Initially a cheque is deposited for $50. Subsequently a withdrawal is made for $40. If the application reading from the Kafka topic and applying the changes to database balances sees the operations as they occur in the source transaction stream then they will always have a valid balance. If the consuming application does not receive the operations in source transaction stream order then an invalid balance exists when consuming the withdrawal prior to the deposit record. Source Transaction Stream
  • 21. What is an example of when Exactly Once processing matters? IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Exactly Once Real-Time Event Triggering: Incoming Credit Card purchase requests are replicated to Kafka. The Bank requests text authorization, in real time, for purchases that exceed a $500 threshold A Kafka consumer application consumes the users purchase topic and triggers an authentication event for $500+ purchases Customers would be unhappy (and concerned!) if they are requested to do multiple authorizations for a single purchase! Source Transaction Stream
  • 22. How about a real world example making use of multiple properties of the source transaction stream? Multi-Table Midnight Snapshot Example IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation When Operation Order and Transactional Consistency Matter! Customer replicates CDC data for several key tables to Kafka. Each table is replicated to its own Kafka topic. A consumer application continuously applies the replicated changes to a separate key/value store table per topic (table). Every night the customer makes a copy of the key value store tables consistent to the last completed transaction on the source before midnight. Resulting tables used for analytic analysis, end of day bookkeeping, and a potential table data backup. (Ie. The key value pairs could form the contents of a load statement) Source Transaction Stream
  • 23. Order Matters: Multi Table Midnight Snapshot IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Operation Order Aggregate Net Change Example: Records representing DML operations are being actively applied to a target table. Altering the source transaction stream sequence of Update, Update, Delete would result in the key/value store table no longer representing the source table row for row. Eg. Initial key value pair: (1 / (abc)) Operations in source transaction stream order: (1 / (def)) , (1 / (ghi)) , (1, / null) Result in Key/Value store: no value for key 1. Operations if applied out of stream order: (1 / null) , (1 / (def)) , (1 / (ghi)) Result in Key/Value store: value ghi for Key 1. Source Transaction Stream
  • 24. Transaction Boundaries Matter (Pt. 1): Multi Table Midnight Snapshot IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Each individual table is essentially corrupt if operations are not applied to a transaction boundary. Midnight table snapshot is likely to contain operations from part of a transaction if applied without knowledge of transaction boundaries. Resulting target table can be in a state that did not exist on the source. Resulting target table may be in an “impossible” state from a business logic perspective. Resulting target table may violate business rules. Analytics requiring more than general aggregation at risk. Eg. Cannot perform aggregate accounting. Using table snapshot as potential DR load would corrupt source database. Source Transaction Stream
  • 25. Transaction Boundaries Matter (Pt.2): Multi Table Midnight Snapshot IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation It is essential to bring all tables being replicated to the same transaction boundary in the source transaction stream. Example: Topic 1 represents orders placed. Topic 2 represents the quantity of iron available for new manufacturing of orders. Failure to have the source transaction streams commit boundaries might see the table snapshot for topic 1 includes an additional order that the debit of available raw materials for is not reflected in the key/value table representing topic 2. Result: Incorrect assumption that more raw materials are available for future orders than actually are at end of day. Source Transaction Stream
  • 26. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation The IBM IIDR Kafka Replication Objective
  • 27. GOAL IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Empower Kafka consumer applications to recreate the source commit stream! • Consume Kafka records from multiple topics and multiple partitions in the order the originating source operations occurred. • Consume understanding the original source “transactionality” of the operations: • Transaction Boundaries: so transactions can be atomically manipulated. • Order of the Transactions: So the atomic units of work can be sequentially applied if required. • Consume data without operation duplication. In Other Words: Give consuming applications the ability to treat data read from Kafka with the semantics of a transactional database.
  • 28. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation The Ideal Solution Criteria: • No duplication. To truly recreate the source transaction stream is to do so without duplication of any operations. • No performance impact when applying data to Kafka. • Must support the parallel apply of transactions to Kafka • Must support the parallel apply of operations with differing topics/partitions • High performance recreation of the source transaction stream sequence • Allow for traditional consumption of topic data for legacy applications • Allow for restarting the sequence of operations from any point in the source transaction stream • Must be compatible with all versions/distributions of Kafka 0.10.X and higher • Consumers must not require IIDR (CDC) to be running to generate sequence of source commit stream operations. The IBM IIDR Kafka Replication Objective
  • 29. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation The Goal Visualized ===================== Offset ===================== è Tab 1 Partition1 T3 Op1 T4 Op1 Tab2 Parititon1 T2 Op1 T2 Op2 T1 Op3 Tab2 Partition2 T2 Op4 T1 Op1 Tab3 Partition1 T2 Op3 Tab3 Partition2 T1 Op2 Allow a Kafka Consumer Application To Read This…. And Receive This….. Kafka Consumer Application <======== T3 Op1 , T2 Op1, T2 Op2 , T2 Op3, T2 Op4 , T4 Op1, T1 Op1, T1 Op2, T1 Op3
  • 30. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Evolution Towards Transactionally Consistent Consumption
  • 31. First Attempt: Send All Records to a Single Topic, Enrich with Tx ID. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation <<<<<< Play First Video >>>>>>> Pros • Operations are absolutely ordered • Most transaction boundaries can be determined • Consumer can checkpoint using offset Cons • Poor performance • One topic with only one partition • No consumer group parallelism • Susceptible to Producer Duplication / Batch out of order considerations • Consumer must read entries it may be uninterested in • Most Recent transaction boundary difficult to determine Evolution Towards Transactionally Consistent Consumption
  • 32. Second Attempt: One Topic Per Source Table, Numerically Sequence all Operations, Explicitly write a special commit record. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation <<<<<< Play Second Video >>>>>>> Pros • Operations can be absolutely ordered • Transaction boundaries can be determined • Consumer can checkpoint using offset Cons • Consumer can receive messages for transactions partially written to Kafka. Fails exactly once* • Consumers will receive transactions/operations out of order • Staging area required to resequence Or a serialized application reading every message on each topic/partition • Complex/Costly Producer crash recovery • Consumer must read entries it may be uninterested in Evolution Towards Transactionally Consistent Consumption
  • 33. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation INTRODUCING: The IBM IIDR Transactionally Consistent Consumer (TCC)
  • 34. Consuming Using The IIDR Kafka Transactonally Consitent Consumer (TCC) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IBM Kafka Transactionally Consistent Consumer (TCC) Kafka Cluster Tab 1 Partition 1 User Data Tab 2 Partition 1 User Data Tab N Partition N User Data Commit Topic Metadata Tx3 Metdata Tx2 Metadata … 3 2 TCC API Consumer Application User Topic Consumer 1 User Topic Consumer 2 User Topic Consumer N Commit Topic Consumer Deduplication and Order Engine 1) Instantiation 4 5 4 Source Transaction Stream T3 Op1, T3 Commit, T2 Op1, T2 Op2, T2 Op3, T2 Op4, T2 Commit 5 4 5 6
  • 35. IBM Kafka Transactionally Consistent Consumer (TCC) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation <<Play Third Video>> • IBM IIDR console-consumer utility - Similar to Kafka’s kafka-console-consumer, that wraps a kafka consumer and prints formatted output to the screen. IBM IIDR provides a utility that wraps the TCC and prints formatted output to the screen. Code for IBM’s console consumer is included in the IIDR samples.jar so that users have an example application that leverages the TCC API Relevant Avro Console Fields For DML Operation Records: Operation Sequence – Order of the operation for this transaction Bookmark – Used to restart the TCC producing records at this point in the output stream for the subscription Topic – The user data topic the operation record being returned was replicated to Partition/Offset – The user data partition and offset into the partition the operation record was written to Key – The key of the operation record Value – The value of the operation record
  • 36. Audit Format TCC Example: Db2 database change data replicated to Kafka with “audit” style records using the Audit KCOP. Source Database operations: $ db2 "insert into tab1 values (8,8,8, 'Tab1 stuff')" $ db2 "insert into tab3 values (30,30,'Tab3 stuff')" $ db2 "update tab1 set i1=9, i2=9, i3=9 where i1=8" $ db2 "update tab3 set i1=31, i2=31 where i1=30" IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Formatted Output of the TCC based AvroConsole utility: $ java -cp "lib/*" com.datamirror.ts.kafka.txconsistentconsumer.sampleapplications.AvroConsole -i kafka1 -s KAFKA_MULTI_TAB -c /CDC/conf/exactConsumer.properties -d Dml record, operationSequence=0, Bookmark hex value: [000100236B61666B61312D4B41464B415F4D554C54495F5441422D63 6F6D6D697473747265616D000000000000001B00000000] topic: kafka1.kafka_multi_tab.sourcedb.shawnrr.tab1 partition:offset (0:5) Key {"I2": 8, "I3": 8} Value {"I1": 8, "I2": 8, "I3": 8, "V1": "Tab1 stuff", "A_ENTTYP": "PT", "A_TIMSTAMP": "2018-10-10 19:34:57.000000000000} Commit record. operationSequence=0, commitStreamTopicName=kafka1- KAFKA_MULTI_TAB-commitstream, commitStreamTopicOffset=27 Bookmark hex value: [000100236B61666B61312D4B41464B415F4D554C54495F5441422D63 6F6D6D697473747265616D000000000000001B00000000] Dml record, operationSequence=0, Bookmark hex value: [000100236B61666B61312D4B41464B415F4D554C54495F5441422D63 6F6D6D697473747265616D000000000000001B00000001] topic: kafka1.kafka_multi_tab.sourcedb.shawnrr.tab3 partition:offset (0:0) Key {"V3": "Tab3 stuff"} Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "PT", "A_TIMSTAMP": "2018-10-10 19:35:40.000000000000} IBM Kafka Transactionally Consistent Consumer (TCC)
  • 37. $ db2 "insert into tab1 values (8,8,8, 'Tab1 stuff')" $ db2 "insert into tab3 values (30,30,'Tab3 stuff')" $ db2 "update tab1 set i1=9, i2=9, i3=9 where i1=8" $ db2 "update tab3 set i1=31, i2=31 where i1=30" $ java -cp "lib/*" com.datamirror.ts.kafka.txconsistentconsumer.sampleapplications.AvroConsole - i kafka1 -s KAFKA_MULTI_TAB -c /CDC/conf/exactConsumer.properties -d Key {"I2": 8, "I3": 8} Value {"I1": 8, "I2": 8, "I3": 8, "V1": "Tab1 stuff", "A_ENTTYP": "PT", "A_TIMSTAMP": "2018-10-10 19:34:57.000000000000} Commit record. Key {"V3": "Tab3 stuff"} Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "PT", "A_TIMSTAMP": "2018-10-10 19:35:40.000000000000} Commit record. Key {"I2": 9, "I3": 9} Value {"I1": 8, "I2": 8, "I3": 8, "V1": "Tab1 stuff", "A_ENTTYP": "UB", "A_TIMSTAMP": "2018-10-10 19:36:11.000000000000} Key {"I2": 9, "I3": 9} Value {"I1": 9, "I2": 9, "I3": 9, "V1": "Tab1 stuff", "A_ENTTYP": "UP", "A_TIMSTAMP": "2018-10-10 19:36:11.000000000000”} IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Commit record. Key {"V3": "Tab3 stuff"} Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "UB", "A_TIMSTAMP": "2018-10-10 19:39:38.000000000000} Key {"V3": "Tab3 stuff"} Value {"I1": 31, "I2": 31, "V3": "Tab3 stuff", "A_ENTTYP": "UP", "A_TIMSTAMP": "2018-10-10 19:39:38.000000000000} Commit record ** Note 1: In the Audit KCOP, updates are a single DML operation on the source represented by two records in Kafka, a before image (the old value) and an after image ( the value the row is updated to.) ** Note 2: The commit record therefore occurs after the two records of the update which logically represent a single update operation on the source. IBM Kafka Transactionally Consistent Consumer (TCC): Abbreviated Entries
  • 38. Commit record. The TCC bookmark for the second last update operation: Dml record, operationSequence=2, Bookmark hex value: [000100236B61666B61312D4B41464B415F4D554C5449 5F5441422D636F6D6D697473747265616D00000000000 0001B00000003] topic: kafka1.kafka_multi_tab.sourcedb.shawnrr.tab1 partition:offset (0:7) Key {"I2": 9, "I3": 9} Value {"I1": 9, "I2": 9, "I3": 9, "V1": "Tab1 stuff", "A_ENTTYP": "UP", "A_TIMSTAMP": "2018-10-10 19:36:11.000000000000”} IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation $ jre64/jre/bin/java -cp "lib/*" com.datamirror.ts.kafka.txconsistentconsumer.sampleapplications.AvroConsole -i kafka1 -s KAFKA_MULTI_TAB -c /home/nz/CDC_KAFKA/CDC/conf/exactConsumer.properties -b 000100236B61666B61312D4B41464B415F4D554C54495F5441422D636F6D6D69747374726561 6D000000000000001B00000003 -d Key {"V3": "Tab3 stuff"} Value {"I1": 30, "I2": 30, "V3": "Tab3 stuff", "A_ENTTYP": "UB", "A_TIMSTAMP": "2018-10-10 19:39:38.000000000000} Key {"V3": "Tab3 stuff"} Value {"I1": 31, "I2": 31, "V3": "Tab3 stuff", "A_ENTTYP": "UP", "A_TIMSTAMP": "2018-10-10 19:39:38.000000000000"} Commit record. Note: This time a bookmark argument was added to the AvroConsole command line. This bookmark corresponded to the second last update operation. As a result the output of the TCC is the continuation of the transaction stream immediately after the provided bookmark. Ie. The final update operation’s two kafka recrods. IBM Kafka Transactionally Consistent Consumer (TCC): Bookmark Example
  • 39. IBM Kafka Transctionally Consistent Consumer (TCC) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation • Provides Records in the Original Source Transaction Stream Order. (Total Order) • Provides Correct order for Kafka Records written to different partitions within a topic, and Kafka Records written to different topics. (Total Order) • Consumer application is not required to know the topics a particular transaction impacted. • Only Provides Records for Transactions that Have been written in their entirety (atomically) to Kafka. (Transactional Atomic) • Only Returns one copy of a given transaction. (Exactly Once) • Only Returns records for a given table operation once (Exactly Once) • Provides a bookmark with each operation so that ordered transactional consumption can resume from any location as desired • User can commit bookmark with transformation to record progress of processing data stream.
  • 40. How The TCC Produces Data To Kafka (Producer Parallelism) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Exactly the same as default high performance IBM CDC Kafka replication. • Producers write a transactions records in parallel. • Multiple producers are employed for a single transaction • For a transaction that spans multiple topics • For a transaction with topic(s) that have multiple partitions • Multiple Requests In Flight from each Producer to maximize bandwidth • A dedicated producer for every topic/partition pairing involved in a transaction IBM Kafka Transactionally Consistent Consumer (TCC)
  • 41. How The TCC Produces Data To Kafka (Producer Parallelism) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Exactly the same as default high performance IBM CDC Kafka replication. • Transactions Parallelism Also Employed! • Transactions affecting different topic/partition pairings also written in Parallel IBM Kafka Transactionally Consistent Consumer (TCC)
  • 42. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IIDR Kafka Target Engine – Producer Parallelism 1 <======== T3 Op1 (Tab1), T2 Op1(Tab2), T2 Op2 (Tab2), T2 Op3(Tab 3), T2 Op4 (Tab2) , T4 Op1 (Tab1), T1 Op1(Tab2), T1 Op2(Tab3), T1 Op3 (Tab2) T3 Op1 , T2 Op1, T2 Op2 T2 Op3, T2 Op4 , T4 Op1 T1 Op1, T1 Op2, T1 Op3 KCOP Parallel Transformation Stage KCOP Thread 2 KCOP Thread 1 KCOP Thread 3 Accounting and Assignment Stage T3 Op1 Tab1/P1/0 T2 Op1 ?/?/? T2 Op2 ?/?/? T2 Op3 Tab3/P1/0 T2 Op4 ?/?/? T4 Op1 Tab1/P1/1 T1 Op1 ?/?/? T1 Op2 ?/?/? T1 Op3 ?/?/? Producer 1 T3 Op1, T4 Op1 2 3 Producer 2 T2 Op1, T2 Op2, T1 Op3 Producer 3 T2 Op4, T1 Op1 Producer 4 T2 Op3 Producer 5 T1 Op2 Parallel Producer Stage IIDR Kafka Target Engine 4 5 Kafka Cluster Tab 1 Partition 1 T3 Op1, T4 Op1 Tab 2 Partition 1 T2 Op1, T2 Op2, T1 Op3 Tab 2 Partition 2 T2 Op4, T1 Op1 Tab 3 Partition 1 T2 Op3 Tab 3 Partition 2 T1 Op2 6 6 6 6 6 Topic Partition Offset Transaction Operation 7 7) Un-Ordered Callbacks =======è
  • 43. Transaction Parallelism Challenges (The Cost of Performance) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Challenges • A transaction’s operations potentially arrive at the Kafka cluster out of order • Can occur when a transaction involves multiple topics or partitions • However a topic/partition pairing is written by a single producer in order* • Transactions themselves can be written out of order • Can occur when composite operations apply to different topic/partition pairings • Potential duplicate records in communication/cluster failure scenarios • Potential for order to be incorrect even on a partition/topic pair if retries and multiple requests in flight to same topic IBM Kafka Transactionally Consistent Consumer (TCC)
  • 44. Parallelism Challenges Met! Generating the Commit Stream IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IBM Kafka Transactionally Consistent Consumer (TCC) A metadata list of transactions is maintained in the Source Transaction Stream order Each transaction metadata block maintains an entry for each of its constituent operations Each operation entry includes the topic/partition/offset triplet reported by the callback. Accounting and Assignment Stage T3 Op1 Tab1/P1/0 T2 Op1 Tab2/P1/0 T2 Op2 ?/?/? T2 Op3 ?/?/? T2 Op4 Tab2/P2/1 T4 Op1 Tab1/P1/1 T1 Op1 ?/?/? T1 Op2 Tab3/P2/2 T1 Op3 ?/?/? Topic Partition Offset Transaction Operation Transaction Accounting List Transaction 3 (1 Operation) Tab1/P1/0 Transaction 2 (4 operations) Tab2/P1/0, ?, ?, Tab2/P2/1 Transaction 4 (1 operation) Tab1/P1/1 Transaction 1 (3 operations) ?, Tab3/P2/2, ?
  • 45. Parallelism Challenge Met! Generating the Commit Stream IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IBM Kafka Transactionally Consistent Consumer (TCC) Transaction Tx3 is the only candidate for being officially committed for the TCC. To be a candidate for TCC committing: 1) All prior transactions in the STS have been, or could be, committed. 2) All operation triplets have been received by callback for the current candidate transaction. Accounting and Assignment Stage T3 Op1 Tab1/P1/0 T2 Op1 Tab2/P1/0 T2 Op2 ?/?/? T2 Op3 ?/?/? T2 Op4 Tab2/P2/1 T4 Op1 Tab1/P1/1 T1 Op1 ?/?/? T1 Op2 Tab3/P2/2 T1 Op3 ?/?/? Topic Partition Offset Transaction Operation Transaction Accounting List Transaction 3 (1 Operation) Tab1/P1/0 Transaction 2 (4 operations) Tab2/P1/0, ?, ?, Tab2/P2/1 Transaction4 (1 operation) Tab1/P1/1 Transaction 1 (3 operations) ?, Tab3/P2/2, ?
  • 46. Parallelism Challenge Met! Writing the Commit Stream IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IBM Kafka Transactionally Consistent Consumer (TCC) Transactions T3 and T2 are officially committed for TCC functionality. Candidate transactions are officially committed for the TCC when their transaction metadata is written to the commit topic Transactions metadata is written to the commit topic in Source Transaction Stream order Transaction Accounting List Tx3 (1 Operation) Tab1/P1/0 Tx2 (4 operations) Tab2/P1/0, Tab2/P1/1, Tab3/P1/0, Tab2/P2/0 Tx4 (1 operation) Tab1/P1/1 Tx1 (3 operations) ?, Tab3/P2/2, ? IIDR Kafka Target Engine 1 2 Kafka Cluster Tab 1 Partition 1 T3 Op1, T4 Op1 Tab 2 Partition 1 T2 Op1, T2 Op2, T1 Op3 Tab 2 Partition 2 T2 Op4, T1 Op1 Tab 3 Partition 1 T2 Op3 Tab 3 Partition 2 T1 Op2 Commit Topic T3 (Tab1/P1/0), T2 (Tab2/P1/0, Tab2/P1/1, Tab3/P1/0, Tab2/P2/0)
  • 47. Utilizing the TCC – Kafka Consumer Applications IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IBM Kafka Transactionally Consistent Consumer (TCC) TCC API is called by the consumer application. TCC starts a configurable number of Kafka Consumers TCC returns a desired subscriptions operation records TCC returns Operations in Source Transaction Stream order TCC enriches each returned operation with a “bookmark” TCC enriches the returned record stream with a commit record to identify the end of a transaction Kafka Cluster Tab 1 Partition 1 T3 Op1, T4 Op1 Tab 2 Partition 1 T2 Op1, T2 Op2, T1 Op3 Tab 2 Partition 2 T2 Op4, T1 Op1 Tab 3 Partition 1 T2 Op3 Tab 3 Partition 2 T1 Op2 Commit Topic T3 (Tab1/P1/0), T2 (Tab2/P1/0, Tab2/P1/1, Tab3/P1/0, Tab2/P2/0) 3 2 TCC API Consumer Application User Topic Consumer 1 User Topic Consumer 2 User Topic Consumer 3 Commit Topic Consumer User Topic Consumer 4 Ordering/Deduplication Logic 1) Instantiation 4 4 4 4 6) Source Transaction Stream T3 Op1, T3 Commit, T2 Op1, T2 Op2, T2 Op3, T2 Op4, T2 Commit 5
  • 48. TCC API Consumption Features! IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Consumer Parallelism - A single TCC utilizes multiple consumers Horizontal Scalability - Multiple TCCs can be used at the same time on the same topics Exactly Once - Bookmark returned with every operation allows restart at exactly that point in the STS Exactly Once – Operation from TCC guarantees the entire transaction, and all previous transactions, replicated to Kafka No Duplicates - Duplicate records in the user data topic are not included in TCC output Topic Filtering - The TCC can be instructed to return a subset of the subscription topics replicated User Topic Data Not Altered - No additional fields are added to the user data topics, enrichment occurs in TCC No impact to Regular consumer applications - Consumer applications can simply read from user data topics as normal Commit Boundaries Identified - TCC processing enriches output stream with commit boundary indicators Logical Isolation – The TCC will ignore records written to user data topics by other applications even if they are separate TCC instances. Order - Operation and Transaction order is provided in the TCC output stream IBM Kafka Transactionally Consistent Consumer (TCC)
  • 49. Integrating the TCC in End To End Application Flow IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Performance - Leverages parallelism on many levels for performance (Producing and Consuming) Performance - Kafka work flows can choose a subset of topics to process (only necessary data) Performance - User can shred data across multiple partitions for performance and still read in Source Transaction Stream order Insight - Knowledge available only from real-time consistent (atomic) datasets can be leveraged. Insight - Knowledge only available from relative transactions and operation ordering can be leveraged. Insight - Ordered Source Transaction Stream allows for Numeric Sequencing at Partition, Topic, or Subscription level. Robustness - Applications can use bookmark to ensure real-time exactly once processing of data and ensure no duplicates, even in crash restart scenarios Robustness – Data not related to the TCC’s subscription written to user data topics is ignored Extreme Flexibility - Combined with CDC’s KCOP feature produced records can be in any format. Can write multiple versions of a source operation to different topics (different formats/content) and still provide TCC semantics IBM Kafka Transactionally Consistent Consumer (TCC)
  • 50. TCC Dependencies IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation • Commit stream records must exist for TCC to produce data • User Data records must exist at the topic and offset originally written to • Consumer Application responsible for using bookmark for checkpointing if desired • The consumer makes use of the Transactionally Consistent Consumer. IBM Kafka Transactionally Consistent Consumer (TCC)
  • 51. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Transactionally Consistent Consumer Cloud Advantages
  • 52. Leveraging the TCC for the Cloud IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Kafka In The Cloud + TCC consuming replication • Generally, advantages outlined for TCC replication to Kafka are applicable • Utilizes parallel producers to help mitigate longer round-trip times to cluster • Utilizes more partitions per topic to horizontally scale (TCC allows original ordering to be known) • Utilizes deduplication and exactly once delivery to mitigate connectivity / timeout / additional issues • Utilizes bookmarking and total operation order to enable sequencing, parallel processing, and destination validation Kafka on Prem + TCC Consuming Application in the cloud • Generally, advantages outlined for TCC replication to Kafka are applicable • Utilizes Parallel consumers reading from multiple topics to help mitigate longer round-trip times • Utilizes TCC bookmark to enable consumers to checkpoint their consumption and processing • Utilizes total order to enable the generation of sequencing • Utilizes encoded operation topic/location to avoid unnecessary topic reads Transactionally Consistent Consumer Cloud Advantages
  • 53. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Comparison with Kafka Transactional and Idempotence Feature (KIP-98)
  • 54. How IBM TCC differs from Kafka transactional / exactly once feature (KIP 98) IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Different Approaches KIP 98 - employs mechanisms to ensure records are written to topics with explicit transactions and exactly once IBM TCC - employs mechanisms to ensure records are returned to the consuming application ordered and exactly once Different Motivation KIP 98 - “the main motivation for transactions is to enable exactly once processing in Kafka Streams.” https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98- ExactlyOnceDeliveryandTransactionalMessaging-TheOutOfOrderSequenceException IBM TCC - Empower Kafka consumer applications to leverage change capture data maximizing: performance, robustness and flexibility. Allow consumption and understanding of data as if it were read from the original source database. Different Scope KIP 98 - User must ensure data is produced correctly. Producer application is responsible for crash recovery, identifying transactions explicitly, transaction and idempotence have single producer session limits. Consumer also have limitations and need specific considerations when utilized. Eg. “Further, since each new instance of a producer is assigned a new, unique, PID, we can only guarantee idempotent production within a single producer session.” https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98-ExactlyOnceDeliveryandTransactionalMessaging- TheOutOfOrderSequenceException IBM TCC - Since the IBM CDC producer is part of the TCC guarantees and semantics span sessions. The scope is end to end delivery from source database to the consumer application. Comparison with Kafka Transactional and Idempotence Feature (KIP-98)
  • 55. Contrast of the IBM TCC to Kafka Exactly Once Feature IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation TCC is focused on replicating database change data transaction streams and providing transactional semantics TCC inherently orders operations within a transaction TCC empowers multiple producers to write a single transaction’s operations in parallel TCC allows transactions to be written in parallel, out of order potentially, but can provide original transaction order TCC Provides Explicit Transaction Boundaries for processing data in atomic units of work TCC provides mechanisms for downstream consuming logic to serialize at the STS level, topic level, partition level TCC focuses on delivering data exactly once to the consuming application, Kafka focuses on writing it once into the cluster TCC logically isolates itself from the user data topics. Other application’s uncommitted transactions on a topic do not block TCC does not need to utilize idempotence or Kafka transactional functionality. No overhead associated with those features. TCC returns all data for a given transaction sequentially, no interleaved operations from different transactions Comparison with Kafka Transactional and Idempotence Feature (KIP-98)
  • 56. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Links and Resources
  • 57. Links And Resources IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation IBM Infosphere Data Replication (IIDR) https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.idr.frontend.doc/pv_welcome.html IIDR Transactionally Consistent Consumer (TCC) https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/concepts/kafkatcc.html IIDR KCOP (Kafka Custom Operation Formatter) Feature https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/concepts/kafkakcop.html IBM Event Streams / Message Hub (Kafka Based Solutions) https://www.ibm.com/cloud/message-hub. (CLOUD) https://www.ibm.com/cloud/event-streams (On Premise) Kafka Exactly Once Delivery and Transaction Messaging KIP https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98- ExactlyOnceDeliveryandTransactionalMessaging-TheOutOfOrderSequenceException Links And Resources
  • 58. Thank You! IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation Shawn Robertson, P. Eng IBM IIDR Kafka Architect — shawnrr@ca.ibm.com https://ca.linkedin.com/in/shawn-robertson-4738937b
  • 59. IBM IIDR Kafka Replication / October 15, 2018 / © 2018 IBM Corporation