Comparing AWS Serverless with Confluent Kafka Standard: Select the Appropriate Platform to Stream Your Data in Real Time

Amazon MSK or Confluent Kafka

In the modern data platform world, companies rely heavily on data-driven decisions. However, the question remains: Are we using the data at the right time? The answer may not always be ‘yes.’ Companies in every industry quickly shift from batch processing to real-time data streams to keep up with modern business requirements. The specific use cases such as fraud detection, contextual marketing triggers, and dynamic pricing  rely on leveraging a data feed or real-time data. Based on the current and comprehensive data, organizations get  visibility to make informed operational decisions faster. 

When deciding on real-time data ingestion solutions, customers look for the following factors:

  • Scalability
  • Ordering
  • Consistency and durability
  • Fault tolerance and data guarantees
  • Cost effectiveness

AWS MSK Serverless and Confluent Kafka Standard are considered to be the best services for real-time data ingestion. This blog shares insights that will help you make an informed decision on the right fit for your requirements.

About AWS MSK Serverless

Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data. It automatically provisions and scales capacity while managing the partitions in your topic, so you can stream data without thinking about right-sizing or scaling clusters. It offers a throughput-based pricing model, so you pay only for what you use. 

About Confluent Kafka Standard

Confluent Kafka standard provides a truly cloud-native experience, completing Kafka with a holistic set of enterprise-grade features to enhance developer productivity, operate efficiently at scale, and meet  your architectural requirements before moving to production. Underpinning the platform is its 99.99% uptime SLA and committer-driven expertise, providing support and services from the team with over one million hours of technical experience with Kafka.

Let us now compare AWS MSK Serverless and Confluent Kafka Standard across different focus areas:

1. Architecture

Criteria GroupAWS MSK ServerlessConfluent Kafka Standard
Ease of SetupRun Apache Kafka without having to manage and scale cluster capacity as a serverless offeringNo setup is involved as Confluent Kafka is serverless
Access ControlsAmazon MSK uses IAM to check whether the client is an authenticated identity and is authorized to interact with your clusterConfluent Cloud role-based access control (RBAC) lets you control access to an organization, environment, cluster, or granular Kafka resources based on predefined roles and access permissions
Ecosystem integrationIntegrates with AWS ecosystems like VPC, Lambda, Glue Schema, and AWS Kinesis Analytics that process streaming dataOffers Pre-built Kafka Connectors, Confluent Hub, Schema Registry and MQTT Proxy
ScalabilityIn serverless clusters, Amazon MSK automatically balances partitionsAutomatic resource allocation to your cluster to manage consumer lag as throughput scales up or down with self-balancing clusters
AvailabilityAmazon MSK uses multi-AZ replication for high availability. Data replication is included at no additional costStandard clusters are designed for production-ready features with an uptime SLA: 99.95% for Single-Zone and 99.99% for Multi-Zone

2. Platform Support

Criteria GroupAWS MSK ServerlessConfluent Kafka Standard
OfferingsAWS MSK and MSK Serverless are the two offerings provided by AWS for the managed KafkaSelf managed and fully managed are the two offerings provided by Confluent for managed Kafka
ConnectorsCreate custom plugins using MSK connect to move data between source and destination systems120+ pre-built connectors for real-time integration between source and destination systems
Kafka UpgradesMSK Serverless automatically upgrades clusters without requiring customers to provide any inputUpgrade policy in confluent Kafka is broadly classified into three types –  minor, major and deprecation upgrades listed here
QuotaMax ingress throughput : 200 MBPS
Max egress throughput : 400 MBPS
Ingress per partition : 5 MBPS
Egress per partition  : 10 MBPS
Max partition size : 250 GB
Max ingress throughput : 250 MBPS
Max egress throughput : 750 MBPS
Ingress per partition : 5 MBPS
Egress per partition  : 15 MBPS
Max partition size : 250 GB
MonitoringAmazon MSK integrates with Amazon CloudWatch so that you can collect, view and analyze CloudWatch metrics for your Amazon MSK clusterConfluent Control Center is a web-based tool for managing and monitoring Apache Kafka®.
Control Center provides a user interface that provides you with a quick overview of cluster health, observe and control messages, topics, and Schema Registry, and develop and run ksqlDB queries
WorkloadsWorkloads which are AWS native and need deeper integration with other AWS services are a good fitWorkloads which are hybrid type, i.e on-prem and cloud/multi-cloud  are suited for Confluent Kafka platform

3. Data Governance and Security

Criteria GroupAWS MSK ServerlessConfluent Kafka Standard
Compliance CertificatesHIPAA eligible, PCI, ISO, SOC 1,2,3,FedRAMPSOC 1/2/3, ISO 27001 and PCI, GDPR
Network SecurityMSK provides a strong network security pillar using AWS VPC Peering, AWS Transit Gateway and AWS Private link to provide a secure traffic flow between cross-account and cross-regionsAll the Confluent Standard clusters are accessible through secure internet endpoints. All connections to Confluent Cloud are encrypted with TLS and require authentication using API keys, regardless of network configuration
Data EncryptionAWS KMS for data in rest and transit encryption.Bring your own key (BYOK) for at-rest encryption. Data in motion encryption is available

4. Pricing

Criteria GroupAWS MSK ServerlessConfluent Kafka Standard
Connector pricingMSK connect pricing breakdownConnectors in Confluent cloud breakdown as per source


Let’s assume the cluster has 5 topics with 20 partitions each. Your producers write on average 100GB of data daily and your consumers read 200GB of data. You also retain that data for 24 hours to ensure it is available for replay. In the above scenario, you would pay the following for a 31-day month:

Here’s the pricing breakdown for MSK Serverless (for Ohio Region)

Cluster/hours31 days * 24 hrs/day = 744 cluster-hours0.75/cluster-hr744 * 0.75 =558
Partition/hours31 days * 24 hrs/day * 5 * 20 = 74,400 partition-hours0.0015/partition-hr74,400 * 0.001875 = 111.60
Data-in100 GB x 31 days = 3,100 GB0.10/GB-in3,100 * 0.10 = 310
Data-out200 GB x 31 days = 6,200 GB0.05/GB-out6,200 * 0.05 = 310
StorageAverage storage used = 100 GB-months0.10/GB-month100 * 0.10 = 10
Total1229.60 USD

Here’s the pricing breakdown for Confluent Kafka Standard:

Cluster/hours31 days * 24 hrs/day = 744 cluster-hours1.5/cluster-hr744 * 0.75 =1116
Partition/hoursFirst 500 partitions at no additional cost0.0015/partition-hr0
Data-in100 GB x 31 days = 3,100 GB0.13/GB-in3,100 * 0.13 = 403
Data-out200 GB x 31 days = 6,200 GB0.06/GB-out6,200 * 0.06 = 372
StorageAverage storage used = 100 GB-months0.10/GB-month100 * 0.10 = 10
Total1901 USD

What do you select?

You should choose AWS MSK when  

  • You need more seamless and deep integration with other native AWS Services like AWS lambda for MSK event sourcing, AWS Secret Manager for client credentials used for SASL/SCRAM authentication, etc.
  • You want to lower costs since the cluster hours are reasonably less.
  • You need speed in provisioning Kafka Cluster
  • You need secure connectivity to your MSK Cluster and other clients accessing the resource using AWS Private Link, VPC Peering or Transit Gateway.

You should choose Confluent Kafka when 

  • You have a hybrid and multi-cloud strategy (available on AWS, Azure, GCP), making it native to the public cloud providers.
  • You need to centralize Kafka Management Operation and have a quick overview using the Confluent Control Center.
  • You have an on-premises deployment with cloud providers (e.g., AWS Outpost including Wavelength, Google’s Anthos) since it’s primarily built on top of Kubernetes
  • You need a rich Kafka ecosystem like pre-built connectors, governance using stream lineage, and the ability to connect non-java Kafka clients.

AWS MSK Serverless and Confluent Kafka Standard  are both efficient, reliable, and among the market’s best real-time data ingestion solutions. However, they cater to different uses and processes. Hence, evaluating the use case and selecting the most suitable tool is important.

Quantiphi as Amazon MSK Service Delivery Partner 

Quantiphi is a designated service delivery partner that makes it easy for customers to migrate and build data streaming solutions on Amazon MSK to not only take advantage of the rich Amazon MSK integrations with other AWS services and address real-time analytics use cases but also help them realize the cost benefits sooner. To get started, or learn more, get in touch with our experts

Written byKarthik Shetty & Sanchit Jain

Get your digital transformation started

Let's Talk