Questions for the CCAAK were updated on : Nov 21 ,2025
A developer is working for a company with internal best practices that dictate that there is no single
point of failure for all data stored.
What is the best approach to make sure the developer is complying with this best practice when
creating Kafka topics?
D
Explanation:
Replication factor determines how many copies of each partition exist across different brokers. A
replication factor of 3 ensures that even if one or two brokers fail, the data is still available, thus
eliminating a single point of failure.
Where are Apache Kafka Access Control Lists stored'?
B
Explanation:
In Apache Kafka (open-source), Access Control Lists (ACLs) are stored in ZooKeeper. Kafka brokers
retrieve and enforce ACLs from ZooKeeper at runtime.
What are important factors in sizing a ksqlDB cluster? (Choose three.)
A, B, C
Explanation:
The complexity of the schema (number of fields, data types, etc.) affects processing and memory
usage.
Each ksqlDB persistent query consumes resources (CPU, memory), so more queries require more
capacity.
More partitions increase parallelism, but also resource usage, especially in scaling and state
management.
Why does Kafka use ZooKeeper? (Choose two.)
A, D
Explanation:
ZooKeeper stores metadata such as partition leadership and ISR (in-sync replicas), which brokers use
to coordinate.
Kafka uses ZooKeeper to perform leader election for the Controller broker, which manages cluster
metadata and leadership changes.
Kafka broker supports which Simple Authentication and Security Layer (SASL) mechanisms for
authentication? (Choose three.)
A, C, D
Explanation:
SASL/PLAIN – A simple username/password mechanism supported by Kafka.
SASL/GSSAPI (Kerberos) – Kafka supports Kerberos authentication through the GSSAPI mechanism.
SASL/OAUTHBEARER – Kafka supports OAUTHBEARER for token-based authentication.
A Kafka cluster with three brokers has a topic with 10 partitions and a replication factor set to three.
Each partition stores 25 GB data per day and data retention is set to 24 hours.
How much storage will be consumed by the topic on each broker?
C
Explanation:
10 partitions × 25 GB/day = 250 GB total per day for the topic (primary data).
With a replication factor of 3, there are 3 full copies of the data: 250 GB × 3 = 750 GB total across the
entire cluster.
The cluster has 3 brokers, and Kafka tries to distribute replicas evenly among them: 750 GB ÷ 3
brokers = 250 GB per broker on average.
However, due to replication, some partitions have leaders and followers, so there's some overlap
and not-perfect distribution. Each broker stores approximately 2/3 of the total topic data (since each
broker holds replicas for around 2/3 of the partitions).
2/3 × 750 GB = 500 GB, but this is shared, so each broker ends up storing ~300 GB of replicated data,
including its share of leaders and followers.
Which connector type takes data from a topic and sends it to an external data system?
A
Explanation:
A Sink Connector reads data from a Kafka topic and writes it to an external data system, such as a
database, file system, or cloud service.
You have a cluster with a topic t1 that already has uncompressed messages. A new Producer starts
sending messages to t1 with compression enabled.
Which condition would allow this?
A
Explanation:
Kafka allows mixed compression formats within the same topic and even the same partition. Each
message batch includes metadata indicating whether and how it is compressed. Therefore, a new
producer can send compressed messages to a topic that already contains uncompressed messages,
as long as it is configured with a compression codec (e.g., compression.type=gzip, snappy, etc.).
If the Controller detects the failure of a broker that was the leader for some partitions, which actions
will be taken? (Choose two.)
B, C
Explanation:
The Controller updates ZooKeeper with the new leader and in-sync replica (ISR) information to
maintain metadata consistency.
Brokers need this information to correctly route client requests and continue replication.
Which out-of-the-box Kafka Authorizer implementation uses ZooKeeper?
B
Explanation:
Kafka's built-in ACL (Access Control List) authorizer stores and manages permissions using ZooKeeper
by default. This implementation controls access at the resource level (topics, consumer groups, etc.).
What does Kafka replication factor provide? (Choose two.)
B, C
Explanation:
Replication ensures that multiple copies of data exist across different brokers, so data is not lost if
one broker fails.
With multiple replicas, Kafka can continue to serve data even if the leader or one replica fails,
maintaining service availability.
What is the primary purpose of Kafka quotas?
A
Explanation:
Kafka quotas are used to limit the throughput (bytes/sec) of producers and consumers to ensure fair
resource usage and prevent any single client from overwhelming the brokers.
Which tool is used for scalably and reliably streaming data between Kafka and other data systems?
A
Explanation:
Kafka Connect is the tool designed for scalable and reliable integration between Kafka and external
data systems (e.g., databases, cloud storage, key-value stores). It supports source connectors (to pull
data into Kafka) and sink connectors (to push data from Kafka).
Which property in broker configuration specifies that a broker belongs to a particular rack?
Which property in broker configuration specifies that a broker belongs to a particular rack?
B
Explanation:
The broker.rack property is used in a Kafka broker’s configuration to specify the rack or availability
zone the broker belongs to. This is important for rack-aware replica placement, allowing Kafka to
distribute replicas across different racks for fault tolerance.
Kafka Connect is running on a two node cluster in distributed mode. The connector is a source
connector that pulls data from Postgres tables (users/payment/orders), writes to topics with two
partitions, and with replication factor two. The development team notices that the data is lagging
behind.
What should be done to reduce the data lag*?
The Connector definition is listed below:
{
"name": "confluent-postgresql-source",
"connector class": "PostgresSource",
"topic.prefix": "postgresql_",
& nbsp;& nbsp;& nbsp;…
"db.name": "postgres",
"table.whitelist": "users.payment.orders”,
"timestamp.column.name": "created_at",
"output.data format": "JSON",
"db.timezone": "UTC",
"tasks.max": "1"
}
B
Explanation:
The connector is currently configured with "tasks.max": "1", which means only one task is handling
all tables (users, payment, orders). This can create a bottleneck and lead to lag. Increasing tasks.max
allows Kafka Connect to parallelize work across multiple tasks, which can pull data from different
tables concurrently and reduce lag.