Course Curriculum

Kafka Mastery: Building StreamSocial


Module 1: Foundation & Core Concepts (Days 1-10)

Day 1: Event-Driven Architecture Fundamentals

Concept: Transition from request-response to event-driven systems
StreamSocial: Design event taxonomy (user_action, content_interaction, system_event)
Challenge: Define 10 core event types for social media platform

Day 2: Kafka Cluster Setup

Concept: Zookeeper/KRaft, broker architecture, distributed consensus
StreamSocial: Multi-broker cluster for high availability
Challenge: Deploy 3-broker cluster with Docker Compose

Day 3: Topics & Partitions Strategy

Concept: Partitioning for parallelism and ordering guarantees
StreamSocial: Create user-actions (1000 partitions), content-interactions (500 partitions)
Challenge: Calculate optimal partition count for 50M req/s

Day 4: High-Volume Producer Implementation

Concept: Producer API, threading, connection pooling
StreamSocial: User action producer handling 5M posts/second
Challenge: Implement producer with connection pooling

Day 5: Engagement Consumer Development

Concept: Consumer API, polling strategies, deserialization
StreamSocial: Like/share/comment processing consumer
Challenge: Build consumer with proper error handling

Day 6: Consumer Groups & Scalability

Concept: Partition assignment, horizontal scaling patterns
StreamSocial: Parallel feed generation workers
Challenge: Scale consumer group from 1 to 100 instances

Day 7: Offset Management Strategies

Concept: Offset semantics, persistence, recovery
StreamSocial: User engagement metrics tracking
Challenge: Implement custom offset management for analytics

Day 8: Commit Strategies & Reliability

Concept: Auto vs manual commits, at-least-once processing
StreamSocial: Reliable engagement processing pipeline
Challenge: Handle consumer failure scenarios gracefully

Day 9: Dynamic Consumer Rebalancing

Concept: Rebalance protocols, partition reassignment
StreamSocial: Auto-scaling feed generators during traffic spikes
Challenge: Implement custom rebalance listener

Day 10: Delivery Guarantees

Concept: At-least-once, at-most-once, exactly-once semantics
StreamSocial: Critical billing/analytics event processing
Challenge: Design exactly-once processing for user payments


Module 2: Producer Reliability & Performance (Days 11-20)

Day 11: Producer Acknowledgment Strategies

Concept: acks=0,1,all trade-offs for durability vs latency
StreamSocial: 99.9% durability for critical user data
Challenge: Benchmark different acks settings under load

Day 12: Retry Logic & Failure Handling

Concept: Transient vs permanent failures, exponential backoff
StreamSocial: Mobile app connectivity resilience
Challenge: Implement intelligent retry with circuit breaker

Day 13: Idempotent Producers

Concept: Producer ID, sequence numbers, duplicate prevention
StreamSocial: Prevent duplicate posts during network issues
Challenge: Verify idempotence under simulated network failures

Day 14: Message Ordering & Keys

Concept: Partition key hashing, ordering within partitions
StreamSocial: Chronological user timeline ordering
Challenge: Implement user-specific message ordering

Day 15: Custom Partitioning Logic

Concept: Custom partitioner implementation, business logic partitioning
StreamSocial: Geographic content distribution partitioning
Challenge: Build geo-aware partitioner for global users

Day 16: Batching & Throughput Optimization

Concept: batch.size, linger.ms, buffer.memory tuning
StreamSocial: Peak traffic optimization (50M req/s)
Challenge: Achieve maximum throughput with minimal latency

Day 17: Compression Strategies

Concept: GZIP, Snappy, LZ4, ZSTD comparison
StreamSocial: Bandwidth optimization for media metadata
Challenge: Compare compression ratios and CPU impact

Day 18: Transactional Producers

Concept: Atomic writes across partitions, transaction coordinators
StreamSocial: Atomic posting to timeline and global feed
Challenge: Implement multi-topic atomic operations

Day 19: Asynchronous Operations & Callbacks

Concept: Non-blocking sends, Future handling, callback patterns
StreamSocial: Content moderation pipeline error handling
Challenge: Build asynchronous producer with custom callbacks

Day 20: Replication & ISR Management

Concept: Leader election, follower synchronization, min.insync.replicas
StreamSocial: Multi-region disaster recovery design
Challenge: Simulate leader failure and recovery scenarios


Module 3: Advanced Consumer Patterns (Days 21-30)

Day 21: Manual Partition Assignment

Concept: Direct partition control, bypassing consumer groups
StreamSocial: Dedicated trend analysis workers
Challenge: Implement stateful partition assignment strategy

Day 22: Low-Latency Consumer Optimization

Concept: Poll timeouts, fetch sizes, processing loops
StreamSocial: Sub-100ms notification delivery
Challenge: Optimize consumer for <50ms processing latency

Day 23: Graceful Shutdown Patterns

Concept: Shutdown hooks, offset commitment, resource cleanup
StreamSocial: Recommendation engine worker lifecycle
Challenge: Zero-downtime consumer deployment strategy

Day 24: Message Headers & Metadata

Concept: Header usage patterns, tracing, feature flags
StreamSocial: A/B testing and feature flag propagation
Challenge: Implement distributed tracing with headers

Day 25: Error Handling & Poison Pills

Concept: Deserialization errors, skip strategies, dead letter queues
StreamSocial: Malformed social media post recovery
Challenge: Build robust error handling with DLQ

Day 26: Schema Definition with JSON Schema

Concept: Schema validation, evolution strategies
StreamSocial: User profile and interaction schemas
Challenge: Define comprehensive event schema catalog

Day 27: Avro Serialization & Schema Registry

Concept: Binary serialization, schema evolution, compatibility
StreamSocial: Efficient high-volume event serialization
Challenge: Implement Avro with backward compatibility

Day 28: Protocol Buffers Integration

Concept: Protobuf vs Avro comparison, code generation
StreamSocial: Mobile client efficiency optimization
Challenge: Compare Protobuf and Avro performance

Day 29: Schema Evolution Strategies

Concept: Forward, backward, full compatibility modes
StreamSocial: Mobile app update compatibility
Challenge: Evolve schema without breaking existing consumers

Day 30: Log Compaction

Concept: Key-based retention, latest value semantics
StreamSocial: User preference state management
Challenge: Implement user settings with compacted topics


Module 4: Data Integration & Connect (Days 31-40)

Day 31: Kafka Connect Architecture

Concept: Workers, connectors, tasks, distributed processing
StreamSocial: External social signal ingestion setup
Challenge: Deploy Connect cluster for high availability

Day 32: Source Connector Implementation

Concept: Data ingestion patterns, offset management
StreamSocial: User-generated content from file uploads
Challenge: Build custom file-based source connector

Day 33: Sink Connector Development

Concept: Data export patterns, exactly-once delivery
StreamSocial: Trending hashtags to analytics systems
Challenge: Implement database sink with upsert capability

Day 34: Single Message Transformations

Concept: Field transformations, routing, filtering
StreamSocial: Raw user actions to standardized events
Challenge: Chain multiple SMTs for data normalization

Day 35: Distributed Connect Deployment

Concept: Worker coordination, task distribution, fault tolerance
StreamSocial: Fault-tolerant content ingestion pipeline
Challenge: Handle worker failures without data loss

Day 36: Connect Monitoring & Observability

Concept: Metrics collection, alerting, performance monitoring
StreamSocial: Content pipeline health monitoring
Challenge: Build Connect monitoring dashboard

Day 37: Custom Connector Development

Concept: Connector SDK, configuration, lifecycle management
StreamSocial: Third-party social platform integration
Challenge: Design custom Twitter/LinkedIn connector

Day 38: Change Data Capture with Debezium

Concept: Database change streaming, CDC patterns
StreamSocial: Real-time user profile updates
Challenge: Stream database changes to Kafka topics

Day 39: Database Integration Patterns

Concept: JDBC connectors, bulk operations, conflict resolution
StreamSocial: Recommendation model training data sync
Challenge: Sync user data with ML training pipeline

Day 40: Error Handling in Connect

Concept: Dead letter queues, retry policies, error tolerance
StreamSocial: Content moderation failure handling
Challenge: Implement comprehensive error handling strategy


Module 5: Stream Processing with Kafka Streams (Days 41-50)

Day 41: Kafka Streams Fundamentals

Concept: Stream processing topology, local state
StreamSocial: Real-time engagement scoring engine
Challenge: Build basic stream processing application

Day 42: KStream Processing

Concept: Record-by-record processing, immutable streams
StreamSocial: User interaction stream for feed ranking
Challenge: Implement stateless transformations pipeline

Day 43: Stateless Transformations

Concept: filter, map, flatMap, peek operations
StreamSocial: Spam filtering and content policy enforcement
Challenge: Build content moderation pipeline

Day 44: Stateful Aggregations

Concept: Windowing, grouping, aggregation functions
StreamSocial: Trending score calculation with sliding windows
Challenge: Implement real-time trending hashtag detection

Day 45: KTable Operations

Concept: Changelog streams, materialized views
StreamSocial: Real-time user reputation scores
Challenge: Maintain user engagement statistics

Day 46: Stream-Table Joins

Concept: Stream enrichment, lookup patterns
StreamSocial: User action enrichment with profile data
Challenge: Enrich events with user demographic data

Day 47: Table-Table Joins

Concept: Referential integrity, co-partitioning
StreamSocial: User preferences with content metadata
Challenge: Join user settings with content recommendations

Day 48: Interactive Queries

Concept: State store queries, REST endpoints
StreamSocial: Live trending topics API
Challenge: Build real-time analytics REST API

Day 49: Fault Tolerance & Recovery

Concept: Changelog topics, state restoration
StreamSocial: Engagement score recovery after failures
Challenge: Test application recovery scenarios

Day 50: Processor API

Concept: Low-level processing, custom state stores
StreamSocial: Custom content recommendation algorithms
Challenge: Implement advanced ML-based content scoring


Module 6: Production Operations & Security (Days 51-60)

Day 51: Broker Monitoring & Metrics

Concept: JMX metrics, key performance indicators
StreamSocial: 50M req/s performance monitoring
Challenge: Set up comprehensive broker monitoring

Day 52: Client Metrics & Observability

Concept: Producer/consumer lag, throughput metrics
StreamSocial: SLA compliance monitoring
Challenge: Build client performance dashboard

Day 53: Centralized Logging Strategy

Concept: Log aggregation, structured logging, debugging
StreamSocial: Viral content issue debugging
Challenge: Implement ELK stack for Kafka logs

Day 54: Authentication with SASL

Concept: SASL/SCRAM, SASL/PLAIN security mechanisms
StreamSocial: User data security implementation
Challenge: Configure SASL authentication cluster-wide

Day 55: Authorization with ACLs

Concept: Access control lists, role-based permissions
StreamSocial: Service-level topic access control
Challenge: Implement least-privilege access model

Day 56: Encryption & TLS

Concept: Data in transit encryption, certificate management
StreamSocial: Sensitive user data protection
Challenge: Enable end-to-end encryption

Day 57: Microservices Event Architecture

Concept: Domain events, eventual consistency, saga patterns
StreamSocial: User, content, and notification service design
Challenge: Implement event-driven microservices communication

Day 58: Schema Governance

Concept: Schema registry, versioning, compatibility policies
StreamSocial: Social feature evolution management
Challenge: Build schema governance framework

Day 59: Advanced Ecosystem Tools

Concept: KSQL, Confluent Platform, administrative tools
StreamSocial: Ad-hoc trending analysis queries
Challenge: Implement KSQL-based analytics

Day 60: System Integration & Production Readiness

Concept: Deployment strategies, capacity planning, disaster recovery
StreamSocial: Complete production-ready social platform
Challenge: Final system review and optimization


Assessment & Certification

Daily Assessments

  • Coding Challenges (60 total)

  • Concept Quizzes (12 module quizzes)

  • System Design Reviews (weekly)

Final Project

Complete StreamSocial system capable of:

  • Processing 50M requests/second

  • <100ms latency for real-time features

  • 99.99% uptime with fault tolerance

  • Global multi-region deployment

  • Production monitoring and alerting

Certification Requirements

  • Complete all 60 daily challenges

  • Pass module assessments (>80%)

  • Deploy working StreamSocial system

  • Present system architecture and design decisions


Resources & Tools

Required Software

  • Java 17+ with Maven/Gradle

  • Docker Desktop with 8GB+ RAM allocation

  • IntelliJ IDEA Community/Ultimate

  • Apache Kafka 3.5+

  • Confluent Platform (optional)

Development Environment

  • Minimum: 16GB RAM, 4-core CPU, 100GB storage

  • Recommended: 32GB RAM, 8-core CPU, 500GB SSD

Supporting Materials

  • Course GitHub repository with starter code

  • Docker Compose templates

  • Monitoring dashboard templates

  • Schema registry configurations

  • Production deployment guides


Expected Outcomes

Upon completion, you will:

  • Build production-ready event-driven systems at scale

  • Design fault-tolerant distributed architectures

  • Implement real-time stream processing applications

  • Deploy secure, monitored Kafka clusters

  • Architect microservices with event-driven patterns

  • Handle 50M+ requests/second with confidence

Career Impact: Qualify for Senior Software Engineer, Solutions Architect, or Platform Engineer roles focusing on distributed systems and real-time data processing.