Skip to content

Setup Guide

Prerequisites

  • Node.js 20+ and pnpm package manager
  • Docker 20+ and Docker Compose 2+
  • 4GB+ available RAM
  • Git

Technology Stack Explained

Backend Technologies

Apache Kafka 4.0.2 (KRaft Mode)

What it is:
  • Distributed event streaming platform
  • Runs in KRaft mode (no Zookeeper dependency)
Why we use it:
  • Durability: Messages persist even if services crash
  • Scalability: Can handle high throughput with partitions
  • Decoupling: Services don't need to know about each other
  • Replay capability: Can reprocess historical data
Configuration:
  • 3 partitions per topic for parallel processing
  • Topics: raw-comments, processed-comments, retry-queue, dead-letter-queue
  • Port: 9092 (broker/PLAINTEXT), 9093 (controller/KRaft)

NestJS 10 with TypeScript

What it is:
  • Progressive Node.js framework
  • Built with TypeScript, supports decorators
Why we use it:
  • Module structure for microservices
  • Built-in Kafka integration (@nestjs/microservices)
  • Dependency injection
  • gRPC support
Used in:
  • Producer service
  • Consumer service
  • Sentiment service
  • API service

PostgreSQL 17

What it is:
  • Relational database
Why we use it:
  • ACID compliance for reliable storage
  • Rich querying with SQL
  • TypeORM integration
Schema:
processed_comments (
  id: serial PRIMARY KEY,
  commentId: varchar (UUID, UNIQUE, INDEXED),
  text: text,
  textHash: varchar (INDEXED),  -- SHA256 hash for sentiment caching
  tag: varchar (positive/negative/neutral/unrelated, INDEXED),
  source: varchar (twitter/instagram/facebook/tiktok),
  processedAt: timestamp (INDEXED),
  consumerId: varchar,  -- Which consumer instance processed it
  retryCount: integer (default 0)
)

Indexed columns: commentId (unique), textHash, tag, processedAt

Redis 7.2

What it is:
  • In-memory key-value store
Why we use it:
  • Fast deduplication checks
  • 3-hour TTL for comment hashes
  • Persistence across service restarts
Data stored:
  • processed:{commentId}'1' (presence flag, 3-hour TTL)
  • Used by consumer to detect duplicate comment submissions

gRPC with @grpc/grpc-js

What it is:
  • High-performance RPC framework
  • Uses Protocol Buffers
Why we use it:
  • Binary protocol (faster than JSON/REST)
  • Type safety with proto files
  • Streaming support
Used for:
  • Consumer ↔ Sentiment communication
  • Methods: RegisterConsumer, AnalyzeSentiment

Frontend Technologies

React 19

What it is:
  • UI library with component model
Why we use it:
  • Component reusability
  • Virtual DOM for performance
  • Large ecosystem

TanStack Ecosystem

TanStack Router:
  • Type-safe routing
  • File-based route definitions
  • Used for: / and /comments pages
TanStack Query:
  • Set up as a provider (part of TanStack Start framework template)
  • Not actively used for data fetching — all data comes from TanStack DB collection
TanStack Table:
  • Headless table library
  • Features used:
    • Column definitions with custom cells
    • Sorting state management
    • Filtering by tag and search query
    • Pagination with page state
  • No batteries-included UI (we provide our own with shadcn/ui)
Implementation:
const table = useReactTable({
  data: filteredComments, // pre-filtered via useLiveQuery + useMemo
  columns,
  getCoreRowModel: getCoreRowModel(),
  getPaginationRowModel: getPaginationRowModel(),
  getSortedRowModel: getSortedRowModel(),
})

TanStack DB

What it is:
  • Client-side reactive database with localStorage backend
Why we use it:
  • Automatic persistence to localStorage
  • Reactive live queries (components auto-update)
  • Schema validation with Zod
  • Cross-tab synchronization
  • No manual state management needed
Key Features:
  • SQL-like query syntax
  • Type-safe operations
  • Built-in caching and persistence
  • Offline support

Recharts

What it is:
  • React charting library
Why we use it:
  • Built for React
  • Composable components
  • Responsive by default
Used for:
  • Pie chart: Sentiment distribution
  • Bar chart: Counts by tag

shadcn/ui

What it is:
  • Copy-paste component library (not npm package)
  • Built on Radix UI primitives
Components used:
  • Table, Card, Badge, Button, Input
  • Custom variants for sentiment tags (positive/negative/neutral/unrelated)

Tailwind CSS 4

What it is:
  • Utility-first CSS framework
Why we use it:
  • Rapid styling
  • Responsive design
  • Theme configuration

Infrastructure

Docker & Docker Compose

What it is:
  • Containerization platform
Why we use it:
  • Consistent environments
  • Easy local development
  • Multi-service orchestration
Services:
  • kafka (apache/kafka:4.0.2)
  • postgres (postgres:17)
  • redis (redis:7.2)
  • kafka-ui (provectuslabs/kafka-ui)
  • producer, consumer, sentiment, api, dashboard

pnpm Workspaces

What it is:
  • Monorepo package manager
Why we use it:
  • Shared dependencies
  • Faster installs than npm
  • Workspace protocol for local packages
Structure:
workspace (pnpm-workspace.yaml)
├── producer/
├── consumer/
├── sentiment/
├── api/
├── dashboard/
└── shared/ (types and constants)

Installation Steps

1. Clone Repository

git clone <repository-url>
cd kafka-producer-consumer

2. Install Dependencies

pnpm install

This automatically:

  • Installs all workspace dependencies
  • Builds @repo/shared package via postinstall hook
  • Creates TypeScript declarations

3. Environment Configuration (Optional)

Create .env in root:

# Kafka
KAFKA_BROKER=localhost:9092

# PostgreSQL
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_DB=restaurant_comments

# Redis
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_TTL=10800

# Producer
PRODUCER_MIN_DELAY=100
PRODUCER_MAX_DELAY=10000
PRODUCER_DUPLICATE_RATE=0.05

# Consumer
CONSUMER_MAX_RETRIES=5
CONSUMER_RETRY_DELAY=1000
CONSUMER_CACHE_SIZE=100

# Sentiment
SENTIMENT_AUTH_RATE_LIMIT=100
SENTIMENT_UNAUTH_RATE_LIMIT=10
SENTIMENT_CACHE_SIZE=500
SENTIMENT_FAILURE_RATE=0.03125

# API
API_PORT=3001
API_CORS_ORIGIN=http://localhost:3000

4. Start Services

With Docker (Recommended):
pnpm docker:up
Development Mode:
# Start infrastructure only
docker compose up -d postgres redis kafka
 
# Start all services in watch mode
# TypeORM synchronize:true auto-creates tables on startup
pnpm dev:all

Accessing Services

ServiceURLPurpose
Dashboardhttp://localhost:3000Main UI
APIhttp://localhost:3001REST endpoints
API Healthhttp://localhost:3001/healthHealth check
Statisticshttp://localhost:3001/api/statisticsAggregate data
SSE Streamhttp://localhost:3001/api/sse/commentsReal-time events
Sentiment Healthhttp://localhost:3005/healthgRPC service status
Kafka UIhttp://localhost:8080Kafka topic browser

Verification

1. Check Kafka

docker ps | grep kafka
# Should show kafka container running

2. Check Database

docker exec kafka-postgres psql -U postgres -d restaurant_comments -c "\dt"
# Should show "processed_comments" table

3. Check Redis

docker exec kafka-redis redis-cli ping
# Should return PONG

4. Check Services

curl http://localhost:3001/health
# Should return service stats
 
curl http://localhost:3005/health
# Should return sentiment service stats

5. Check Real-time Updates

Open http://localhost:3000 and watch for:

  • Connection status indicator
  • Live comment count updates
  • Charts updating

Troubleshooting

Kafka not ready:
  • Wait 60 seconds after docker compose up
  • Kafka needs time to initialize in KRaft mode
Database connection failed:
  • Ensure PostgreSQL container is running
  • TypeORM synchronize: true creates tables automatically on startup
@repo/shared not found:
  • Run pnpm --filter shared build
  • Check shared/dist/ folder exists
Port conflicts:
  • Ensure ports 3000, 3001, 3004, 3005, 5432, 6379, 8080, 9092 are free
  • Check with: lsof -i :PORT_NUMBER

Next Steps