Sentiment Service

Overview

The Sentiment service is a hybrid HTTP + gRPC microservice that analyzes comment text and classifies it into one of four categories: positive, negative, neutral, or unrelated. It includes consumer registration, rate limiting, caching, and intentional failure simulation.

Dual Protocol Support:

gRPC (Port 3004): Primary interface for sentiment analysis and consumer registration
HTTP (Port 3005): Health checks, monitoring, and statistics endpoints

Purpose

Classify comment sentiment using keyword matching
Provide authenticated access via consumer registration
Enforce rate limits (100/sec auth, 10/sec unauth)
Cache results to reduce redundant analysis
Simulate real-world service failures
Expose health and monitoring endpoints via HTTP

Architecture

┌─────────────────────────────────────────────┐
│         Sentiment Service                   │
│                                             │
│  ┌──────────────┐      ┌──────────────┐   │
│  │ HTTP Server  │      │ gRPC Server  │   │
│  │ (Port 3005)  │      │ (Port 3004)  │   │
│  └──────┬───────┘      └──────┬───────┘   │
│         │                     │            │
│         ▼                     │            │
│  ┌─────────────┐      ┌───────┴────────┐  │
│  │   Health    │      │    Register    │  │
│  │  Endpoint   │      │    Consumer    │  │
│  │   /health   │      └────────────────┘  │
│  └─────────────┘              │            │
│                                │            │
│                        ┌───────┴────────┐  │
│                        │    Analyze     │  │
│                        │   Sentiment    │  │
│                        └───────┬────────┘  │
│                                │            │
│                   ┌────────────┴────────┐  │
│                   │                     │  │
│                   ▼                     ▼  │
│           ┌───────────┐         ┌──────────────┐
│           │   Rate    │         │   LRU Cache  │
│           │  Limiter  │         │  (500 items) │
│           └───────────┘         └──────────────┘
│                   │                     │  │
│                   └────────┬────────────┘  │
│                            ▼               │
│                  ┌──────────────────┐     │
│                  │  Keyword Matcher │     │
│                  │ (classification) │     │
│                  └──────────────────┘     │
└─────────────────────────────────────────────┘

Port Assignment:

3004: gRPC service (sentiment analysis, registration)
3005: HTTP service (health checks, monitoring)

gRPC Service Definition

Proto file: sentiment/proto/sentiment.proto

service SentimentService {
  rpc RegisterConsumer(RegisterRequest) returns (RegisterResponse);
  rpc AnalyzeSentiment(SentimentRequest) returns (SentimentResponse);
}
 
message RegisterRequest {
  string consumerName = 1;
}
 
message RegisterResponse {
  string consumerId = 1;
  int32 rateLimit = 2;
  string message = 3;
}
 
message SentimentRequest {
  string commentId = 1;
  string text = 2;
  string textHash = 3;
  string consumerId = 4;
}
 
message SentimentResponse {
  string commentId = 1;
  string tag = 2;  // positive | negative | neutral | unrelated
  int32 processingTime = 3;
  bool cached = 4;
}

Key Components

1. Registration Service

Location: sentiment/src/registration.service.ts

Purpose: Issues unique consumer IDs for authenticated access

Implementation:

@Injectable()
export class RegistrationService {
  private consumers = new Map<string, RegisteredConsumer>()
 
  registerConsumer(consumerName: string): string {
    const consumerId = uuidv4()
    
    this.consumers.set(consumerId, {
      id: consumerId,
      name: consumerName,
      registeredAt: new Date()
    })
 
    return consumerId
  }
 
  isRegistered(consumerId: string): boolean {
    return this.consumers.has(consumerId)
  }
}

Why registration:

Distinguishes authenticated from unauthenticated requests
Enables per-consumer rate limiting
Could track usage statistics (request counts, etc.)

2. Rate Limiter Service

Location: sentiment/src/rate-limiter.service.ts

Two-tier rate limiting:

@Injectable()
export class RateLimiterService {
  private limits = new Map<string, RateLimitEntry>()
  private authenticatedRateLimit = 100  // per second
  private unauthenticatedRateLimit = 10  // per second
  
  async checkRateLimit(consumerId: string): Promise<boolean> {
    const now = Date.now()
    const entry = this.limits.get(consumerId)
    
    // Determine rate limit based on registration status
    const isRegistered = this.registrationService.isRegistered(consumerId)
    const rateLimit = isRegistered ? this.authenticatedRateLimit : this.unauthenticatedRateLimit
    
    if (!entry || now > entry.resetTime) {
      // New window
      this.limits.set(consumerId, {
        count: 1,
        resetTime: now + 1000, // 1 second window
      })
      return true
    }
    
    if (entry.count >= rateLimit) {
      return false  // Rate limit exceeded
    }
    
    entry.count++
    return true
  }
}

Configuration:

SENTIMENT_AUTH_RATE_LIMIT=100
SENTIMENT_UNAUTH_RATE_LIMIT=10

Why two tiers:

Incentivizes registration
Prevents abuse from unregistered clients
Simulates API quota systems

Enforcement:

const allowed = await this.rateLimiter.checkRateLimit(consumerId)
 
if (!allowed) {
  throw new RpcException({
    code: 8, // RESOURCE_EXHAUSTED
    message: 'Rate limit exceeded'
  })
}

3. LRU Cache

Configuration:

const sentimentCache = new LRUCache({
  max: 500,  // SENTIMENT_CACHE_SIZE
  ttl: 1000 * 60 * 60 * 24  // 24 hours
})

Usage:

const cacheKey = createHash('sha256').update(text).digest('hex')
const cached = sentimentCache.get(cacheKey)
 
if (cached) {
  return { tag: cached, processingTime: 0, cached: true }
}
 
const tag = this.classifySentiment(text)
sentimentCache.set(cacheKey, tag)

Why caching:

Same comments analyzed repeatedly (duplicates get through dedup timing)
Reduces CPU load on keyword matching
Faster response times

4. Sentiment Classification

Keyword-based approach:

private classifySentiment(text: string): SentimentTag {
  const lowerText = text.toLowerCase()
  
  // Positive keywords
  const positiveWords = [
    'amazing', 'best', 'love', 'perfect', 'excellent', 'great',
    'outstanding', 'fantastic', 'wonderful', 'delicious', '😍', '❤️',
    'recommend', 'incredible', 'awesome'
  ]
  
  // Negative keywords
  const negativeWords = [
    'terrible', 'worst', 'bad', 'horrible', 'awful', 'disappointing',
    'cold', 'tasteless', 'overpriced', 'rude', 'dirty', 'unacceptable',
    'complaint', 'poisoning', 'waste'
  ]
  
  // Unrelated keywords
  const unrelatedWords = [
    'time', 'close', 'parking', 'blog', 'meet', 'wifi', 'password',
    'reservation', 'online', 'gluten-free', 'follow', 'spam', 'crypto'
  ]
  
  // Count matches
  let positiveCount = 0
  let negativeCount = 0
  let unrelatedCount = 0
  
  positiveWords.forEach(word => {
    if (lowerText.includes(word)) positiveCount++
  })
  
  negativeWords.forEach(word => {
    if (lowerText.includes(word)) negativeCount++
  })
  
  unrelatedWords.forEach(word => {
    if (lowerText.includes(word)) unrelatedCount++
  })
  
  // Determine sentiment
  if (unrelatedCount > 0) {
    return 'unrelated'
  }
  
  if (positiveCount > negativeCount) {
    return 'positive'
  } else if (negativeCount > positiveCount) {
    return 'negative'
  } else {
    return 'neutral'
  }
}

Why keyword matching:

Simple and deterministic
Fast processing
Sufficient for PoC
Easily understandable

Limitations:

No context understanding
Sarcasm not detected
Simple word counting

5. Failure Simulation

Configuration:

SENTIMENT_FAILURE_RATE=0.03125  # 1 in 32 requests

Implementation:

private shouldSimulateFailure(): boolean {
  return Math.random() < this.failureRate
}
 
async analyzeSentiment(request: SentimentRequest): Promise<SentimentResponse> {
  if (this.shouldSimulateFailure()) {
    throw new RpcException({
      code: 13, // INTERNAL
      message: 'Random service failure'
    })
  }
  
  // Normal processing...
}

Why simulate failures:

Tests consumer retry mechanism
Simulates real-world service instability
Validates error handling paths
Populates dead-letter queue

6. Processing Time Simulation

Simulates variable processing time:

const processingTime = text.length * 2  // 2ms per character
 
await sleep(processingTime)

Why simulate delay:

More realistic than instant responses
Tests timeout handling
Creates observable latency variations

Example:

"Great!" (6 chars) → 12ms
"Amazing food! Best restaurant..." (30 chars) → 60ms

Behavior Details

Registration Flow

Request:

{
  "consumerName": "consumer-service"
}

Response:

{
  "consumerId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "rateLimit": 100,
  "message": "Successfully registered consumer: consumer-service"
}

Consumer stores consumerId and includes it in all sentiment requests.

Analysis Flow

Request:

{
  "commentId": "abc-123",
  "text": "The food was amazing!",
  "textHash": "3a2f1b...",
  "consumerId": "a1b2c3d4-..."
}

Processing steps:

Check if consumerId is registered
Check rate limit (100/sec if registered, 10/sec if not)
1 in 32 chance to throw error (failure simulation)
Check LRU cache for textHash
If not cached, classify sentiment using keyword matching
Simulate processing time (text.length * 2ms)
Cache result and return

Response:

{
  "commentId": "abc-123",
  "tag": "positive",
  "processingTime": 42,
  "cached": false
}

Rate Limiting Behavior

Scenario: Registered consumer

Request 1-100: Accepted
Request 101: Rejected (RESOURCE_EXHAUSTED)
[After 1 second]
Request 102-201: Accepted

Scenario: Unregistered consumer

Request 1-10: Accepted
Request 11: Rejected (RESOURCE_EXHAUSTED)
[After 1 second]
Request 12-21: Accepted

Error response:

gRPC error: code=RESOURCE_EXHAUSTED, message="Rate limit exceeded"

HTTP Endpoints

The sentiment service runs both HTTP and gRPC servers simultaneously on different ports.

Health Endpoint

URL: GET http://localhost:3005/health

Purpose:

Service health monitoring
Aggregated statistics from all internal services
Container readiness/liveness probes
Debugging and observability

Response:

{
  "status": "healthy",
  "service": "sentiment-grpc",
  "cache": {
    "size": 245,
    "maxSize": 500
  },
  "rateLimiter": {
    "activeConsumers": 3,
    "authenticatedRateLimit": 100,
    "unauthenticatedRateLimit": 10
  },
  "registration": {
    "totalRegistered": 2,
    "consumers": [
      {
        "id": "a1b2c3d4...",
        "name": "consumer-service",
        "registeredAt": "2026-03-26T10:00:00Z"
      }
    ]
  },
  "timestamp": "2026-03-26T10:30:00Z"
}

Aggregated Metrics:

Cache stats: Current size and capacity
Rate limiter stats: Active consumer count and limits
Registration stats: All registered consumers with timestamps
Service status: Overall health indicator

Why use HTTP instead of gRPC:

Easy to test with curl/Postman
Standard for Docker/Kubernetes health checks
Human-readable JSON output
No proto compilation needed for monitoring

Docker Healthcheck:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:3005/health"]
  interval: 10s
  timeout: 5s
  retries: 3

Hybrid Architecture Benefits

gRPC for Core Business Logic:

High performance binary protocol
Type-safe service contracts
Streaming support (if needed later)
Efficient for service-to-service calls

HTTP for Operations:

Easy monitoring and debugging
Standard healthcheck protocols
Browser-accessible endpoints
No special client tooling needed

Single Process, Dual Protocols:

Both servers run in same NestJS application
Shared service layer (cache, rate limiter, registration)
Different ports prevent confusion
Minimal overhead

Performance Characteristics

Throughput:

With cache hits: 1000+ requests/second
Without cache: 200-500 requests/second

Latency:

Cached: < 5ms
Uncached: 10-200ms (depends on text length)
Failed (simulated): 0ms (immediate exception)

Cache effectiveness:

Hit rate: ~30-40% in typical scenarios
Varies based on duplicate rate from producer

Monitoring

Logs:

[Sentiment] Registered consumer: consumer-service (ID: a1b2c3d4-...)
[Sentiment] Analyzing sentiment for: "Great food!"
[Sentiment] Cache hit for text hash: 3a2f1b...
[Sentiment] Rate limit exceeded for consumer: unregistered
[Sentiment] Simulated failure (1 in 32)

Metrics to track:

Total requests
Registered vs unregistered requests
Cache hit rate
Rate limit violations
Simulated failures
Average processing time

Development

Run locally:

cd sentiment
pnpm dev

Test gRPC calls:

Using grpcurl:

# Register
grpcurl -plaintext -d '{"consumerName":"test"}' \
  localhost:3004 SentimentService/RegisterConsumer
 
# Analyze
grpcurl -plaintext -d '{"commentId":"abc","text":"Great!","textHash":"def","consumerId":"xyz"}' \
  localhost:3004 SentimentService/AnalyzeSentiment

Environment variables:

SENTIMENT_AUTH_RATE_LIMIT=100
SENTIMENT_UNAUTH_RATE_LIMIT=10
SENTIMENT_CACHE_SIZE=500
SENTIMENT_FAILURE_RATE=0.03125

Testing

Test Registration

# Should return consumerId
curl http://localhost:3005/health
# Check totalRegistered count

Test Rate Limiting

Send 150 requests rapidly:

for i in {1..150}; do
  grpcurl -plaintext -d '{"consumerId":"test","text":"Test"}' \
    localhost:3004 SentimentService/AnalyzeSentiment
done
# Should see RESOURCE_EXHAUSTED after request 100

Test Cache

# Send same text twice
grpcurl -d '{"commentId":"test","text":"Same text","textHash":"abc","consumerId":"xyz"}' ...
# Second call should be faster (processingTime: 0, cached: true)

Test Failure Simulation

# Send 100 requests
# ~3 should fail with UNAVAILABLE

Next Steps

API & SSE - How processed data is served
Consumer Service - How sentiment is requested
Frontend - How tags are displayed