EventCatalog | Inventory Service

This runbook provides operational procedures for the InventoryService, which is responsible for managing product inventory and stock levels across the FlowMart e-commerce platform.

Architecture

The InventoryService is responsible for:

Managing product inventory and stock levels
Reserving inventory for pending orders
Tracking inventory across warehouses and locations
Providing real-time availability information
Triggering restock notifications

Service Dependencies

Loading graph...

Monitoring and Alerting

Key Metrics

Metric	Description	Warning Threshold	Critical Threshold
`inventory_check_rate`	Inventory availability checks per minute	> 1000	> 5000
`inventory_check_latency`	Time to check inventory availability	> 100ms	> 500ms
`inventory_update_latency`	Time to update inventory levels	> 200ms	> 1s
`low_stock_items`	Number of items with low stock	> 50	> 100
`connection_pool_usage`	Database connection pool utilization	> 70%	> 90%
`redis_hit_rate`	Cache hit rate	< 80%	< 60%

Dashboards

Common Alerts

Alert	Description	Troubleshooting Steps
`InventoryServiceHighLatency`	API latency exceeds thresholds	See High Latency
`InventoryServiceDatabaseIssues`	Database connection or performance issues	See Database Issues
`InventoryServiceCacheFailure`	Redis cache unavailable or performance degraded	See Cache Issues
`InventoryServiceOutOfStock`	Critical products out of stock	See Stock Management

Troubleshooting Guides

High Latency

If the service is experiencing high latency:

Check system resource usage:
```
kubectl top pods -n inventory
```

Check database connection pool:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl localhost:8080/actuator/metrics/hikaricp.connections.usage

Check cache hit rate:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl localhost:8080/actuator/metrics/cache.gets | grep "hit_ratio"

Check for slow queries in the database:

kubectl exec -it $(kubectl get pods -l app=postgresql -n data -o jsonpath='{.items[0].metadata.name}') -n data -- psql -U postgres -c "SELECT query, calls, mean_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;"

Scale the service if needed:

kubectl scale deployment inventory-service -n inventory --replicas=5

Database Issues

If there are database connection or performance issues:

Check PostgreSQL status:

kubectl exec -it $(kubectl get pods -l app=postgresql -n data -o jsonpath='{.items[0].metadata.name}') -n data -- pg_isready -U postgres

Check for long-running transactions:

kubectl exec -it $(kubectl get pods -l app=postgresql -n data -o jsonpath='{.items[0].metadata.name}') -n data -- psql -U postgres -c "SELECT pid, now() - xact_start AS duration, state, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC;"

Check for table bloat:

kubectl exec -it $(kubectl get pods -l app=postgresql -n data -o jsonpath='{.items[0].metadata.name}') -n data -- psql -U postgres -c "SELECT schemaname, relname, n_live_tup, n_dead_tup, (n_dead_tup::float / n_live_tup::float) AS dead_ratio FROM pg_stat_user_tables WHERE n_live_tup > 1000 ORDER BY dead_ratio DESC;"

Restart database connections in the application if needed:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl -X POST localhost:8080/actuator/restart-db-connections

Cache Issues

If there are Redis cache issues:

Check Redis status:

kubectl exec -it $(kubectl get pods -l app=redis -n data -o jsonpath='{.items[0].metadata.name}') -n data -- redis-cli ping

Check Redis memory usage:

kubectl exec -it $(kubectl get pods -l app=redis -n data -o jsonpath='{.items[0].metadata.name}') -n data -- redis-cli info memory

Check cache hit rate:

kubectl exec -it $(kubectl get pods -l app=redis -n data -o jsonpath='{.items[0].metadata.name}') -n data -- redis-cli info stats | grep hit_rate

Clear cache if necessary:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl -X POST localhost:8080/actuator/caches/clearAll

Stock Management

For critical stock issues:

Identify products with low or no stock:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl localhost:8080/internal/api/inventory/low-stock

Check for stuck inventory reservations:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl localhost:8080/internal/api/inventory/stuck-reservations

Release expired reservations if necessary:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl -X POST localhost:8080/internal/api/inventory/release-expired-reservations

Manually update inventory levels for emergency corrections:

curl -X PUT https://api.internal.flowmart.com/inventory/products/{productId}/stock \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"warehouseId": "WAREHOUSE_ID", "quantity": 100, "reason": "Manual correction"}'

Common Operational Tasks

Scaling the Service

To scale the service horizontally:

kubectl scale deployment inventory-service -n inventory --replicas=<number>

Restarting the Service

To restart all pods:

kubectl rollout restart deployment inventory-service -n inventory

Database Maintenance

For routine database maintenance:

Run VACUUM ANALYZE to optimize tables:

kubectl exec -it $(kubectl get pods -l app=postgresql -n data -o jsonpath='{.items[0].metadata.name}') -n data -- psql -U postgres -c "VACUUM ANALYZE inventory_items;"

Update database statistics:

kubectl exec -it $(kubectl get pods -l app=postgresql -n data -o jsonpath='{.items[0].metadata.name}') -n data -- psql -U postgres -c "ANALYZE;"

Reconcile Inventory

To reconcile inventory with the warehouse management system:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl -X POST localhost:8080/internal/api/inventory/reconcile

Manually Trigger Restock Notifications

To trigger restock notifications for low stock items:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl -X POST localhost:8080/internal/api/inventory/trigger-restock-notifications

Recovery Procedures

Database Failure Recovery

If the PostgreSQL database becomes unavailable:

Verify the status of the PostgreSQL cluster:

kubectl get pods -l app=postgresql -n data

If the primary instance is down, check if automatic failover has occurred:

kubectl exec -it $(kubectl get pods -l app=postgresql-patroni -n data -o jsonpath='{.items[0].metadata.name}') -n data -- patronictl list

If automatic failover has not occurred, initiate manual failover:

kubectl exec -it $(kubectl get pods -l app=postgresql-patroni -n data -o jsonpath='{.items[0].metadata.name}') -n data -- patronictl failover

Once database availability is restored, validate the InventoryService functionality:
```
curl -X GET https://api.internal.flowmart.com/inventory/health
```

Cache Failure Recovery

If the Redis cache becomes unavailable:

Verify Redis cluster status:
```
kubectl get pods -l app=redis -n data
```

If needed, restart the Redis cluster:

kubectl rollout restart statefulset redis -n data

The InventoryService will fall back to database queries when the cache is unavailable.

When the cache is restored, you can warm it up:

kubectl exec -it $(kubectl get pods -l app=inventory-service -n inventory -o jsonpath='{.items[0].metadata.name}') -n inventory -- curl -X POST localhost:8080/internal/api/inventory/warm-cache

Disaster Recovery

Complete Service Failure

In case of a complete service failure:

Initiate incident response by notifying the on-call team through PagerDuty.

Verify the deployment status:

kubectl describe deployment inventory-service -n inventory

If necessary, restore from a previous version:

kubectl rollout undo deployment inventory-service -n inventory

If the primary region is experiencing issues, fail over to the secondary region:
```
./scripts/dr-failover.sh inventory-service
```

Verify the service is functioning in the secondary region:

curl -X GET https://api-dr.internal.flowmart.com/inventory/health

Maintenance Tasks

Deploying New Versions

kubectl set image deployment/inventory-service -n inventory inventory-service=ecr.aws/flowmart/inventory-service:$VERSION

Database Schema Updates

For database schema updates:

Notify stakeholders through the #maintenance Slack channel.

Set InventoryService to maintenance mode:

curl -X POST https://api.internal.flowmart.com/inventory/admin/maintenance -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" -d '{"maintenanceMode": true, "message": "Database schema update"}'

Apply the database migrations:

kubectl apply -f inventory-flyway-job.yaml

Verify migration completion:

kubectl logs -l job-name=inventory-flyway-migration -n inventory

Turn off maintenance mode:

curl -X POST https://api.internal.flowmart.com/inventory/admin/maintenance -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" -d '{"maintenanceMode": false}'

Contact Information

Primary On-Call: Inventory Team (rotating schedule)
Secondary On-Call: Platform Team
Escalation Path: Inventory Team Lead > Engineering Manager > CTO

Slack Channels:

#inventory-support (primary support channel)
#inventory-alerts (automated alerts)
#incident-response (for major incidents)

Inventory Service - Runbook