To operate Prisme.ai efficiently in production, it’s essential to monitor service health, resource usage, and error rates. This guide explains how to install and configure Prometheus and Grafana using Operators in a Kubernetes environment.


Why Use Operators?

Using Kubernetes Operators simplifies lifecycle management of complex systems like Prometheus and Grafana:

  • Automated installation and upgrades
  • Simplified configuration
  • Native CRDs for monitoring targets, dashboards, alerts

Step-by-Step Installation

1

Install Prometheus Operator

You can install the Prometheus Operator via Helm:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install kube-prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace
2

Expose Grafana Dashboard

Expose Grafana using an Ingress or port-forward:

kubectl port-forward svc/kube-prometheus-grafana 3000:80 -n monitoring

Then access it at http://localhost:3000

Default credentials:

  • Username: admin
  • Password: admin (or see adminPassword in the values file)
3

Configure Prometheus Scrape Targets

Prisme.ai services expose Prometheus-compatible metrics endpoints (e.g. /metrics). To scrape them, define a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: prisme-api
  labels:
    release: kube-prometheus
spec:
  selector:
    matchLabels:
      app: api-gateway
  namespaceSelector:
    matchNames:
    - prisme-ai
  endpoints:
  - port: http
    path: /metrics
    interval: 30s
4

Import Dashboards

Grafana supports importing dashboards via the UI or ConfigMaps.

Use community dashboards for:

  • Kubernetes cluster monitoring
  • Pod resource usage
  • API Gateway latency & error rates
  • Redis, MongoDB, and Elasticsearch health

Alerts and Notifications

Set up alert rules and connect them to notification channels:


Best Practices

Namespace Separation

  • Run monitoring stack in a dedicated namespace (monitoring)
  • Use RBAC to isolate metrics access

Retention & Storage

  • Configure Prometheus retention (--storage.tsdb.retention.time=15d)
  • Mount persistent volumes for metric storage

Service Discovery

  • Use ServiceMonitor and PodMonitor for automatic discovery
  • Label all Prisme.ai services consistently (e.g., app: api-gateway)

Grafana Security

  • Change default admin password
  • Enable SSO integration (e.g., OAuth, LDAP) if required

Next Steps