AI Monitoring & Observability

Tools for monitoring AI model performance and costs

18 tools

Arize Phoenix

Open Source

Open-source LLM observability and evaluation. Trace visualization, embedding analysis, and evals.

AI Monitoring & Observability

Braintrust

Freemium

Enterprise AI evaluation and observability platform. Prompt playground, scoring, and dataset management.

AI Monitoring & Observability

Datadog LLM Observability

Paid

LLM monitoring within Datadog ecosystem. Trace prompts, tokens, latency alongside infrastructure metrics.

AI Monitoring & Observability

DeepEval

Open Source

Open-source LLM evaluation framework. 14+ metrics including hallucination, relevancy, and bias detection.

AI Monitoring & Observability

Galileo

Freemium

LLM evaluation and hallucination detection platform. Automated metrics for RAG quality and safety.

AI Monitoring & Observability

Helicone

Freemium

Open-source LLM observability proxy. One-line integration, request logging, caching, and rate limiting.

AI Monitoring & Observability

Humanloop

Freemium

Prompt management and evaluation platform. Version prompts, run evals, and optimize LLM performance.

AI Monitoring & Observability

Langfuse

Freemium

Open-source LLM observability. Traces, metrics, prompt management, and evaluation. Self-hostable.

AI Monitoring & Observability

LangSmith

Freemium

LangChain's observability platform. Trace, debug, and evaluate LLM applications with detailed run analytics.

AI Monitoring & Observability

OpenLLMetry

Open Source

Open-source observability for LLMs based on OpenTelemetry. Works with Datadog, Grafana, Honeycomb.

AI Monitoring & Observability

Opik (Comet)

Open Source

Open-source LLM evaluation and tracing platform. Track experiments, evaluate outputs, and debug prompts.

AI Monitoring & Observability

Portkey

Freemium

AI gateway with observability. Load balancing, fallbacks, caching, and guardrails for LLM APIs.

AI Monitoring & Observability

PostHog

Freemium

Open-source product analytics with AI feature tracking. Session replay, feature flags, A/B testing.

AI Monitoring & Observability

PromptLayer

Freemium

Prompt engineering platform. Version control, A/B testing, and analytics for prompts across providers.

AI Monitoring & Observability

RAGAS

Open Source

Evaluation framework for RAG pipelines. Measures faithfulness, relevancy, and context precision.

AI Monitoring & Observability

Sentry

Freemium

Error tracking with AI/LLM monitoring support. Track exceptions, performance, and LLM-specific errors.

AI Monitoring & Observability

Traceloop

Freemium

LLM monitoring built on OpenTelemetry. Auto-instrumentation for LangChain, LlamaIndex, and OpenAI SDK.

AI Monitoring & Observability

Weights & Biases

Freemium

ML experiment tracking and model monitoring. LLM-specific features for prompt tracking and evaluation.

AI Monitoring & Observability