diff --git a/docs/docs/monitoring.md b/docs/docs/monitoring.md
index eaedb631d6..abce9f53c1 100644
--- a/docs/docs/monitoring.md
+++ b/docs/docs/monitoring.md
@@ -20,10 +20,17 @@ for all OpenTelemetry-related configurables.
 
 ## Prometheus
 
-OPA exposes an HTTP endpoint that can be used to collect performance metrics
+OPA exposes an HTTP endpoint at `/metrics` that can be used to collect performance metrics
 for all API calls. The Prometheus endpoint is enabled by default when you run
 OPA as a server.
 
+OPA provides two ways to access performance metrics:
+
+1. **System-wide metrics** via the `/metrics` Prometheus endpoint - Instance-level metrics across all OPA operations
+2. **Per-query metrics** via API responses with `?metrics=true` - Metrics for individual query executions
+
+These serve different purposes: system metrics for OPA instance monitoring and alerting, per-query metrics for debugging and optimization.
+
 You can enable metric collection from OPA with the following `prometheus.yml` config:
 
 ```yaml
@@ -86,6 +93,24 @@ When Prometheus is enabled in the status plugin (see [Configuration](./configura
 | last_success_bundle_request    | gauge       | Last successful bundle request in UNIX nanoseconds.    | STABLE |
 | bundle_loading_duration_ns     | histogram   | A histogram of duration for bundle loading.            | STABLE |
 
+## Available Metrics
+
+The Prometheus `/metrics` endpoint exposes the following instance-level metrics:
+
+- **URL**: `http://localhost:8181/metrics` (default configuration)
+- **Method**: HTTP GET
+- **Format**: Prometheus text format
+- **Contents**: Instance-level counters, timers, histograms, Go runtime metrics
+- **Use case**: Monitoring dashboards, alerting, performance trends
+
+### Additional Resources
+
+- **Per-query metrics**: See [REST API Performance Metrics](./rest-api#performance-metrics) for debugging individual queries
+- **Policy performance**: See [Policy Performance](./policy-performance#performance-metrics) for optimization guidance
+- **Status API**: See [Status API](./management-status) for metrics reporting via status updates
+- **Decision logs**: See [Decision Logs](./management-decision-logs) for including metrics in decision logs
+- **CLI tools**: See [opa eval](./cli#eval) and [opa bench](./cli#bench) for command-line metric collection
+
 ## Health Checks
 
 OPA exposes a `/health` API endpoint that can be used to perform health checks.
diff --git a/docs/docs/policy-performance.md b/docs/docs/policy-performance.md
index 5875dd0c77..8b822fbe55 100644
--- a/docs/docs/policy-performance.md
+++ b/docs/docs/policy-performance.md
@@ -977,6 +977,66 @@ This feature can be enabled for `opa run`, `opa eval`, and `opa bench` by settin
 
 Users are recommended to do performance testing to determine the optimal configuration for their use case.
 
+## Performance Metrics
+
+OPA exposes metrics for policy evaluation performance. These are available through:
+
+- **System-wide metrics** at the `/metrics` Prometheus endpoint
+- **Per-query metrics** with individual API responses when `?metrics=true` is specified
+
+See [Monitoring](./monitoring#metrics-overview) for more details.
+
+### Common Built-in Function Metrics
+
+#### HTTP Built-ins
+
+`http.send` metrics help identify I/O bottlenecks:
+
+- `timer_rego_builtin_http_send_ns` - Total time spent in http.send calls
+- `counter_rego_builtin_http_send_interquery_cache_hits` - Inter-query cache hits
+- `counter_rego_builtin_http_send_network_requests` - Actual network requests made
+
+High cache hit ratios indicate effective caching and reduced network overhead.
+
+#### Regex Built-ins
+
+Regex operation metrics help optimize pattern matching:
+
+- `timer_rego_builtin_regex_interquery_ns` - Time spent in regex operations
+- `counter_rego_builtin_regex_interquery_cache_hits` - Regex pattern cache hits
+- `counter_rego_builtin_regex_interquery_value_cache_hits` - Regex value cache hits
+
+Effective regex caching improves performance when the same patterns are used repeatedly.
+
+### Core Query Metrics
+
+Basic query evaluation phases:
+
+- `timer_rego_query_parse_ns` - Time parsing the query string
+- `timer_rego_query_compile_ns` - Time compiling the query
+- `timer_rego_query_eval_ns` - Time executing the compiled query
+
+Compilation time often dominates in complex policies.
+
+### High-Level Metrics
+
+Server-level metrics for overall performance:
+
+- `timer_server_handler_ns` - Total request handler execution time
+- `counter_server_query_cache_hit` - Server-level query cache hits
+
+### Using Metrics for Optimization
+
+1. **Query phases**: Compare parse, compile, and eval times to identify bottlenecks
+2. **Cache effectiveness**: Low cache hit rates suggest tuning opportunities
+3. **I/O bottlenecks**: High `http.send` network request counts indicate caching issues
+4. **Pattern matching**: Monitor regex cache hits for frequently used patterns
+
+Access metrics via:
+- REST API: Add `?metrics=true` to policy evaluation requests
+- CLI: Use `--metrics` flag with `opa eval` or `opa bench`
+- Prometheus: See [Monitoring](./monitoring#prometheus) for system-wide metrics
+
 ## Key Takeaways
 
 For high-performance use cases:
@@ -987,3 +1047,4 @@ For high-performance use cases:
 - Write your policies with indexed statements so that [rule-indexing](https://blog.openpolicyagent.org/optimizing-opa-rule-indexing-59f03f17caf3) is effective.
 - Use the profiler to help identify portions of the policy that would benefit the most from improved performance.
 - Use the benchmark tools to help get real world timing data and detect policy performance changes.
+- Monitor performance metrics to track optimization impact and identify bottlenecks.
diff --git a/docs/docs/policy-reference/builtins/glob.mdx b/docs/docs/policy-reference/builtins/glob.mdx
index 799665e91c..3d17b8f496 100644
--- a/docs/docs/policy-reference/builtins/glob.mdx
+++ b/docs/docs/policy-reference/builtins/glob.mdx
@@ -27,3 +27,13 @@ The following table shows examples of how `glob.match` works:
 | `output := glob.match("{cat,bat,[fr]at}", [], "bat")`            | `true`   | A glob with pattern-alternatives matchers.    |
 | `output := glob.match("{cat,bat,[fr]at}", [], "rat")`            | `true`   | A glob with pattern-alternatives matchers.    |
 | `output := glob.match("{cat,bat,[fr]at}", [], "at")`             | `false`  | A glob with pattern-alternatives matchers.    |
+
+## Performance Metrics
+
+When OPA is configured with metrics enabled, `glob.match` operations expose the following metrics in per-query metrics (accessible when `?metrics=true` is specified in API requests):
+
+| Metric | Description |
+| ------ | ----------- |
+| `counter_rego_builtin_glob_interquery_value_cache_hits` | Number of inter-query cache hits for compiled glob patterns |
+
+Effective glob pattern caching improves performance when the same patterns are used repeatedly across queries. High cache hit ratios indicate that glob compilation overhead is being minimized through caching.
diff --git a/docs/docs/policy-reference/builtins/http.mdx b/docs/docs/policy-reference/builtins/http.mdx
index b09cf3f2b9..aacb475b88 100644
--- a/docs/docs/policy-reference/builtins/http.mdx
+++ b/docs/docs/policy-reference/builtins/http.mdx
@@ -113,3 +113,15 @@ The table below shows examples of calling `http.send`:
 | Files containing TLS material                 | `http.send({"method": "get", "url": "https://127.0.0.1:65331", "tls_ca_cert_file": "testdata/ca.pem", "tls_client_cert_file": "testdata/client-cert.pem", "tls_client_key_file": "testdata/client-key.pem"})`     |
 | Environment variables containing TLS material | `http.send({"method": "get", "url": "https://127.0.0.1:65360", "tls_ca_cert_env_variable": "CLIENT_CA_ENV", "tls_client_cert_env_variable": "CLIENT_CERT_ENV", "tls_client_key_env_variable": "CLIENT_KEY_ENV"})` |
 | Unix Socket URL Format                        | `http.send({"method": "get", "url": "unix://localhost/?socket=%F2path%F2file.socket"})`                                                                                                                           |
+
+## Performance Metrics
+
+When OPA is configured with metrics enabled, `http.send` operations expose the following metrics in per-query metrics (accessible when `?metrics=true` is specified in API requests):
+
+| Metric | Description |
+| ------ | ----------- |
+| `timer_rego_builtin_http_send_ns` | Total time spent in `http.send` calls during query evaluation |
+| `counter_rego_builtin_http_send_interquery_cache_hits` | Number of inter-query cache hits for `http.send` requests |
+| `counter_rego_builtin_http_send_network_requests` | Number of actual network requests made by `http.send` |
+
+High cache hit ratios indicate effective caching and reduced network overhead. These metrics help identify I/O bottlenecks in policies that make external HTTP requests.
diff --git a/docs/docs/policy-reference/builtins/regex.mdx b/docs/docs/policy-reference/builtins/regex.mdx
index eb3f99e2d2..35a1400324 100644
--- a/docs/docs/policy-reference/builtins/regex.mdx
+++ b/docs/docs/policy-reference/builtins/regex.mdx
@@ -110,3 +110,13 @@ overlap. This can be useful when using patterns to define permissions or access
 rules. The function returns `true` if the two patterns overlap and `false` otherwise.
 
 <PlaygroundExample dir={require.context('../_examples/regex/globs_match/role_patterns')} />
+
+## Performance Metrics
+
+When OPA is configured with metrics enabled, regex operations expose the following metrics in per-query metrics (accessible when `?metrics=true` is specified in API requests):
+
+| Metric | Description |
+| ------ | ----------- |
+| `counter_rego_builtin_regex_interquery_value_cache_hits` | Number of regex cache hits for compiled patterns |
+
+Effective regex caching improves performance when the same patterns are used repeatedly. High cache hit ratios indicate that regex compilation overhead is being minimized through caching.
diff --git a/docs/docs/rest-api.md b/docs/docs/rest-api.md
index d9eace22d1..793372a7b9 100644
--- a/docs/docs/rest-api.md
+++ b/docs/docs/rest-api.md
@@ -2333,9 +2333,12 @@ Query instrumentation can help diagnose performance problems, however, it can
 add significant overhead to query evaluation. We recommend leaving query
 instrumentation off unless you are debugging a performance problem.
 
-When instrumentation is enabled there are several additional performance metrics
-for the compilation stages. They follow the format of `timer_compile_stage_*_ns`
-and `timer_query_compile_stage_*_ns` for the query and module compilation stages.
+When query instrumentation is enabled (`instrument=true`), the following additional detailed evaluation metrics are included:
+- **timer_eval_op_***: Various evaluation operation timers (e.g., `timer_eval_op_plug_ns`, `timer_eval_op_resolve_ns`)
+- **histogram_eval_op_***: Histograms tracking evaluation operation time distributions
+- **timer_rego_builtin_***: Built-in function execution times
+- **counter_rego_builtin_***: Built-in function call counts and cache hits
+- **timer_compile_stage_*_ns**: Compilation stage timers for the query and module compilation stages
 
 ## Provenance