A Grafana panel plugin for visualizing distributed traces with incremental loading capabilities. This plugin provides an enhanced trace viewing experience with interactive help and user-friendly error handling.
- Incremental Trace Loading: Loads traces progressively for better performance
- Interactive Help System: Contextual help modals for common issues
- User-Friendly Error Handling: Clear guidance for panel size and data issues
- Advanced TraceQL Support: Complex query filtering and search capabilities
- Real-time Updates: Live trace data updates
- Responsive Design: Adapts to different panel sizes
For detailed usage instructions and troubleshooting, see HELP.md.
App plugins can let you create a custom out-of-the-box monitoring experience by custom pages, nested data sources and panel plugins.
The plugin supports build-time configuration through environment variables. Create a .env
file in the root directory and set the following variables:
SUPPORTS_CHILD_COUNT
: Set to'1'
to enable child count support,'0'
to disable it. This affects whether the plugin will usechildCount
attributes from the backend. Default is'false'
.
Example .env
file:
SUPPORTS_CHILD_COUNT=0
-
Install dependencies
bun install
-
Build plugin in development mode and run in watch mode
bun run dev
-
Build plugin in production mode
bun run build
To build with child count support enabled:
bun run build:with-child-count
-
Run the tests (using Jest)
# Runs the tests and watches for changes, requires git init first bun run test # Exits after running all the tests bun run test:ci
-
Spin up a Grafana instance and run the plugin inside it (using Docker)
bun run server
-
Run the E2E tests (using Playwright)
# Spins up a Grafana instance first that we tests against bun run server # If you wish to start a certain Grafana version. If not specified will use latest by default GRAFANA_VERSION=11.3.0 bun run server # Starts the tests bun run e2e
-
Run the linter
bun run lint # or bun run lint:fix
The CHANGELOG.md is considered the source of truth.
New changes to this project should be reflected in this file, preferably when creating PRs.
Use the Keep a Changelog format.
If there is no pending release, add a new section:
When you are ready to release, change the ## [Unreleased]
header into the next version, for example:
Make sure to also bump the version in package.json, and commit these changes in a PR to the main branch.
On the main branch, scripts/check-version.js will run and create a new tag if a release is found in the changelog that does not yet exist as a GitHub release.
If scripts/check-version.js creates a new tag, it will trigger the release.yml job.
This workflow builds two versions of the plugin: a signed and an unsigned one.
It then creates a new GitHub release with both archives attached as artifacts.
When distributing a Grafana plugin either within the community or privately the plugin must be signed so the Grafana application can verify its authenticity. This can be done with the @grafana/sign-plugin
package.
Note: It's not necessary to sign a plugin during development. The docker development environment that is scaffolded with @grafana/create-plugin
caters for running the plugin without a signature.
Before signing a plugin please read the Grafana plugin publishing and signing criteria documentation carefully.
@grafana/create-plugin
has added the necessary commands and workflows to make signing and distributing a plugin via the grafana plugins catalog as straightforward as possible.
Before signing a plugin for the first time please consult the Grafana plugin signature levels documentation to understand the differences between the types of signature level.
- Create a Grafana Cloud account.
- Make sure that the first part of the plugin ID matches the slug of your Grafana Cloud account.
- You can find the plugin ID in the
plugin.json
file inside your plugin directory. For example, if your account slug isacmecorp
, you need to prefix the plugin ID withacmecorp-
.
- You can find the plugin ID in the
- Create a Grafana Cloud API key with the
PluginPublisher
role. - Keep a record of this API key as it will be required for signing a plugin
If the plugin is using the github actions supplied with @grafana/create-plugin
signing a plugin is included out of the box. The release workflow can prepare everything to make submitting your plugin to Grafana as easy as possible. Before being able to sign the plugin however a secret needs adding to the Github repository.
- Please navigate to "settings > secrets > actions" within your repo to create secrets.
- Click "New repository secret"
- Name the secret "GRAFANA_API_KEY"
- Paste your Grafana Cloud API key in the Secret field
- Click "Add secret"
To trigger the workflow we need to push a version tag to github. This can be achieved with the following steps:
- Run
npm version <major|minor|patch>
- Run
git push origin main --follow-tags
Below you can find source code for existing app plugins and other related documentation.
When running bun run server
Docker compose will also spin up a OpenSearch instance.
Initially run the index setup script:
bun run scripts/setup-opensearch.js
then run
bun run scripts/create-depth-trace.js
bun run scripts/create-large-trace.js
To create some sample trace data. Feel free to tweak these scripts to your local needs.
This should already have been done when launch the grafana
docker service.
If not:
- Login in as
admin
, pwadmin
- Install OpenSearch plugin, http://localhost:3000/plugins
- Add new data source:
- URL: http://opensearch:9200
- View index name at http://localhost:9200/\_cat/indices?v , is most likely going to be
ss4o_traces-default-namespace
We are using a contract first approach for the Go resource endpoints. api.yml is the source of truth, and you can generate client & server code via:
bun run generate-api
This project uses Tailwind CSS for styling. Note that Tailwind cannot detect dynamically constructed class names, so we use inline style
tags for truly dynamic values. For more details, see the Tailwind CSS documentation on dynamic class names.
Run bun run package
to create a plugin zip archive.
# Install via the grafana cli
./bin/grafana cli -pluginUrl ../gresearch-grafanaincrementaltraceviewer-app-0.1.0.zip plugins install "gresearch-grafanaincrementaltraceviewer-app"
Afterward you need to start Grafana, if you are using Docker, just restart the container. Once that is done, you need to enable the plugin via the UI.
- If the search endpoint returns duplicate spanIDs, we do not handle them.
- If a single root node has a very large number of children, the plugin will attempt to load all children when it mounts.
In production at G-Research, we target a Tempo-compatible API endpoint. The API is Tempo-compatible but uses a custom implementation and a different datastore.
Minor differences:
-
The
search
endpoint returns all span attributes, even when they were not requested in traceQL. When opening span details the client performs two requests to obtain all span attributes:- Retrieve all tags via
/search/tags
. - Query the span and use the tags in a
select(...)
. The Grafana Tempo API only returns attributes which are part of the| select(...)
query. In production, the server already has these attributes and we do not fetch them again.
- Retrieve all tags via
-
childCount
is part of the traceQL spec but is not implemented by Grafana Tempo. (comment) This field is supported by the production API. -
In traceQL,
nestedSetParent = -1
is an undocumented feature (but part of the spec) used to find root nodes. Ideally, each trace has a single root node; however, when a trace is still in progress, that root node might not yet exist (see partial application spans). Our production server detects parentIds that do not yet exist in a trace and treats nodes referencing these ghost parents as root nodes. This impacts performance, but the processing occurs server-side. -
Resource attributes that come from the server are prefixed with
resource.
. Since we have all attributes available, we need a way to tell which ones belong to the resource and which belong to the span. Tempo is a bit odd: if you requestselect(resource.serviceAttributeName)
, it will appear on the span asserviceAttributeName
, but because you requested it in the select you can trace it back to a resource attribute. In production, however, we receive all values without requesting them, so theresource.
prefix is necessary to identify which attributes are resource attributes.
To differentiate between the Grafana Tempo API and the G-Research–flavoured Tempo API, the plugin checks the SUPPORTS_CHILD_COUNT
environment variable.
Building with SUPPORTS_CHILD_COUNT=1
results in the runtime behavior described above.
We run end-to-end using Playwright and @grafana/plugin-e2e
.
There are two ways to run the tests, there is setup for when SUPPORTS_CHILD_COUNT=0
or SUPPORTS_CHILD_COUNT=1
.
In both cases, we rely on a provisioned Docker compose setup.
Playwright requires Chromium as a dependency
bunx playwright install
Why is this not part of our package.json?
Chromium cannot be installed in our production environment, so we do not include it as a required dependency in package.json.
Run bun run server
to start the regular developer setup.
Here we shall target the Grafana Tempo API as mentioned in [./docker-compose.yaml].
Run
bun run build
to build a bundle without SUPPORTS_CHILD_COUNT
.
Next, we need to provision sample data to our Tempo store. Run
bun run scripts/e2e-tempo-trace.js
The sample data is based on the moon landing and should not be altered unless you are working on e2e tests.
Afterwards all pieces are in place to run the e2e tests:
bun run e2e
To simulate the production API, we have constructed a different Docker setup in local-tempo-docker-compose.yml.
There we also have a Tempo
service, so from Grafana's point of view nothing will have changed.
Run
bun run server:local
to start our alternative compose.
The Tempo service in Docker is a proxy script that will forward requests from 3200
to a localhost server.
In production, this would be the .NET side of things, for our local setup, we can run:
bun run tests/test-api.ts
Next, build our plugin using SUPPORTS_CHILD_COUNT=1
via
bun run build:with-child-count
Afterwards, you should be able to run the tests using:
bun run e2e
In scripts/e2e-tempo-trace.js
, we have a test scenario.
To ensure we use the same data during SUPPORTS_CHILD_COUNT=1
.
You can extract the last trace to tests/test-trace.json via
bun run scripts/extract-trace.ts
This of course assumes you are running against regular Docker compose and real Tempo!