By Chia Li Yun．Jan 1, 2023
A script for you to pinpoint the usage of the metrics before making changes to it.
Datadog is one of the monitoring and analytics tool used by enterprise to keep track of their software performance. It provides an extensive list of integrated features (e.g. logging, metric, dashboard and monitor) to support the observability.
In this article, we will focus more on Datadog metric. There are 2 ways one could generate a metric:
We often create new metrics to keep track of certain behaviours as we roll out new changes to production. While it is very convenient and helpful in providing insights of the application state / customer behaviour, it’s important to use with caution of the cost that it would incur.
Occasionally, engineers may want to do some housekeeping tasks — including removing metrics that are no longer in use. However, (even if the team has documentation of their metrics usage), one cannot be 100% sure that the metric is not used by the other teams. Given the heavy usage of Datadog (most likely the case, since the organization has already paid for the service 😅), it’s almost impossible to check all the dashboard and monitors manually.
I have created a python script that would help you to identify if the metric is being used anywhere — dashboards / monitors. It leverages on Datadog HTTP API service.
- install python3
pip install requests)
- create an API Key and Application Key that will be required to make API request (you may find them by clicking onto your profile -> organization settings)
How to use the script?
After you have copy-pasted the script, you will need to assign the actual values to the following variables.
baseUrl— this is the application host URL that depends on which region you are using (
eu = https://app.datadoghq.eu/
us1 = app.datadoghq.com/
us3 = us3.datadoghq.com/
us5 = us5.datadoghq.com/
us1-fed = app.ddog-gov.com)
ddApplicationKey— from step 3 of the pre-requisite section
logGeneratedMetrics— replace with the list of metrics that you want to detect
You may run the script directly (e.g.
python3 checkUsage.py) or simply pipe the output into another file to read later (e.g.
python3 checkUsage.py > checkUsageOutput).
Do note that the runtime depends on how much Datadog resources your organisation have.
What is in the script?
The script has 2 main functions:
This is a straightforward method that loops through the result of all monitors — for each of the monitor, check if any of the metric exist in it. If found, it will print a log (e.g.
found! metrics (metric.name.1) is used in baseUrl/monitors/1234)
Similarly, it also runs a for loop to check through all the dashboards. However this would take a significantly longer time as it requires to request for each dashboard again because
/dashboardendpoint only returns the metadata of all the dashboards. Likewise, a log will be printed if found (e.g.
found! metrics (metric.name.1) is used in baseUrl/dashboard/abc-xyz)
I have also added another for loop (line 73–77) to find out the exact widget that contains the metric (e.g.
inside widget: checkout). This added some additional runtime. You may remove that code block if you just want to know if it’s being used anywhere.
There you go! I hope this script would give you the confidence to remove those unessential metrics and save cost! 🤑
If you are not a Medium member yet, click here to join!
The original article published on Medium.