Excluding failing dependencies from Application Insights logging
Implement an ITelemetryProcessor to ignore specific dependency failures in your Application Insights service - this help keep the data clean(er), reduces cost, and ultimately reduces false positives.
Azure Application Insights is a core feature of Azure Monitor that I use regularly. I keep track of exceptions, custom logs, ingested logs from native integration points, performance, overall health, uptime, dependency failures, and more.
When operating production workloads, you can quickly see patterns emerge from the data. A common scenario that I've seen is that dependency failures happen, but they may not be unexpected.
As an example, when working with Azure Table Storage, you can get a 404
or a 409
when you try certain operations.
Examples of how a 404
dependency failure can happen:
if (!tableClient.Exists())
tableClient.CreateIfNotExists();
The code just returns a true
or a false
for your boolean check, however the dependency (the table client) is also returning a 404
if it does not exist. You may not see that immediately when you're working on the code, though. (Depends on how deep you look).
Similarly, if you remove the conditional and use the built-in CreateIfNotExists()
method, you'll get a 409
instead:
tableClient.CreateIfNotExists();
The problem here is that the ingested logs in my Log Analytics Workspace quickly grows, as I'm operating a lot over Tables, Queues and Blobs - and with a majority of the operations we first check whether the objects exist before we start using them.
These libraries are built in a way that they are throwing the 404 or 409 or other error codes - not because it crashes, but because that's the built-in behavior. I can't do much about that from here.
Imagine scaling this up to hundreds of millions of requests. This will result in a couple of things for your Log Analytics Workspace and App Insights experience:
- The incurred cost will grow. Quickly!
- The data in your charts and queries might be hard to interpret.
- Setting alerts will now be more difficult, because I don't want to ignore dependency failures - however, I also don't want 200 000 000 triggers for expected dependency failures that I can't do anything about from this side.
- For trivia: My cost was about to increase by 1300% because of this - so a solution was needed. Be mindful of what data you ingest, and at what scale.
Let us take a quick look at growing dependencies and ingested data. I have collected these from various production-systems at various scale - all having the same challenges with dependencies being reported as failures, albeit the code is working as expected.
In this image, we have ~109 000 000 dependency failures in the last 7 days.
Reviewing the data from another angle, we can see that in the last ~30 days there has been ~412 000 000 dependency failures. In the right pane we can see specifically what the failed dependency response codes are; and true to our expectations, it's the 409 and 404.
Just to put the data into comparison, we can see everthing else almost flat-lining at the bottom compared to our failed dependencies. Exceptions, traces, metrics, events, and more - nothing compares to the insane growth of the dependency failes, which in our code isn't actually a failure.
Finally, querying the data from yet another angle just to put it into perspective. In the last 24 hours there's been ~49 000 000 dependency failures, but "only" 1 500 000 application traces.
There are many ways we can handle this - but from where I'm standing, we need to clean up the data in our Application Insights service, and ultimately the Log Analtyics Workspace.
Enter the ITelemetryProcessor
, where we can filter the ingestions in our Application Insights service in code.
Ignore failing dependencies with the ITelemetryProcessor
Filtering out requests with an ITelemetryProcessor
isn't particularly hard. However, after some research and discussions, it seems many I've talked with still doesn't know about them - so I wanted to share how they can help in a scenario where we operate at scale, with hundreds of millions of items being processed.
First, let's define our ITelemetryProcessor, and tell it to ignore something - in this case, I'm just looking for the dependency of type Azure table
and where the success is false; In your scenario, just as the code comments say, you should align this with your workloads and logic for a better fit.
public class AzureStorageDependencyProcessor : ITelemetryProcessor
{
private ITelemetryProcessor Next { get; set; }
public AzureStorageDependencyProcessor(ITelemetryProcessor next)
{
Next = next;
}
public void Process(ITelemetry item)
{
if (!VerifyDepdendencyIsValid(item))
return;
Next.Process(item);
}
private bool VerifyDepdendencyIsValid(ITelemetry item)
{
DependencyTelemetry dependency = item as DependencyTelemetry;
if (dependency == null)
return true;
// Here you can filter on your dependency any way you want.
// This is an example of how to not report "Azure table" related dependency issues as failures.
// NOTE: Depending on your production scenarios, you should modify the logic here, and test it.
// This is meant to illustrate how to accomplish the task, not solve production issues.
if (dependency.Type == "Azure table" && dependency.Success == false)
{
dependency.Success = true;
}
return dependency.Success != true;
}
}
My normal Application Insights service code can then inject this ITelemetryProcessor
when I'm creating my client:
public class ApplicationInsightsService
{
private readonly TelemetryClient _telemetryClient;
public ApplicationInsightsService(string instrumentationKey)
{
IServiceCollection services = new ServiceCollection();
services.AddApplicationInsightsTelemetryWorkerService(instrumentationKey);
// NOTE: We inject our custom ITelemetryProcessor here.
services.AddApplicationInsightsTelemetryProcessor<AzureStorageDependencyProcessor>();
IServiceProvider serviceProvider = services.BuildServiceProvider();
_telemetryClient = serviceProvider.GetRequiredService<TelemetryClient>();
}
Make note of the line where we call AddApplicationInsightsTelemetryWorkerService
- this will now inject my custom dependency filters, and tell my service to not ship anything related to dependency failures of "Azure table".
I hope this helps. Enjoy!
Links:
Recent comments