AppInsights logging full requests with hidding sensitive data
In our application, daily, we are using Application Insights. To look at the details of a request/response, we store the body of them in a separate Log database. As long as the application growth, we thought it would be easier for us to have everything in one place. So we planned to involve everything in ApplicationInsights.
Hi,
in our application, daily, we are using Application Insights to report all requests to and from us so we could monitor the actual status of a system. Thanks to that we are sure that our clients wouldn’t experience unpleasantness because of not working or slowly working application. To look at the body of each request and response, we store them in a separate Log database. As long as the application growth, we have more and more users and regions to handle we thought it would be easier for us to have everything in one place. So we planned to involve everything in ApplicationInsights.
All of this sounds pretty easy, but we encounter a couple of problems.
First of them was that when we listen in middleware for a request/response to/from a server so we could send it body to an Application Insights is not that easy, when we tried to read a data stream we get an exception. This is caused by built-in middleware which already read the body but didn’t keep a stream in a state that would be ready for another read (link to SO). So as a workaround to this problem, we decided to cache a request/response body in an in-memory cache for a very short period. So we could read it when we want to send it to an Azure. The important thing here is that we cache this data per instance of a service, so every instance has it’s own cache for request/response body so the availability time would be very low. Also because of that we used a very short lifetime and bounded cache size, so it wouldn’t be growth to the enormous sizes. The caching approach itself we implement like this:
We could found here a .net core filter, which is responsible for catching requests/response to/from server and then save them in a cache.
TelemetryEnricher implementation looks like this:
Cache implementation looks like this and in fact it encapsulates a IMemoryCache.
Encapsulation here is not a mistake, because we want to have full control of how an instance of a cache is created in concrete service, so the constructor has a lifetime, size parameter. Also, we could see here how we build keys, this is because we want to distinguish the request from response bodies.
Okay, we listen to requests and responses but how the call to AppInsights looks like?
As we could see we have to implement an ITelemetryInitializer interface from AppInsights nuget package, which is executed to send data to AppInsights. Here we get data from our cache. We don’t delete them since we know the lifetime is set to very short so the delete operation has no sense here.
We also have to implement our ITelemetryProcessor to say when we want to send data to Azure:
We run application and we could found records as such on Azure:
Everything looks good. But are we sure? As we could look at the data, we don’t hide anything so we could send sensitive data to Azure which is not a good idea when we have a RODO/GDPR. Because of that, we introduce an attribute name it Sensitive, so we could mark a whole class, field or property we want to hide from logging to ApplicationInsights. We would change it’s content to a PII Data string, so the similar behavior to hiding sensitive data in Connection Strings by Azure.
Attribute code looks like this:
And how it is consumed you could see here:
We get an object we want to log to ApplicationInsights, we check all fields and properties if they should be hidden (based on an attribute) if so we change their value to PII Data. The above code works also for nested objects.
We also need to include NoPIILogContractResolver to IRequestDataAccessor where we operate on JsonConvert, we did this like this:
We run code once again and we could find records like this in ApplicationInsights:
To sum up everything. In 3 very easy steps, we were able to add logging to ApplicationInsights which logs almost full request and response without any sensitive data (of course as long as a developer would mark all sensitive parts by attribute). But thanks to the above in a future we could more easily and faster encounter a problem with our application, which would help us save some time for other things. Only 1 thing to consider here is consumption of database under the hood for ApplicationInsights, maybe it would be a good idea to somehow limit the number of data logged per single request/response so some very big bodies wouldn’t be logged as a full but only in some small parts.