Performance testing using FsCheck

Hi,

in order to improve the reliability of a comercial application that is supposed to generate some visualizations of data for users, I wanted to guarantee that while adding a new functionality to an application response time will not grow and remain at the acceptable level for users. I started searching the internet to find ready libraries/solutions. However, I encountered mainly solutions for scala/java, so I came up with the idea of writing FsCheck tests, which will be supposed to check the execution time of the function, rather than their full correctness. Thanks to this approach, I can generate a series of data, similar to those generated by users during day to day work with an application. With such data, I can easily see and eliminate weak points of the application. I did not find any information on the internet about crazy approach like that, which came into my head. I assumed that the test should perform some action (in my case, the API request with some parameters). The received response needs to be checked via its status, in my case the http response status (if it is equal to 200). This will ensure that the data in the response is correct (or looks correct on the basis of the unit tests), plus I am sure that the operation itself ended with a valid HTTP.

Creating tests that were supposed to check the execution time of data operations according to parameters is not different than normal unit tests, except that at the end of a spec there is also an assertion for an execution time. Therefore, the structure of such test(s) looks as follows:

Having a scheduled test structure, let’s look at their appearance and how the data will be generated. As I said before, the tests in their intent have to check the time of the data generation for some selection, where this selection depends on filters on the page. Filters are powered from the database. Generator for tests, builds test objects based on data retrieved from the database. This generator is shown below:

As you could see, at the beginning I set up dbscheme with the database from which we will extract the data. Then define the type that will be used in the test and declare the way the data would be generated by the generator, of course, the generation could be a lot more complex. However, the main clue of this post is not “How to create very complex data in generator”. In addition, you could see the Settings declaration that will retrieve configuration data from the app.config file thanks to the Fsharp.Configuration.

With the implemented data generation method, we can go to the test class. Which looks like this:

You will notice that each test will run 10 times. The test itself queries the given action by requesting GET and counts the response time from such action, as the only thing we are interested in is a HttpStatusCode, as I mentioned earlier, I only want to check the execution time, not the fully correctness of the returned data. You can see here the use of a couple of values from the app.settings file, which is available under the Settings class. This file looks like this.

In order to reduce the reusability of the code, run and runFor methods are extracted so they could be used in other tests. Thanks to such tests, we have the ability to test the application in terms of its correctness and speed of execution, which gives us practically instant feedback about the code we created, since these tests are performed throughout the entire Continuous Integration process in the project.

Everything looks cool and simple, but are there any problems with these tests?

Yes, because of testing a web application written in C# technology, it turns out that when tests run after a new deployment for a given environment, there are problems with the “warming up” of the environment due to the fact that tests are a kind of a “warmup” for the whole application. Because of this, the first made request is often noticeably longer than the rest, which may result in fail of the test just because of the previously mentioned warming up of the environment. How to fix that issue?

Of course, it is an infrastucture problem and should be fix’d in the process of deployment, so that the application is already heated up by triggering all diagnostic actions. You can also look, how applications which gives us an ability to performing some queries fix/resolve that. One such application is RESTful Stress thanks to which we have the ability to set how many tests should be run as a “WarmUp”, so they are not taken into account in the analysis of application time and are appropriately marked on the charts.

layout chart

Unfortunately, I did not find such a possibility in the case of FsCheck and Nunit (maybe I could not found it?). Although I noticed that you could implement a specific warmup of the application or part of the application by using the Setup attribute for NUnit, constructor for Xunit, or Establish for Machine.Specifications. Thanks to this, I can do one or n such queries in the test. Thanks to such an approach, I can guarantee some warming up of the environment. The code that is responsible for such “WarmUp” is shown below:

A new entry added to the configuration file:

As you could see, despite the small number of libraries/solutions that allow us to test application execution times, we are able to realize this with “home” solution using the FsCheck library.

I know there are ready-made solutions like App Insights on the Azure platform, Performance module in the Expecto library, or a great Benchmark.NET library, although in my case the application is not hosted on the Azure platform nor it is available in the developer environment “on outside” also Expecto allows you to compare the time of execution of 2 implementations, which is not the case in my case. Benchmark.NET allows measurement of the times that can be used later in some analysis, but I want a solution that will be connected under the whole process of Continous Integration and will give me a cyclical information about the violation of some of the pre-established norms on the time of execution of specific actions. (Although after creating my own solution, I found a library/package based on Benchmark.NET, which does theoretically what I would be interested in).

Of course, there is Nbench library, but in my case I was interested in large variety of queries that are being made with the generator in FsCheck.

As I have already mentioned in my project, this has resulted in cyclical monitoring of execution time of certain actions, which allows me/us to ensure a certain quality of the produced application. I encourage everyone to try out this solution also in their own projects, or blame me for not being able to search in Google.

Thanks for reading!