How to Write Unit Tests for Logging (opens in new tab)

(principal-it.eu)

56 pointsJanVanRyswyck5y ago37 comments

37 comments

30 comments · 8 top-level

valenterry5y ago· 8 in thread

There are two types of logging:

a) Logs which break the program semantics when missing. That means, the program does not work "as expected" anymore for any directly or indirectly impacted enduser.

b) Logs that have no semantic impact when missing. No one will notice that they are missing, unless there is any other problem. These logs are solely meant to better understand the program operations.

Logs of type A should be tested like everything else. But logs of type B - I don't think I would test them.

While it is not nice to have a failure in production and figuring out only at that time that the logging doesn't work, I think that is even worse to break tests because of logging changes. Logging usually is so easy that it is an all- or nothing thing. I'm not saying that it does not have advantages to test for these kind of logs, but from my experience it is not worth the costs in 99% of the cases.

rixrax5y ago

Back in the day when everything was written in C/C++ it wasn’t unusual that seemingly well tested software had careless memcpy or sprintf left in the error/log execution path that then exposed that system to remote code execution.

Just wanted to throw this in as to discourage completely ignoring testing logging. Of course e.g. good static analysis should somewhat alleviate chance of surprises here.

sorokod5y ago

I would suggest a third type:

c) Logs that when present, "break" (using the term loosely) the application. The example here is when application is logging sensitive data, e.g. PII

geewee5y ago

Completely agree with this. I've wanted to test type A logs, and often heard people saying that "you should never test logs" No - you should never test most types of logs - sometimes logs, particularly in error situations, are part of the output of a function.

parhamn5y ago

This sounds like a nomenclature thing. I’m sure any reasonable developer would agree it makes sense to test expected output of a program that isn’t strictly for debugging purposes (type A mentioned above), it’s just that is usually called output not “logs”.

zoomablemind5y ago

> ...and often heard people saying that "you should never test logs"

It's surprising that it gets labeled as "logs". Don't we test the expected behavior of the _code_ under test?

Be it a business or an auxilliary code like logging, it's still code, some function/class/facility...well, any purpose added code.

Thus, I would test such logging facility for correctness; otherwise, the next time its output is needed, I want to be sure that it can be trusted.

Of course, using ready-made logging frameworks may help shift such burden of testing onto the framework's developer.

marcosdumay5y ago

I agree, but...

> Logging usually is so easy that it is an all- or nothing thing.

By all means, have the integration test that verifies that logs are produced at all (on each mechanism, if you support more than one). And it's a bonus if that test makes the deployment (or installation if that's your thing) fail.

ivanbakel5y ago

> Logs which break the program semantics when missing.

What's an example of this kind of logging? I can imagine perhaps legally-mandated audit logs for financial software, but that would be stretching the definition of "semantics" in my mind.

xboxnolifes5y ago

Maybe I'm misunderstanding the parent, but I interpreted as the same kind of "logging" as the standard output of a CLI application. You expect a grep tool to output result to standard output for the user to use. Or you would expect user-facing errors to occur if an sql client fails to connect to a server.

sgt1015y ago· 7 in thread

Let's write unit tests for unit tests!

ThePadawan5y ago

To be fair, I worked at a company that had CI for running unit tests so nothing could be merged into remote master until the build was green.

The build asserted that the number of failing unit tests was 0.

It turns out that "breaking the unit test library completely such that 0 tests run instead of all thousands of them" did not break the build.

When someone noticed a week later, that was not fun to clean up.

the_af5y ago

Tests that test nothing useful are surprisingly easy to write, and lead to a similar result -- even in cases where you do run them! That's why it makes some sense to test the tests (or at least, check their bug-detecting quality).

1 more reply

rwbhn5y ago

Recently added an assertion that the # of tests succeeding was > 0 to cover a similar issue.

the_af5y ago

> Let's write unit tests for unit tests!

Tests for tests exist. One such technique is called "mutation testing" [1].

I don't know of any job that actually uses them, but they make sense to me: the tests need checks too, otherwise how do you know they are adding value? It's too easy to write worthless tests, see them pass, and get a false sense of security. Techniques such as mutation testing check how sensitive your tests are to actual bugs (or program changes): do they detect them? If not, they aren't useful.

[1] https://en.wikipedia.org/wiki/Mutation_testing

aurelianito5y ago

I view the tested code as the test of the tests.

solarengineer5y ago

When that starts to happen, it’s time to review what exactly one is Test Driving.

je425y ago

so, on that topic:

- most test frameworks use unittests to test themselves.

- tests need to have value. the dev/team/organization needs to come up with the sensible measure of value that guides poeople what to test and how to test it.

u801e5y ago· 6 in thread

It seems that the boundaries of what a unit test should cover have never been clearly defined. In my opinion, unit testing should only cover logic that does not have side effects (like writing to disk or communicating over a network). If you want to test external side effects, you should write functional or integration tests for them.

pydry5y ago

I wish more people believed this. The number of lines of code I've seen written of pointless unit tests is staggering.

If your code is mostly passing data around and managing side effects a unit test is more like a parasite. It will couple to your code and fail noisily when refactoring but not provide any actual useful feedback.

closeparen5y ago

My armchair psychoanalysis is that it’s precisely poor effort:reward ratio that makes people feel superior for writing and demanding them. Look at me, I am a responsible engineer who tests thoroughly, you childish cowboys are trying to wriggle out of doing your chores.

No, I’m still going to sweep the floor, just not with a toothpick.

gnusty_gnurc5y ago

Unit tests are fine for testing side effects if you've written the code properly with proper interfaces, etc.

You need to test that you're using the contracts you've made, and integration tests verify that everything works as expected when both sides are complying with the contract.

I'm saying that if I've abstracted out the filesystem into an interface the distinction of asserting against whether a method was called on that interface and asserting a side-effect is somewhat trivial. I've made it so that side-effects must be handled in integration tests.

u801e5y ago

> I'm saying that if I've abstracted out the filesystem into an interface the distinction of asserting against whether a method was called on that interface and asserting a side-effect is somewhat trivial.

But now you're enforcing a degree of coupling between your filesystem interface and your code. If you refactor the filesytem interface such that the overall logic hasn't changed, but the parameter order for one of the methods is updated, then you essentially double the amount of work you have to do (update the code and update the tests). If you just test the logic and not the interface, then you only have to update the tests that pertain to the logic itself.

You should read up on the concept of functional core/imperative shell.

1 more reply

brlewis5y ago

I don't think clearly defined boundaries are what's needed. People just need to keep in mind that the purpose of tests is to catch bugs. When I modify this code, what tests will tell me that it still does what it's supposed to? Keeping that question at the fore will work better than any cookbook.

u801e5y ago

> When I modify this code, what tests will tell me that it still does what it's supposed to?

While that's true, you don't want to end up in a situation where the test is enforcing coupling which makes it harder to refactor your code. So, for instance, if you have tests that assert whether the method tested calls another method with certain parameters in a certain order, then you're enforcing coupling between the two methods. If you refactor the code to have the first method return a result and then use that result to call the second method, then you also have to update the tests. But they shouldn't have to be updated since the overall logic hasn't changed. If, instead, you test the logic of each method individually, then you can refactor and not have to update the tests.

1 more reply

sethammons5y ago· 1 in thread

In Go, I test my log output easily: my logger instance’s output is set to a buffer, call the function under test, assert log contents. If another system depends on it, it gets tested. Our acceptance level tests operate largely on logs.

luxurytent5y ago

I've mostly spent my career in Python/Javascript and recently jumped back into Go. How a Go programmer can use an interface to effectively mock another object feels very natural and core to the language, I like it.

(I'm probably using incorrect terminology here, so happy if someone corrects me)

henrik_w5y ago

I usually don't test the logging. However, sometimes I get tests for the logging messages "for free". That happens when I break out some logic in its own function, and I want logging on what happened in that function (definitely not all the time, but is sometimes useful). Then the function can return its result, and a list of logging messages to output. When asserting the result, I can also assert on the logging messages.

It's what I describe in "Returning a message list" in this post: https://henrikwarne.com/2020/07/23/good-logging/

tlarkworthy5y ago

You can use log and monitoring signals to whitebox test certain implementation details to an otherwise stateless interface.

I would say it's not great practice, but, it can remove a lot of contortions to be able to make assertions about the execution path taken when doing a unit test.

If I have a tricky thing to test. I ask, if I added the required operational monitoring (e.g. queue sizes)? Would I suddenly be able to test this easier? Would it make the test more self explanatory. If the answer is yes I might add the monitoring and then do the unit testing. If the answer is no I have to refactor the implementation for testability.

conclusion: exposing logs and especially monitoring counters to unit tests can actually super charge your testing and leave you net positive

sandermvanvliet5y ago

I’ve written a sink for the Serilog structured logging library specifically with this in mind: testing logging from code.

When doing TDD there is no reason to skip over logging (although that does seem to happen, everyone likes print debugging ;-) it’s just that sometimes it’s a bit clunky to which is why I wrote this tool. It supplies a secondary package that has FluentAssertions extensions to make testing even easier.

Code/packages can be found here: https://github.com/sandermvanvliet/SerilogSinksInMemory

Edit: formatting

woldrich5y ago

Log4J2 has a testing library that is pretty dope, you can use it to assert on log events and parameters passed and tweak your appenders so that you're not spamming the console while the test runs.

Here's the coordinates in gradle:

compile group: 'org.apache.logging.log4j', name: 'log4j-core', classifier: 'tests'

j / k navigate · click thread line to collapse

37 comments

30 comments · 8 top-level

valenterry5y ago· 8 in thread

There are two types of logging:

a) Logs which break the program semantics when missing. That means, the program does not work "as expected" anymore for any directly or indirectly impacted enduser.

Logs of type A should be tested like everything else. But logs of type B - I don't think I would test them.

rixrax5y ago

Just wanted to throw this in as to discourage completely ignoring testing logging. Of course e.g. good static analysis should somewhat alleviate chance of surprises here.

sorokod5y ago

I would suggest a third type:

c) Logs that when present, "break" (using the term loosely) the application. The example here is when application is logging sensitive data, e.g. PII

geewee5y ago

parhamn5y ago

zoomablemind5y ago

> ...and often heard people saying that "you should never test logs"

It's surprising that it gets labeled as "logs". Don't we test the expected behavior of the _code_ under test?

Be it a business or an auxilliary code like logging, it's still code, some function/class/facility...well, any purpose added code.

Thus, I would test such logging facility for correctness; otherwise, the next time its output is needed, I want to be sure that it can be trusted.

Of course, using ready-made logging frameworks may help shift such burden of testing onto the framework's developer.

marcosdumay5y ago

I agree, but...

> Logging usually is so easy that it is an all- or nothing thing.

ivanbakel5y ago

> Logs which break the program semantics when missing.

What's an example of this kind of logging? I can imagine perhaps legally-mandated audit logs for financial software, but that would be stretching the definition of "semantics" in my mind.

xboxnolifes5y ago

sgt1015y ago· 7 in thread

Let's write unit tests for unit tests!

ThePadawan5y ago

To be fair, I worked at a company that had CI for running unit tests so nothing could be merged into remote master until the build was green.

The build asserted that the number of failing unit tests was 0.

It turns out that "breaking the unit test library completely such that 0 tests run instead of all thousands of them" did not break the build.

When someone noticed a week later, that was not fun to clean up.

the_af5y ago

1 more reply

rwbhn5y ago

Recently added an assertion that the # of tests succeeding was > 0 to cover a similar issue.

the_af5y ago

> Let's write unit tests for unit tests!

Tests for tests exist. One such technique is called "mutation testing" [1].

[1] https://en.wikipedia.org/wiki/Mutation_testing

aurelianito5y ago

I view the tested code as the test of the tests.

solarengineer5y ago

When that starts to happen, it’s time to review what exactly one is Test Driving.

je425y ago

so, on that topic:

- most test frameworks use unittests to test themselves.

- tests need to have value. the dev/team/organization needs to come up with the sensible measure of value that guides poeople what to test and how to test it.

u801e5y ago· 6 in thread

pydry5y ago

I wish more people believed this. The number of lines of code I've seen written of pointless unit tests is staggering.

closeparen5y ago

No, I’m still going to sweep the floor, just not with a toothpick.

gnusty_gnurc5y ago

Unit tests are fine for testing side effects if you've written the code properly with proper interfaces, etc.

You need to test that you're using the contracts you've made, and integration tests verify that everything works as expected when both sides are complying with the contract.

u801e5y ago

You should read up on the concept of functional core/imperative shell.

1 more reply

brlewis5y ago

u801e5y ago

> When I modify this code, what tests will tell me that it still does what it's supposed to?

1 more reply

sethammons5y ago· 1 in thread

luxurytent5y ago

(I'm probably using incorrect terminology here, so happy if someone corrects me)

henrik_w5y ago

It's what I describe in "Returning a message list" in this post: https://henrikwarne.com/2020/07/23/good-logging/

tlarkworthy5y ago

You can use log and monitoring signals to whitebox test certain implementation details to an otherwise stateless interface.

I would say it's not great practice, but, it can remove a lot of contortions to be able to make assertions about the execution path taken when doing a unit test.

conclusion: exposing logs and especially monitoring counters to unit tests can actually super charge your testing and leave you net positive

sandermvanvliet5y ago

I’ve written a sink for the Serilog structured logging library specifically with this in mind: testing logging from code.

Code/packages can be found here: https://github.com/sandermvanvliet/SerilogSinksInMemory

Edit: formatting

woldrich5y ago

Log4J2 has a testing library that is pretty dope, you can use it to assert on log events and parameters passed and tweak your appenders so that you're not spamming the console while the test runs.

Here's the coordinates in gradle:

compile group: 'org.apache.logging.log4j', name: 'log4j-core', classifier: 'tests'

j / k navigate · click thread line to collapse