For those who are not sure what is meant by โlimited non-determinism,โ I recommend the Mark Seeman post .
The idea is a test that has deterministic values โโonly for data that affects the behavior of SUT. Non-relevant data may be somewhat random.
I like this approach. The more data is abstract, the more clear and expressive expectations become, and it really becomes more difficult to subconsciously select data for a test.
I am trying to โsellโ this approach (along with AutoFixture ) to my colleagues, and yesterday we discussed this for a long time.
They suggested an interesting argument that it is unstable to debug tests because of random data.
At first it seemed a little strange, because, since we all agreed that the flow affecting the data should not be random, and this behavior is impossible. Nevertheless, I took a break to fully reflect on this problem. And I finally came to the following problem:
But some of my assumptions:
- The test code MUST be considered a production code.
- The test code MUST express the correct expectations and characteristics of the behavior of the system.
- Nothing warns you about inconsistencies better than a broken assembly (either not compiled or just failed tests - closed registration).
consider these two variants of the same test:
[TestMethod] public void DoSomethig_RetunrsValueIncreasedByTen() { // Arrange ver input = 1; ver expectedOutput = input+10; var sut = new MyClass(); // Act var actualOuptut = sut.DoeSomething(input); // Assert Assert.AreEqual(expectedOutput,actualOutput,"Unexpected return value."); } /// Here nothing is changed besides input now is random. [TestMethod] public void DoSomethig_RetunrsValueIncreasedByTen() { // Arrange var fixture = new Fixture(); ver input = fixture.Create<int>(); ver expectedOutput = input+10; var sut = new MyClass(); // Act var actualOuptut = sut.DoeSomething(input); // Assert Assert.AreEqual(expectedOutput,actualOutput,"Unexpected return value."); }
Until now, God, everything works, and life is beautiful, but then the requirements change, and DoSomething changes its behavior: now it increases the input only if it is less than 10 and multiplied by 10 otherwise. What's going on here? A test with hard-coded data passes (almost by accident), while the second test sometimes fails. And both of them mistakenly cheat on tests: they test non-existent behavior.
It does not seem to matter if the data is hard-coded or random: it is simply inappropriate. And yet we have no reliable way to detect such "dead" tests.
So the question is:
Does anyone have any good advice on how to write tests on how such situations do not appear?