Unit tests. Benefits of unit testing with contract changes?

I recently had an interesting discussion with a colleague about unit tests. We discussed when testing unit testing became less productive when your contracts change.

Perhaps someone can tell me how to approach this problem. Let me clarify:

So let's say that there is a class that does some great computation. The contract says that it must calculate a number or it returns -1 when for some reason it does not work.

I have contract tests that test this. And in all my other tests I will drown out this excellent calculator.

So now I am changing the contract, whenever it cannot calculate, it will throw a CannotCalculateException.

My contract tests will fail, and I will fix them. But all my deceived objects will still use the old contract rules. These tests will succeed while they should not!

The question that arises is that with this belief in unit testing, how much faith can be placed in such changes ... Unit tests are successful, but errors will occur when testing the application. Tests using this calculator must be fixed, which takes time and can even be pointed / mocked many times ...

What do you think of this case? I never thought about that. In my opinion, these changes in unit tests would be acceptable. If I do not use unit tests, I would also notice that such errors occur at the testing stage (by testers). However, I'm not sure enough to indicate it will cost more (or less).

Any thoughts?

+49
unit-testing design-by-contract
Jun 03 '10 at 11:32
source share
9 answers

The first problem that you raise is the so-called "fragile test" problem. You make changes to your application, and hundreds of tests break because of these changes. When this happens, you have a design problem. Your tests were fragile. They were not completely separate from the production code. The solution (as in all such programming problems) is to find an abstraction that separates the tests from the production code in such a way that the volatility of the production code is hidden from the tests.

Some simple things that cause such fragility:

  • Testing the displayed strings. Such lines are unstable because their grammar or spelling may change at the whim of the analyst.
  • Testing discrete values ​​(e.g. 3) that should be encoded for abstraction (e.g. FULL_TIME).
  • Calling one API from many tests. You must transfer the API call to the test function so that when you change the API you can make changes in one place.

Testing design is an important issue that is often ignored by TDD beginners. This often leads to fragile trials that then force newcomers to dismiss TDD as "unproductive."

The second problem you raised is false positives. You used so many layouts that none of your tests tested an integrated system. Although testing independent devices is good, it is also important to test partial and complete system integration. TDD is not only unit tests.

Tests should be organized as follows:

  • Unit tests provide almost 100% code coverage. They test independent devices. They are written by programmers using the system programming language.
  • Component testing covers ~ 50% of the system. They are written by business analysts and QA. They are written in a language such as FitNesse, Selenium, Cucumber, etc. They test whole components, not individual units. They check, first of all, happy path events and some very noticeable accident cases.
  • Integration tests cover ~ 20% of the system. They test small component assemblies, unlike the entire system. Also written in FitNesse / Selenium / Cucumber, etc. It is written by architects.
  • System tests cover ~ 10% of the system. They test the entire system, combined together. Again they are written in FitNesse / Selenium / Cucumber, etc. It is written by architects.
  • Experimental manual tests. (See James Bach) These tests are manual, but not scripted. They use human ingenuity and creativity.
+90
Jun 03 '10 at 20:08
source share

It’s better to fix unit tests that don’t run due to deliberate code changes than don’t have tests to catch the errors that these changes ultimately made.

When your code base has good unit test coverage, you may encounter numerous unit test failures that are not related to errors in the code, but intentional changes to the contracts or code refactoring.

However, unit test coverage will also give you confidence in reorganizing your code and implementing any contract changes. Some tests will fail and need to be fixed, but other tests will ultimately fail due to the errors you entered with these changes.

+12
Jun 03 2018-10-06T00:
source share

Unit tests probably cannot catch all the errors, even in the ideal case of 100% code / functionality coverage. I think this should not be expected.

If the test contract changes, I (the developer) should use my brain to update all the code (including the test code!) Accordingly. If I can’t update some of the bullying, which thus still leads to the old behavior, this is my fault, not unit tests.

This is similar to the case when I correct the error and create a unit test for, but I can not think over (and verify) all such cases, some of which later also turn out to be erroneous.

So yes, unit tests need maintenance just like the production code itself. Without maintenance, they decay and rot.

+5
Jun 03 '10 at 11:51 on
source share

I have similar experience with unit tests - when you change the contract of one class, often you also need to change a lot of other tests (which in many cases will pass, which makes it even more complicated). This is why I always use higher level tests:

  • Acceptance tests - check a couple or more classes. These tests are usually built into user repositories that need to be implemented, so you check that the user story is “working”. They do not need to connect to the database or other external systems, but they can.
  • Integration tests - mainly for checking the connection of external systems, etc.
  • Complete end-to-end tests - validation of the entire system.

Please note that even if you have 100% unit test coverage, you cannot even guarantee that your application will start! That's why you need higher level tests. There are so many different levels of tests, because the lower you test something, the cheaper it is usually (in terms of development, maintenance of the testing infrastructure and runtime).

As a side note - because of the issue you were talking about using unit tests, you have to keep your components as loose as possible and their contracts as small as possible - which is certainly good practice!

+4
Jun 03 '10 at
source share

Someone asked the same question on the Google Group for the book, "Growing Object Oriented Software - Test Driven." Thread Fixed unit test / stub errors .

Here is JB Rainesberger's Answer (he is the author of Manning " JUnit Recipes ").

+3
Apr 11 2018-12-12T00:
source share

One of the rules for unit test code (and all other code used for testing) is to process it in the same way as production code - no more, no less - the same.

My understanding of this is that (besides preserving its relevance, refactoring, work, etc., as a production code), it should be considered in the same way as with regard to investments / costs.

Your testing strategy should probably include something to solve the problem that you described in the original post-something line by line, determining which test code (including stubs / mocks) should be reviewed (executed, checked, changed, fixed and etc.) when the designer changes the function / method in the production code. Therefore, the cost of changing any production code should include the cost of this — if not, the test code will become a “third-class citizen,” and the designers ’confidence in the unit test kit, as well as its significance, will decrease. Obviously, ROI is at the time of error detection and correction.

+2
Jun 03 '10 at 18:21
source share

One principle that I rely on is to eliminate duplication. Usually I don’t have many different fakes or mockups implementing this contract (because of this I use more fakes than bullying). When I change a contract, it’s natural to check every implementation of this contract, production code or test. It bothers me when I find that I am making such changes, my abstractions should be better thought out, perhaps, etc., But if the test codes are too burdensome to scale the contract, then I have to ask myself if this also associated with refactoring.

+1
Jun 03 '10 at 23:54
source share

I look at it this way, when your contract changes, you should consider it as a new contract. Therefore, for this "new" contract, you must create a completely new set of UNIT test. The fact that you have an existing set of test cases besides the point.

0
Jun 03 '10 at 11:52
source share

Second uncle Bob believes that the problem is in design. I will also go back one step and check the design of your contracts .

In short

instead of saying "return -1 for x == 0" or "throw CannotCalculateException for x == y", underspecify niftyCalcuatorThingy(x,y) with the precondition x!=y && x!=0 in appropriate situations (see . below). Thus, your stubs can behave arbitrarily for these cases, your unit tests should reflect this, and you have the maximum modularity, that is, the freedom to arbitrarily change the behavior of your system under the test for all unproven cases - without the need to change contracts or tests .

Designation, if necessary

You can distinguish your statement “-1 when it for some reason doesn’t work” according to the following criteria: Is the script

  • exceptional behavior that an implementation can test?
  • in the method / liability domain?
  • an exception that the caller (or someone previously in the call stack) can recover from / handle in some other way?

If and only if 1) - 3), specify the script in the contract (for example, that EmptyStackException is EmptyStackException when pop () is called on an empty stack).

Without 1) an implementation cannot guarantee specific behavior in an exceptional case. For example, Object.equals () does not indicate any behavior when the condition of reflexivity, symmetry, transitivity, and consistency is not satisfied.

Without 2), the SingleResponsibilityPrinciple fails, the modularity is broken, and users / code readers are confused. For example, Graph transform(Graph original) should not indicate that a MissingResourceException can be MissingResourceException because cloning through serialization is performed in depth down.

Without 3) the caller cannot use the specified behavior (specific return value / exception). For example, if the JVM throws an UnknownError.

Advantages and disadvantages

If you indicate cases where 1), 2) or 3) is not fulfilled, you get some difficulties:

  • The main purpose of the contract (design) is modularity. This is best implemented if you really share responsibilities: when the precondition (the responsibility of the caller) is not met, not specifying the implementation behavior leads to maximum modularity - as your example shows.
  • you don’t have any freedom to change in the future, even for the more general functionality of a method that throws an exception in fewer cases
  • exceptional behavior can become quite complex, so the contracts that cover them become complex, error prone and difficult to understand. For example: is each situation covered? What behavior is true if several exceptional prerequisites are met?

The disadvantage of the lower specification is that (test) reliability, that is, the ability of an implementation to respond appropriately to abnormal conditions, is more complex.

As a compromise, I like to use the following contract scheme where possible:

<(Semi-) formal PRE- and POST-condition, including exceptional behavior where 1) - 3). >

If PRE fails, the current implementation returns RTE A, B, or C.

0
Oct 07
source share



All Articles