Designing your test methods using a simple structure such as Given-When-Then can help you communicate the purpose of your tests more clearly, focus your thinking while you're writing your tests, make test writing faster, make it easier to reuse parts of your test, highlight the assumptions you're making about test preconditions, and highlight what outcomes you expect, and what you're testing against. In this episode I'll be talking about designing your test cases, and test methods, using Given-When-Then. It doesn't matter if you're using pytest, unittest, nose, or something completely different, this episode will help you write better tests. [music] I'm Brian Okken, and this is the Python Test Podcast, a podcast about software development and software testing in Python. I announce new shows on Twitter at @testpodcast, and follow me at @brianokken. [music] [1:00] Structuring your tests not only makes it easier to read, it makes it easier to write and reuse. I'm really excited to get into this topic, but first I'd like to take a moment to thank our show sponsors. I don't talk about sponsors much on this show, but I do have them. This episode is brought to you by listeners like you who have pledged support for the show through my Patreon campaign at pythontesting.net/support. You can go directly to patreon.com/okken, but let's be honest, hitting my support page at pythontesting.net is easier to remember. But if you want, Patreon is spelled P-A-T-R-E-O-N, dot com, and Okken is spelled O-K-K-E-N. Right now, I'm trying to reach my first goal of sixty dollars per episode. When I get there, I'm going to outsource transcript writing, and Patreon supporters will be the first to access those transcripts. I'm also thinking of some other goodies for supporters. Hmm, maybe some stickers. At this moment, there are twenty individuals that are supporting the show. I'm going to thank them all right now, what the heck. Here goes: Thank you Hamza, Nils, Dan, Matthew, Bartelby, Andrew, Mahmood, Michael, Joe, Allan, Anthony, Richard, Jason, Dave, Javier, Aldo, and another Nils. You all rock, thank you tremendously. [2:22] Well let's get into the topic for today: designing your test cases using Given-When-Then. What I'm talking about here is the test functions and methods. Not the structure of your entire suite, but the individual tests. This applied to any framework, but I'm going to assume pytest for now, so I don't have to keep saying "pytest or unittest or nose or something else". Pytest doesn't care what you put in your test functions and methods. And it doesn't really care what goes into the setup and teardown functions and methods, or test fixtures. The syntax and mechanics of all that is pretty straightforward. If the syntax or mechanics trip you up, don't feel bad, just bookmark a good reference somewhere. Of course, I suggest that you bookmark pythontesting.net/starthere, and the fixture reference page for the appropriate framework, found on the start here page. I'll put links to those in the show notes. But once you've got the mechanics down, you can put whatever you want in there. If a fixture hits an assert or an exception, then your test will end in error, and if a test function or method, the main part of your test hits an assert, your test will fail. That's how it works. Great, there's the mechanics. But it still seems like a blank canvas, an empty page. What should you put in there? Well, just like artists and writers are frequently aided in their creativity by following a familiar structure, we too as test writers can use structure to not only get past the blank page problem, but also to achieve quite a few benefits. [3:58] Let's talk about structure first, and then we'll cover some of the benefits. There have been many different structures or outlines proposed as good models for writing tests. The models I'm familiar with really seem like the same thing with different names. Let's start with Given-When-Then. The model I use now, and love most, is called Given-When-Then. It's just so darn easy to remember, and it puts me right in the right mindset for thinking about my tests, expanding the tests and reusing parts. It's pretty basic. Given some context for your test to run in, When some action happens, Then some consequences are expected, either output from the action or side effects, that can be tested. A simplistic way to start is to just separate the code in the body of your test functions into three visually separate chunks. I usually separate the sections with a blank line or two. You can also put a comment at the top of the section with those very keywords. If you're writing whole sentences in the comment, maybe put Given-When-Then in all caps. Like, for instance, comments might be: # GIVEN a mobile is registered, WHEN a test-mode data connection is initiated, THEN the call should connect [5:16] This Given-When-Then structure is borrowed from BDD, Behaviour Driven Development. I think it's the only thing I've taken from BDD actually. BDD has a lot of baggage that I'm not quite ready to deal with yet, but I love the Given-When-Then. This especially becomes super powerful if you don't even put the Given in the test function proper. Put it in a fixture. Put it in setup, or a class for a class or a module, or better yet, put it in a named pytest fixture. The power of putting the Given in a fixture is that if you can't get through the Given portion, say you hit an assert, then the test doesn't end in failure, it ends in error, and, you know, that makes sense. The thing that you are testing is the When part, and if you can't get through the Given, it's not a test failure, you just didn't even get to the point where you could make the test. And also, if you put everything, the Given part in the fixture, then the test body only has two parts, the When and the Then. For some tests, the Given will be setting up test data, but it could also be getting the system into a proper state. For me, when testing embedded electronic instrument code, the Given or the setup is doing things like configuring RF ports, setting cable losses, loading arbitrary waveform generators or many fun things like that. But for you it could be quite a lot simpler, or it even could be empty. If the action you are taking in the When section should have the same side effect regardless of the state the system is before, then there's nothing to put there. I suggest being explicit though and putting a comment like "# Given any state", before moving on to the next sections. This ensures that the future test maintainers know what you did, and that you did think of what the preconditions were, and there weren't any. [7:18] The When section should be very readable and very obvious what's going on. The When section should be doing one thing. Even if that thing is complex, it should be something that a user would think of as doing one thing. The Then section is where you check the postconditions, and look for observable side effects. That's where all the asserts are. Some people will tell you that you should only have one assert per test, but that's rubbish. What they're talking about is a very narrow definition of Test Driven Development, a definition that doesn't include all the levels of testing that I concern myself with on a daily basis. If your action from the When section has like fifteen observable side effects and a function output, then by all means, go ahead and put sixteen assert statements in there. I usually only have a few really, but this totally depends on the test, what you are testing, your domain, and many other factors. There's more that I have to say about Given-When-Then, but I'd like to take a break from it and discuss the other names for this test design structure. One that you may be familiar with is Setup-Test-Teardown, it's also sometimes expanded to Setup-Exercise-Verify-Teardown. This is for the most part just like Given-When-Then, with an additional teardown step. The setup is equivalent to Given, Exercise is equivalent to When, and Verify is equivalent to the Then part. When written as Setup-Test-Teardown, the Test portion is both the When and the Then. So what's then Teardown? Well for a lot of you it's nothing, empty, nothing to do. It will be something important when you really need to undo whatever you did in the Setup, or in the Exercise portion. Let's say you're testing a transactional system. You can use the Teardown to roll back the transactions to the state before the test started. In my case, I might break a data connection with a mobile device, or make sure the power levels in the system are at safe levels, or reset a switch matrix to safe paths through the system. The teardown step is present in Given-When-Then as I use it, it's just not the hard part, so I don't mind not having it explicitly part of the name. When using pytest named fixtures, you will write the teardown as part of the fixture itself. Well, write with it anyway, in the form of a finaliser function, so the test proper doesn't have to think about teardown. Of course, if, in the test proper, the When section, say, you need something undone in the teardown, we need to make sure that happens even if an exception or an assert causes the test function to not complete. Okay, that might have been confusing. If I don't really have something in the Given state that I need to undo, but I do something in the When action that needs to get undone, a great way to do that is to actually include a test fixture setup that doesn't actually do anything, except for register a finaliser, and then the finaliser can clean things up. [10:22] Another name for this structure is Arrange-Act-Assert. Now, this should be really obvious how that maps to Given-When-Then. Arrange is Given, Act is When, and Assert is Then. Come to think of it, I kind of like Act better than When. Maybe Given-Act-Then? No, maybe we could name it Given-Act-Assert? No, we lose the alliteration, and I won't remember it. I seem to have no trouble remembering Given-When-Then, so I guess I'll stick with that. Are there any other names? Well, an older one is Preconditions-Trigger-Postconditions. That's not bad. Again, Given is the Preconditions section, When is the Trigger, and Then is the Postconditions. I don't really like that name for the name of the thing, but thinking about the Given as preconditions and the Then as postconditions, that's kind of cool and it helps me understand what they're for. Yeah, it's actually pretty good, but I'm not writing those in my comments, Preconditions and Postconditions, those are too long words, too much typing. And I can't make a sentence out of it. Let me know if you're running across any other variations, I can't think of any right now. If you do, a great place to do that would be in the comments section of the show notes at pythontesting.net/10, the number one zero. [11:45] Okay, I promised I'd talk about the benefits of using a pattern like Given-When-Then, or whatever variation that we've discussed that makes the most sense to you. So, the benefits. Splitting up your test function like this --Okay let's just assume Given-When-Then so I don't have to cover them all, and of course the teardowns and optional finalisers are implied in there --has many benefits. Hopefully these will make sense to you now. I listed them at the beginning, but let's cover them again. Communicate the purpose of your test more clearly. Well, having the When section simple and separated by whitespace will highlight for you, and for others reading the code, what it is that you're testing, especially if the test method name directly relates to the action in the When section. It really helps clarify what you're trying to test. If the name seems too long, or there's too much code in your When section, just review it. Should this really be one test, or should you split it up into more than one? I'm not telling you what the right answer is, I've got plenty of big-ish tests that make sense the way they are, just make sure it's clear what's going on. [12:53] Next benefit: focus your thinking while writing the test. Only thinking about one section at a time really helps to clarify thinking and coding. Kind of hard to put in words, but it really does help make it easier to know what to write. Another benefit is making test writing faster. Working within the constraint of Given-When-Then, and the focus you gain, really does make it faster. Also, you can look at the set of tests with the same Given section and using the same setup and decide if you've tested all the actions available to the user with that Given state. If not, write more tests with the same Given, but with different actions. And of course, the Then postconditions will need to be reexamined. This is also related to the next benefit: make it easier to reuse parts of your code. You can look at your tests now, and think about if the Given really represents the only states of your system where the When action can occur. If not, then you can add more tests with the same action but with different Given states. Of course, you've got to reexamine your postconditions then. I really just described two ways to reuse your test parts to make new tests: reusing the Given and adding a different action in the When section, and reusing the When section by adding different Given sections. These two kinds of reuse to create new tests are part of what's called behaviour coverage. Specifically I'm talking about state coverage and transition coverage. I'm going to talk about behaviour coverage, state coverage and transition coverage in a future episode. For now, just realise that separating the Given and the When section helps highlight the states being tested, the Given, and the changes to the state, the When, and this separation allows you to review your tests and see if you've missed some obvious actions or starting states. [14:52] Another benefit is to highlight the assumptions you are making about the test preconditions. Think I've kind of covered this already. Do you have all the reasonable Givens for the functionality you are testing in the When section? A few more benefits are to highlight what outcomes you are expecting and testing against, to highlight missing tests and highlight missing functionality. This is a smidge harder to get your head around. Looking at the Then sections associated with related tests, ones with either common Given or common When sections, or the starting states and transition out of those states, umm this is getting confusing. The Then section checks for postcondition states, represents the final state of the system after the actions. If you're pretty darned sure there are final states that isn't represented in a test, then you might have missing tests, or you might have missing functionality in the system. This is still kind of confusing. Let me give a simple example. Here's a couple tests that I can kind of describe reasonably well in audio form. [15:54] Two tests. One: test subscribe when not registered, and another one, test subscribe when already registered. So the first, when not registered. Given a user is not subscribed to a newsletter, when a user subscribes, then the user's email is now part of the newsletter email list, and the user is told that they are now subscribed. Next test: test subscribe when already registered. Given a user is already subscribed, when the user subscribes to the newsletter, then no changes are made to the email list and the user is told that they are already subscribed. OK, I've got a set of two tests. Let's look at the set. I have two tests with the same action, subscribing. They come from two different starting states, either subscribed or not subscribed. But I don't have any postconditions where the user ends up in the not subscribed state. If there's no unsubscribe functionality, then I've just noticed a missing functionality of my system, and we should add it. If that functionality already exists, to unsubscribe, then I've just forgotten to write those tests for it, and I can go write those tests now. [17:06]Another thing that I'd like to talk about now is, doing all this reuse, and reuse can be aided by fixture parameterization or test parameterization. That's a big word. It's such a common case to have these reusable parts that you want to build lots and lots of tests with, with shared parts, but these cases can be handled by pytest almost automatically. So you can write fewer test but still cover all the starting states and functionality you need to. This is called fixture parameterization, or test parameterization. I'll talk about that in a future episode as well. I do touch on it in the params section of an article I wrote called pytest Fixtures: Nuts and Bolts, and I'll have a link to that in the show notes. I think that's enough from the benefits list right now, I think I've shown that structuring your tests not only makes it easier to read, it makes it easier to write and reuse components. But wow, I've highlighted a lot of areas, I think, that I need to cover in more detail in future episodes. Also this would be a great episode to have a transcript for, don't you think? I'd like to get those transcripts written, so please, if you haven't already, sign up for the Patreon campaign and help me get to the sixty dollar goal so I can get transcripts written, and so you can get early access to them. [18:28] This has been another episode of the Python Test Podcast. Today's show was sponsored by listeners. Thanks again for everyone for boning up a buck or two per episode. Become a supporter yourself, visit pythontesting.net/support. This show was episode ten, and the show notes are at pythontesting.net/10. Be sure to subscribe to the show, search for Python in any podcast client. My logo's the black and white one. You can also find the iTunes and direct RSS feeds at pythontesting.net/podcast. This is Brian Okken, thanks for listening. [music]