Grey is made up of Black and White

There's an idea that has been on the back of my mind for a while now.

In certain testing (and development) circles, one hears about "good enough" software. I get it. Nothing's perfect and no software can ever be completely bug free, so the best you can hope for is to have caught and fixed the most important bugs (to your stakeholders) before you ship. Or something like that. I'm paraphrasing.

The thing that irks some people is that "good enough" is not carved in stone. It has a human element to the decision-making process that some bean counters might prefer to do without.

I think I've come to appreciate that definition over time because I've had the benefit of seeing many projects of different types for different companies go out over the last decade. With experience, you develop a sense of when things are in your comfort zone (i.e. acceptable risk) compared with when they are not (i.e. this risk gives me the willies - ship it if you dare but I'm going to duck and cover).

That part I'm okay with. That's not what's troubling me. The thing I'm wondering about is where do you start to teach someone this?

So, when I first started to Test, I started by working on some automation tests, and then moved to developing some written test cases and test plans. Over time, I came to appreciate that the brain tends to move quicker than paper and that there are some testing practices out there that can help you test more and provide you with more good information and feedback in the time given on a project. You can choose to spend that precious quality time with paperwork or do more testing. The choice is yours.

Fast forward a few years. I've chosen to teach Exploratory Testing to people who are new to the Testing field. It's an interesting and challenging approach and I love how it gets you to think bigger than you've thought before.

So, you decide to test something. Did the test "Pass"? Did it "Fail"? How do you know? Are you sure? Can you check? ... and so on...

This leads to discussion of Oracles - i.e. a principle or mechanism by which you can tell something is a bug. Funny thing about Oracles - they're like siblings. Each one is unique, they're similar/related in some ways but different in others, and they all think they're the most important and want to be in front.

As a tester, you need to decide *which* is the most relevant Oracle to help you decide if something is a bug or not.

For example, I often explain to new testers that the app might match the specification when you compare them, so is that a Pass? The simple answer is "yes"... assuming the only thing you care about is comparing the application to a particular claim. So what's the problem?

Well, the problem I have is the assumption that the spec, the claim, is correct in the first place. What if they both match each other but I don't care because it goes against what I *expect* as a user? That is, when you use computers for a while you develop a general understanding of how many apps work in similar ways for similar functions. And then along comes some hot-shot spec-writer who thinks they're going to be awesome and develop something completely new and different for ... oh, I dunno, the 'Print' function.

"Okay," you say to the spec-writer, "you might do that.. but then again, reinventing something that everyone has a certain idea of how it should work might lead to mistakes, frustration and user/operator errors that are not so good."

So, back to the tester, the app matches the spec, but they're both wrong I say because neither matches my expectations. Where's the Pass? Where's the Fail? It's not so easy anymore. In fact, you might say there are shades of grey when it comes to determining Pass/Fail - especially if you find yourself in the situation where there are 3 or more applicable Oracles. (eek!)

That's where I think my line of thinking went wrong a while back. It's not about shades of grey, it's clearly about determining which sibling, which Oracle, is the most applicable here. You *have* to pick one. There has to be a binary, logical Pass/Fail answer to the question "Is there a problem here?" One Oracle needs to edge out more than the others. You may not be able to make that decision on your own, and that's okay. But someone, or some team of people, needs to reach a consensus about whether or not something is a bug according to some oracle that they care about most in the given situation.

At the end of the day, it doesn't matter if your brain is firing on all cylinders or none, a "black and white" decision needs to be made about whether or not something is a bug. Computers are logical things. I think it makes sense that we need to apply logic in this situation too.

But hold on here, where does this fit in with the whole "good enough" software thing I started talking about a few minutes ago? That doesn't sound very logical. In fact, it sounds kind of like the opposite, doesn't it?

Why, yes, it does sound illogical, and yet there is logic in it too. The idea came to me when standing in a Wendy's looking at a poster while waiting to order lunch. The poster had a picture made up of smaller photos.. a photo mosaic it's called. (Aside: Google search turns up lots of cool info on these.)

And then it hit me. Aha!

Determining "good enough" is like finally recognising the picture of the quality of the release when you spend every day looking at all of the smaller 'quality' photos. These dozens, hundreds, or thousands of 'black and white' decisions that are made over the course of a project (i.e. the bugs you find) paint a picture. Some people pay attention to these, while others do not (to their detriment).

Good development teams come up with Release criteria that make sense and often contain human elements in the final go/no-go release decision. Different project team members may have different pieces of the picture. As a Test Lead/Manager, you bring information about the tests that have been run, the risks that were covered, and the kinds of bugs that were found, reported and fixed (and the ones left outstanding). Sometimes you might even have other metrics that you feel are important in helping you recognise "good enough" when you see it.

I like the idea of a photo mosaic as an analogy for good enough software. You may never get to see the complete picture, but you might not need to either. When the picture is too fuzzy, then you should recognise that you might not be collecting all the information you need to help make a good timely decision.

As a tester, you need to be concerned with all the logical decisions in the Pass/Fail criteria world that helps you identify bugs on a daily basis. But don't get attached to any one particular bug, because in the grand scheme it may not be as important as you think it is. (It *might* be, but be aware that it might *not* be too.)

As a Test Lead or Manager, you need to step back from all the separate little details and ask yourself what patterns you recognise in the complete body of [testing] work that has been performed? One might even use the cliché "can you recognise the forest for the trees?" but I like my photo mosaic analogy better. ;)

I like how the colour grey is made up of mixing the colours black and white together. And similarly, good test management decisions, which may seem 'fuzzy' to some, are also based on the understanding of all the little logical decisions made by the testers on the team.

These are different skills (between tester and test lead). I haven't been specifically teaching testers to see the big picture and patterns in work beyond their own. Where does that fit in? When? I don't know that I'm comfortable doing that just yet.

I think I got the teaching testing part down, so I guess the next logical step is looking into the skills required to manage good test teams.

Any thoughts or recommendations? Is it just something that you learn by trial and error, by osmosis, working on different projects for different companies? (Nahh, that's how I learned.. it takes too long.)

This, I think, is going to be the next big need for training in the testing field. Let's say for a moment that you've trained testers in good/different testing practices. So how do you train the managers or leads to manage in new, intelligent ways to complement the productive, agile testing effort? Part of it will be in developing new test management tools, but part of it, clearly, will be in skills development and education too. I just don't see that available right now.

I'll have to keep my eyes open the next time I'm out for lunch. ;)

Think *inside* the Box

Last weekend I presented at the Toronto Workshop on Software Testing (TWST). The topic for the workshop this year was "Coaching, Mentoring and Training Software Testers" and I decided to present a process that I had developed with my team to help manage a peer review process. For us, that's a coaching opportunity, so I thought it was relevant to the theme. There was some excitement and hoopla over the presentation that had me baffled for a bit though.

My presentation had some specific content and charts and things which I know were "new" because, well, we developed them. That was the point of sharing it with colleagues. The thing that had me stumped/confused was at a higher level than that though.

You see, I didn't think there was anything particularly new about the whole idea. Our team follows a process that, as someone reminded me, was initially developed 10 years ago for some specific company's needs. That initial presentation is on the web for all to see and I came across it a long time ago.

The high-level process description fits in with our current company's needs and has been working well for us for several years now. In all the articles and descriptions that I've found online, there was just one piece that wasn't described too well.. okay, at all. To clarify the specific example here, I'm talking about Session-Based Test Management (SBTM) - a test management framework to help manage and measure your Exploratory Testing (ET) effort.

Overall, the process has worked for us in our present company/context, but the "Debrief" step/phase is a bit 'vague' or 'light' in description compared to the other important elements. As it happens, I have a teaching (B.Ed) degree, coupled with my 10+ years of experience, so I feel I am able to improvise this step fairly well.

But one thing always bugged me about that step - it didn't fit in with the overall flow of the rest of the framework. That is, according the the framework 'activity hierarchy' you're either testing or your not. Well... I don't think that it's that clear cut. What about the Debrief step? Is that Testing? Or is it clearly "not"? I don't think it's either, but closer to the 'testing' side if I had to pick one.

Okay, so if I look at it as closer to the testing side, then does it fit in with the overall framework? Actually, no. It's an oddball. The catch here is that the basic SBTM framework helps you manage and measure the ET effort, but nothing about the Debrief step. Wait a minute! Hold the phone. Why not?!

So we, as a team, came up with a process for managing and measuring the Debrief step that follows the same basic format as the testing part. When I look at it, it makes sense. It looks more like a complete picture to me now.

For a number of people at the workshop last week, our process seemed like a novel/fresh way of looking at it. I can't see that actually. As far as I can tell, our team is still following the same basic outline and process described over a decade ago. We just filled in some of the blanks in a way that we thought was consistent with the rest of the framework.

When people ask you to "think outside the box" they mean to think in an unconventional way to solve some problem. To me, I looked inside the box and noticed something was incomplete/missing. My solution, as far as I'm concerned, was "thinking inside the box."

Having said that, when I step back from the process to see what the big picture looks like, I feel a bit like Alfred Nobel after inventing dynamite. The reaction from my colleagues at the workshop was kind of like that too - explosive! (It was quite cool actually. I don't recall the last time I've seen such a flurry of excitement and questions in such a short time!)

I don't know what to do with this process right now. I'm changing it to put some "safeties" in place because I recognise the dangers of allowing people to easily generate "metrics" that represent the quality of a tester's work and learning.

Sigh. There is one thing I've gotten from all of this. I have more learning to do. This time, I know it is clearly in the field of Psychology, although I'm not yet sure where to start. I need to understand this Debrief dynamite that we've invented - what it means and how it can be used for good and not for evil in the hands of fools.

I think the new process is working (for me) though, because it allows me to ask new questions that I haven't thought to ask before. For instance, if ET is the simultaneous learning, design and test execution (by one definition), then managing the debrief step helps me to track a tester's learning. Am I really tracking learning though? Am I observing someone's level of interest and attention to detail? How much they care? What are the implications of poor quality work that doesn't improve over time? Should that tester move onto something else? Should I leave them alone if training and reinforcement doesn't help and they do generally "okay" work? ... ?

Aside from the details and all of these interesting new questions, it was just surprising to me that no one has come up with something similar already. I looked inside the box. I filled in the missing piece using a similar-looking piece. I don't see that as being unconventional.

Sometimes you don't have to go out of your way to come up with something fresh. Maybe no one has gotten around to looking at all the corners of the box yet. Maybe someone meant to but got distracted and didn't return. Maybe there's an opportunity waiting. Maybe you are the one to see it.

Have you looked in your box yet? Sometimes a good tester, like an inventor or explorer, is just thorough.