Trust – HCSE Blog

21/09/2019

Restoring Pride-in-Work

In 1986, Dr Don Berwick from Boston attended a 4-day seminar run by Dr W. Edwards Deming in Washington. Dr Berwick was a 40 year old paediatrician who was also interested in health care management and improving quality and productivity. Dr Deming was an 86 year old engineer and statistician who, when he was in his 40’s, helped the US to improve the quality and productivity of the industrial processes supporting the US and Allies in WWII.

Don Berwick describes attending the seminar as an emotionally challenging life-changing experience when he realised that his well-intended attempts to improve quality by inspection-and-correction was a counterproductive, abusive approach that led to fear, demotivation and erosion of pride-in-work. His blinding new clarity of insight led directly to the Institute of Healthcare Improvement in the USA in the early 1990’s.

One of the tenets of Dr Deming’s theories is that the ingrained beliefs and behaviours that erode pride-in-work also lead to the very outcomes that management do not want – namely conflict between managers and workers and economic failure.

So, an explicit focus on improving pride-in-work as an early objective in any improvement exercise makes very good economic sense, and is a sign of wise leadership and competent management.

Last week a case study was published that illustrates exactly that principle in action. The important message in the title is “restore the calm”.

One of the most demotivating aspects of health care that many complain about is the stress caused a chaotic environment, chronic crisis and perpetual firefighting. So, anything that can restore calm will, in principle, improve motivation – and that is good for staff, patients and organisations.

The case study describes, in detail, how calm was restored in a chronically chaotic chemotherapy day unit … on Weds, June 19th 2019 … in one day and at no cost!

To say that the chemotherapy nurses were surprised and delighted is an understatement. They were amazed to see that they could treat the same number of patients, with the same number of staff, in the same space and without the stress and chaos. And they had time to keep up with the paperwork; and they had time for lunch; and they finished work 2 hours earlier than previously!

Such a thing was not possible surely? But here they were experiencing it. And their patients noticed the flip from chaos-to-strangely-calm too.

The impact of the one-day-test was so profound that the nurses voted to adopt the design change the following week. And they did. And the restored calm has been sustained.

What happened next?

The chemotherapy nurses were able to catch up with their time-owing that had accumulated from the historical late finishes. And the problem of high staff turnover and difficultly in recruitment evaporated. Highly-trained chemotherapy nurses who had left because of the stressful chaos now want to come back. Pride-in-work has been re-established. There are no losers. It is a win-win-win result for staff, patients and organisations.

So, how was this “miracle” achieved?

Well, first of all it was not a miracle. The flip from chaos-to-calm was predicted to happen. In fact, that was the primary objective of the design change.

So, how what this design change achieved?

By establishing the diagnosis first – the primary cause of the chaos – and it was not what the team believed it was. And that is the reason they did not believe the design change would work; and that is the reason they were so surprised when it did.

So, how was the diagnosis achieved?

By using an advanced systems engineering technique called Complex Physical System (CPS) modelling. That was the game changer! All the basic quality improvement techniques had been tried and had not worked – process mapping, direct observation, control charts, respectful conversations, brainstorming, and so on. The system structure was too complicated. The system behaviour was too complex (i.e. chaotic).

What CPS revealed was that the primary cause of the chaotic behaviour was the work scheduling policy. And with that clarity of focus, the team were able to re-design the policy themselves using a simple paper-and-pen technique. That is why it cost nothing to change.

So, why hadn’t they been able to do this before?

Because systems engineering is not a taught component of the traditional quality improvement offerings. Healthcare is rather different to manufacturing! As the complexity of the health care system increases we need to learn the more advanced tools that are designed for this purpose.

What is the same is the principle of restoring pride-in-work and that is what Dr Berwick learned from Dr Deming in 1986, and what we saw happen on June 19th, 2019.

To read the story of how it was done click here.

08/06/2019

Commissioned Improvement

This recent tweet represents a significant milestone. It formally recognises and celebrates in public the impact that developing health care systems engineering (HCSE) capability has had on the culture of the organisation.

What is also important is that the HCSE training was not sought and funded by the Trust, it was discovered by chance and funded by their commissioners, the local clinical commissioning group (CCG).

The story starts back in the autumn of 2017 and, by chance, I was chatting with Rob, a friend-of-a-friend, about work. As you do. It turned out that Rob was the CCG Lead for Unscheduled Care and I was describing how HCSE can be applied in any part of any health care system; primary care, secondary care, scheduled, unscheduled, clinical, operational or whatever. They are all parts of the same system and the techniques and tools of improvement-by-design are generic. And I described lots of real examples of doing just that and the sustained improvements that had followed.

So he asked “If you were to apply this approach to unscheduled care in a large acute trust how would you do it?“. My immediate reply was “I would start by training the front line teams in the HCSE Level 1 stuff, and the first step is to raise awareness of what is possible. We do that by demonstrating it in practice because you have to see it and experience it to believe it.“

And so that is what we did.

The CCG commissioned a one-year HCSE Level 1 programme for four teams at University Hospitals of North Midlands (UHNM) and we started in January 2018 with some One Day Flow Workshops.

The intended emotional effect of a Flow Workshop is surprise and delight. The challenge for the day is to start with a simulated, but very realistic, one-stop outpatient clinic which is chaotic and stressful for everyone. And with no prior training the delegates transform it into a calm and enjoyable experience using the HCSE approach. It is called emergent learning. We have run dozens of these workshops and it has never failed.

After directly experiencing HCSE working in practice the teams that stepped up to the challenge were from ED, Transformation, Ambulatory Emergency Care and Outpatients.

The key to growing HCSE capability is to assemble small teams, called micro-system design teams (MSDTs) and to focus on causes that fall inside their circle of control.

The MSDT sessions need to be regular, short, and facilitated by an experienced HCSE who has seen it, done it and can teach it.

In UHNM, the Transformation team divided themselves between the front-line teams and they learned HCSE together. Here’s a picture of the ED team … left to right we have Alex, Mark and Julie (ED consultants) then Steve and Janina (Transformation). The essential tools are a big table, paper, pens, notebooks, coffee and a laptop/projector.

The purpose of each session is empirical learning-by-doing i.e. using a real improvement challenge to learn and practice the method so that before the end of the programme the team can confidently “fly” solo.

That is the key to continued growth and sustained improvement. The HCSE capability needs to become embedded.

It is good fun and immensely rewarding to see the “ah ha” moments and improvements happen as the needle on the emotometer moves from “Can’t Do” to “Can Do”.

Metamorphosis is re-arranging what you already have in a way that works better.

The tweet is objective evidence that demonstrates the HCSE programme delivers as designed. It is fit-for-purpose. It is called validation.

The other objective evidence of effectiveness comes from the learning-by-doing projects themselves. And for an individual to gain a coveted HCSE Level 1 Certificate of Competency requires writing up to a publishable quality and sharing the story. Warts-and-all.

To read the full story of just click here

And what started this was the CCG who had the strategic vision, looked outside themselves for innovative approaches, and demonstrated the courage to take a risk.

Commissioned Improvement.

02/12/2017

The Strangeness of LoS

It had been some time since Bob and Leslie had chatted so an email from the blue was a welcome distraction from a complex data analysis task.

<Bob> Hi Leslie, great to hear from you. I was beginning to think you had lost interest in health care improvement-by-design.

<Leslie> Hi Bob, not at all. Rather the opposite. I’ve been very busy using everything that I’ve learned so far. It’s applications are endless, but I have hit a problem that I have been unable to solve, and it is driving me nuts!

<Bob> OK. That sounds encouraging and interesting. Would you be able to outline this thorny problem and I will help if I can.

<Leslie> Thanks Bob. It relates to a big issue that my organisation is stuck with – managing urgent admissions. The problem is that very often there is no bed available, but there is no predictability to that. It feels like a lottery; a quality and safety lottery. The clinicians are clamoring for “more beds” but the commissioners are saying “there is no more money“. So the focus has turned to reducing length of stay.

<Bob> OK. A focus on length of stay sounds reasonable. Reducing that can free up enough beds to provide the necessary space-capacity resilience to dramatically improve the service quality. So long as you don’t then close all the “empty” beds to save money, or fall into the trap of believing that 85% average bed occupancy is the “optimum”.

<Leslie> Yes, I know. We have explored all of these topics before. That is not the problem.

<Bob> OK. What is the problem?

<Leslie> The problem is demonstrating objectively that the length-of-stay reduction experiments are having a beneficial impact. The data seems to say they they are, and the senior managers are trumpeting the success, but the people on the ground say they are not. We have hit a stalemate.

<Bob> Ah ha! That old chestnut. So, can I first ask what happens to the patients who cannot get a bed urgently?

<Leslie> Good question. We have mapped and measured that. What happens is the most urgent admission failures spill over to commercial service providers, who charge a fee-per-case and we have no choice but to pay it. The Director of Finance is going mental! The less urgent admission failures just wait on queue-in-the-community until a bed becomes available. They are the ones who are complaining the most, so the Director of Governance is also going mental. The Director of Operations is caught in the cross-fire and the Chief Executive and Chair are doing their best to calm frayed tempers and to referee the increasingly toxic arguments.

<Bob> OK. I can see why a “Reduce Length of Stay Initiative” would tick everyone’s Nice If box. So, the data analysts are saying “the length of stay has come down since the Initiative was launched” but the teams on the ground are saying “it feels the same to us … the beds are still full and we still cannot admit patients“.

<Leslie> Yes, that is exactly it. And everyone has come to the conclusion that demand must have increased so it is pointless to attempt to reduce length of stay because when we do that it just sucks in more work. They are feeling increasingly helpless and hopeless.

<Bob> OK. Well, the “chronic backlog of unmet need” issue is certainly possible, but your data will show if admissions have gone up.

<Leslie> I know, and as far as I can see they have not.

<Bob> OK. So I’m guessing that the next explanation is that “the data is wonky“.

<Leslie> Yup. Spot on. So, to counter that the Information Department has embarked on a massive push on data collection and quality control and they are adamant that the data is complete and clean.

<Bob> OK. So what is your diagnosis?

<Leslie> I don’t have one, that’s why I emailed you. I’m stuck.

<Bob> OK. We need a diagnosis, and that means we need to take a “history” and “examine” the process. Can you tell me the outline of the RLoS Initiative.

<Leslie> We knew that we would need a baseline to measure from so we got the historical admission and discharge data and plotted a Diagnostic Vitals Chart®. I have learned something from my HCSE training! Then we planned the implementation of a visual feedback tool that would show ward staff which patients were delayed so that they could focus on “unblocking” the bottlenecks. We then planned to measure the impact of the intervention for three months, and then we planned to compare the average length of stay before and after the RLoS Intervention with a big enough data set to give us an accurate estimate of the averages. The data showed a very obvious improvement, a highly statistically significant one.

<Bob> OK. It sounds like you have avoided the usual trap of just relying on subjective feedback, and now have a different problem because your objective and subjective feedback are in disagreement.

<Leslie> Yes. And I have to say, getting stuck like this has rather dented my confidence.

<Bob> Fear not Leslie. I said this is an “old chestnut” and I can say with 100% confidence that you already have what you need in your T4 kit bag?

<Leslie>Tee-Four?

<Bob> Sorry, a new abbreviation. It stands for “theory, techniques, tools and training“.

<Leslie> Phew! That is very reassuring to hear, but it does not tell me what to do next.

<Bob> You are an engineer now Leslie, so you need to don the hard-hat of Improvement-by-Design. Start with your Needs Analysis.

<Leslie> OK. I need a trustworthy tool that will tell me if the planned intervention has has a significant impact on length of stay, for better or worse or not at all. And I need it to tell me that quickly so I can decide what to do next.

<Bob> Good. Now list all the things that you currently have that you feel you can trust.

<Leslie> I do actually trust that the Information team collect, store, verify and clean the raw data – they are really passionate about it. And I do trust that the front line teams are giving accurate subjective feedback – I work with them and they are just as passionate. And I do trust the systems engineering “T4” kit bag – it has proven itself again-and-again.

<Bob> Good, and I say that because you have everything you need to solve this, and it sounds like the data analysis part of the process is a good place to focus.

<Leslie> That was my conclusion too. And I have looked at the process, and I can’t see a flaw. It is driving me nuts!

<Bob> OK. Let us take a different tack. Have you thought about designing the tool you need from scratch?

<Leslie> No. I’ve been using the ones I already have, and assume that I must be using them incorrectly, but I can’t see where I’m going wrong.

<Bob> Ah! Then, I think it would be a good idea to run each of your tools through a verification test and check that they are fit-4-purpose in this specific context.

<Leslie> OK. That sounds like something I haven’t covered before.

<Bob> I know. Designing verification test-rigs is part of the Level 2 training. I think you have demonstrated that you are ready to take the next step up the HCSE learning curve.

<Leslie> Do you mean I can learn how to design and build my own tools? Special tools for specific tasks?

<Bob> Yup. All the techniques and tools that you are using now had to be specified, designed, built, verified, and validated. That is why you can trust them to be fit-4-purpose.

<Leslie> Wooohooo! I knew it was a good idea to give you a call. Let’s get started.

[Postscript] And Leslie, together with the other stakeholders, went on to design the tool that they needed and to use the available data to dissolve the stalemate. And once everyone was on the same page again they were able to work collaboratively to resolve the flow problems, and to improve the safety, flow, quality and affordability of their service. Oh, and to know for sure that they had improved it.

25/11/2017

The Turkeys Voting For Xmas Trap

One of the quickest and easiest ways to kill an improvement initiative stone dead is to label it as a “cost improvement program” or C.I.P.

Everyone knows that the biggest single contributor to cost is salaries.

So cost reduction means head count reduction which mean people lose their jobs and their livelihood.

Who is going to sign up to that?

It would be like turkeys voting for Xmas.

There must be a better approach?

Yes. There is.

Over the last few weeks, groups of curious skeptics have experienced the immediate impact of systems engineering theory, techniques and tools in a health care context.

They experienced queues, delays and chaos evaporate in front of their eyes … and it cost nothing to achieve. No extra resources. No extra capacity. No extra cash.

Their reaction was “surprise and delight”.

But … it also exposed a problem. An undiscussable problem.

Queues and chaos require expensive resources to manage.

We call them triagers, progress-chasers, and fire-fighters. And when the queues and chaos evaporate then their jobs do too.

The problem is that the very people who are needed to make the change happen are the ones who become surplus-to-requirement as a result of the change.

So change does not happen.

It would like turkeys voting for Xmas.

The way around this impasse is to anticipate the effect and to proactively plan to re-invest the resource that is released. And to re-invest it doing a more interesting and more worthwhile jobs than queue-and-chaos management.

One opportunity for re-investment is called time-buffering which is an effective way to improve resilience to variation, especially in an unscheduled care context.

Another opportunity for re-investment is tail-gunning the chronic backlogs until they are down to a safe and sensible size.

And many complain that they do not have time to learn about improvement because they are too busy managing the current chaos.

So, another opportunity for re-investment is training – oneself first and then others.

R.I.P. C.I.P.

25/02/2017

Levels of Resistance

Improvement implies change, but change does not imply improvement.

We have all experienced the pain of disappointment when a change that promised much delivered no improvement, or even worse, a negative impact.

We have learned to become wary and skeptical about change.

We have learned a whole raft of tactics for deflection and diffusion of the enthusiasm of others. And by doing so we don the black hat of the healthy skeptic and the tell tale mantra of “Yes, but …”.

So here is an onion diagram to use as a reference. It comes from a recently published essay that compares and contrasts two schools of flow improvement. Eli Goldratt’s “Theory of Constraints” and a translation of Systems Engineering called 6M Design®.

The first five layers can be described as “denial”, the second four as “grudging acceptance” … and the last one is the sound of the final barrier coming down and revealing the raw emotion underpinning our reluctance to change. Fear.

The good news is that this diagram helps us to shape and steer change in a way that improves its chances of success, because if we can learn to peel back these layers by sharing information that soothes the fear of the unknown, then we can align and engage. And that is essential for emotional momentum to build.

So when we meet resistance do we push or not?

Ask yourself. How would prefer to be engaged? Pushed or not?

08/01/2017

The Power of Pictures

I am a big fan of pictures that tell a story … and this week I discovered someone who is creating great pictures … Hayley Lewis.

This is one of Hayley’s excellent sketch notes … the one that captures the essence of the Bruce Tuckman model of team development.

The reason that I share this particular sketch-note is because my experience of developing improvement-by-design teams is that it works just like this!

The tricky phase is the STORMING one because not all teams survive it!

About half sink in the storm – and that seems like an awful waste – and I believe it is avoidable.

This means that before starting the team development cycle, the leader needs to be aware of how to navigate themselves and the team through the storm phase … and that requires training, support and practice.

Which is the reason why coaching from a independent, experienced, capable practitioner is a critical element of the improvement process.

31/12/2016

How Do We Know We Have Improved?

Phil and Pete are having a coffee and a chat. They both work in the NHS and have been friends for years.

They have different jobs. Phil is a commissioner and an accountant by training, Pete is a consultant and a doctor by training.

They are discussing a challenge that affects them both on a daily basis: unscheduled care.

Both Phil and Pete want to see significant and sustained improvements and how to achieve them is often the focus of their coffee chats.

<Phil> We are agreed that we both want improvement, both from my perspective as a commissioner and from your perspective as a clinician. And we agree that what we want to see improvements in patient safety, waiting, outcomes, experience for both patients and staff, and use of our limited NHS resources.

<Pete> Yes. Our common purpose, the “what” and “why”, has never been an issue. Where we seem to get stuck is the “how”. We have both tried many things but, despite our good intentions, it feels like things are getting worse!

<Phil> I agree. It may be that what we have implemented has had a positive impact and we would have been even worse off if we had done nothing. But I do not know. We clearly have much to learn and, while I believe we are making progress, we do not appear to be learning fast enough. And I think this knowledge gap exposes another “how” issue: After we have intervened, how do we know that we have (a) improved, (b) not changed or (c) worsened?

<Pete> That is a very good question. And all that I have to offer as an answer is to share what we do in medicine when we ask a similar question: “How do I know that treatment A is better than treatment B?” It is the essence of medical research; the quest to find better treatments that deliver better outcomes and at lower cost. The similarities are strong.

<Phil> OK. How do you do that? How do you know that “Treatment A is better than Treatment B” in a way that anyone will trust the answer?

<Pete> We use a science that is actually very recent on the scientific timeline; it was only firmly established in the first half of the 20th century. One reason for that is that it is rather a counter-intuitive science and for that reason it requires using tools that have been designed and demonstrated to work but which most of us do not really understand how they work. They are a bit like magic black boxes.

<Phil> H’mm. Please forgive me for sounding skeptical but that sounds like a big opportunity for making mistakes! If there are lots of these “magic black box” tools then how do you decide which one to use and how do you know you have used it correctly?

<Pete> Those are good questions! Very often we don’t know and in our collective confusion we generate a lot of unproductive discussion. This is why we are often forced to accept the advice of experts but, I confess, very often we don’t understand what they are saying either! They seem like the medieval Magi.

<Phil> H’mm. So these experts are like ‘magicians’ – they claim to understand the inner workings of the black magic boxes but are unable, or unwilling, to explain in a language that a ‘muggle’ would understand?

<Pete> Very well put. That is just how it feels.

<Phil> So can you explain what you do understand about this magical process? That would be a start.

<Pete> OK, I will do my best. The first thing we learn in medical research is that we need to be clear about what it is we are looking to improve, and we need to be able to measure it objectively and accurately.

<Phil> That makes sense. Let us say we want to improve the patient’s subjective quality of the A&E experience and objectively we want to reduce the time they spend in A&E. We measure how long they wait.

<Pete> The next thing is that we need to decide how much improvement we need. What would be worthwhile? So in the example you have offered we know that reducing the average time patients spend in A&E by just 30 minutes would have a significant effect on the quality of the patient and staff experience, and as a by-product it would also dramatically improve the 4-hour target performance.

<Phil> OK. From the commissioning perspective there are lots of things we can do, such as commissioning alternative paths for specific groups of patients; in effect diverting some of the unscheduled demand away from A&E to a more appropriate service provider. But these are the sorts of thing we have been experimenting with for years, and it brings us back to the question: How do we know that any change we implement has had the impact we intended? The system seems, well, complicated.

<Pete> In medical research we are very aware that the system we are changing is very complicated and that we do not have the power of omniscience. We cannot know everything. Realistically, all we can do is to focus on objective outcomes and collect small samples of the data ocean and use those in an attempt to draw conclusions can trust. We have to design our experiment with care!

<Phil> That makes sense. Surely we just need to measure the stuff that will tell us if our impact matches our intent. That sounds easy enough. What’s the problem?

<Pete> The problem we encounter is that when we measure “stuff” we observe patient-to-patient variation, and that is before we have made any changes. Any impact that we may have is obscured by this “noise”.

<Phil> Ah, I see. So if the our intervention generates a small impact then it will be more difficult to see amidst this background noise. Like trying to see fine detail in a fuzzy picture.

<Pete> Yes, exactly like that. And it raises the issue of “errors”. In medical research we talk about two different types of error; we make the first type of error when our actual impact is zero but we conclude from our data that we have made a difference; and we make the second type of error when we have made an impact but we conclude from our data that we have not.

<Phil> OK. So does that imply that the more “noise” we observe in our measure for-improvement before we make the change, the more likely we are to make one or other error?

<Pete> Precisely! So before we do the experiment we need to design it so that we reduce the probability of making both of these errors to an acceptably low level. So that we can be assured that any conclusion we draw can be trusted.

<Phil> OK. So how exactly do you do that?

<Pete> We know that whenever there is “noise” and whenever we use samples then there will always be some risk of making one or other of the two types of error. So we need to set a threshold for both. We have to state clearly how much confidence we need in our conclusion. For example, we often use the convention that we are willing to accept a 1 in 20 chance of making the Type I error.

<Phil> Let me check if I have heard you correctly. Suppose that, in reality, our change has no impact and we have set the risk threshold for a Type 1 error at 1 in 20, and suppose we repeat the same experiment 100 times – are you saying that we should expect about five of our experiments to show data that says our change has had the intended impact when in reality it has not?

<Pete> Yes. That is exactly it.

<Phil> OK. But in practice we cannot repeat the experiment 100 times, so we just have to accept the 1 in 20 chance that we will make a Type 1 error, and we won’t know we have made it if we do. That feels a bit chancy. So why don’t we just set the threshold to 1 in 100 or 1 in 1000?

<Pete> We could, but doing that has a consequence. If we reduce the risk of making a Type I error by setting our threshold lower, then we will increase the risk of making a Type II error.

<Phil> Ah! I see. The old swings-and-roundabouts problem. By the way, do these two errors have different names that would make it easier to remember and to explain?

<Pete> Yes. The Type I error is called a False Positive. It is like concluding that a patient has a specific diagnosis when in reality they do not.

<Phil> And the Type II error is called a False Negative?

<Pete> Yes. And we want to avoid both of them, and to do that we have to specify a separate risk threshold for each error. The convention is to call the threshold for the false positive the alpha level, and the threshold for the false negative the beta level.

<Phil> OK. So now we have three things we need to be clear on before we can do our experiment: the size of the change that we need, the risk of the false positive that we are willing to accept, and the risk of a false negative that we are willing to accept. Is that all we need?

<Pete> In medical research we learn that we need six pieces of the experimental design jigsaw before we can proceed. We only have three pieces so far.

<Phil> What are the other three pieces then?

<Pete> We need to know the average value of the metric we are intending to improve, because that is our baseline from which improvement is measured. Improvements are often framed as a percentage improvement over the baseline. And we need to know the spread of the data around that average, the “noise” that we referred to earlier.

<Phil> Ah, yes! I forgot about the noise. But that is only five pieces of the jigsaw. What is the last piece?

<Pete> The size of the sample.

<Phil> Eh? Can’t we just go with whatever data we can realistically get?

<Pete> Sadly, no. The size of the sample is how we control the risk of a false negative error. The more data we have the lower the risk. This is referred to as the power of the experimental design.

<Phil> OK. That feels familiar. I know that the more experience I have of something the better my judgement gets. Is this the same thing?

<Pete> Yes. Exactly the same thing.

<Phil> OK. So let me see if I have got this. To know if the impact of the intervention matches our intention we need to design our experiment carefully. We need all six pieces of the experimental design jigsaw and they must all fall inside our circle of control. We can measure the baseline average and spread; we can specify the impact we will accept as useful; we can specify the risks we are prepared to accept of making the false positive and false negative errors; and we can collect the required amount of data after we have made the intervention so that we can trust our conclusion.

<Pete> Perfect! That is how we are taught to design research studies so that we can trust our results, and so that others can trust them too.

<Phil> So how do we decide how big the post-implementation data sample needs to be? I can see we need to collect enough data to avoid a false negative but we have to be pragmatic too. There would appear to be little value in collecting more data than we need. It would cost more and could delay knowing the answer to our question.

<Pete> That is precisely the trap than many inexperienced medical researchers fall into. They set their sample size according to what is achievable and affordable, and then they hope for the best!

<Phil> Well, we do the same. We analyse the data we have and we hope for the best. In the magical metaphor we are asking our data analysts to pull a white rabbit out of the hat. It sounds rather irrational and unpredictable when described like that! Have medical researchers learned a way to avoid this trap?

<Pete> Yes, it is a tool called a power calculator.

<Phil> Ooooo … a power tool … I like the sound of that … that would be a cool tool to have in our commissioning bag of tricks. It would be like a magic wand. Do you have such a thing?

<Pete> Yes.

<Phil> And do you understand how the power tool magic works well enough to explain to a “muggle”?

<Pete> Not really. To do that means learning some rather unfamiliar language and some rather counter-intuitive concepts.

<Phil> Is that the magical stuff I hear lurks between the covers of a medical statistics textbook?

<Pete> Yes. Scary looking mathematical symbols and unfathomable spells!

<Phil> Oh dear! Is there another way for to gain a working understanding of this magic? Something a bit more pragmatic? A path that a ‘statistical muggle’ might be able to follow?

<Pete> Yes. It is called a simulator.

<Phil> You mean like a flight simulator that pilots use to learn how to control a jumbo jet before ever taking a real one out for a trip?

<Pete> Exactly like that.

<Phil> Do you have one?

<Pete> Yes. It was how I learned about this “stuff” … pragmatically.

<Phil> Can you show me?

<Pete> Of course. But to do that we will need a bit more time, another coffee, and maybe a couple of those tasty looking Danish pastries.

<Phil> A wise investment I’d say. I’ll get the the coffee and pastries, if you fire up the engines of the simulator.

26/11/2016

Pride and Joy

Have you heard the phrase “Pride comes before a fall“?

What does this mean? That the feeling of pride is the reason for the subsequent fall?

So by following that causal logic, if we do not allow ourselves to feel proud then we can avoid the fall?

And none of us like the feeling of falling and failing. We are fearful of that negative feeling, so with this simple trick we can avoid feeling bad. Yes?

But we all know the positive feeling of achievement – we feel pride when we have done good work, when our impact matches our intent. Pride in our work.

Is that bad too?

Should we accept under-achievement and unexceptional mediocrity as the inevitable cost of avoiding the pain of possible failure? Is that what we are being told to do here?

The phrase comes from the Bible, from the Book of Proverbs 16:18 to be precise.

And the problem here is that the phrase “pride comes before a fall” is not the whole proverb.

It has been simplified. Some bits have been omitted. And those omissions lead to ambiguity and the opportunity for obfuscation and re-interpretation.

In the fuller New International Version we see a missing bit … the “haughty spirit” bit. That is another way of saying “over-confident” or “arrogant”.

But even this “authorised” version is still ambiguous and more questions spring to mind:

Q1. What sort of pride are we referring to? Just the confidence version? What about the pride that follows achievement?

Q2. How would we know if our feeling of confidence is actually justified?

Q3. Does a feeling of confidence always precede a fall? Is that how we diagnose over-confidence? Retrospectively? Are there instances when we feel confident but we do not fail? Are there instances when we do not feel confident and then fail?

Q4. Does confidence cause the fall or it is just a temporal association? Is there something more fundamental that causes both high-confidence and low-competence?

There is a well known model called the Conscious-Competence model of learning which generates a sequence of four stages to achieving a new skill. Such as one we need to achieve our intended outcomes.

We all start in the “blissful ignorance” zone of unconscious incompetence. Our unknowns are unknown to us. They are blind spots. So we feel unjustifiably confident.

In this model the first barrier to progress is “wrong intuition” which means that we actually have unconscious assumptions that are distorting our perception of reality.

What we perceive makes sense to us. It is clear and obvious. We feel confident. We believe our own rhetoric.

But our unconscious assumptions can trick us into interpreting information incorrectly. And if we derive decisions from unverified assumptions and invalid analysis then we may do the wrong thing and not achieve our intended outcome. We may unintentionally cause ourselves to fail and not be aware of it. But we are proud and confident.

Then the gap between our intent and our impact becomes visible to all and painful to us. So we are tempted to avoid the social pain of public failure by retreating behind the “Yes, But” smokescreen of defensive reasoning. The “doom loop” as it is sometimes called. The Victim Vortex. “Don’t name, shame and blame me, I was doing my best. I did not intent that to happen. To err is human”.

The good news is that this learning model also signposts a possible way out; a door in the black curtain of ignorance. It suggests that we can learn how to correct our analysis by using feedback from reality to verify our rhetorical assumptions. Those assumptions which pass the “reality check” we keep, those which fail the “reality check” we redesign and retest until they pass. Bit by bit our inner rhetoric comes to more closely match reality and the wisdom of our decisions will improve.

And what we then see is improvement. Our impact moves closer towards our intent. And we can justifiably feel proud of that achievement. We do not need to be best-compared-with-the-rest; just being better-than-we-were-before is OK. That is learning.

And this is how it feels … this is the Learning Curve … or the Nerve Curve as we call it.

What it says is that to be able to assess confidence we must also measure competence. Outcomes. Impact.

And to achieve excellence we have to be prepared to actively look for any gap between intent and impact. And we have to be prepared to see it as an opportunity rather than as a threat. And we will need to be able to seek feedback and other people’s perspectives. And we need to be to open to asking for examples and explanations from those who have demonstrated competence.

It says that confidence is not a trustworthy surrogate for competence.

It says that we want the confidence that flows from competence because that is the foundation of trust.

Improvement flows at the speed of trust and seeing competence, confidence and trust growing is a joyous thing.

Pride and Joy are OK.

Arrogance and incompetence comes before a fall would be a better proverb.

12/11/2016

Defensive Reasoning

About 25 years ago a paper was published in the Harvard Business Review with the interesting title of “Teaching Smart People How To Learn“

The uncomfortable message was that many people who are top of the intellectual rankings are actually very poor learners.

This sounds like a paradox. How can people be high-achievers and yet be unable to learn?

Health care systems are stuffed full of super-smart, high-achieving professionals. The cream of educational crop. The top 2%. They are called “doctors”.

And we have a problem with improvement in health care … a big problem … the safety, delivery, quality and affordability of the NHS is getting worse. Not better.

Improvement implies change and change implies learning, so if smart people struggle to learn then could that explain why health care systems find self-improvement so difficult?

This paragraph from the 1991 HBR paper feels uncomfortably familiar:

The author, Chris Argyris, refers to something called “single-loop learning” and if we translate this management-speak into the language of medicine it would come out as “treating the symptom and ignoring the disease“. That is poor medicine.

Chris also suggests an antidote to this problem and gave it the label “double-loop learning” which if translated into medical speak becomes “diagnosis“. And that is something that doctors can relate to because without a diagnosis, a justifiable treatment is difficult to formulate.

We need to diagnose the root cause(s) of the NHS disease.

The 1991 HBR paper refers back to an earlier 1977 HBR paper called Double Loop Learning in Organisations where we find the theory that underpins it.

The proposed hypothesis is that we all have cognitive models that we use to decide our actions (and in-actions), what I have referred to before as ChimpWare. In it is a reference to a table published in a 1974 book and the message is that Single-Loop learning is a manifestation of a Model 1 theory-in-action.

And if we consider the task that doctors are expected to do then we can empathize with their dominant Model 1 approach. Health care is a dangerous business. Doctors can cause a lot of unintentional harm – both physical and psychological. Doctors are dealing with a very, very complex system – a human body – that they only partially understand. No two patients are exactly the same and illness is a dynamic process. Everyone’s expectations are high. We have come a long way since the days of blood-letting and leeches! Failure is not tolerated.

Doctors are intelligent and competitive … they had to be to win the education race.

Doctors must make tough decisions and have to have tough conversations … many, many times … and yet not be consumed in the process. They often have to suppress emotions to be effective.

Doctors feel the need to protect patients from harm – both physical and emotional.

And collectively they do a very good job. Doctors are respected and trusted professionals.

But … to quote Chris Argyris …

“Model I blinds people to their weaknesses. For instance, the six corporate presidents were unable to realize how incapable they were of questioning their assumptions and breaking through to fresh understanding. They were under the illusion that they could learn, when in reality they just kept running around the same track.”

This blindness is self-reinforcing because …

“All parties withheld information that was potentially threatening to themselves or to others, and the act of cover-up itself was closed to discussion.”

How many times have we seen this in the NHS?

The Mid-Staffordshire Hospital debacle that led to the Francis Report is all the evidence we need.

So what is the way out of this double-bind?

Chris gives us some hints with his Model II theory-in-use.

Valid information – Study.
Free and informed choice – Plan.
Constant monitoring of the implementation – Do.

The skill required is to question assumptions and break through to fresh understanding and we can do that with design-led approach because that is what designers do.

They bring their unconscious assumptions up to awareness and ask “Is that valid?” and “What if” questions.

It is called Improvement-by-Design.

And the good news is that this Model II approach works in health care, and we know that because the evidence is accumulating.

05/11/2016

Value, Verify and Validate

Many of the challenges that we face in delivering effective and affordable health care do not have well understood and generally accepted solutions.

If they did there would be no discussion or debate about what to do and the results would speak for themselves.

This lack of understanding is leading us to try to solve a complicated system design challenge in our heads. Intuitively.

And trying to do it this way is fraught with frustration and risk because our intuition tricks us. It was this sort of challenge that led Professor Rubik to invent his famous 3D Magic Cube puzzle.

It is difficult enough to learn how to solve the Magic Cube puzzle by trial and error; it is even more difficult to attempt to do it inside our heads! Intuitively.

And we know the Rubik Cube puzzle is solvable, so all we need are some techniques, tools and training to improve our Rubik Cube solving capability. We can all learn how to do it.

Returning to the challenge of safe and affordable health care, and to the specific problem of unscheduled care, A&E targets, delayed transfers of care (DTOC), finance, fragmentation and chronic frustration.

This is a systems engineering challenge so we need some systems engineering techniques, tools and training before attempting it. Not after failing repeatedly.

One technique that a systems engineer will use is called a Vee Diagram such as the one shown above. It shows the sequence of steps in the generic problem solving process and it has the same sequence that we use in medicine for solving problems that patients present to us …

Diagnose, Design and Deliver

which is also known as …

Study, Plan, Do.

Notice that there are three words in the diagram that start with the letter V … value, verify and validate. These are probably the three most important words in the vocabulary of a systems engineer.

One tool that a systems engineer always uses is a model of the system under consideration.

Models come in many forms from conceptual to physical and are used in two main ways:

To assist the understanding of the past (diagnosis)
To predict the behaviour in the future (prognosis)

And the process of creating a system model, the sequence of steps, is shown in the Vee Diagram. The systems engineer’s objective is a validated model that can be trusted to make good-enough predictions; ones that support making wiser decisions of which design options to implement, and which not to.

So if a systems engineer presented us with a conceptual model that is intended to assist our understanding, then we will require some evidence that all stages of the Vee Diagram process have been completed. Evidence that provides assurance that the model predictions can be trusted. And the scope over which they can be trusted.

Last month a report was published by the Nuffield Trust that is entitled “Understanding patient flow in hospitals” and it asserts that traffic flow on a motorway is a valid conceptual model of patient flow through a hospital. Here is a direct quote from the second paragraph in the Executive Summary:

Unfortunately, no evidence is provided in the report to support the validity of the statement and that omission should ring an alarm bell.

The observation that “the hospitals with the least free space struggle the most” is not a validation of the conceptual model. Validation requires a concrete experiment.

To illustrate why observation is not validation let us consider a scenario where I have a headache and I take a paracetamol and my headache goes away. I now have some evidence that shows a temporal association between what I did (take paracetamol) and what I got (a reduction in head pain).

But this is not a valid experiment because I have not considered the other seven possible combinations of headache before (Y/N), paracetamol (Y/N) and headache after (Y/N).

An association cannot be used to prove causation; not even a temporal association.

When I do not understand the cause, and I am without evidence from a well-designed experiment, then I might be tempted to intuitively jump to the (invalid) conclusion that “headaches are caused by lack of paracetamol!” and if untested this invalid judgement may persist and even become a belief.

Understanding causality requires an approach called counterfactual analysis; otherwise known as “What if?” And we can start that process with a thought experiment using our rhetorical model. But we must remember that we must always validate the outcome with a real experiment. That is how good science works.

A famous thought experiment was conducted by Albert Einstein when he asked the question “If I were sitting on a light beam and moving at the speed of light what would I see?” This question led him to the Theory of Relativity which completely changed the way we now think about space and time. Einstein’s model has been repeatedly validated by careful experiment, and has allowed engineers to design and deliver valuable tools such as the Global Positioning System which uses relativity theory to achieve high positional precision and accuracy.

So let us conduct a thought experiment to explore the ‘faster movement requires more space‘ statement in the case of patient flow in a hospital.

First, we need to define what we mean by the words we are using.

The phrase ‘faster movement’ is ambiguous. Does it mean higher flow (more patients per day being admitted and discharged) or does it mean shorter length of stage (the interval between the admission and discharge events for individual patients)?

The phrase ‘more space’ is also ambiguous. In a hospital that implies physical space i.e. floor-space that may be occupied by corridors, chairs, cubicles, trolleys, and beds. So are we actually referring to flow-space or storage-space?

What we have in this over-simplified statement is the conflation of two concepts: flow-capacity and space-capacity. They are different things. They have different units. And the result of conflating them is meaningless and confusing.

However, our stated goal is to improve understanding so let us consider one combination, and let us be careful to be more precise with our terminology, “higher flow always requires more beds“. Does it? Can we disprove this assertion with an example where higher flow required less beds (i.e. space-capacity)?

The relationship between flow and space-capacity is well understood.

The starting point is Little’s Law which was proven mathematically in 1961 by J.D.C. Little and it states:

Average work in progress = Average lead time X Average flow.

In the hospital context, work in progress is the number of occupied beds, lead time is the length of stay and flow is admissions or discharges per time interval (which must be the same on average over a long period of time).

(NB. Engineers are rather pedantic about units so let us check that this makes sense: the unit of WIP is ‘patients’, the unit of lead time is ‘days’, and the unit of flow is ‘patients per day’ so ‘patients’ = ‘days’ * ‘patients / day’. Correct. Verified. Tick.)

So, is there a situation where flow can increase and WIP can decrease? Yes. When lead time decreases. Little’s Law says that is possible. We have disproved the assertion.

Let us take the other interpretation of higher flow as shorter length of stay: i.e. shorter length of stay always requires more beds. Is this correct? No. If flow remains the same then Little’s Law states that we will require fewer beds. This assertion is disproved as well.

And we need to remember that Little’s Law is proven to be valid for averages, does that shed any light on the source of our confusion? Could the assertion about flow and beds actually be about the variation in flow over time and not about the average flow?

And this is also well understood. The original work on it was done almost exactly 100 years ago by Agner Krarup Erlang and the problem he looked at was the quality of customer service of the early telephone exchanges. Specifically, how likely was the caller to get the “all lines are busy, please try later” response.

What Erlang showed was there there is a mathematical relationship between the number of calls being made (the demand), the probability of a call being connected first time (the service quality) and the number of telephone circuits and switchboard operators available (the service cost).

So it appears that we already have a validated mathematical model that links flow, quality and cost that we might use if we substitute ‘patients’ for ‘calls’, ‘beds’ for ‘telephone circuits’, and ‘being connected’ for ‘being admitted’.

And this topic of patient flow, A&E performance and Erlang queues has been explored already … here.

So a telephone exchange is a more valid model of a hospital than a motorway.

We are now making progress in deepening our understanding.

The use of an invalid, untested, conceptual model is sloppy systems engineering.

So if the engineering is sloppy we would be unwise to fully trust the conclusions.

And I share this feedback in the spirit of black box thinking because I believe that there are some valuable lessons to be learned here – by us all.

To vote for this topic please click here.
To subscribe to the blog newsletter please click here.
To email the author please click here.

08/10/2016

Outliers

An effective way to improve is to learn from others who have demonstrated the capability to achieve what we seek. To learn from success.

Another effective way to improve is to learn from those who are not succeeding … to learn from failures … and that means … to learn from our own failings.

But from an early age we are socially programmed with a fear of failure.

The training starts at school where failure is not tolerated, nor is challenging the given dogma. Paradoxically, the effect of our fear of failure is that our ability to inquire, experiment, learn, adapt, and to be resilient to change is severely impaired!

So further failure in the future becomes more likely, not less likely. Oops!

Fortunately, we can develop a healthier attitude to failure and we can learn how to harness the gap between intent and impact as a source of energy, creativity, innovation, experimentation, learning, improvement and growing success.

And health care provides us with ample opportunities to explore this unfamiliar terrain. The creative domain of the designer and engineer.

The scatter plot below is a snapshot of the A&E 4 hr target yield for all NHS Trusts in England for the month of July 2016. The required “constitutional” performance requirement is better than 95%. The delivered whole system average is 85%. The majority of Trusts are failing, and the Trust-to-Trust variation is rather wide. Oops!

This stark picture of the gap between intent (95%) and impact (85%) prompts some uncomfortable questions:

Q1: How can one Trust achieve 98% and yet another can do no better than 64%?

Q2: What can all Trusts learn from these high and low flying outliers?

[NB. I have not asked the question “Who should we blame for the failures?” because the name-shame-blame-game is also a predictable consequence of our fear-of-failure mindset.]

Let us dig a bit deeper into the information mine, and as we do that we need to be aware of a trap:

A snapshot-in-time tells us very little about how the system and the set of interconnected parts is behaving-over-time.

We need to examine the time-series charts of the outliers, just as we would ask for the temperature, blood pressure and heart rate charts of our patients.

Here are the last six years by month A&E 4 hr charts for a sample of the high-fliers. They are all slightly different and we get the impression that the lower two are struggling more to stay aloft more than the upper two … especially in winter.

And here are the last six years by month A&E 4 hr charts for a sample of the low-fliers. The Mark I Eyeball Test results are clear … these swans are falling out of the sky!

So we need to generate some testable hypotheses to explain these visible differences, and then we need to examine the available evidence to test them.

One hypothesis is “rising demand”. It says that “the reason our A&E is failing is because demand on A&E is rising“.

Another hypothesis is “slow flow”. It says that “the reason our A&E is failing is because of the slow flow through the hospital because of delayed transfers of care (DTOCs)“.

So, if these hypotheses account for the behaviour we are observing then we would predict that the “high fliers” are (a) diverting A&E arrivals elsewhere, and (b) reducing admissions to free up beds to hold the DTOCs.

Let us look at the freely available data for the highest flyer … the green dot on the scatter gram … code-named “RC9”.

The top chart is the A&E arrivals per month.

The middle chart is the A&E 4 hr target yield per month.

The bottom chart is the emergency admissions per month.

Both arrivals and admissions are increasing, while the A&E 4 hr target yield is rock steady!

And arranging the charts this way allows us to see the temporal patterns more easily (and the images are deliberately arranged to show the overall pattern-over-time).

Patterns like the change-for-the-better that appears in the middle of the winter of 2013 (i.e. when many other trusts were complaining that their sagging A&E performance was caused by “winter pressures”).

The objective evidence seems to disprove the “rising demand”, “slow flow” and “winter pressure” hypotheses!

So what can we learn from our failure to adequately explain the reality we are seeing?

The trust code-named “RC9” is Luton and Dunstable, and it is an average district general hospital, on the surface. So to reveal some clues about what actually happened there, we need to read their Annual Report for 2013-14. It is a public document and it can be downloaded here.

This is just a snippet …

… and there are lots more knowledge nuggets like this in there …

… it is a treasure trove of well-known examples of good system flow design.

The results speak for themselves!

Q: How many black swans does it take to disprove the hypothesis that “all swans are white”.

A: Just one.

“RC9” is a black swan. An outlier. A positive deviant. “RC9” has disproved the “impossibility” hypothesis.

And there is another flock of black swans living in the North East … in the Newcastle area … so the “Big cities are different” hypothesis does not hold water either.

The challenge here is a human one. A human factor. Our learned fear of failure.

Learning-how-to-fail is the way to avoid failing-how-to-learn.

And to read more about that radical idea I strongly recommend reading the recently published book called Black Box Thinking by Matthew Syed.

It starts with a powerful story about the impact of human factors in health care … and here is a short video of Martin Bromiley describing what happened.

Video Player

Media error: Format(s) not supported or source(s) not found

Download File: http://www.improvementscience.co.uk/blog/wp-content/uploads/2016/10/Martin_Bromiley_Human_Factors.mp4?_=1

00:00

Use Up/Down Arrow keys to increase or decrease volume.

The “black box” that both Martin and Matthew refer to is the one that is used in air accident investigations to learn from what happened, and to use that learning to design safer aviation systems.

Martin Bromiley has founded a charity to support the promotion of human factors in clinical training, the Clinical Human Factors Group.

So if we can muster the courage and humility to learn how to do this in health care for patient safety, then we can also learn to how do it for flow, quality and productivity.

Our black swan called “RC9” has demonstrated that this goal is attainable.

And the body of knowledge needed to do this already exists … it is called Health and Social Care Systems Engineering (HSCSE).

For more posts like this please vote here.
For more information please subscribe here.
To email the author please click here.

Postscript: And I am pleased to share that Luton & Dunstable features in the House of Commons Health Committee report entitled Winter Pressures in A&E Departments that was published on 3rd Nov 2016.

Here is part of what L&D shared to explain their deviant performance:

These points describe rather well the essential elements of a pull design, which is the antidote to the rather more prevalent pressure cooker design.

01/10/2016

The Cream of the Crap Trap

It has been a busy week.

And a common theme has cropped up which I have attempted to capture in the diagram below.

It relates to how the NHS measures itself and how it “drives” improvement.

The measures are called “failure metrics” – mortality, infections, pressure sores, waiting time breaches, falls, complaints, budget overspends. The list is long.

The data for a specific trust are compared with an arbitrary minimum acceptable standard to decide where the organisation is on the Red-Amber-Green scale.

If we are in the red zone on the RAG chart … we get a kick. If not we don’t.

The fear of being bullied and beaten raises the emotional temperature and the internal pressure … which drives movement to get away from the pain. A nematode worm will behave this way. They are not stupid either.

As as we approach the target line our RAG indicator turns “amber” … this is the “not statistically significant zone” … and now the stick is being waggled, ready in case the light goes red again.

So we muster our reserves of emotional energy and we PUSH until our RAG chart light goes green … but then we have to hold it there … which is exhausting. One pain is replaced by another.

The next step is for the population of NHS nematodes to be compared with each other … they must be “bench-marked”, and some are doing better than others … as we might expect. We have done our “sadistics” training courses.

The bottom 5% or 10% line is used to set the “arbitrary minimum standard target” … and the top 10% are feted at national award ceremonies … and feast on the envy of the other 90 or 95% of “losers”.

The Cream of the Crop now have a big tick in their mission statement objectives box “To be in the Top 10% of Trusts in the UK“. Hip hip huzzah.

And what has this system design actually achieved? The Cream of the Crap.

Oops!

It is said that every system is perfectly designed to deliver what it delivers.

And a system that has been designed to only use failure and fear to push improvement can only ever achieve chronic mediocrity – either chaotic mediocrity or complacent mediocrity.

So, if we want to tap into the vast zone of unfulfilled potential, and if we want to escape the perpetual pain of the Cream of the Crap Trap … we need a better system design.

And maybe we might need a splash of humility and some system engineers to help us do that.

This week I met some at the Royal Academy of Engineering in London, and it felt like finding a candle of hope amidst the darkness of despair.

I said it had been a busy week!

03/07/2016

Early Warning System

The most useful tool that a busy operational manager can have is a reliable and responsive early warning system (EWS).

One that alerts when something is changing and that, if missed or ignored, will cause a big headache in the future.

Rather like the radar system on an aircraft that beeps if something else is approaching … like another aircraft or the ground!

Operational managers are responsible for delivering stuff on time. So they need a radar that tells them if they are going to deliver-on-time … or not.

And their on-time-delivery EWS needs to alert them soon enough that they have time to diagnose the ‘threat’, design effective plans to avoid it, decide which plan to use, and deliver it.

So what might an effective EWS for a busy operational manager look like?

It needs to be reliable. No missed threats or false alarms.
It needs to be visible. No tomes of text and tables of numbers.
It needs to be simple. Easy to learn and quick to use.

And what is on offer at the moment?

The RAG Chart
This is a table that is coloured red, amber and green. Red means ‘failing’, green means ‘not failing’ and amber means ‘not sure’. So this meets the specification of visible and simple, but it is reliable?

It appears not. RAG charts do not appear to have helped to solve the problem.

A RAG chart is generated using historic data … so it tells us where we are now, not how we got here, where we are going or what else is heading our way. It is a snapshot. One frame from the movie. Better than complete blindness perhaps, but not much.

The SPC Chart
This is a statistical process control chart and is a more complicated beast. It is a chart of how some measure of performance has changed over time in the past. So like the RAG chart it is generated using historic data. The advantage is that it is not just a snapshot of where were are now, it is a picture of story of how we got to where we are, so it offers the promise of pointing to where we may be heading. It meets the specification of visible, and while more complicated than a RAG chart, it is relatively easy to learn and quick to use.

Here is an example. It is the SPC chart of the monthly A&E 4-hour target yield performance of an acute NHS Trust. The blue lines are the ‘required’ range (95% to 100%), the green line is the average and the red lines are a measure of variation over time. What this charts says is: “This hospital’s A&E 4-hour target yield performance is currently acceptable, has been so since April 2012, and is improving over time.”

So that is much more helpful than a RAG chart (which in this case would have been green every month because the average was above the minimum acceptable level).

So why haven’t SPC charts replaced RAG charts in every NHS Trust Board Report?

Could there be a fly-in-the-ointment?

The answer is “Yes” … there is.

SPC charts are a quality audit tool. They were designed nearly 100 years ago for monitoring the output quality of a process that is already delivering to specification (like the one above). They are designed to alert the operator to early signals of deterioration, called ‘assignable cause signals’, and they prompt the operator to pay closer attention and to investigate plausible causes.

SPC charts are not designed for predicting if there is a flow problem looming over the horizon. They are not designed for flow metrics that exhibit expected cyclical patterns. They are not designed for monitoring metrics that have very skewed distributions (such as length of stay). They are not designed for metrics where small shifts generate big cumulative effects. They are not designed for metrics that change more slowly than the frequency of measurement.

And these are exactly the sorts of metrics that a busy operational manager needs to monitor, in reality, and in real-time.

Demand and activity both show strong cyclical patterns.

Lead-times (e.g. length of stay) are often very skewed by variation in case-mix and task-priority.

Waiting lists are like bank accounts … they show the cumulative sum of the difference between inflow and outflow. That simple fact invalidates the use of the SPC chart.

Small shifts in demand, activity, income and expenditure can lead to big cumulative effects.

So if we abandon our RAG charts and we replace them with SPC charts … then we climb out of the RAG frying pan and fall into the SPC fire.

Oops! No wonder the operational managers and financial controllers have not embraced SPC.

So is there an alternative that works better? A more reliable EWS that busy operational managers and financial controllers can use?

Yes, there is, and here is a clue …

… but tread carefully …

… building one of these Flow-Productivity Early Warning Systems is not as obvious as it might first appear. There are counter-intuitive traps for the unwary and the untrained.

You may need the assistance of a health care systems engineer (HCSE).

11/06/2016

Resuscitate-Review-Repair

We form emotional attachments to places where we have lived and worked. And it catches our attention when we see them in the news.

So this headline caught my eye, because I was a surgical SHO in Portsmouth in the closing years of the Second Millennium. The good old days when we still did 1:2 on call rotas (i.e. up to 104 hours per week) and we were paid 70% LESS for the on call hours than the Mon-Fri 9-5 work. We also had stable ‘firms’, superhuman senior registrars, a canteen that served hot food and strong coffee around the clock, and doctors mess parties that were … well … messy! A lot has changed. And not all for the better.

Here is the link to the fuller story about the emergency failures.

And from it we get the impression that this is a recent problem. And with a bit of a smack and some name-shame-blame-game feedback from the CQC, then all will be restored to robust health. H’mm. I am not so sure that is the full story.

Here is the monthly aggregate A&E 4-hour target performance chart for Portsmouth from 2010 to date.

It says “this is not a new problem“.

It also says that the ‘patient’ has been deteriorating spasmodically over six years and is now critically-ill.

And giving a critically-ill hospital a “good telling off” is about as effective as telling a critically-ill patient to “pull themselves together“. Inept management.

In A&E a critically-ill patient requires competent resuscitation using a tried-and-tested process of ABC. Airway, Breathing, Circulation.

Also, the A&E 4-hour performance is only a symptom of the sickness in the whole urgent care system. It is the reading on an emotometer inserted into the A&E orifice of the acute hospital! Just one piece in a much bigger flow jigsaw.

It only tells us the degree of distress … not the diagnosis … nor the required treatment.

So what level of A&E health can we realistically expect to be able to achieve? What is possible in the current climate of austerity? Just how chilled-out can the A&E cucumber run?

This is the corresponding A&E emotometer chart for a different district general hospital somewhere else in NHS England.

Luton & Dunstable Hospital to be specific.

This A&E happiness chart looks a lot healthier and it seems to be getting even healthier over time too. So this is possible.

Yes, but … if our hospital deteriorates enough to be put on the ‘critical list’ then we need to call in an Emergency Care Intensive Support Team (ECIST) to resuscitate us.

A very good idea.

And how do their critically-ill patients fare?

Here is the chart of one of them. The significant improvement following the ‘resuscitation’ is impressive to be sure!

But, disappointingly, it was not sustained and the patient ‘crashed’ again. Perhaps they were just too poorly? Perhaps the first resuscitation call was sent out too late? But at least they tried their best.

An experienced clinician might comment: Those are indeed a plausible explanations, but before we conclude that is the actual cause, can I check that we did not just treat the symptoms and miss the disease?

Q: So is it actually possible to resuscitate and repair a sick hospital? Is it possible to restore it to sustained health, by diagnosing and treating the cause, and not just the symptoms?

Here is the corresponding A&E emotometer chart of yet another hospital.

It shows the same pattern of deteriorating health. And it shows a dramatic improvement. It appears to have responded to some form of intervention.

And this time the significant improvement has sustained. The patient did not crash-and-burn again.

So what has happened here that explains this different picture?

This hospital had enough insight and humility to seek the assistance of someone who knew what to do and who had a proven track record of doing it. Dr Kate Silvester to be specific. A dual-trained doctor and manufacturing systems engineer.

Dr Kate is now a health care systems engineer (HCSE), and an experienced ‘hospital doctor’.

Dr Kate helped them to learn how to diagnose the root causes of their A&E 4-hr fever, and then she showed them how to design an effective treatment plan.

They did the re-design; they tested it; and they delivered their new design. Because they owned it, they understood it, and they trusted their own diagnosis-and-design competence.

And the evidence of their impact matching their intent speaks for itself.

07/05/2016

The NHS Cockpit Dashboard

A few weeks ago I raised the undiscussable issue that the NHS feels like it is on a downward trajectory … and that what might be needed are some better engines … and to design, test, build and install them we will need some health care system engineers (HCSEs) … and that we do not have appear to have enough of those. None in fact.

The feedback shows that many people resonated with this sentiment.

This week I had the opportunity to peek inside the NHS Cockpit and look at the Dashboard … and this is what I saw on the A&E Performance panel.

This is the monthly aggregate A&E 4-hour performance for England (red), Scotland (purple), Wales (brown) and Northern Ireland (grey) for the last six years.

The trajectory looked alarmingly obvious to me – the NHS is on a predictable path to destruction – a controlled flight into terrain (CFIT).

The repeating up-and-down pattern is the annual cycle of seasons; better in the summer and worse in the winter. This signal is driven by the celestial clock … the movement of the planets … which is beyond our power to influence.

The downward trajectory is the cumulative effect of our current design … which is the emergent effect of our collective beliefs, behaviours, policies and politics … which are completely within our gift to change.

If we chose to and if we knew how to – which we do not appear to.

Our collective ineptitude is not a topic for discussion. It is a taboo subject.

And I know that because if it were for discussion then this dashboard would be on public view on a website hosted by the NHS.

It isn’t.

It was created by George Donald, a member of the public, a disappointed patient, and a retired IT consultant. And it was shared, free for all to see and use via Twitter (@GMDonald).

The information source is open, public, shared NHS data, but it takes a lot of work to winkle it out and present it like this. So well done George … keep up the great work!

Now have a closer look at the Dashboard Display … look at the most recent data for England and Scotland. What do you see?

Does it look like Scotland is pulling out of the dive and England is heading down even faster?

Hard to say for sure; there are lots of signals and noise all mixed up.

So we need to use some Systems Engineering tools to help us separate the signals from the noise; and for this a statistical process control (SPC) chart is useless. We need a system behaviour chart (SBC) and its handy helper the deviation from aim (DFA) chart.

I will not bore you with the technical details but, suffice it to say, it is a tried-and-tested technique called the Method of Residuals.

Exhibit #1 is the DFA chart for Scotland. The middle 4 years (2011-2014) are used to create a ‘predictive model’; the model projection is then compared with measured performance; and the difference is plotted as the DFA chart.

What this “says” is that the 2015/16 performance in Scotland is significantly better than projected, and the change of direction seemed to start in the first half of 2015.

This evidence seems to support the results of our Mark I Eyeball test.

Exhibit #2 – the DFA for England suggests the 2015/16 performance is significantly worse than projected, and this deterioration appears to have started later in 2015.

Oh dear! I do not believe that was the intention, but it appears to be the impact.

So what are England and Scotland doing differently?
What can we all learn from this?
What can we all do differently in the future?

Isn’t that a question that more people like you, me and George could reasonably ask of those whom we entrust to design, build and fly our NHS?

Isn’t that a reasonable question that could be asked by the 65 million people in the UK who might, at any time, be unlucky enough to require a trip to their local A&E department.

So, let us all grasp the nettle and get the Elephant in the Room into plain view and say in unison “The Emperor Has No Clothes!”

We are suffering from mass ineptitude and hubris, to use Dr Atul Gawande’s language, and we need a better collective strategy.

And there is hope.

Some innovative hospitals have had the courage to grasp the nettle. They have seen what is coming; they have fully accepted the responsibility for their own fate; they have stepped up to the challenge; they have looked-listened-and-learned from others, and they are proving what is possible.

They have a name. They are called positive deviants.

Have a look at this short video … it is jaw-dropping … it is humbling … it is inspiring … and it is challenging … because it shows what has been achieved already.

It shows what is possible. Now, and here in the UK.

Luton and Dunstable

30/04/2016

What is Transformation?

It has been another interesting week. A bitter-sweet mixture of disappointment and delight. And the central theme has been ‘transformation’.

The source of disappointment was the newsreel images of picket lines of banner-waving junior doctors standing in the cold watching ambulances deliver emergencies to hospitals now run by consultants.

So what about the thousands of elective appointments and operations that were cancelled to release the consultants? If the NHS was failing elective delivery time targets before it is going to be failing them even more now. And who will pay for the “waiting list initiatives” needed to just catch up? Depressing to watch.

The mercurial Roy Lilley summed up the general mood very well in his newsletter on Thursday, the day after the strike.

What he is saying is we do not have a health care system, we have a sick care system. Which is the term coined by the acclaimed systems thinker, the late Russell Ackoff (see the video about half way down).

We aspire to a transformation-to-better but we only appear to be able to achieve a transformation-to-worse. That is depressing.

My source of delight was sharing the stories of those who are stepping up and are transforming themselves and their bits of the world; and how they are doing that by helping each other to learn “how to do it” – a small bite at a time.

Here is one excellent example: a diagnostic study looking at the root cause of the waiting time for school-age pupils to receive a health-protecting immunisation.

So what sort of transformation does the NHS need?

A transformation in the way it delivers care by elimination of the fragmentation that is the primary cause of the distrust, queues, waits, frustration, chaos and ever-increasing costs?

A transformation from purposeless and reactive; to purposeful and proactive?

A transformation from the disappointment that flows from the mismatch between intent and impact; to the delight that flows from discovering that there is a way forward; that there is a well understood science that underpins it; and a growing body of evidence that proves its effectiveness. The Science of Improvement.

In a recent blog I shared the story of how it is possible to ‘melt queues‘ or more specifically how it is possible to teach anyone, who wants to learn, how to melt queues.

It is possible to do this for an outpatient clinic in one day.

So imagine what could happen if just 1% of consultants decided improve their outpatient clinics using this quick-and-easy-to-learn-and-apply method? Those courageous and innovative consultants who are not prepared to drown in the Victim Vortex of despair and cynicism. And what could happen if they shared their improvement stories with their less optimistic colleagues? And what could happen if a just a few of them followed the lead of the innovators?

Would that be a small transformation? Or the start of a much bigger one? Or both?

23/04/2016

Undiscussables

Last week I shared a link to Dr Don Berwick’s thought provoking presentation at the Healthcare Safety Congress in Sweden.

Near the end of the talk Don recommended six books, and I was reassured that I already had read three of them. Naturally, I was curious to read the other three.

One of the unfamiliar books was “Overcoming Organizational Defenses” by the late Chris Argyris, a professor at Harvard. I confess that I have tried to read some of his books before, but found them rather difficult to understand. So I was intrigued that Don was recommending it as an ‘easy read’. Maybe I am more of a dimwit that I previously believed! So fear of failure took over my inner-chimp and I prevaricated. I flipped into denial. Who would willingly want to discover the true depth of their dimwittedness!

Later in the week, I was forwarded a copy of a recently published paper that was on a topic closely related to a key thread in Dr Don’s presentation:

understanding variation.

The paper was by researchers who had looked at the Board reports of 30 randomly selected NHS Trusts to examine how information on safety and quality was being shared and used. They were looking for evidence that the Trust Boards understood the importance of variation and the need to separate ‘signal’ from ‘noise’ before making decisions on actions to improve safety and quality performance. This was a point Don had stressed too, so there was a link.

The randomly selected Trust Board reports contained 1488 charts, of which only 88 demonstrated the contribution of chance effects (i.e. noise). Of these, 72 showed the Shewhart-style control charts that Don demonstrated. And of these, only 8 stated how the control limits were constructed (which is an essential requirement for the chart to be meaningful and useful).

That is a validity yield of 8 out of 1488, or 0.54%, which is for all practical purposes zero. Oh dear!

This chance combination of apparently independent events got me thinking.

Q1: What is the reason that NHS Trust Boards do not use these signal-and-noise separation techniques when it has been demonstrated, for at least 12 years to my knowledge, that they are very effective for facilitating improvement in healthcare? (e.g. Improving Healthcare with Control Charts by Raymond G. Carey was published in 2003).

Q2: Is there some form of “organizational defense” system in place that prevents NHS Trust Boards from learning useful ‘new’ knowledge?

So I surfed the Web to learn more about Chris Argyris and to explore in greater depth his concept of Single Loop and Double Loop learning. I was feeling like a dimwit again because to me it is not a very descriptive title! I suspect it is not to many others too.

I sensed that I needed to translate the concept into the language of healthcare and this is what emerged.

Single Loop learning is like treating the symptoms and ignoring the disease.

Double Loop learning is diagnosing the underlying disease and treating that.

So what are the symptoms?
The pain of NHS Trust failure on all dimensions – safety, delivery, quality and productivity (i.e. affordability for a not-for-profit enterprise).

And what are the signs?
The tell-tale sign is more subtle. It’s what is not present that is important. A serious omission. The missing bits are valid time-series charts in the Trust Board reports that show clearly what is signal and what is noise. This diagnosis is critical because the strategies for addressing them are quite different – as Julian Simcox eloquently describes in his latest essay. If we get this wrong and we act on our unwise decision, then we stand a very high chance of making the problem worse, and demoralizing ourselves and our whole workforce in the process! Does that sound familiar?

And what is the disease?
Undiscussables. Emotive subjects that are too taboo to table in the Board Room. And the issue of what is discussable is one of the undiscussables so we have a self-sustaining system. Anyone who attempts to discuss an undiscussable is breaking an unspoken social code. Another undiscussable is behaviour, and our social code is that we must not upset anyone so we cannot discuss ‘difficult’ issues. But by avoiding the issue (the undiscussable disease) we fail to address the root cause and end up upsetting everyone. We achieve exactly what we are striving to avoid, which is the technical definition of incompetence. And Chris Argyris labelled this as ‘skilled incompetence’.

Does an apparent lack of awareness of what is already possible fully explain why NHS Trust Boards do not use the tried-and-tested tool called a system behaviour chart to help them diagnose, design and deliver effective improvements in safety, flow, quality and productivity?

Or are there other forces at play as well?

Some deeper undiscussables perhaps?

02/04/2016

Culture – cause or effect?

The Harvard Business Review is worth reading because many of its articles challenge deeply held assumptions, and then back up the challenge with the pragmatic experience of those who have succeeded to overcome the limiting beliefs.

So the heading on the April 2016 copy that awaited me on my return from an Easter break caught my eye: YOU CAN’T FIX CULTURE.

The successful leaders of major corporate transformations are agreed … the cultural change follows the technical change … and then the emergent culture sustains the improvement.

The examples presented include the Ford Motor Company, Delta Airlines, Novartis – so these are not corporate small fry!

The evidence suggests that the belief of “we cannot improve until the culture changes” is the mantra of failure of both leadership and management.

A health care system is characterised by a culture of risk avoidance. And for good reason. It is all too easy to harm while trying to heal! Primum non nocere is a core tenet – first do no harm.

But, change and improvement implies taking risks – and those leaders of successful transformation know that the bigger risk by far is to become paralysed by fear and to do nothing. Continual learning from many small successes and many small failures is preferable to crisis learning after a catastrophic failure!

The UK healthcare system is in a state of chronic chaos. The evidence is there for anyone willing to look. And waiting for the NHS culture to change, or pushing for culture change first appears to be a guaranteed recipe for further failure.

The HBR article suggests that it is better to stay focussed; to work within our circles of control and influence; to learn from others where knowledge is known, and where it is not – to use small, controlled experiments to explore new ground.

And I know this works because I have done it and I have seen it work. Just by focussing on what is important to every member on the team; focussing on fixing what we could fix; not expecting or waiting for outside help; gathering and sharing the feedback from patients on a continuous basis; and maintaining patient and team safety while learning and experimenting … we have created a micro-culture of high safety, high efficiency, high trust and high productivity. And we have shared the evidence via JOIS.

The micro-culture required to maintain the safety, flow, quality and productivity improvements emerged and evolved along with the improvements.

It was part of the effect, not the cause.

So the concept of ‘fix the system design flaws and the continual improvement culture will emerge’ seems to work at macro-system and at micro-system levels.

We just need to learn how to diagnose and treat healthcare system design flaws. And that is known knowledge.

So what is the next excuse? Too busy?

12/03/2016

Grit in the Oyster

The word pearl is a metaphor for something rare, beautiful, and valuable.

Pearls are formed inside the shell of certain mollusks as a defense mechanism against a potentially threatening irritant.

The mollusk creates a pearl sac to seal off the irritation.

And so it is with change and improvement. The growth of precious pearls of improvement wisdom – the ones that develop slowly over time – are triggered by an irritant.

Someone asking an uncomfortable question perhaps, or presenting some information that implies that an uncomfortable question needs to be asked.

About seven years ago a question was asked “Would improving healthcare flow and quality result in lower costs?”

It is a good question because some believe that it would and some believe that it would not. So an experiment to test the hypothesis was needed.

The Health Foundation stepped up to the challenge and funded a three year project to find the answer. The design of the experiment was simple. Take two oysters and introduce an irritant into them and see if pearls of wisdom appeared.

The two ‘oysters’ were Sheffield Hospital and Warwick Hospital and the irritant was Dr Kate Silvester who is a doctor and manufacturing system engineer and who has a bit-of-a-reputation for asking uncomfortable questions and backing them up with irrefutable information.

Two rare and precious pearls did indeed grow.

In Sheffield, it was proved that by improving the design of their elderly care process they improved the outcome for their frail, elderly patients. More went back to their own homes and fewer left via the mortuary. That was the quality and safety improvement. They also showed a shorter length of stay and a reduction in the number of beds needed to store the work in progress. That was the flow and productivity improvement.

What was interesting to observe was how difficult it was to get these profoundly important findings published. It appeared that a further irritant had been created for the academic peer review oyster!

The case study was eventually published in Age and Aging 2014; 43: 472-77.

The pearl that grew around this seed is the Sheffield Microsystems Academy.

In Warwick, it was proved that the A&E 4 hour performance could be improved by focussing on improving the design of the processes within the hospital, downstream of A&E. For example, a redesign of the phlebotomy and laboratory process to ensure that clinical decisions on a ward round are based on todays blood results.

This specific case study was eventually published as well, but by a different path – one specifically designed for sharing improvement case studies – JOIS 2015; 22:1-30

And the pearls of wisdom that developed as a result of irritating many oysters in the Warwick bed are clearly described by Glen Burley, CEO of Warwick Hospital NHS Trust in this recent video.

Getting the results of all these oyster bed experiments published required irritating the Health Foundation oyster … but a pearl grew there too and emerged as the full Health Foundation report which can be downloaded here.

So if you want to grow a fistful of improvement and a bagful of pearls of wisdom … then you will need to introduce a bit of irritation … and Dr Kate Silvester is a proven source of grit for your oyster!

09/01/2016

The Two-Points-In-Time Comparison Trap

[Bzzzzzz] Bob’s phone vibrated to remind him it was time for the regular ISP remote coaching session with Leslie. He flipped the lid of his laptop just as Leslie joined the virtual meeting.

<Leslie> Hi Bob, and Happy New Year!

<Bob> Hello Leslie and I wish you well in 2016 too. So, what shall we talk about today?

<Leslie> Well, given the time of year I suppose it should be the Winter Crisis. The regularly repeating annual winter crisis. The one that feels more like the perpetual winter crisis.

<Bob> OK. What specifically would you like to explore?

<Leslie> Specifically? The habit of comparing of this year with last year to answer the burning question “Are we doing better, the same or worse?” Especially given the enormous effort and political attention that has been focused on the hot potato of A&E 4-hour performance.

<Bob> Aaaaah! That old chestnut! Two-Points-In-Time comparison.

<Leslie> Yes. I seem to recall you usually add the word ‘meaningless’ to that phrase.

<Bob> H’mm. Yes. It can certainly become that, but there is a perfectly good reason why we do this.

<Leslie> Indeed, it is because we see seasonal cycles in the data so we only want to compare the same parts of the seasonal cycle with each other. The apples and oranges thing.

<Bob> Yes, that is part of it. So what do you feel is the problem?

<Leslie> It feels like a lottery! It feels like whether we appear to be better or worse is just the outcome of a random toss.

<Bob> Ah! So we are back to the question “Is the variation I am looking at signal or noise?”

<Leslie> Yes, exactly.

<Bob> And we need a scientifically robust way to answer it. One that we can all trust.

<Leslie> Yes.

<Bob> So how do you decide that now in your improvement work? How do you do it when you have data that does not show a seasonal cycle?

<Leslie> I plot-the-dots and use an XmR chart to alert me to the presence of the signals I am interested in – especially a change of the mean.

<Bob> Good. So why can we not use that approach here?

<Leslie> Because the seasonal cycle is usually a big signal and it can swamp the smaller change I am looking for.

<Bob> Exactly so. Which is why we have to abandon the XmR chart and fall back the two points in time comparison?

<Leslie> That is what I see. That is the argument I am presented with and I have no answer.

<Bob> OK. It is important to appreciate that the XmR chart was not designed for doing this. It was designed for monitoring the output quality of a stable and capable process. It was designed to look for early warning signs; small but significant signals that suggest future problems. The purpose is to alert us so that we can identify the root causes, correct them and the avoid a future problem.

<Leslie> So we are using the wrong tool for the job. I sort of knew that. But surely there must be a better way than a two-points-in-time comparison!

<Bob> There is, but first we need to understand why a TPIT is a poor design.

<Leslie> Excellent. I’m all ears.

<Bob> A two point comparison is looking at the difference between two values, and that difference can be positive, zero or negative. In fact, it is very unlikely to be zero because noise is always present.

<Leslie> OK.

<Bob> Now, both of the values we are comparing are single samples from two bigger pools of data. It is the difference between the pools that we are interested in but we only have single samples of each one … so they are not measurements … they are estimates.

<Leslie> So, when we do a TPIT comparison we are looking at the difference between two samples that come from two pools that have inherent variation and may or may not actually be different.

<Bob> Well put. We give that inherent variation a name … we call it variance … and we can quantify it.

<Leslie> So if we do many TPIT comparisons then they will show variation as well … for two reasons; first because the pools we are sampling have inherent variation; and second just from the process of sampling itself. It was the first lesson in the ISP-1 course.

<Bob> Well done! So the question is: “How does the variance of the TPIT sample compare with the variance of the pools that the samples are taken from?”

<Leslie> My intuition tells me that it will be less because we are subtracting.

<Bob> Your intuition is half-right. The effect of the variation caused by the signal will be less … that is the rationale for the TPIT after all … but the same does not hold for the noise.

<Leslie> So the noise variation in the TPIT is the same?

<Bob> No. It is increased.

<Leslie> What! But that would imply that when we do this we are less likely to be able to detect a change because a small shift in signal will be swamped by the increase in the noise!

<Bob> Precisely. And the degree that the variance increases by is mathematically predictable … it is increased by a factor of two.

<Leslie> So as we usually present variation as the square root of the variance, to get it into the same units as the metric, then that will be increased by the square root of two … 1.414

<Bob> Yes.

<Leslie> I need to put this counter-intuitive theory to the test!

<Bob> Excellent. Accept nothing on faith. Always test assumptions. And how will you do that?

<Leslie> I will use Excel to generate a big series of normally distributed random numbers; then I will calculate a series of TPIT differences using a fixed time interval; then I will calculate the means and variations of the two sets of data; and then I will compare them.

<Bob> Excellent. Let us reconvene in ten minutes when you have done that.

10 minutes later …

<Leslie> Hi Bob, OK I am ready and I would like to present the results as charts. Is that OK?

<Bob> Perfect!

<Leslie> Here is the first one. I used our A&E performance data to give me some context. We know that on Mondays we have an average of 210 arrivals with an approximately normal distribution and a standard deviation of 44; so I used these values to generate the random numbers. Here is the simulated Monday Arrivals chart for two years.

<Bob> OK. It looks stable as we would expect and I see that you have plotted the sigma levels which look to be just under 50 wide.

<Leslie> Yes, it shows that my simulation is working. So next is the chart of the comparison of arrivals for each Monday in Year 2 compared with the corresponding week in Year 1.

<Bob> Oooookaaaaay. What have we here? Another stable chart with a mean of about zero. That is what we would expect given that there has not been a change in the average from Year 1 to Year 2. And the variation has increased … sigma looks to be just over 60.

<Leslie> Yes! Just as the theory predicted. And this is not a spurious answer. I ran the simulation dozens of times and the effect is consistent! So, I am forced by reality to accept the conclusion that when we do two-point-in-time comparisons to eliminate a cyclical signal we will reduce the sensitivity of our test and make it harder to detect other signals.

<Bob> Good work Leslie! Now that you have demonstrated this to yourself using a carefully designed and conducted simulation experiment, you will be better able to explain it to others.

<Leslie> So how do we avoid this problem?

<Bob> An excellent question and one that I will ask you to ponder on until our next chat. You know the answer to this … you just need to bring it to conscious awareness.

21/11/2015

Whip or WIP?

The NHS appears to be suffering from some form of obsessive-compulsive disorder.

OCD sufferers feel extreme anxiety in certain situations. Their feelings drive their behaviour which is to reduce the perceived cause of their feelings. It is a self-sustaining system because their perception is distorted and their actions are largely ineffective. So their anxiety is chronic.

Perfectionists demonstrate a degree of obsessive-compulsive behaviour too.

In the NHS the triggers are called ‘targets’ and usually take the form of failure metrics linked to arbitrary performance specifications.

The anxiety is the fear of failure and its unpleasant consequences: the name-shame-blame-game.

So a veritable industry has grown around ways to mitigate the fear. A very expensive and only partially effective industry.

Data is collected, cleaned, manipulated and uploaded to the Mothership (aka NHS England). There it is further manipulated, massaged and aggregated. Then the accumulated numbers are posted on-line, every month for anyone with a web-browser to scrutinise and anyone with an Excel spreadsheet to analyse.

An ocean of measurements is boiled and distilled into a few drops of highly concentrated and sanitized data and, in the process, most of the useful information is filtered out, deleted or distorted.

For example …

One of the failure metrics that sends a shiver of angst through a Chief Operating Officer (COO) is the failure to deliver the first definitive treatment for any patient within 18 weeks of referral from a generalist to a specialist.

The infamous and feared 18-week target.

Service providers, such as hospitals, are actually fined by their Clinical Commissioning Groups (CCGs) for failing to deliver-on-time. Yes, you heard that right … one NHS organisation financially penalises another NHS organisation for failing to deliver a result over which they have only partial control.

Service providers do not control how many patients are referred, or a myriad of other reasons that delay referred patients from attending appointments, tests and treatments. But the service providers are still accountable for the outcome of the whole process.

This ‘Perform-or-Pay-The-Price Policy‘ creates the perfect recipe for a lot of unhappiness for everyone … which is exactly what we hear and what we see.

So what distilled wisdom does the Mothership share? Here is a snapshot …

Q1: How useful is this table of numbers in helping us to diagnose the root causes of long waits, and how does it help us to decide what to change in our design to deliver a shorter waiting time and more productive system?

A1: It is almost completely useless (in this format).

So what actually happens is that the focus of management attention is drawn to the part just before the speed camera takes the snapshot … the bit between 14 and 18 weeks.

Inside that narrow time-window we see a veritable frenzy of target-failure-avoiding behaviour.

Clinical priority is side-lined and management priority takes over. This is a management emergency! After all, fines-for-failure are only going to make the already bad financial situation even worse!

The outcome of this fire-fighting is that the bigger picture is ignored. The focus is on the ‘whip’ … and avoiding it … because it hurts!

Message from the Mothership: “Until morale improves the beatings will continue”.

The good news is that the undigestible data liquor does harbour some very useful insights. All we need to do is to present it in a more palatable format … as pictures of system behaviour over time.

We need to use the data to calculate the work-in-progress (=WIP).

And then we need to plot the WIP in time-order so we can see how the whole system is behaving over time … how it is changing and evolving. It is a dynamic living thing, it has vitality.

So here is the WIP chart using the distilled wisdom from the Mothership.

And this picture does not require a highly trained data analyst or statistician to interpret it for us … a Mark I eyeball linked to 1.3 kg of wetware running ChimpOS 1.0 is enough … and if you are reading this then you must already have that hardware and software.

Two patterns are obvious:

1) A cyclical pattern that appears to have an annual frequency, a seasonal pattern. The WIP is higher in the summer than in the winter. Eh? What is causing that?

2) After an initial rapid fall in 2008 the average level was steady for 4 years … and then after March 2012 it started to rise. Eh? What is causing is that?

The purpose of a WIP chart is to stimulate questions such as:

Q1: What happened in March 2012 that might have triggered this change in system behaviour?

Q2: What other effects could this trigger have caused and is there evidence for them?

A1: In March 2012 the Health and Social Care Act 2012 became Law. In the summer of 2012 the shiny new and untested Clinical Commissioning Groups (CCGs) were authorised to take over the reins from the exiting Primary care Trusts (PCTs) and Strategic Health Authorities (SHAs). The vast £80bn annual pot of tax-payer cash was now in the hands of well-intended GPs who believed that they could do a better commissioning job than non-clinicians. The accountability for outcomes had been deftly delegated to the doctors. And many of the new CCG managers were the same ones who had collected their redundancy checks when the old system was shut down. Now that sounds like a plausible system-wide change! A massive political experiment was underway and the NHS was the guinea-pig.

A2: Another NHS failure metric is the A&E 4-hour wait target which, worringly, also shows a deterioration that appears to have started just after July 2010, i.e. just after the new Government was elected into power. Maybe that had something to do with it? Maybe it would have happened whichever party won at the polls.

A plausible temporal association does not constitute proof – and we cannot conclude a political move to a CCG-led NHS has caused the observed behaviour. Retrospective analysis alone is not able to establish the cause.

It could just as easily be that something else caused these behaviours. And it is important to remember that there are usually many causal factors combining together to create the observed effect.

And unraveling that Gordian Knot is the work of analysts, statisticians, economists, historians, academics, politicians and anyone else with an opinion.

We have a more pressing problem. We have a deteriorating NHS that needs urgent resuscitation!

So what can we do?

One thing we can do immediately is to make better use of our data by presenting it in ways that are easier to interpret … such as a work in progress chart.

Doing that will trigger different conversions; ones spiced with more curiosity and laced with less cynicism.

We can add more context to our data to give it life and meaning. We can season it with patient and staff stories to give it emotional impact.

And we can deepen our understanding of what causes lead to what effects.

And with that deeper understanding we can begin to make wiser decisions that will lead to more effective actions and better outcomes.

This is all possible. It is called Improvement Science.

And as we speak there is an experiment running … a free offer to doctors-in-training to learn the foundations of improvement science in healthcare (FISH).

In just two weeks 186 have taken up that offer and 13 have completed the course!

And this vanguard of curious and courageous innovators have discovered a whole new world of opportunity that they were completely unaware of before. But not anymore!

So let us ease off applying the whip and ease in the application of WIP.

PostScript

Here is a short video describing how to create, animate and interpret a form of diagnostic Vitals Chart® using the raw data published by NHS England. This is a training exercise from the Improvement Science Practitioner (level 2) course.

How to create an 18 weeks animated Bucket Brigade Chart (BBC)

12/09/2015

A Case of Chronic A&E Pain: Part 1

The blog last week seems to have caused a bit of a stir … so this week we will continue on the same theme.

I’m Dr Bob and I am a hospital doctor: I help to improve the health of poorly hospitals.

And I do that using the Science of Improvement – which is the same as all sciences, there is a method to it.

Over the next few weeks I will outline, in broad terms, how this is done in practice.

And I will use the example of a hospital presenting with pain in their A&E department. We will call it St.Elsewhere’s ® Hospital … a fictional name for a real patient.

It is a while since I learned science at school … so I thought a bit of a self-refresher would be in order … just to check that nothing fundamental has changed.

This is what I found on page 2 of a current GCSE chemistry textbook.

Note carefully that the process starts with observations; hypotheses come after that; then predictions and finally designing experiments to test them.

The scientific process starts with study.

Which is reassuring because when helping a poorly patient or a poorly hospital that is exactly where we start.

So, first we need to know the symptoms; only then can we start to suggest some hypotheses for what might be causing those symptoms – a differential diagnosis; and then we look for more specific and objective symptoms and signs of those hypothetical causes.

<Dr Bob> What is the presenting symptom?

<StE> “Pain in the A&E Department … or more specifically the pain is being felt by the Executive Department who attribute the source to the A&E Department. Their pain is that of 4-hour target failure.“

<Dr Bob> Are there any other associated symptoms?

<StE> “Yes, a whole constellation. Complaints from patients and relatives; low staff morale, high staff turnover, high staff sickness, difficulty recruiting new staff, and escalating locum and agency costs. The list is endless.”

<Dr Bob> How long have these symptoms been present?

<StE> “As long as we can remember.”

<Dr Bob> Are the symptoms staying the same, getting worse or getting better?

<StE> “Getting worse. It is worse in the winter and each winter is worse than the last.”

<Dr Bob> And what have you tried to relieve the pain?

<StE> “We have tried everything and anything – business process re-engineering, balanced scorecards, Lean, Six Sigma, True North, Blue Oceans, Golden Hours, Perfect Weeks, Quality Champions, performance management, pleading, podcasts, huddles, cuddles, sticks, carrots, blogs and even begging. You name it we’ve tried it! The current recommended treatment is to create a swarm of specialist short-stay assessment units – medical, surgical, trauma, elderly, frail elderly just to name a few.”

<Dr Bob> And how effective have these been?

<StE> “Well some seemed to have limited and temporary success but nothing very spectacular or sustained … and the complexity and cost of our processes just seem to go up and up with each new initiative. It is no surprise that everyone is change weary and cynical.”

The pattern of symptoms is that of a chronic (longstanding) illness that has seasonal variation, which is getting worse over time and the usual remedies are not working.

And it is obvious that we do not have a clear diagnosis; or know if our unclear diagnosis is incorrect; or know if we are actually dealing with an incurable disease.

So first we need to focus on establishing the diagnosis.

And Dr Bob is already drawing up a list of likely candidates … with carveoutosis at the top.

<Dr Bob> Do you have any data on the 4-hour target pain? Do you measure it?

<StE> “We are awash with data! I can send the quarterly breach performance data for the last ten years!”

<Dr Bob> Excellent, that will be useful as it should confirm that this is a chronic and worsening problem but it does not help establish a diagnosis. What we need is more recent, daily data. Just the last six months should be enough. Do you have that?

<StE> “Yes, that is how we calculate the quarterly average that we are performance managed on. Here is the spreadsheet. We are ‘required’ to have fewer than 5% 4-hour breaches on average. Or else.”

This is where Dr Bob needs some diagnostic tools. He needs to see the pain scores presented as picture … so he can see the pattern over time … because it is a very effective way to generate plausible causal hypotheses.

Dr Bob can do this on paper, or with an Excel spreadsheet, or use a tool specifically designed for the job. He selects his trusted visualisation tool : BaseLine©.

<Dr Bob> This is your A&E pain data plotted as a time-series chart. At first glance it looks very chaotic … that is shown by the wide and flat histogram. Is that how it feels?

<StE> “That is exactly how it feels … earlier in the year it was unremitting pain and now we have a constant background ache with sharp, severe, unpredictable stabbing pains on top. I’m not sure what is worse!

<Dr Bob> We will need to dig a bit deeper to find the root cause of this chronic pain … we need to identify the diagnosis or diagnoses … and your daily pain data should offer us some clues.

So I have plotted your data in a different way … grouping by day of the week … and this shows there is a weekly pattern to your pain. It looks worse on Mondays and least bad on Fridays. Is that your experience?

<StE> “Yes, the beginning of the week is definitely worse … because it is like a perfect storm … more people referred by their GPs on Mondays and the hospital is already full with the weekend backlog of delayed discharges so there are rarely beds to admit new patients into until late in the day. So they wait in A&E.

Dr Bob’s differential diagnosis is firming up … he still suspects acute-on-chronic carveoutosis as the primary cause but he now has identified an additional complication … Forrester’s Syndrome.

And Dr Bob suspects an unmentioned problem … that the patient has been traumatised by a blunt datamower!

So that is the evidence we will look for next … here

05/09/2015

The Catastrophe is Coming

This week an interesting report was published by Monitor – about some possible reasons for the A&E debacle that England experienced in the winter of 2014.

Summary At A Glance

“91% of trusts did not meet the A&E 4-hour maximum waiting time standard last winter – this was the worst performance in 10 years”.

So it seems a bit odd that the very detailed econometric analysis and the testing of “Ten Hypotheses” did not look at the pattern of change over the previous 10 years … it just compared Oct-Dec 2014 with the same period for 2013! And the conclusion: “Hospitals were fuller in 2014“. H’mm.

The data needed to look back 10 years is readily available on the various NHS England websites … so here it is plotted as simple time-series charts. These are called system behaviour charts or SBCs. Our trusted analysis tools will be a Mark I Eyeball connected to the 1.3 kg of wetware between our ears that runs ChimpOS 1.0 … and we will look back 11 years to 2004.

First we have the A&E Arrivals chart … about 3.4 million arrivals per quarter. The annual cycle is obvious … higher in the summer and falling in the winter. And when we compare the first five years with the last six years there has been a small increase of about 5% and that seems to associate with a change of political direction in 2010.

So over 11 years the average A&E demand has gone up … a bit … but only by about 5%.

In stark contrast the A&E arrivals that are admitted to hospital has risen relentlessly over the same 11 year period by about 50% … that is about 5% per annum … ten times the increase in arrivals … and with no obvious step in 2010. We can see the annual cycle too. It is a like a ratchet. Click click click.

But that does not make sense. Where are these extra admissions going to? We can only conclude that over 11 years we have progressively added more places to admit A&E patients into. More space-capacity to store admitted patients … so we can stop the 4-hour clock perhaps? More emergency assessment units perhaps? Places to wait with the clock turned off perhaps? The charts imply that our threshold for emergency admission has been falling: Admission has become increasingly the ‘easier option’ for whatever reason. So why is this happening? Do more patients need to be admitted?

In a recent empirical study we asked elderly patients about their experience of the emergency process … and we asked them just after they had been discharged … when it was still fresh in their memories. A worrying pattern emerged. Many said that they had been admitted despite them saying they did not want to be. In other words they did not willingly consent to admission … they were coerced.

This is anecdotal data so, by implication, it is wholly worthless … yes? Perhaps from a statistical perspective but not from an emotional one. It is a red petticoat being waved that should not be ignored. Blissful ignorance comes from ignoring anecdotal stuff like this. Emotionally uncomfortable anecdotal stories. Ignore the early warning signs and suffer the potentially catastrophic consequences.

And here is the corresponding A&E 4-hour Target Failure chart. Up to 2010 the imposed target was 98% success (i.e. 2% acceptable failure) and, after bit of “encouragement” in 2004-5, this was actually achieved in some of the summer months (when the A&E demand was highest remember).

But with a change of political direction in 2010 the “hated” 4-hour target was diluted down to 95% … so a 5% failure rate was now ‘acceptable’ politically, operationally … and clinically.

So it is no huge surprise that this is what was achieved … for a while at least.

In the period 2010-13 the primary care trusts (PCTs) were dissolved and replaced by clinical commissioning groups (CCGs) … the doctors were handed the ignition keys to the juggernaut that was already heading towards the cliff.

The charts suggest that the seeds were already well sown by 2010 for an evolving catastrophe that peaked last year; and the changes in 2010 and 2013 may have just pressed the accelerator pedal a bit harder. And if the trend continues it will be even worse this coming winter. Worse for patients and worse for staff and worse for commissioners and worse for politicians. Lose lose lose lose.

So to summarise the data from the NHS England’s own website:

1. A&E arrivals have gone up 5% over 11 years.
2. Admissions from A&E have gone up 50% over 11 years.
3. Since lowering the threshold for acceptable A&E performance from 98% to 95% the system has become unstable and “fallen off the cliff” … but remember, a temporal association does not prove causation.

So what has triggered the developing catastrophe?

Well, it is important to appreciate that when a patient is admitted to hospital it represents an increase in workload for every part of the system that supports the flow through the hospital … not just the beds. Beds represent space-capacity. They are just where patients are stored. We are talking about flow-capacity; and that means people, consumables, equipment, data and cash.

So if we increase emergency admissions by 50% then, if nothing else changes, we will need to increase the flow-capacity by 50% and the space-capacity to store the work-in-progress by 50% too. This is called Little’s Law. It is a mathematically proven Law of Flow Physics. It is not negotiable.

So have we increased our flow-capacity and our space-capacity (and our costs) by 50%? I don’t know. That data is not so easy to trawl from the websites. It will be there though … somewhere.

What we have seen is an increase in bed occupancy (the red box on Monitor’s graphic above) … but not a 50% increase … that is impossible if the occupancy is already over 85%. A hospital is like a rigid metal box … it cannot easily expand to accommodate a growing queue … so the inevitable result in an increase in the ‘pressure’ inside. We have created an emergency care pressure cooker. Well lots of them actually.

And that is exactly what the staff who work inside hospitals says it feels like.

And eventually the relentless pressure and daily hammering causes the system to start to weaken and fail, gradually at first then catastrophically … which is exactly what the NHS England data charts are showing.

So what is the solution? More beds?

Nope. More beds will create more space and that will relieve the pressure … for a while … but it will not address the root cause of why we are admitting 50% more patients than we used to; and why we seem to need to increase the pressure inside our hospitals to squeeze the patients through the process and extrude them out of the various exit nozzles.

Those are the questions we need to have understandable and actionable answers to.

Q1: Why are we admitting 5% more of the same A&E arrivals each year rather than delivering what they need in 4 hours or less and returning them home? That is what the patients are asking for.

Q2: Why do we have to push patients through the in-hospital process rather than pulling them through? The staff are willing to work but not inside a pressure cooker.

A more sensible improvement strategy is to look at the flow processes within the hospital and ensure that all the steps and stages are pulling together to the agreed goals and plan for each patient. The clinical management plan that was decided when the patient was first seen in A&E. The intended outcome for each patient and the shortest and quickest path to achieving it.

Our target is not just a departure within 4 hours of arriving in A&E … it is a competent diagnosis (study) and an actionable clinical management plan (plan) within 4 hours of arriving; and then a process that is designed to deliver (do) it … for every patient. Right, first time, on time, in full and at a cost we can afford.

Q: Do we have that?
A: Nope.

Q: Is that within our gift to deliver?
A: Yup.

Q: So what is the reason we are not already doing it?
A: Good question. Who in the NHS is trained how to do system-wide flow design like this?

13/07/2015

Good Science, an antidote to Ben Goldacre’s “Bad Science”

by Julian Simcox & Terry Weight

Ben Goldacre has spent several years popularizing the idea that we all ought all to be more interested in science.

Every day he writes and tweets examples of “bad science”, and about getting politicians and civil servants to be more evidence-based; about how governmental interventions should be more thoroughly tested before being rolled-out to the hapless citizen; about how the development and testing of new drugs should be more transparent to ensure the public get drugs that actually make a difference rather than risk harm; and about bad statistics – the kind that “make clever people do stupid things”(8).

Like Ben we would like to point the public sector, in particular the healthcare sector and its professionals, toward practical ways of doing more of the good kind of science, but just what is GOOD science?

In collaboration with the Cabinet Office’s behaviour insights team, Ben has recently published a polemic (9) advocating evidence-based government policy. For us this too is commendable, yet there is a potentially grave error of omission in their paper which seems to fixate upon just a single method of research, and risks setting-up the unsuspecting healthcare professional for failure and disappointment – as Abraham Maslow once famously said

“.. it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail”(17)

We question the need for the new Test, Learn and Adapt (TLA) model he offers because the NHS already possesses such a model – one which in our experience is more complete and often simpler to follow – it is called the “Improvement Model”(15) – and via its P-D-S-A mnemonic (Plan-Do-Study-Act) embodies the scientific method.

Moreover there is a preexisting wealth of experience on how best to embed this thinking within organisations – from top-to-bottom and importantly from bottom-to-top; experience that has been accumulating for fully nine decades – and though originally established in industrial settings has long since spread to services.

We are this week publishing two papers, one longer and one shorter, in which we start by defining science, ruing the dismal way in which it is perennially conveyed to children and students, the majority of whom leave formal education without understanding the power of discovery or gaining any first hand experience of the scientific method.

View Shorter Version Abstract

We argue that if science were to be defined around discovery, and learning cycles, and built upon observation, measurement and the accumulation of evidence – then good science could vitally be viewed as a process rather than merely as an externalized entity. These things comprise the very essence of what Don Berwick refers to as Improvement Science (2) as embodied by the Institute of Healthcare Improvement (IHI) and in the NHS’s Model for Improvement.

We also aim to bring an evolutionary perspective to the whole idea of science, arguing that its time has been coming for five centuries, yet is only now more fully arriving. We suggest that in a world where many at school have been turned-off science, the propensity to be scientific in our daily lives – and at work – makes a vast difference to the way people think about outcomes and their achievement. This is especially so if those who take a perverse pride in saying they avoided science at school, or who freely admit they do not do numbers, can get switched on to it.

The NHS Model for Improvement has a pedigree originating with Walter Shewhart in the 1920’s, then being famously applied by Deming and Juran after WWII. Deming in particular encapsulates the scientific method in his P-D-C-A model (three decades later he revised it to P-D-S-A in order to emphasize that the Check stage must not be short-changed) – his pragmatic way of enabling a learning/improvement to evolve bottom-up in organisations.

After the 1980’s Dr Don Berwick , standing on these shoulders, then applied the same thinking to the world of healthcare – initially in his native America. Berwick’s approach is to encourage people to ask questions such as: What works? .. and How would we know? His method, is founded upon a culture of evidence-based learning, providing a local context for systemic improvement efforts. A new organisational culture, one rooted in the science of improvement, if properly nurtured, may then emerge.

Yet, such a culture may initially jar with the everyday life of a conventional organisation, and the individuals within it. One of several reasons, according to Yuval Harari (21), is that for hundreds of generations our species has evolved such that imagined reality has been lorded over objective reality. Only relatively recently in our evolution has the advance of science been leveling up this imbalance, and in our papers we argue that a method is now needed that enables these two realities to more easily coexist.

We suggest that a method that enables data-rich evidence-based storytelling – by those who most know about the context and intend growing their collective knowledge – will provide the basis for an approach whereby the two realities may do just that.

In people’s working lives, a vital enabler is the 3-paradigm “Accountability/Improvement/Research” measurement model (AIRmm), reflecting the three archetypal ways in which people observe and measure things. It was created by healthcare professionals (23) to help their colleagues and policy-makers to unravel a commonly prevailing confusion, and to help people make better sense of the different approaches they may adopt when needing to evidence what they’re doing – depending on the specific purpose. An amended version of this model is already widely quoted inside the NHS, though this is not to imply that it is yet as widely understood or applied as it needs to be.

This 3-paradigm A-I-R measurement model underpins the way that science can be applied by, and has practical appeal for, the stretched healthcare professional, managerial leader, civil servant.

Indeed for anyone who intuitively suspects there has to be a better way to combine goals that currently feel disconnected or even in conflict: empowerment and accountability; safety and productivity; assurance and improvement; compliance and change; extrinsic and intrinsic motivation; evidence and action; facts and ideas; logic and values; etc.

Indeed for anyone who is searching for ways to unify their actions with the system-based implementation of those actions as systemic interventions. Though widely quoted in other guises, we are returning to the original model (23) because we feel it better connects to the primary aim of supporting healthcare professionals make best sense of their measurement options.

In particular the model makes it immediately plain that a way out of the apparent Research/Accountability dichotomy is readily available to anyone willing to “Learn, master and apply the modern methods of quality control, quality improvement and quality planning” – the recommendation made for all staff in the Berwick Report (3).

In many organisations, and not just in healthcare, the column 1 paradigm is the only game in town. Column 3 may feel attractive as a way-out, but it also feels inaccessible unless there is a graduate in statistician on hand. Moreover, the mainstay of the Column 3 worldview: the Randomized Controlled Trial (RCT) can feel altogether overblown and lacking in immediacy. It can feel like reaching for a spanner and finding a lump hammer in your hand – as Berwick says “Fans of traditional research methods view RCTs as the gold standard, but RCTs do not work well in many healthcare contexts” (2).

Like us, Ben is frustrated by the ways that healthcare organisations conduct themselves – not just the drug companies that commercialize science and publish only the studies likely to enhance sales, but governments too who commonly implement politically expedient policies only to then have to subsequently invent evidence to support them.

Policy-based evidence rather than evidence-based policy.

Ben’s recommended Column 3-style T-L-A approach is often more likely to make day-to-day sense to people and teams on the ground if complemented by Column 2-style improvement science.
One reason why Improvement Science can sometimes fail to dent established cultures is that it gets corralled by organisational “experts” – some of whom then use what little knowledge they have gathered merely to make themselves indispensable, not realising the extent to which everyone else as a consequence gets dis-empowered.

In our papers we take the opportunity to outline the philosophical underpinnings, and to do this we have borrowed the 7-point framework from a recent paper by Perla et al (35) who suggest that Improvement Science:

1. Is grounded in testing and learning cycles – the aim is collective knowledge and understanding about cause & effect over time. Some scientific method is needed, together with a way to make the necessary inquiry a collaborative one. Shewhart realised this and so invented the concept “continual improvement”.

2. Embraces a combination of psychology and logic – systemic learning requires that we balance myth and received wisdom with logic and the conclusions we derive from rational inquiry. This balance is approximated by the Sensing-Intuiting continuum in the Jungian-based MBTI model (12) reminding us that constructing a valid story requires bandwidth.

3. Has a philosophical foundation of conceptualistic pragmatism (16) – it cannot be expected that two scientists when observing, experiencing, or experimenting will make the same theory-neutral observations about the same event – even if there is prior agreement about methods of inference and interpretation. The normative nature of reality therefore has to be accommodated. Whereas positivism ultimately reduces the relation between meaning and experience to a matter of logical form, pragmatism allows us to ground meaning in conceived experience.

4. Employs Shewhart’s “theory of cause systems” – Walter Shewhart created the Control Chart for tuning-in to systemic behaviour that would otherwise remain unnoticed. It is a diagnostic tool, but by flagging potential trouble also aids real time prognosis. It might have been called a “self-control chart” for he was especially interested in supporting people working in and on their system being more considered (less reactive) when taking action to enhance it without over-reacting – avoiding what Deming later referred to as “Tampering” (4).

5. Requires the use of Operational Definitions – Deming warned that some of the most important aspects of a system cannot be expressed numerically, and those that can require care because “there is no true value of anything measured or observed” (5). When it comes to metric selection therefore it is essential to understand the measurement process itself, as well as the “operational definition” that each metric depends upon – the aim being to reduce ambiguity to zero.

6. Considers the contexts of both justification and discovery – Science can be defined as a process of discovery – testing and learning cycles built upon observation, measurement and accumulating evidence or experience – shared for example via a Flow Chart or a Gantt chart in order to justify a belief in the truth of an assertion. To be worthy of the term “science” therefore, a method or procedure is needed that is characterised by collaborative inquiry.

7. Is informed by Systems Theory – Systems Theory is the study of systems, any system: as small as a quark or as large as the universe. It aims to uncover archetypal behaviours and the principles by which systems hang together – behaviours that can be applied across all disciplines and all fields of research. There are several types of systems thinking, but Jay Forrester’s “System Dynamics” has most pertinence to Improvement Science because of its focus on flows and relationships – recognising that the behaviour of the whole may not be explained by the behaviour of the parts.

In the papers, we say more about this philosophical framing, and we also refer to the four elements in Deming’s “System of Profound Knowledge”(5). We especially want to underscore that the overall aim of any scientific method we employ is contextualised knowledge – which is all the more powerful if continually generated in context-specific experimental cycles. Deming showed that good science requires a theory of knowledge based upon ever-better questions and hypotheses. We two aim now to develop methods for building knowledge-full narratives that can work well in healthcare settings.

We wholeheartedly agree with Ben that for the public sector – not just in healthcare – policy-making needs to become more evidence-based.

In a poignant blog from the Health Foundation’s (HF) Richard Taunt (24), he recently describes attending two recent conferences on the same day. At the first one, policymakers from 25 countries had assembled to discuss how national policy can best enhance the quality of health care. When collectively asked which policies they would retain and repeat, their list included: use of data, building quality improvement capability, ensuring senior management are aware of improvement approaches, and supporting and spreading innovations.

In a different part of London, UK health politicians happened also to be debating Health and Care in order to establish the policy areas they would focus on if forming the next government. This second discussion brought out a completely different set of areas: the role of competition, workforce numbers, funding, and devolution of commissioning. These two discussions were supposedly about the same topic, but a Venn diagram would have contained next to no overlap.

Clare Allcock, also from the HF, then blogged to comment that “in England, we may think we are fairly advanced in terms of policy levers, but (unlike, for example in Scotland or the USA) we don’t even have a strategy for implementing health system quality.” She points in particular to Denmark who recently have announced they are phasing out their hospital accreditation scheme in favour of an approach strongly focused around quality improvement methodology and person-centred care. The Danes are in effect taking the 3-paradigm model and creating space for Column 2: improvement thinking.

The UK needs to take a leaf out of their book, for without changing fundamentally the way the NHS (and the public sector as a whole) thinks about accountability, any attempt to make column 2 the dominant paradigm is destined to be still born.

It is worth noting that in large part the AIRmm Column 2 paradigm was actually central to the 2012 White Paper’s values, and with it the subsequent Outcomes Framework consultation – both of which repeatedly used the phrase “bottom-up” to refer to how the new system of accountability would need to work, but somehow this seems to have become lost in legislative procedures that history will come to regard as having been overly ambitious. The need for a new paradigm of accountability however remains – and without it health workers and clinicians – and the managers who support them – will continue to view metrics more as something intrusive than as something that can support them in delivering enhancements in sustained outcomes. In our view the Stevens’ Five Year Forward View makes this new kind of accountability an imperative.

“Society, in general, and leaders and opinion formers, in particular, (including national and local media, national and local politicians of all parties, and commentators) have a crucial role to play in shaping a positive culture that, building on these strengths, can realise the full potential of the NHS.
When people find themselves working in a culture that avoids a predisposition to blame, eschews naïeve or mechanistic targets, and appreciates the pressures that can accumulate under resource constraints, they can avoid the fear, opacity, and denial that will almost inevitably lead to harm.”
Berwick Report (3)

Changing cultures means changing our habits – it starts with us. It won’t be easy because people default to the familiar, to more of the same. Hospitals are easier to build than relationships; operations are easier to measure than knowledge, skills and confidence; and prescribing is easier than enabling. The two of us do not of course possess a monopoly on all possible solutions, but our experience tells us that now is the time for: evidence-rich storytelling by front line teams; by pharmaceutical development teams; by patients and carers conversing jointly with their physicians.

We know that measurement is not a magic bullet, but what frightens us is that the majority of people seem content to avoid it altogether. As Oliver Moody recently noted in The Times ..

Call it innumeracy, magical thinking or intrinsic mental laziness, but even intelligent members of the public struggle, through no fault of their own, to deal with statistics and probability. This is a problem. People put inordinate amounts of trust in politicians, chief executives, football managers and pundits whose judgment is often little better than that of a psychic octopus. Short of making all schoolchildren study applied mathematics to A level, the only thing scientists can do about this is stick to their results and tell more persuasive stories about them.

Too often, Disraeli’s infamous words: “Lies, damned lies, and statistics” are used as the refuge of busy professionals looking for an excuse to avoid numbers.

If Improvement Science is to become a shared language, Berwick’s recommendation that all NHS staff “Learn, master and apply the modern methods of quality control, quality improvement and quality planning” has to be taken seriously.

As a first step we recommend enabling teams to access good data in as near to real time as possible, data that indicates the impact that one’s intervention is having – this alone can prompt a dramatic shift in the type of conversation that people working in and on their system may have. Often this can be initiated simply by converting existing KPI data into System Behaviour Chart form which, using a tool like BaseLine® takes only a few mouse clicks.

In our longer paper we offer three examples of Improvement Science in action – combining to illustrate how data may be used to evidence both sustained systemic enhancement, and to generate engagement by the people most directly connected to what in real time is systemically occurring.

1. A surgical team using existing knowledge established by column 3-type research as a platform for column 2-type analytic study – to radically reduce post-operative surgical site infection (SSI).

2. 25 GP practices are required to collect data via the Friends & Family Test (FFT) and decide to experiment with being more than merely compliant. In two practices they collectively pilot a system run by their PPG (patient participation group) to study the FFT score – patient by patient – as they arrive each day. They use IS principles to separate signal from noise in a way that prompts the most useful response to the feedback in near to real time. Separately they summarise all the comments as a whole and feed their analysis into the bi-monthly PPG meeting. The aim is to address both “special cause” feedback and “common cause” feedback in a way that, in what most feel is an over-loaded system, can prompt sensibly prioritised improvement activity.

3. A patient is diagnosed with NAFLD and receives advice from their doctor to get more exercise e.g. by walking more. The patient uses the principles of IS to monitor what happens – using the data not just to show how they are complying with their doctor’s advice, but to understand what drives their personal mind/body system. The patient hopes that this knowledge can lead them to better decision-making and sustained motivation.

The landscape of NHS improvement and innovation support is fragmented, cluttered, and currently pretty confusing. Since May 2013 Academic Health Science Networks (AHSNs) funded by NHS England (NHSE) have been created with the aim of bringing together health services, and academic and industry members. Their stated purpose is to improve patient outcomes and generate economic benefits for the UK by promoting and encouraging the adoption of innovation in healthcare. They have a 5 year remit and have spent the first 2 years establishing their structures and recruiting, it is not yet clear if they will be able to deliver what’s really needed.

Patient Safety Collaboratives linked with AHSN areas have also been established to improve the safety of patients and ensure continual patient safety learning. The programme, coordinated by NHSE and NHSIQ will provide safety improvements across a range of healthcare settings by tackling the leading causes of avoidable harm to patients. The intention is to empower local patients and healthcare staff to work together to identify safety priorities and develop solutions – implemented and tested within local healthcare organisations, then later shared nationally.

We hope our papers will significantly influence the discussions about how improvement and innovation can assist with these initiatives. In the shorter paper to echo Deming, we even include our own 14 points for how healthcare organisations need to evolve. We will know that we have succeeded if the papers are widely read; if we enlist activists like Ben to the definition of science embodied by Improvement Science; and if we see a tidal wave of improvement science methods being applied across the NHS?

As patient volunteers, we each intend to find ways of contributing in any way that appears genuinely helpful. It is our hope that Improvement Science enables the cultural transformation we have envisioned in our papers and with our case studies. This is what we feel most equipped to help with. When in your sixties it easy to feel that time is short, but maybe people of every age should feel this way? In the words of Francis Bacon, the father of the scientific method.

Download Long Version

References

13/06/201507/09/2024

Excellent or Mediocre?

Many organisations proclaim that their vision, goal, objective and mission is to achieve excellence but then proceed to deliver mediocre performance.

Why is this?

It is certainly not from lack of purpose, passion or people.

So, the flaw must lie somewhere in the process.

The clue lies in how we measure performance … and to see the collective mindset behind the design of the performance measurement system we just need to examine the key performance indicators or KPIs.

Do they measure failure or success?

Let us look at some from the NHS …. hospital mortality, hospital acquired infections, never events, 4-hour A&E breaches, cancer wait breaches, 18 week breaches, and so on.

In every case the metric reported is a failure metric. Not a success metric.

And the focus of action is all about getting away from failure.

Damage mitigation, damage limitation and damage compensation.

So, we have the answer to our question: our performance metrics prove we know we are doing a good job when we are not failing.

But are we?

When we are not failing we are not doing a bad job … is that the same as doing a good job?

Q: Does excellent = not excrement?

A: No. There is something between the extremes of excrement and excellent.

And the succeed-or-fail dichotomy is a distorting simplification created by applying an arbitrary threshold to a continuous measure of performance.

So, how, specifically, have we designed our current system to avoid failure?

Usually by imposing an arbitrary target connected to a punitive reaction to failure. Performance management by fear.

This tactic generates an expected and predictable punishment-avoidance and back-covering behaviour which is manifest as a lot of repeated checking and correcting of the inevitable errors that we find. A lot of extra work that requires extra time and that requires extra money.

So, while an arbitrary-target-driven-check-and-correct design may avoid failing on safety, the additional cost may cause us to then fail on financial viability.

Out of the frying pan and into the fire.

No wonder Governance and Finance come into conflict!

And if we do manage to pull off an uneasy compromise … then what level of quality are we achieving?

Studies show that if take a random sample of 100 people from the pool of ‘disappointed by their experience’ and we ask if they are prepared to complain then only 5% will do so.

So, if we use complaints as our improvement feedback loop and we react to that and make changes that eliminate these complaints then what do we get? Excellence?

Nope.

We get what we designed … just good enough to avoid the 5% of complaints but not the 95% of disappointment.

We get mediocrity.

And what do we do then?

We start measuring ‘customer satisfaction’ … which is actually asking the question ‘did your experience meet your expectation?’

And if we find that satisfaction scores are disappointingly low then how do we improve them?

We have two choices: improve the experience or reduce the expectation.

But as we are very busy doing the necessary checking-and-correcting then our path of least resistance to greater satisfaction is … to lower expectations.

And we do that by donning the black hat of the pessimist and we lay out the the risks and dangers.

And by doing that we generate anxiety and fear. Which was not the intended outcome.

Our mission statement proclaims ‘trusted to achieve excellence’ rather than ‘designed to deliver mediocrity’.

But mediocrity is what the evidence says we are delivering. Just good enough to avoid a smack.

And if we are honest with ourselves then we are forced to conclude that:

A design that uses failure metrics as the primary feedback loop can achieve no better than mediocrity.

So, if we choose to achieve excellence then we need a better feedback design.

We need a design that uses success metrics as the primary feedback loop and we use failure metrics only in safety critical contexts.

And the ideal people to specify the success metrics are those who feel the benefit directly and immediately … the patients who receive care and the staff who give it.

Ask a patient what they want and they do not say “To be treated in less than 18 weeks”. In fact I have yet to meet a patient who has even heard of the 18-week target!

A patient will say ‘I want to know what is wrong, what can be done, when it can be done, who will do it, what do I need to do, and what can I expect to be the outcome’.

Do we measure any of that?

Do we measure accuracy of diagnosis? Do we measure use of best evidenced practice? Do we know the possible delivery time (not the actual)? Do we inform patients of what they can expect to happen? Do we know what they can expect to happen? Do we measure the experience and outcome for every patient? Good and not so good? Do we feed that information back continuously and learn from it?

Nope.

So …. if we choose and commit to delivering excellence then we will need to start measuring-4-success and feeding what we see back to those who deliver the care.

Warts and all.

We want to know when we are doing a good job, and we need to know where to focus further improvement effort.

And if we abdicate that commitment and choose to deliver mediocrity-by-default then we are the engineers of our own chaos and despair.

We have the choice.

We are the agents of our own destiny.

06/06/2015

Bitten by the ISP bug

There is a condition called SFQPosis which is an infection that is transmitted by a vector called an ISP.

The primary symptom of SFQPosis is sudden clarity of vision and a new understanding of how safety, flow, quality and productivity improvements can happen at the same time …

… when they are seen as partners on the same journey.

There are two sorts of ISP … Solitary and Social.

Solitary ISPs infect one person at a time … often without them knowing. And there is often a long lag time between the infection and the appearance of symptoms. Sometimes years – and often triggered by an apparently unconnected event.

In contrast the Social ISPs will tend to congregate together and spend their time foraging for improvement pollen and nectar and bringing it back to their ‘hive’ to convert into delicious ‘improvement honey’ which once tasted is never forgotten.

It appears that Jeremy Hunt, the Secretary of State for Health, has recently been bitten by an ISP and is now exhibiting the classic symptoms of SFQPosis.

Here is the video of Jeremy describing his symptoms at the recent NHS Confederation Conference. The talk starts at about 4 minutes.

His account suggests that he was bitten while visiting the Virginia Mason Hospital in the USA and on return home then discovered some Improvement hives in the UK … and some of the Solitary ISPs that live in England.

Warwick and Sheffield NHS Trusts are buzzing with ISPs … and the original ISP that infected them was one Kate Silvester.

The repeated message in Jeremy’s speech is that improved safety, quality and productivity can happen at the same time and are within our gift to change – and the essence of achieving that is to focus on flow.

The sequence is safety first (eliminate the causes of avoidable harm), then flow second (eliminate the causes of avoidable chaos), then quality (measure both expectation and experience) and then productivity will soar.

And everyone will benefit.

This is not a zero-sum win-lose game.

So listen for the buzz of the ISPs …. follow it and ask them to show you how … ask them to innoculate you with SFQPosis.

And here is a recent video of Dr Steve Allder, a consultant neurologist and another ISP that Kate infected with SFQPosis a few years ago. Steve is describing his own experience of learning how to do Improvement-by-Design.

29/03/2015

Over-Egged Expectation

Resistance-to-change is an oft quoted excuse for improvement torpor. The implied sub-message is more like “We would love to change but They are resisting“.

Notice the Us-and-Them language. This is the observable evidence of an “We‘re OK and They’re Not OK” belief. And in reality it is this unstated belief and the resulting self-justifying behaviour that is an effective barrier to systemic improvement.

This Us-and-Them language generates cultural friction, erodes trust and erects silos that are effective barriers to the flow of information, of innovation and of learning. And the inevitable reactive solutions to this Us-versus-Them friction create self-amplifying positive feedback loops that ensure the counter-productive behaviour is sustained.

One tangible manifestation are DRATs: Delusional Ratios and Arbitrary Targets.

So when a plausible, rational and well-evidenced candidate for an alternative approach is discovered then it is a reasonable reaction to grab it and to desperately spray the ‘magic pixie dust’ at everything.

This a recipe for disappointment: because there is no such thing as ‘improvement magic pixie dust’.

The more uncomfortable reality is that the ‘magic’ is the result of a long period of investment in learning and the associated hard work in practising and polishing the techniques and tools.

It may look like magic but is isn’t. That is an illusion.

And some self-styled ‘magicians’ choose to keep their hard-won skills secret … because by sharing them know that they will lose their ‘magic powers’ in a flash of ‘blindingly obvious in hindsight’.

And so the chronic cycle of despair-hope-anger-and-disappointment continues.

System-wide improvement in safety, flow, quality and productivity requires that the benefits of synergism overcome the benefits of antagonism. This requires two changes to the current hope-and-despair paradigm. Both are necessary and neither are sufficient alone.

1) The ‘wizards’ (i.e. magic folk) share their secrets.
2) The ‘muggles’ (i.e. non-magic folk) invest the time and effort in learning ‘how-to-do-it’.

The transition to this awareness is uncomfortable so it needs to be managed pro-actively … by being open about the risk … and how to mitigate it.

That is what experienced Practitioners of Improvement Science (and ISP) will do. Be open about the challenged ahead.

And those who desperately want the significant and sustained SFQP improvements; and an end to the chronic chaos; and an end to the gaming; and an end to the hope-and-despair cycle …. just need to choose. Choose to invest and learn the ‘how to’ and be part of the future … or choose to be part of the past.

Improvement science is simple … but it is not intuitively obvious … and so it is not easy to learn.

If it were we would be all doing it.

And it is the behaviour of a wise leader of change to set realistic and mature expectations of the challenges that come with a transition to system-wide improvement.

That is demonstrating the OK-OK behaviour needed for synergy to grow.

07/03/2015

The Improvement Gearbox

One of the most rewarding experiences for an improvement science coach is to sense when an individual or team shift up a gear and start to accelerate up their learning curve.

It is like there is a mental gearbox hidden inside them somewhere. Before they were thrashing themselves by trying to go too fast in a low gear. Noisy, ineffective, inefficient and at high risk of blowing a gasket!

Then, they discover that there is a higher gear … and that to get to it they have to take a risk … depress the emotional clutch, ease back on the gas, slip into neutral, and trust themselves to find the new groove and … click … into the higher gear, and then ease up the power while letting out the clutch. And then accelerate up the learning curve. More effective, more efficient. More productive. More fun.

Organisations appear to behave in much the same way.

Some scream along in the slow-lane … thrashing their employee engine. The majority chug complacently in the middle-lane of mediocrity. A few accelerate past in the fast-lane to excellence.

And they are all driving exactly the same model of car.

So it is not the car that is making the difference … it is the driving.

Those who have studied organisations have observed five cultural “gears”; and which gear an organisation is in most of the time can be diagnosed by listening to the sound of the engine – the conversations of the employees.

If they are muttering “work sucks” then they are in first gear. The sense of hopelessness, futility, despair and anger consumes all their emotional fuel. Fortunately this is uncommon.

If we mainly hear “my work sucks” then they are in second gear. The feeling is of helplessness and apathy and the behaviour is Victim-like. They believe that they cannot solve their own problems … someone else must do it for them or tell them what to do. They grumble a lot.

If the dominant voice is “I’m great but you lot suck” then we are hearing third gear attitudes. The selfishly competitive behaviour of the individualist achiever. The “keep your cards close to your chest” style of dyadic leadership. The advocate of “it is OK to screw others to get ahead”. They grumble a lot too – about the apathetic bunch.

And those who have studied organisations suggest that about 80% of healthcare organisations are stuck in first, second or third cultural gear. And we can tell who they are … the lower 80% of the league tables. The ones clamouring for more … of everything.

So how come so many organisations are so stuck? Unable to find fourth gear?

One cause is the design of their feedback loops. Their learning loops.

If an organisation only uses failure as a feedback loop then it is destined to get no more than mediocrity. Third gear at best, and usually only second.

Example.
We all feel disappointment when our experience does not live up to our expectation. But only the most angry of us will actually do something and complain. Especially when we have no other choice of provider!

Suppose we are commissioners of healthcare services and we are seeing a rising tide of patient and staff complaints. We want to improve the safety and quality of the services that we are paying for; so we draw up a league table using complaints as feedback fodder and we focus on the worst performing providers … threatening them with dire consequences for being in the bottom 20%. What happens? Fear of failure motivates them to ‘pull up their socks’ and the number of complaints falls.

Job done?

Unfortunately not.

All we have done is to bully those stuck in first or second gear into thrashing their over-burdened employee engine even harder. We have not helped anyone find their higher gear. We have hit the target, missed the point, and increased the risk of system failure!

So what about those organisations stuck in third gear?

Well they are ticking their performance boxes, meeting our targets, keeping their noses clean. Some are just below, and some just above the collective mean of barely acceptable mediocrity.

But expectation is changing.

The 20% who have discovered fourth gear are accelerating ahead and are demonstrating what is possible. And they are raising expectation, increasing the variation of service quality … for the better.

And the other 80% are falling further and further behind; thrashing their tired and demoralised staff harder and harder to keep up. Complaining increasingly that life is unfair and that they need more, time, money and staff engagement. Eventually their executive head gaskets go “pop” and they fall by the wayside.

Finding cultural fourth gear is possible but it is not easy. There are no short cuts. We have to work our way up the gears and we have to learn when and how to make smooth transitions from first to second, second to third and then third to fourth.

And when we do that the loudest voice we hear is “We are OK“.

We need to learn how to do a smooth cultural hill start on the steep slope from apathy to excellence.

And we need to constantly listen to the sound of our improvement engine; to learn to understand what it is saying; and learn how and when to change to the next cultural gear.

28/02/2015

Circles

For a system to be both effective and efficient the parts need to work in synergy. This requires both alignment and collaboration.

Systems that involve people and processes can exhibit complex behaviour. The rules of engagement also change as individuals learn and evolve their beliefs and their behaviours.

The values and the vision should be more fixed. If the goalposts are obscure or oscillate then confusion and chaos is inevitable.

So why is collaborative alignment so difficult to achieve?

One factor has been mentioned. Lack of a common vision and a constant purpose.

Another factor is distrust of others. Our fear of exploitation, bullying, blame, and ridicule.

Distrust is a learned behaviour. Our natural inclination is trust. We have to learn distrust. We do this by copying trust-eroding behaviours that are displayed by our role models. So when leaders display these behaviours then we assume it is OK to behave that way too. And we dutifully emulate.

The most common trust eroding behaviour is called discounting. It is a passive-aggressive habit characterised by repeated acts of omission: Such as not replying to emails, not sharing information, not offering constructive feedback, not asking for other perspectives, and not challenging disrespectful behaviour.

There are many causal factors that lead to distrust … so there is no one-size-fits-all solution to dissolving it.

One factor is ineptitude.

This is the unwillingness to learn and to use available knowledge for improvement.

It is one of the many manifestations of incompetence. And it is an error of omission.

Whenever we are unable to solve a problem then we must always consider the possibility that we are inept. We do not tend to do that. Instead we prefer to jump to the conclusion that there is no solution or that the solution requires someone else doing something different. Not us.

The impossibility hypothesis is easy to disprove. If anyone has solved the problem, or a very similar one, and if they can provide evidence of what and how then the problem cannot be impossible to solve.

The someone-else’s-fault hypothesis is trickier because proving it requires us to influence others effectively. And that is not easy. So we tend to resort to easier but less effective methods … manipulation, blame, bullying and so on.

A useful way to view this dynamic is as a set of four concentric circles – with us at the centre.

The outermost circle is called the ‘Circle of Ignorance‘. The collection of all the things that we do not know we do not know.

Just inside that is the ‘Circle of Concern‘. These are things we know about but feel completely powerless to change. Such as the fact that the world turns and the sun rises and falls with predictable regularity.

Inside that is the ‘Circle of Influence‘ and it is a broad and continuous band – the further away the less influence we have; the nearer in the more we can do. This is the zone where most of the conflict and chaos arises.

The innermost is the ‘Circle of Control‘. This is where we can make changes if we so choose to. And this is where change starts and from where it spreads.

So if we want system-level improvements in safety, flow, quality and productivity (or cost) then we need to align these four circles. Or rather the gaps in them.

We start with the gaps in our circle of control. The things that we believe we cannot do … but when we try … we discover that we can (and always could).

With this new foundation of conscious competence we can start to build new relationships, develop trust and to better influence others in a win-win-win conversation.

And then we can collaborate to address our common concerns – the ones that require coherent effort. We can agree and achieve our common purpose, vision and goals.

And from there we will be able to explore the unknown opportunities that lie beyond. The ones we cannot see yet.

31/01/201531/03/2023

The Nanny McPhee Coaching Contract

There comes a point in every improvement-by-design journey when it is time for the improvement guide to leave.

An experienced improvement coach knows when that time has arrived and the expected departure is in the contract.

The Nanny McPhee Coaching Contract:

“When you need me but do not want me then I have to stay. And when you want me but do not need me then I have to leave.”

The science of improvement can appear like ‘magic’ at first because seemingly impossible simultaneous win-win-win benefits are seen to happen with minimal effort.

It is not magic. It requires years of training and practice to become a ‘magician’. So those who have invested in learning the know-how are just catalysts. When their catalysts-of-change work is done then they must leave to do it elsewhere.

The key to managing this transition is to set this expectation clearly and right at the start; so it does not come as a surprise. And to offer reminders along the way.

And it is important to follow through … when the time is right.

It is not always easy though.

There are three commonly encountered situations that will test the temptation of the guide.

1) When things are going very badly because the coaching contract is being breached; usually by old, habitual, trust-eroding, error-of-omission behaviours such as: not communicating, not sharing learning, and not delivering on commitments. The coach, fearing loss of reputation and face, is tempted to stay longer and to try harder. Often getting angry and frustrated in the process. This is an error of judgement. If the coaching contract is being persistently breached then the Exit Clause should be activated clearly and cleanly.

2) When things are going OK, it is easy to become complacent and the temptation then is to depart too soon, only to hear later that the solo-flyers “crashed and burned”, because they were not quite ready and could not (or would not) see it. This is the “need but do not want” part of the Nanny McPhee Coaching Contract. One role of the coach is to respectfully challenge the assertion that ‘We can do it ourselves‘ … by saying ‘OK, please demonstrate‘.

3) When things are going very well it is tempting to blow the Trumpet of Success too early, attracting the attention of others who will want to take short cuts, to bypass the effort of learning for themselves, and to jump onto someone else’s improvement bus. The danger here is that they bring their counter-productive, behavioural baggage with them. This can cause the improvement bus to veer off course on the twists and turns of the Nerve Curve; or grind to a halt on the steeper parts of the learning curve.

An experienced improvement coach will respectfully challenge the individuals and the teams to help them develop their experience, competence and confidence. And just as they start to become too comfortable with having someone to defer to for all decisions, the coach will announce their departure and depart as announced.

This is the “want but do not need” part of the Nanny McPhee Coaching Contract.

And experience teaches us that this mutually respectful behaviour works better.

28/12/2014

Guess-work or Grunt-work?

Improvement flows from change. Change flows from action. Action flows from decision.

And we can make a decision in one of two ways – we can use guess-work or we can use grunt-work.

Of course it does not feel as black and white as that so let us put those two options at the opposite ends of a spectrum. Pure guess-work at one and and pure grunt-work at the other.

Guess-work is the easier end. To guess we just need a random number generator of some sort – like a dice. Grunt-work is the harder end. And what exactly is “grunt-work”?

Using available knowledge to work out a decision that will get us to our intended outcome is grunt-work. It does not require creativity, imagination, assumptions, beliefs, judgements and all the usual machinery that we humans employ to make decisions. It just requires following the tried-and-tested recipe and doing the grunt-work. A computer does grunt-work. It just follows the recipe we give it.

But experience shows that we even with hard work we do not always get the outcome we intend. So what is going wrong?

When the required knowledge is available and we do not use it we are exhibiting ineptitude. So in that context then we have a clear path of improvement: We invest first in dissolving our own ineptitude. We invest in learning what is already known. And that is grunt-work. Hard work.

When the required knowledge is not available then we are exhibiting ignorance. And our ignorance is exposed in two ways: firstly when we cannot make a decision of what to do because we have no option other than to guess. And secondly when what we predicted would happen as a result of our action did not actually happen. Reality disproved our rhetoric.

When we are ignorant we have a different path of improvement – first we need to do research to improve our knowledge and understanding, and only then when we are able to apply the new knowledge to make reliable predictions. We need tested and trusted knowledge to design a path to out intended outcome.

And as Richard Feynman perceptively observed … research starts with an educated guess. We might call it an hypothesis but it is a guess nevertheless. From that we make predictions and then we do experiments using reality to test our rhetoric. All guesses that fail the reality-check are rejected. So our vast body of scientific knowledge is the accumulation of guesses that did not fail the reality-check.

The critical word in the paragraph above is “educated”. How do researchers make educated guesses?

What does the word “educated” imply?

School is all about learning what is already “known”. There is no debate. The teachers are always right, only the students can be wrong. It is assumed.

But most of our learning comes from what we experience before and after school. We are all enrolled in the University of Life – and the teacher there is reality, not rhetoric.

And when we are tested by reality we are very often found to be lacking something. Well actually we are always found to be lacking. Sometimes we flunk the test outright and have to go back to the bottom of the learning ladder. Sometimes we scrape a bare pass … we survive … but we know we came close to failing. Sometimes we secure a safe pass … and still we know we could have done better. We can always do better.

But how? Is it because we were ignorant? Or was it because we were inept?

Examinations at The University of Rhetoric are designed to measure our ineptitude.

The University of Life is not so didactic or autocratic. The challenges it presents come from anywhere in the Ignorance-Ineptitude Zone. We need educated guesswork to survive there.

So one problem we face is how do we differentiate ignorance from ineptitude?

At this point it is important to separate individual ignorance from collective ignorance; and individual ineptitude from collective ineptitude. There are two dimensions at play.

The history of science is characterised by individuals who first resolved their individual ignorance when they discover something new. Only later was it appreciated that they were the first. So long as that discovery is shared then collective ignorance has reduced too. There is no need for everyone to rediscover everything when we share our learning.

Newton’s “discovery” of the Laws of Motion is a good example of an individual discovery quickly becoming collective knowledge. And with that collective knowledge we have proved we are able to land a spaceship on a far distant comet! That is grunt-work.

Einstein’s “discovery” of Relativity did not disprove Newton’s Laws of Motion, it re-framed and re-fined them so that even more profound predictions could be made. Some of the predictions are only now being tested as our technology has evolved to be able to perform the measurements with sufficient precision and accuracy. That is grunt-work. And it is increasingly collective grunt-work.

We are all born individually ignorant and individually inept.

Through experience and education we become aware of collective knowledge and with that we develop our individual capabilities. We do not re-invent every wheel.

And with that individual capability we are able to survive. We can secure a “pass” in the University of Life Survival Challenge.

But it leaves a lot of room for improvement.

Continuing to build collective knowledge through scientific research into more and more complicated and complex challenges, such as climate change, is necessary. But it is not sufficient. We need more.

Developing our collective capability to put that knowledge to the service of every living thing on the Earth is our challenge. And that is not grunt-work because we do not have a recipe to follow. We have to discover how to do that.

And that journey of discovery is called Improvement Science.

20/12/2014

People first or Process first?

A recurring theme this week has been the interplay between the cultural and the technical dimensions of system improvement.

The hearts and the minds. The people and the process. The psychology and the physics.

Reflecting on the many conversations what became clear was that both are required but not always in the same amount and in the same sequence.

The context is critical.

In some cases we can start with some technical stuff. Some flow physics and a Gantt and Run chart or two.

In other cases we have to start with some cultural stuff. Some conversations about values, beliefs and behaviours.

And they are both tricky but in different ways.

The technical stuff is counter-intuitive. We have to engage our logical, rational thinking brains and work it through step-by-step, making every assumption explicit and every definition clear.

If we go with our gut we get it wrong (although we feel it is right) and then we fail, and then we blame others or ourselves. Either way we lose confidence. The logical thinking is hard work. It makes our heads ache. So we cut corners.

But once we have understood then it gets much easier because we can then translate our hard won understanding into a trusted heuristic. We do not need to work it out every time. We can just look up the correct recipe.

And there lurks a trap … the problem that was at first unrecognised, then impossible, then difficult, and then doable … becomes easy and even obvious … but only after we have worked out a solution. And that obvious-in-hindsight effect is a source of many dangers …

… we can become complacent, over-confident, and even dismissive of others who have not been through the ‘pain’ of learning. We may be tempted to elevate our status and to inflate our importance by hoarding our hard-won understanding. We risk losing our humility … and when we do that we stop being curious and we stop learning. And then we are part of the problem again.

So to avoid those traps we need to hold ourselves in the role of the teacher and coach. We need to actively share what we have learned and explain how we came to know it. One step at a time … the blood, the sweat and the tears … the confusion and eureka moments. Not one giant leap from where we started to where we got to. And when we have the generosity to share our knowledge … it is surprising how much we learn! We learn more from teaching than by being taught.

The cultural stuff is counter-intuitive too. We have to engage our emotional, irrational, feeling brains and step back from the objective fine-print to look at the subjective full-picture. We have to become curious. We have to look at the problem from as many perspectives as we can. We have to practice humble inquiry by asking others what they see.

If we go with our gut and rely only on our learned and habitual beliefs, our untested assumptions and our prejudices … we get it wrong. When we filter reality to match our rhetoric, we leap to invalid conclusions, and we make unwise decisions, and they lead to counter-productive actions.

Our language and behaviour gives the game away … we cannot help it … because all this is happening unconsciously and out of our awareness.

So we need to solicit unfiltered feedback from trusted others who will describe what they see. And that is tough to do.

So how do we know where to act first? Cultural or technical?

The conclusion I have come to is to use a check-list … the Safe System Improvement check-list so to speak.

Check cultural first – Is there a need to do some people stuff? If so then do it.

Check technical second – Is there a need to do some process stuff? If so then do it.

If neither are needed then we need to get out of the way and let the people redesign the processes. Only they can.

29/11/2014

Counter-Productivity

The Webex icon bounced up and down on Bob’s task bar signalling that Leslie had just joined the weekly ISP coaching session.

<Leslie> Hi Bob. I have been so busy this week that I have not had time to consider a topic to explore.

<Bob> No problem Leslie, I have shelf full of topics we have not touched yet. So shall we talk about counter-productivity?

<Leslie> Don’t you mean productivity … the fourth dimension of system improvement.

<Bob>They are related of course but we will approach the issue of productivity from a different angle. Rather like we did with safety. To improve safety we considered at the causes of un-safety and focussed our efforts there.

<Leslie> Ah yes, I see. So to improve productivity we look at the causes of un-productivity … in other words counter-productive beliefs and behaviours that are manifest as system design flaws.

<Bob> Exactly. So remind me what the definition of a productivity metric is from your FISH course.

<Leslie> Productivity is the ratio of a stream metric and a stage metric. Value-for-Money for example.

<Bob> Good. So counter-productivity is also a ratio of a stream and a stage metric.

<Leslie> Um, I’m not sure I quite get that. Can you explain a bit more.

<Bob> OK. To explore deeper we need to be clear about how each metric relates to our intended outcome. Remember in safety-by-design we count the number and severity of risks and harm because as harm is going up then safety is going down. So harm is an un-safety stream metric.

<Leslie> Ah! Yes I see. So if we look at cycle-time, which is a stage metric; as cycle-time increases, the activity falls and productivity falls. So cycle-time is actually a counter-productivity metric.

<Bob>Excellent. You are getting the hang of the concept of counter-productivity.

<Leslie> And we need to be careful because productivity is a ratio so the numerator and denominator metrics work in opposite ways: increasing the magnitude of the numerator is equivalent to decreasing the magnitude of the denominator – the ratio increases.

<Bob> Indeed, there are many hazards with ratios as we have explored before. So let is consider a real and rather useful example. Let us look at Little’s Law from the perspective of counter-productivity. Remind me of the definition of Little’s Law for a single step system.

<Leslie> Little’s Law is a mathematically proven law of flow physics which states that the average lead-time is the product of the average work-in-progress and the average cycle-time.

LT = WIP * CT

<Bob> Good and I am pleased to see that you have used cycle-time. We are considering a single stream, single stage, single step system.

<Leslie> Yes, I avoided using the unqualified term ‘activity’. I have learned that lesson the hard way too!

<Bob> So how do the terms in Little’s Law relate to streams, stages and systems?

<Leslie> Lead-time is a stream metric, cycle-time is a stage metric and work-in-progress is a …. h’mm. What it is? A stream metric or a stage metric?

<Bob>Or?

<Leslie>A system metric? WIP is a system metric!

<Bob> Good. So now re-arrange Little’s Law as a productivity formula.

<Leslie> Work-in-Progress equals lead-time divided by cycle-time

WIP = LT / CT

<Bob> So is WIP a productivity or a counter-productivity metric?

<Leslie> H’mmm …. I will need to work this through logically and step-by-step. I do not trust my intuition on this flow stuff.

Increasing cycle-time is counter-productive because it implies activity is falling while costs are not.

But cycle-time is on the bottom of the ratio so it’s effect reverses.

So if lead-time stays the same and cycle-time increases then because it is on the bottom of the ratio that implies a more productive design. And at the same time work in progress must be falling. Urrgh! This is hurting my head.

<Bob> Good, keep going … you are nearly there.

<Leslie> So a falling WIP is a sign of increasing productivity.

<Bob> Good … and that implies?

<Leslie> WIP is a counter-productivity system metric!

<Bob> Well done. Your logic is flawless.

<Leslie> So that is why we focus on WIP so much! Whatever causes WIP to increase is counter-productive!

Ahhhh …. that makes complete sense.

Lo-WIP designs are more productive than Hi-WIP designs.

<Bob> Bravo! And translating this into financial metrics … it is because a big queue of waiting work incurs costs. Storage cost, maintenance cost, processing cost and so on. So WIP is a liability. It is not an asset!

<Leslie> But doesn’t that imply treating work-in-progress as an asset on the financial balance sheet is counter-productive?

<Bob> It does indeed.

<Leslie> Oh dear! That revelation is going to upset a lot of people in the accounting department!

<Bob> The painful reality is that the Laws of Flow Physics are completely indifferent to what any of us believe or do not believe.

<Leslie> Wow! I like this concept of counter-productivity … it really helps to expose some of our invalid assumptions that invisibly block improvement!

<Bob> So here is a question to ponder. Is zero WIP desirable or even possible?

<Leslie> H’mmm. I will have to think about that. I know you would not have asked the question for no reason.

25/10/2014

Fit-4-Purpose

We all want a healthcare system that is fit for purpose.

One which can deliver diagnosis, treatment and prognosis where it is needed, when it is needed, with empathy and at an affordable cost.

One that achieves intended outcomes without unintended harm – either physical or psychological.

We want safety, delivery, quality and affordability … all at the same time.

And we know that there are always constraints we need to work within.

There are constraints set by the Laws of the Universe – physical constraints.

These are absolute, eternal and are not negotiable.

Dr Who’s fantastical tardis is fictional. We cannot distort space, or travel in time, or go faster than light – well not with our current knowledge.

There are also constraints set by the Laws of the Land – legal constraints.

Legal constraints are rigid but they are also adjustable. Laws evolve over time, and they are arbitrary. We design them. We choose them. And we change them when they are no longer fit for purpose.

The third limit is often seen as the financial constraint. We are required to live within our means. There is no eternal font of limitless funds to draw from. We all share a planet that has finite natural resources – and ‘grow’ in one part implies ‘shrink’ in another. The Laws of the Universe are not negotiable. Mass, momentum and energy are conserved.

The fourth constraint is perceived to be the most difficult yet, paradoxically, is the one that we have most influence over.

It is the cultural constraint.

The collective, continuously evolving, unwritten rules of socially acceptable behaviour.

Improvement requires challenging our unconscious assumptions, our beliefs and our habits – and selectively updating those that are no longer fit-4-purpose.

To learn we first need to expose the gaps in our knowledge and then to fill them.

We need to test our hot rhetoric against cold reality – and when the fog of disillusionment forms we must rip up and rewrite what we have exposed to be old rubbish.

We need to examine our habits with forensic detachment and we need to ‘unlearn’ the ones that are limiting our effectiveness, and replace them with new habits that better leverage our capabilities.

And all of that is tough to do. Life is tough. Living is tough. Learning is tough. Leading is tough. But it energising too.

Having a model-of-effective-leadership to aspire to and a peer-group for mutual respect and support is a critical piece of the jigsaw.

It is not possible to improve a system alone. No matter how smart we are, how committed we are, or how hard we work. A system can only be improved by the system itself. It is a collective and a collaborative challenge.

So with all that in mind let us sketch a blueprint for a leader of systemic cultural improvement.

What values, beliefs, attitudes, knowledge, skills and behaviours would be on our ‘must have’ list?

What hard evidence of effectiveness would we ask for? What facts, figures and feedback?

And with our check-list in hand would we feel confident to spot an ‘effective leader of systemic cultural improvement’ if we came across one?

This is a tough design assignment because it requires the benefit of hindsight to identify the critical-to-success factors: our ‘must have and must do’ and ‘must not have and must not do’ lists.

H’mmmm ….

So let us take a more pragmatic and empirical approach. Let us ask …

“Are there any real examples of significant and sustained healthcare system improvement that are relevant to our specific context?”

And if we can find even just one Black Swan then we can ask …

Q1. What specifically was the significant and sustained improvement?
Q2. How specifically was the improvement achieved?
Q3. When exactly did the process start?
Q4. Who specifically led the system improvement?

And if we do this exercise for the NHS we discover some interesting things.

First let us look for exemplars … and let us start using some official material – the Monitor website (http://www.monitor.gov.uk) for example … and let us pick out ‘Foundation Trusts’ because they are the ones who are entrusted to run their systems with a greater degree of capability and autonomy.

And what we discover is a league table where those FTs that are OK are called ‘green’ and those that are Not OK are coloured ‘red’. And there are some that are ‘under review’ so we will call them ‘amber’.

The criteria for deciding this RAG rating are embedded in a large balanced scorecard of objective performance metrics linked to a robust legal contract that provides the framework for enforcement. Safety metrics like standardised mortality ratios, flow metrics like 18-week and 4-hour target yields, quality metrics like the friends-and-family test, and productivity metrics like financial viability.

A quick tally revealed 106 FTs in the green, 10 in the amber and 27 in the red.

But this is not much help with our quest for exemplars because it is not designed to point us to who has improved the most, it only points to who is failing the most! The league table is a name-and-shame motivation-destroying cultural-missile fuelled by DRATs (delusional ratios and arbitrary targets) and armed with legal teeth. A projection of the current top-down, Theory-X, burn-the-toast-then-scrape-it management-of-mediocrity paradigm. Oh dear!

However, despite these drawbacks we could make better use of this data. We could look at the ‘reds’ and specifically at their styles of cultural leadership and compare with a random sample of all the ‘greens’ and their models for success. We could draw out the differences and correlate with outcomes: red, amber or green.

That could offer us some insight and could give us the head start with our blueprint and check-list.

It would be a time-consuming and expensive piece of work and we do not want to wait that long. So what other avenues are there we can explore now and at no cost?

Well there are unofficial sources of information … the ‘grapevine’ … the stuff that people actually talk about.

What examples of effective improvement leadership in the NHS are people talking about?

Well a little blue bird tweeted one in my ear this week …

And specifically they are talking about a leader who has learned to walk-the-improvement-walk and is now talking-the-improvement-walk: and that is Sir David Dalton, the CEO of Salford Royal.

Here is a copy of the slides from Sir David’s recent lecture at the Kings Fund … and it is interesting to compare and contrast it with the style of NHS Leadership that led up to the Mid Staffordshire Failure, and to the Francis Report, and to the Keogh Report and to the Berwick Report.

Chalk and cheese!

So if you are an NHS employee would you rather work as part of an NHS Trust where the leaders walk-DD’s-walk and talk-DD’s-talk?

And if you are an NHS customer would you prefer that the leaders of your local NHS Trust walked Sir David’s walk too?

We are the system … we get the leaders that we deserve … we make the choice … so we need to choose wisely … and we need to make our collective voice heard.

Actions speak louder than words. Walk works better than talk. We must be the change we want to see.

11/10/2014

Feel the Fear

We spend a lot of time in a state of anxiety and fear. It is part and parcel of life because there are many real threats that we need to detect and avoid.

For our own safety and survival.

Unfortunately there are also many imagined threats that feel just as real and just as terrifying.

In these cases it is our fear that does the damage because it paralyses our decision making and triggers our ‘fright’ then ‘fight’ or ‘flight’ reaction.

Fear is not bad … the emotional energy it releases can be channelled into change and improvement. Just as anger can.

So we need to be able to distinguish the real fears from the imaginary ones. And we need effective strategies to defuse the imaginary ones. Because until we do that we will find it very difficult to listen, learn, experiment, change and improve.

So let us grasp the nettle and talk about a dozen universal fears …

Fear of dying before one’s time.
Fear of having one’s basic identity questioned.
Fear of poverty or loss of one’s livelihood.
Fear of being denied one’s fundamental rights and liberties.

Fear of being unjustly accused of wrongdoing.
Fear of public humiliation.
Fear of being unjustly seen as lacking character.
Fear of being discovered as inauthentic – a fraud.

Fear of radical change.
Fear of feedback.
Fear of failure.
Fear of the unknown.

Notice that some of these fears are much ‘deeper’ than others … this list is approximately in depth order. Some relate to ‘self’; some relate to ‘others’ and all are inter-related to some degree. Fear of failure links to fear of humiliation and to fear of loss-of-livelihood.

Of these the four that are closest to the surface are the easiest to tackle … fear of radical change, fear of feedback, fear of failure, and fear of the unknown. These are the Four Fears that block personal improvement.

Fear of the unknown is the easiest to defuse. We just open the door and look … from an emotionally safe distance so that we can run away if our worst fears are realised … which does not happen when the fear is imagined.

This is an effective strategy for defusing the emotionally and socially damaging effects of self-generated phobias.

And we find overcoming fear-of-the-unknown exhilarating … that is how theme parks and roller-coaster rides work.

First we open our eyes, we look, we see, we observe, we reflect, we learn and we convert the unknown to the unfamiliar and then to the familiar. We may not conquer our fear completely … there may be some reasonable residual anxiety … but we have learned to contain it and to control it. We have made friends with our inner Chimp. We climb aboard the roller coaster that is called ‘life’.

Fear of failure is next. We defuse this by learning how to fail safely so that we can learn-by-doing and by that means we reduce the risk of future failures. We make frequent small safe failures in order to learn how to avoid the rare big unsafe ones!

Many people approach improvement from an academic angle. They sit on the fence. They are the reflector-theorists. And this may because they are too fearful-of-failing to learn the how-by-doing. So they are unable to demonstrate the how and their fear becomes the fear-of-fraud and the fear-of-humiliation. They are blocked from developing their pragmatist/activist capability by their self-generated fear-of-failure.

So we start small, we stay focussed, we stay inside our circle of control, and we create a safe zone where we can learn how to fail safely – first in private and later in public.

One of the most inspiring behaviours of an effective leader is the courage to learn in public and to make small failures that demonstrate their humility and humanity.

Those who insist on ‘perfect’ leaders are guaranteed to be disappointed.

And one thing that we all fail repeatedly is to ask for, to give and to receive effective feedback. This links to the deeper fear-of-humiliation.

And it is relatively easy to defuse this fear-of-feedback too … we just need a framework to support us until we find our feet and our confidence.

The key to effective feedback is to make it non-judgemental.

And that can only be done by developing our ability to step back and out of the Drama Triangle and to cultivate an I’m OK- You’re OK mindset.

The mindset of mutual respect. Self-respect and Other-respect.

And remember that Other-respect does not imply trust, alignment, agreement, or even liking.

Sworn enemies can respect each other while at the same time not trusting, liking or agreeing with each other.

Judgement-free feedback (JFF) is a very effective technique … both for defusing fear and for developing mutual respect.

And from that foundation radical change becomes possible, even inevitable.

16/08/2014

The 85% Optimum Occupancy Myth

There seems to be a belief among some people that the “optimum” average bed occupancy for a hospital is around 85%.

More than that risks running out of beds and admissions being blocked, 4 hour breaches appearing and patients being put at risk. Less than that is inefficient use of expensive resources. They claim there is a ‘magic sweet spot’ that we should aim for.

Unfortunately, this 85% optimum occupancy belief is a myth.

So, first we need to dispel it, then we need to understand where it came from, and then we are ready to learn how to actually prevent queues, delays, disappointment, avoidable harm and financial non-viability.

Disproving this myth is surprisingly easy. A simple thought experiment is enough.

Suppose we have a policy where we keep patients in hospital until someone needs their bed, then we discharge the patient with the longest length of stay and admit the new one into the still warm bed – like a baton pass. There would be no patients turned away – 0% breaches. And all our the beds would always be full – 100% occupancy. Perfection!

And it does not matter if the number of admissions arriving per day is varying – as it will.

And it does not matter if the length of stay is varying from patient to patient – as it will.

We have disproved the hypothesis that a maximum 85% average occupancy is required to achieve 0% breaches.

The source of this specific myth appears to be a paper published in the British Medical Journal in 1999 called “Dynamics of bed use in accommodating emergency admissions: stochastic simulation model“

So it appears that this myth was cooked up by academic health economists using a computer model.

And then amateur queue theory zealots jump on the band-wagon to defend this meaningless mantra and create a smoke-screen by bamboozling the mathematical muggles with tales of Poisson processes and Erlang equations.

And they are sort-of correct … the theoretical behaviour of the “ideal” stochastic demand process was described by Poisson and the equations that describe the theoretical behaviour were described by Agner Krarup Erlang. Over 100 years ago before we had computers.

BUT …

The academics and amateurs conveniently omit one minor, but annoying, fact … that real world systems have people in them … and people are irrational … and people cook up policies that ride roughshod over the mathematics, the statistics and the simplistic, stochastic mathematical and computer models.

And when creative people start meddling then just about anything can happen!

So what went wrong here?

One problem is that the academic hefalumps unwittingly stumbled into a whole minefield of pragmatic process design traps.

Here are just some of them …

1. Occupancy is a ratio – it is a meaningless number without its context – the flow parameters.

2. Using linear, stochastic models is dangerous – they ignore the non-linear complex system behaviours (chaos to you and me).

3. Occupancy relates to space-capacity and says nothing about the flow-capacity or the space-capacity and flow-capacity scheduling.

4. Space-capacity utilisation (i.e. occupancy) and systemic operational efficiency are not equivalent.

5. Queue theory is a simplification of reality that is needed to make the mathematics manageable.

6. Ignoring the fact that our real systems are both complex and adaptive implies that blind application of basic queue theory rhetoric is dangerous.

And if we recognise and avoid these traps and we re-examine the problem a little more pragmatically then we discover something very useful:

That the maximum space capacity requirement (the number of beds needed to avoid breaches) is actually easily predictable.

It does not need a black-magic-box full of scary queue theory equations or rather complicated stochastic simulation models to do this … all we need is our tried-and-trusted tool … a spreadsheet.

And we need something else … some flow science training and some simulation model design discipline.

When we do that we discover something else …. that the expected average occupancy is not 85% … or 65%, or 99%, or 95%.

There is no one-size-fits-all optimum occupancy number.

And as we explore further we discover that:

The expected average occupancy is context dependent.

And when we remember that our real system is adaptive, and it is staffed with well-intended, well-educated, creative people (who may have become rather addicted to reactive fire-fighting), then we begin to see why the behaviour of real systems seems to defy the predictions of the 85% optimum occupancy myth:

Our hospitals seem to work better-than-predicted at much higher occupancy rates.

And then we realise that we might actually be able to design proactive policies that are better able to manage unpredictable variation; better than the simplistic maximum 85% average occupancy mantra.

And finally another penny drops … average occupancy is an output of the system …. not an input. It is an effect.

And so is average length of stay.

Which implies that setting these output effects as causal inputs to our bed model creates a meaningless, self-fulfilling, self-justifying delusion.

Ooops!

Now our challenge is clear … we need to learn proactive and adaptive flow policy design … and using that understanding we have the potential to deliver zero delays and high productivity at the same time.

And doing that requires a bit more than a spreadsheet … but it is possible.

26/07/2014

The Productive Meeting

The engine of improvement is a productive meeting.

Complex adaptive systems (CAS) are those that learn and change themselves.

The books of ‘rules’ are constantly revised and refreshed as the CAS co-evolves with its environment.

System improvement is the outcome of effective actions.

Effective actions are the outcomes of wise decisions.

Wise decisions are the output of productive meetings.

So the meeting process must be designed to be productive: which means both effective and efficient.

One of the commonest niggles that individuals report is ‘Death by Meeting’.

That alone is enough evidence that our current design for meetings is flawed.

One common error of omission is lack of clarity about the purpose of the meeting.

This cause has two effects:

1. The wrong sort of meeting design is used for the problem(s) under consideration.

A meeting designed for tactical (how to) planning will not work well for strategic (why to) problems.

2. A mixed bag of problems is dumped into the all-purpose-less meeting.

Mixing up short term tactical and long term strategic problems on a single overburdened agenda is doomed to fail.

Even when the purpose of a meeting is clear and agreed it is common to observe an unproductive meeting process.

The process may be unproductive because it is ineffective … there are no wise decisions made and so no effective actions implemented.

Worse even than that … decisions are made that are unwise and the actions that follow lead to unintended negative consequences.

The process may also be unproductive because it is inefficient … it requires too much input to get any output.

Of course we want both an effective and an efficient meeting process … and we need to be aware that effectiveness comes first. Designing the meeting process to be a more efficient generator of unwise decisions is not a good idea! The result is an even bigger problem!

So our meeting design focus is ‘How could we make wise decisions as a group?’

But if we knew the answer to that we would probably already be doing it!

So we can ask the same question another way: ‘How do we make unwise decisions as a group?

The second question is easier to answer. We just reflect on our current experience.

Some ways we appear to unintentionally generate unwise decisions are:

a) Ensure we have no clarity of purpose – confusion is a good way to defuse effective feedback.
b) Be selective in who we invite to the meeting – group-think facilitates consensus.
c) Ignore the pragmatic, actual, reality and only use academic, theoretical, rhetoric.
d) Encourage the noisy – quiet people are non-contributors.
e) Engage in manipulative styles of behaviour – people cannot be trusted.
f) Encourage the sceptics and cynics to critique and cull innovative suggestions.
g) Have a trump card – keep the critical ‘any other business’ to the end – just in case.

If we adopt all these tactics we can create meetings that are ‘lively’, frustrating, inefficient and completely unproductive. That of course protects us from making unwise decisions.

So one approach to designing meetings to be more productive is simply to recognise and challenge the unproductive behaviours – first as individuals and then as groups.

The place to start is within our own circle of influence – with those we trust – and to pledge to each other to consciously monitor for unproductive behaviours and to respectfully challenge them.

These behaviours are so habitual that we are often unaware that we are doing them.

And it feels strange at first but it get easier with practice and when you see the benefits.

26/04/2014

Synchronicity

[Beep, Beep, Beep, Beep, Beeeeep] The reminder roused Bob from deep reflection and he clicked the Webex link on his desktop to start the meeting. Leslie was already online.

<Bob> Hi Leslie. How are you? And what would you like to share and explore today?

<Leslie> Hi Bob, I am well thank you and I would like to talk about chaos again.

<Bob> OK. That is always a rich mine of new insights! Is there a specific reason?

<Leslie>Yes. The story I want to share is of the chaos that I have been experiencing just trying to get a new piece of software available for my team to use. You would not believe the amount of time, emails, frustration and angst it has taken to negotiate this through the ‘proper channels’.

<Bob> Let me guess … about six months?

<Leslie> Spot on! How did you know?

<Bob> Just prior experience of similar stories. So what is your diagnosis of the cause of the chaos?

<Leslie> My intuition shouts at me that people are just being deliberately difficult and that makes me feel angry and want to shout at them … but I have learned that behaviour is counter-productive.

<Bob> So what did you do?

<Leslie> I escalated the ‘problem’ to my line manager.

<Bob> And what did they do?

<Leslie> I am not sure, I was not copied in, but it seemed to clear the ‘obstruction’.

<Bob> And were the ‘people’ you mentioned suddenly happy and willing to help?

<Leslie> Not really … they did what we needed but they did not seem very happy about it.

<Bob> OK. You are describing a Drama Triangle, a game, and your behaviour was from the Persecutor role.

<Leslie>What! But I deliberately did not send any ANGRY emails or get into a childish argument. I escalated the issue I could not solve because that is what we are expected to do.

<Bob> Yes I know. If you had engaged in a direct angry conversation, by whatever means, that would have been an actively aggressive act. By escalating the issue and someone Bigger having the angry conversation you have engaged in a passive aggressive act. It is still playing the game from the Persecutor role and in fact is the more common mode of Persecution.

<Leslie> But it got the barrier cleared and the problem sorted?

<Bob> And did it leave everyone feeling happier than before?

<Leslie> I guess not. I certainly felt like a bit of a ‘tale teller’ and the IT technician probably hates me and fears for his job, and the departmental heads probably distrust each other even more than before.

<Bob> So this approach may appear to work in the short term but it creates a much bigger long term problem – and it is that long term problem of ‘distrust’ that creates the chaos. So it is a self-sustaining design.

<Leslie> Oh dear! Is there a way to avoid this and to defuse the chronic distrust?

<Bob> Yes. You have demonstrated a process that you would like to improve – you want the same short term outcome, your software installed and working, and you want it quicker and with less angst and leaving everyone feeling good about how they have played a part in achieving that objective.

<Leslie>Yes. That would be my ideal.

<Bob>So what is different between what you did and your ‘ideal’ scenario? What did you do that you should not have and what did you not do that you could have?

<Leslie> Well I triggered off a drama triangle which I should not have. I also assumed that the IT people would know what to do because I do not understand the technical nuances of getting new software procured and installed. What I could have done is make it much clearer for them what I needed, why I needed it and how and when I needed it. I could have done a lot more homework before asking them for assistance. I could also have given my inner Chimp a banana and gone to talk to them face-to-face and ask their opinion early on so I could see the problem from their perspective as well as mine.

<Bob> Yes – that all sounds reasonable and respectful. What you are doing is ‘synchronising‘. You are engaging in understanding the process well enough so that you can align all the actions that need to be done, in the correct order and then sharing that. It is rather like being the composer of a piece of music – you share the score so that the individual players know what to do and when. There is one other task you need to do.

<Leslie>I need to be the conductor!

<Bob> Yes. You are the metronome. You set the pace and guide the orchestra. They are the specialists with their instruments – that is not your role.

<Leslie> And when I do that then the music is harmonious and pleasing-to-the-ear; not a chaotic cacophony!

<Bob> Indeed … and the music is the voice of the system – and is the feedback that everyone hears – and not only do the musicians derive pleasure from contributing then the wider audience will hear what can be achieved and see how it is achieved.

<Leslie> Wow! That musical metaphor works really well for me. Thanks Bob, I need to go and work on my communicating, composing and conducting capabilities.

29/03/2014

The Improvement Pyramid

The image of a tornado is what many associate with improvement. An unpredictable, powerful, force that sweeps away the wood in its path. It certainly transforms – but it leaves a trail of destruction and disappointment in its wake. It does not discriminate between the green wood and the dead wood.

A whirlwind is created by a combination of powerful forces – but the trigger that unleashes the beast is innocuous. The classic ‘butterfly wing effect’. A spark that creates an inferno.

This is not the safest way to achieve significant and sustained improvement. A transformation tornado is a blunt and destructive tool. All it can hope to achieve is to clear the way for something more elegant. Improvement Science.

We need to build the capability for improvement progressively and to build it effective, efficient, strong, reliable, and resilient. In a word – trustworthy. We need a durable structure.

But what sort of structure? A tower from whose lofty penthouse we can peer far into the distance? A bridge between the past and the future? A house with foundations, walls and a roof? Do these man-made edifices meet our criteria? Well partly.

Let us see what nature suggests. What are the naturally durable designs?

Suppose we have a bag of dry sand – an unstructured mix of individual grains – and that each grain represents an improvement idea.

Suppose we have a specific issue that we would like to improve – a Niggle.

Let us try dropping the Improvement Sand on the Niggle – not in a great big reactive dollop – but in a proactive, exploratory bit-at-a-time way. What shape emerges?

What we see is illustrated by the hourglass. We get a pyramid.

The shape of the pyramid is determined by two factors: how sticky the sand is and how fast we pour it.

What we want is a tall pyramid – one whose sturdy pinnacle gives us the capability to see far and to do much.

The stickier the sand the steeper the sides of our pyramid. The faster we pour the quicker we get the height we need. But there is a limit. If we pour too quickly we create instability – we create avalanches.

So we need to give the sand time to settle into its stable configuration; time for it to trickle to where it feels most comfortable.

And, in translating this metaphor to building improvement capability in system we could suggest that the ‘stickiness’ factor is how well ideas hang together and how well individuals get on with each other and how well they share ideas and learning. How cohesive our people are. Distrust and conflict represent repulsive forces. Repulsion creates a large, wide, flat structure – stable maybe but incapable of vision and improvement. That is not what we need

So when developing a strategy for building improvement capability we build small pyramids where the niggles point to. Over time they will merge and bigger pyramids will appear and merge – until we achieve the height. Then was have a stable and capable improvement structure. One that we can use and we can trust.

Just from sprinkling Improvement Science Sand on our Niggles.

22/02/2014

Rocket Science

This is a picture of Chris Hadfield. He is an astronaut and to prove it here he is in the ‘cupola’ of the International Space Station (ISS). Through the windows is a spectacular view of the Earth from space.

Our home seen from space.

What is remarkable about this image is that it even exists.

This image is tangible evidence of a successful outcome of a very long path of collaborative effort by 100’s of 1000’s of people who share a common dream.

That if we can learn to overcome the challenge of establishing a permanent manned presence in space then just imagine what else we might achieve?

Chis is unusual for many reasons. One is that he is Canadian and there are not many Canadian astronauts. He is also the first Canadian astronaut to command the ISS. Another claim to fame is that when he recently lived in space for 5 months on the ISS, he recorded a version of David Bowie’s classic song – for real – in space. To date this has clocked up 21 million YouTube hits and had helped to bring the inspiring story of space exploration back to the public consciousness.

Especially the next generation of explorers – our children.

Chris has also written a book ‘An Astronaut’s View of Life on Earth‘ that tells his story. It describes how he was inspired at a young age by seeing the first man to step onto the Moon in 1969. He overcame seemingly impossible obstacles to become an astronaut, to go into space, and to command the ISS. The image is tangible evidence.

We all know that space is a VERY dangerous place. I clearly remember the two space shuttle disasters. There have been many other much less public accidents. Those tragic events have shocked us all out of complacency and have created a deep sense of humility in those who face up to the task of learning to overcome the enormous technical and cultural barriers.

Getting six people into space safely, staying there long enough to conduct experiments on the long-term effects of weightlessness, and getting them back again safely is a VERY difficult challenge. And it has been overcome. We have the proof.

Many of the seemingly impossible day-to-day problems that we face seem puny in comparison.

For example: getting every patient into hospital, staying there just long enough to benefit from cutting edge high-technology healthcare, and getting them back home again safely.

And doing it repeatedly and consistently so that the system can be trusted and we are not greeted with tragic stories every time we open a newspaper. Stories that erode our trust in the ability of groups of well-intended people to do anything more constructive than bully, bicker and complain.

So when the exasperated healthcare executive exclaims ‘Getting 95% of emergency admissions into hospital in less than 4 hours is not rocket science!‘ – then perhaps a bit more humility is in order. It is rocket science.

Rocket science is Improvement science.

And reading the story of a real-life rocket-scientist might be just the medicine our exasperated executives need.

Because Chris explains exactly how it is done.

And he is credible because he has walked-the-talk so he has earned the right to talk-the-walk.

The least we can do is listen and learn.

Here is is Chris answering the question ‘How to achieve an impossible dream?‘

08/02/2014

Jiggling

[Dring] Bob’s laptop signaled the arrival of Leslie for their regular ISP remote coaching session.

<Bob> Hi Leslie. Thanks for emailing me with a long list of things to choose from. It looks like you have been having some challenging conversations.

<Leslie> Hi Bob. Yes indeed! The deepening gloom and the last few blog topics seem to be polarising opinion. Some are claiming it is all hopeless and others, perhaps out of desperation, are trying the FISH stuff for themselves and discovering that it works. The ‘What Ifs’ are engaged in war of words with the ‘Yes Buts’.

<Bob> I like your metaphor! Where would you like to start on the long list of topics?

<Leslie> That is my problem. I do not know where to start. They all look equally important.

<Bob> So, first we need a way to prioritise the topics to get the horse-before-the-cart.

<Leslie> Sounds like a good plan to me!

<Bob> One of the problems with the traditional improvement approaches is that they seem to start at the most difficult point. They focus on ‘quality’ first – and to be fair that has been the mantra from the gurus like W.E.Deming. ‘Quality Improvement’ is the Holy Grail.

<Leslie>But quality IS important … are you saying they are wrong?

<Bob> Not at all. I am saying that it is not the place to start … it is actually the third step.

<Leslie>So what is the first step?

<Bob> Safety. Eliminating avoidable harm. Primum Non Nocere. The NoNos. The Never Events. The stuff that generates the most fear for everyone. The fear of failure.

<Leslie> You mean having a service that we can trust not to harm us unnecessarily?

<Bob> Yes. It is not a good idea to make an unsafe design more efficient – it will deliver even more cumulative harm!

<Leslie> OK. That makes perfect sense to me. So how do we do that?

<Bob> It does not actually matter. Well-designed and thoroughly field-tested checklists have been proven to be very effective in the ‘ultra-safe’ industries like aerospace and nuclear.

<Leslie> OK. Something like the WHO Safe Surgery Checklist?

<Bob> Yes, that is a good example – and it is well worth reading Atul Gawande’s book about how that happened – “The Checklist Manifesto“. Gawande is a surgeon who had published a lot on improvement and even so was quite skeptical that something as simple as a checklist could possibly work in the complex world of surgery. In his book he describes a number of personal ‘Ah Ha!’ moments that illustrate a phenomenon that I call Jiggling.

<Leslie> OK. I have made a note to read Checklist Manifesto and I am curious to learn more about Jiggling – but can we stick to the point? Does quality come after safety?

<Bob> Yes, but not immediately after. As I said, Quality is the third step.

<Leslie> So what is the second one?

<Bob> Flow.

There was a long pause – and just as Bob was about to check that the connection had not been lost – Leslie spoke.

<Leslie> But none of the Improvement Schools teach basic flow science. They all focus on quality, waste and variation!

<Bob> I know. And attempting to improve quality before improving flow is like papering the walls before doing the plastering. Quality cannot grow in a chaotic context. The flow must be smooth before that. And the fear of harm must be removed first.

<Leslie> So the ‘Improving Quality through Leadership‘ bandwagon that everyone is jumping on will not work?

<Bob> Well that depends on what the ‘Leaders’ are doing. If they are leading the way to learning how to design-for-safety and then design-for-flow then the bandwagon might be a wise choice. If they are only facilitating collaborative agreement and group-think then they may be making an unsafe and ineffective system more efficient which will steer it over the edge into faster decline.

<Leslie>So, if we can stabilize safety using checklists do we focus on flow next?

<Bob>Yup.

<Leslie> OK. That makes a lot of sense to me. So what is Jiggling?

<Bob> This is Jiggling. This conversation.

<Leslie> Ah, I see. I am jiggling my understanding through a series of ‘nudges’ from you.

<Bob>Yes. And when the learning cogs are a bit rusty, some Improvement Science Oil and a bit of Jiggling is more effective and much safer than whacking the caveman wetware with a big emotional hammer.

<Leslie>Well the conversation has certainly jiggled Safety-Flow-Quality-and-Productivity into a sensible order for me. That has helped a lot. I will sort my to-do list into that order and start at the beginning. Let me see. I have a plan for safety, now I can focus on flow. Here is my top flow niggle. How do I design the resource capacity I need to ensure the flow is smooth and the waiting times are short enough to avoid ‘persecution’ by the Target Time Police?

<Bob> An excellent question! I will send you the first ISP Brainteaser that will nudge us towards an answer to that question.

<Leslie> I am ready and waiting to have my brain-teased and my niggles-nudged!

01/02/2014

The Speed of Trust

Systems are built from intersecting streams of work called processes.

This iconic image of the London Underground shows a system map – a set of intersecting transport streams.

Each stream links a sequence of independent steps – in this case the individual stations. Each step is a system in itself – it has a set of inner streams.

For a system to exhibit stable and acceptable behaviour the steps must be in synergy – literally ‘together work’. The steps also need to be in synchrony – literally ‘same time’. And to do that they need to be aligned to a common purpose. In the case of a transport system the design purpose is to get from A to B safety, quickly, in comfort and at an affordable cost.

In large socioeconomic systems called ‘organisations’ the steps represent groups of people with special knowledge and skills that collectively create the desired product or service. This creates an inevitable need for ‘handoffs’ as partially completed work flows through the system along streams from one step to another. Each step contributes to the output. It is like a series of baton passes in a relay race.

This creates the requirement for a critical design ingredient: trust.

Each step needs to be able to trust the others to do their part: right-first-time and on-time. All the steps are directly or indirectly interdependent. If any one of them is ‘untrustworthy’ then the whole system will suffer to some degree. If too many generate dis-trust then the system may fail and can literally fall apart. Trust is like social glue.

So a critical part of people-system design is the development and the maintenance of trust-bonds.

And it does not happen by accident. It takes active effort. It requires design.

We are social animals. Our default behaviour is to trust. We learn distrust by experiencing repeated disappointments. We are not born cynical – we learn that behaviour.

The default behaviour for inanimate systems is disorder – and it has a fancy name – it is called ‘entropy’. There is a Law of Physics that says that ‘the average entropy of a system will increase over time‘. The critical word is ‘average’.

So, if we are not aware of this and we omit to pay attention to the hand-offs between the steps we will observe increasing disorder which leads to repeated disappointments and erosion of trust. Our natural reaction then is ‘self-protect’ which implies ‘check-and-reject’ and ‘check and correct’. This adds complexity and bureaucracy and may prevent further decline – which is good – but it comes at a cost – quite literally.

Eventually an equilibrium will be achieved where our system performance is limited by the amount of check-and-correct bureaucracy we can afford. This is called a ‘mediocrity trap’ and it is very resilient – which means resistant to change in any direction.

To escape from the mediocrity trap we need to break into the self-reinforcing check-and-reject loop and we do that by developing a design that challenges ‘trust eroding behaviour’. The strategy is to develop a skill called ‘smart trust’.

To appreciate what smart trust is we need to view trust as a spectrum: not as a yes/no option.

At one end is ‘nonspecific distrust’ – otherwise known as ‘cynical behaviour’. At the other end is ‘blind trust’ – otherwise known and ‘gullible behaviour’. Neither of these are what we need.

In the middle is the zone of smart trust that spans healthy scepticism through to healthy optimism. What we need is to maintain a balance between the two – not to eliminate them. This is because some people are ‘glass-half-empty’ types and some are ‘glass-half-full’. And both views have a value.

The action required to develop smart trust is to respectfully challenge every part of the organisation to demonstrate ‘trustworthiness’ using evidence. Rhetoric is not enough. Politicians always score very low on ‘most trusted people’ surveys.

The first phase of this smart trust development is for steps to demonstrate trustworthiness to themselves using their own evidence, and then to share this with the steps immediately upstream and downstream of them.

So what evidence is needed?

Safety comes first. If a step cannot be trusted to be safe then that is the first priority. Safe systems need to be designed to be safe.

Flow comes second. If the streams do not flow smoothly then we experience turbulence and chaos which increases stress, the risk of harm and creates disappointment for everyone. Smooth flow is the result of careful flow design.

Third is Quality which means ‘setting and meeting realistic expectations‘. This cannot happen in an unsafe, chaotic system. Quality builds on Flow which builds on Safety. Quality is a design goal – an output – a purpose.

Fourth is Productivity (or profitability) and that does not automatically follow from the other three as some QI Zealots might have us believe. It is possible to have a safe, smooth, high quality design that is unaffordable. Productivity needs to be designed too. An unsafe, chaotic, low quality design is always more expensive. Always. Safe, smooth and reliable can be highly productive and profitable – if designed to be.

So whatever the driver for improvement the sequence of questions is the same for every step in the system: “How can I demonstrate evidence of trustworthiness for Safety, then Flow, then Quality and then Productivity?”

And when that happens improvement will take off like a rocket. That is the Speed of Trust. That is Improvement Science in Action.

28/12/201313/08/2024

The Time Trap

[Hmmmmmm]

The desk amplified the vibration of Bob’s smartphone as it signaled the time for his planned e-mentoring session with Leslie.

<Bob> Hi Leslie, right-on-time, how are you today?

<Leslie> Good thanks Bob. I have a specific topic to explore if that is OK. Can we talk about time traps.

<Bob> OK – do you have a specific reason for choosing that topic?

<Leslie> Yes. The blog last week about ‘Recipe for Chaos‘ set me thinking and I remembered that time-traps were mentioned in the FISH course but I confess, at the time, I did not understand them. I still do not.

<Bob> Can you describe how the ‘Recipe for Chaos‘ blog triggered this renewed interest in time-traps?

<Leslie> Yes – the question that occurred to me was: ‘Is a time-trap a recipe for chaos?’

<Bob> A very good question! What do you feel the answer is?

<Leslie> I feel that time-traps can and do trigger chaos but I cannot explain how. I feel confused.

<Bob> Your intuition is spot on – so can you localize the source of your confusion?

<Leslie> OK. I will try. I confess I got the answer to the MCQ correct by guessing – and I wrote down the answer when I eventually guessed correctly – but I did not understand it.

<Bob> What did you write down?

<Leslie> “The lead time is independent of the flow”.

<Bob> OK. That is accurate – though I agree it is perhaps a bit abstract. One source of confusion may be that there are different causes of time-traps and there is a lot of overlap with other chaos-creating policies. Do you have a specific example we can use to connect theory with reality?

<Leslie> OK – that might explain my confusion. The example that jumped to mind is the RTT target.

<Bob> RTT?

<Leslie> Oops – sorry – I know I should not use undefined abbreviations. Referral to Treatment Time.

<Bob> OK – can you describe what you have mapped and measured already?

<Leslie> Yes. When I plot the lead-time for patients in date-of-treatment order the process looks stable but the histogram is multi-modal with a big spike just underneath the RTT target of 18 weeks. What you describe as the ‘Horned Gaussian’ – the sign that the performance target is distorting the behaviour of the system and the design of the system is not capable on its own.

<Bob> OK, and have you investigated why there is not just one spike?

<Leslie> Yes – the factor that best explains that is the ‘priority’ of the referral. The ‘urgents’ jump in front of the ‘soons’ and both jump in front of the ‘routines’. The chart has three overlapping spikes.

<Bob> That sounds like a reasonable policy for mixed-priority demand. So what is the problem?

<Leslie> The ‘Routine’ group is the one that clusters just underneath the target. The lead time for routines is almost constant but most of the time those patients sit in one queue or another being leap-frogged by other higher-priority patients. Until they become high-priority – then they do the leap frogging.

<Bob> OK – and what is the condition for a time trap again?

<Leslie> That the lead time is independent of flow.

<Bob> Which implies?

<Leslie> Um. Let me think. That the flow can be varying but the lead time stays the same?

<Bob> Yup. So is the flow of routine referrals varying?

<Leslie> Not over the long term. The chart is stable.

<Bob> What about over the short term? Is demand constant?

<Leslie> No of course not – it varies – but that is expected for all systems. Constant means ‘over-smoothed data’ – the Flaw of Averages trap!

<Bob> OK. And how close is the average lead time for routines to the RTT maximum allowable target?

<Leslie> Ah! I see what you mean. The average is about 17 weeks and the target is 18 weeks.

<Bob> So, what is the flow variation on a week-to-week time scale?

<Leslie> Demand or Activity?

<Bob> Both.

<Leslie> H’mm – give me a minute to re-plot flow as a weekly-aggregated chart. Oh! I see what you mean – both the weekly activity and demand are both varying widely and they are not in sync with each other. Work in progress must be wobbling up and down a lot! So how can the lead time variation be so low?

<Bob> What do the flow histograms look like?

<Leslie> Um. Just a second. That is weird! They are both bi-modal with peaks at the extremes and not much in the middle – the exact opposite of what I expected to see! I expected a centered peak.

<Bob> What you are looking at is the characteristic flow fingerprint of a chaotic system – it is called ‘thrashing’.

<Leslie> So, I was right!

<Bob> Yes. And now you know the characteristic pattern to look for. So, what is the policy design flaw here?

<Leslie> The DRAT – the delusional ratio and arbitrary target?

<Bob> That is part of it – that is the external driver policy. The one you cannot change easily. What is the internally driven policy? The reaction to the DRAT?

<Leslie> The policy of leaving routine patients until they are about to breach then re-classifying them as ‘urgent’.

<Bob> Yes! It is called a ‘Prevarication Policy’ and it is surprisingly and uncomfortably common. Ask yourself – do you ever prevaricate? Do you ever put off ‘lower priority’ tasks until later and then not fill the time freed up with ‘higher priority tasks’?

<Leslie> OMG! I do that all the time! I put low priority and unexciting jobs on a ‘to do later’ heap but I do not sit idle – I do then focus on the high priority ones.

<Bob> High priority for whom?

<Leslie> Ah! I see what you mean. High priority for me. The ones that give me the biggest reward! The fun stuff or the stuff that I get a pat on the back for doing or that I feel good about.

<Bob> And what happens?

<Leslie> The heap of ‘no-fun-for-me-to-do’ jobs gets bigger and I await the ‘reminders’ and then have to rush round in a mad panic to avoid disappointment, criticism and blame. It feels chaotic. I get grumpy. I make more mistakes and I deliver lower-quality work. If I do not get a reminder I assume that the job was not that urgent after all and if I am challenged I claim I am too busy doing the other stuff.

<Bob> And have you avoided disappointment?

<Leslie> Ah! No – that I needed to be reminded meant that I had already disappointed. And when I do not get a reminded does not prove I have not disappointed either. Most people blame rather than complain. I have just managed to erode other people’s trust in my reliability. I have disappointed myself. I have achieved exactly the opposite of what I intended. Drat!

<Bob> So, what is the reason that you work this way? There will be a reason. A good reason.

<Leslie> That is a very good question! I will reflect on that because I believe it will help me understand why others behave this way too.

<Bob> OK – I will be interested to hear your conclusion. Let us return to the question. What is the downside of a ‘Prevarication Policy’?

<Leslie> It creates stress, chaos, fire-fighting, last minute changes, increased risk of errors, more work and it erodes both quality, confidence and trust.

<Bob> Indeed so – and the impact on productivity?

<Leslie> The activity falls, the system productivity falls, revenue falls, queues increase, waiting times increase and the chaos increases!

<Bob> And?

<Leslie> We treat the symptoms by throwing resources at the problem – waiting list initiatives – and that pushes our costs up. Either way we are heading into a spiral of decline and disappointment. We do not address the root cause.

<Bob> So what is the way out of chaos?

<Leslie> Reduce the volume on the destabilizing feedback loop? Stop the managers meddling!

<Bob> Or?

<Leslie> Eh? I do not understand what you mean. The blog last week said management meddling was the problem.

<Bob> It is a problem. How many feedback loops are there?

<Leslie> Two – that need to be balanced.

<Bob> So, what is another option?

<Leslie> OMG! I see. Turn UP the volume of the stabilizing feedback loop!

<Bob> Yup. And that is a lot easier to do in reality. So, that is your other challenge to reflect on this week. And I am delighted to hear you using the terms ‘stabilizing feedback loop’ and ‘destabilizing feedback loop’.

<Leslie> Thank you. That was a lesson for me after last week – when I used the terms ‘positive and negative feedback’ it was interpreted in the emotional context – positive feedback as encouragement and negative feedback as criticism. So ‘reducing positive feedback’ in that sense is the exact opposite of what I was intending. So I switched my language to using ‘stabilizing and destabilizing’ feedback loops that are much less ambiguous and the confusion and conflict disappeared.

<Bob> That is very useful learning Leslie … I think I need to emphasize that distinction more in the blog. That is one advantage of online media – it can be updated!

<Leslie> Thanks again Bob! And I have the perfect opportunity to test a new no-prevarication-policy design – in part of the system that I have complete control over – me!

14/12/201312/03/2023

Unknown-Knowns

If we were exploring the corridors in an unfamiliar building and our way forward was blocked by a door that looked like this … we would suspect that something of value lay beyond.

We know there is an unknown.

The puzzle we have to solve to release the chain tells us this. This is called an “affordance” – the design of the lock tells us what we need to do.

More often what we need to know to move forward is unknown to us, and the problems we face afford us no clues as to how to solve them. Worse than that – the clues they do offer are misleading. Our intuition is tricked. We do the ‘intuitively obvious’ thing and the problem gets worse.

It is easy to lose confidence, become despondent, and even to start to believe there is no solution. We begin to believe that the problem is impossible for us to solve.

Then one day someone shows us how to solve an “impossible” problem. And with the benefit of our new perspective the solution looks simple, and how it works is now obvious. But only in retrospect.

Our unknown was known all along. But not by us. We were ignorant. We were agnostic.

And our intuitions are sometimes flaky, forgetful and fickle. They are not to be blindly trusted. And our egos are fragile too – we do not like to feel flaky, forgetful and fickle. So, we lie to ourselves and we confuse obvious-in-hindsight with obvious-in-foresight.

They are not the same.

Suppose we now want to demonstrate our new understanding to someone else – to help them solve their “impossible” problem. How do we do that?

Do we say “But it is obvious – if you cannot see it you must be blind or stupid!”

How can we say that when it was not obvious to us only a short time ago? Is our ego getting the in way again? Can our intuition or ego be trusted at all?

To help others gain insight and to help them deepen their understanding we must put ourselves back into the shoes we used to be in: and we need to look at the problem again from their perspective. With the benefit of the three views of the problem: our old one, their current one and our new one we may be able to then see where the Unknown-Known is for them – because it might be different.

Only then can we help them discover it for themselves; and then they can help others discover their Unknown-Knowns. That is know knowledge and understanding spreads.

Understanding is the bridge between Knowledge and Wisdom.

And it is a wonderful thing to see someone move from conflict, through confusion to clarity by asking them just the right question, at just the right time, in just the right way. For them.

Socrates, the Greek philosopher and teacher, knew how to do this a long time ago – which is why it is called the Socratic Method.

30/11/2013

Seeing Inside the Black Box

Improvement Science requires the effective, efficient and coordinated use of diagnosis, design and delivery tools.

Experience has also taught us that it is not just about the tools – each must be used as it was designed.

The craftsman knows his tools and knows what instrument to use, where and when the context dictates; and how to use it with skill.

Some tools are simple and effective – easy to understand and to use. The kitchen knife is a good example. It does not require an instruction manual to use it.

Other tools are more complex. Very often because they have a specific purpose. They are not generic. And they may not be intuitively obvious how to use them. Many labour-saving household appliances have specific purposes: the microwave oven, the dish-washer and so on – but they have complex controls and settings that we need to manipulate to direct the “domestic robot” to deliver what we actually want. Very often these controls are not intuitively obvious – we are dealing with a black box – and our understanding of what is happening inside is vague.

Very often we do not understand how the buttons and dials that we can see and touch – the inputs – actually influence the innards of the box to determine the outputs. We do not have a mental model of what is inside the Black Box. We do not know – we are ignorant.

In this situation we may resort to just blindly following the instructions; or blindly copying what someone else does; or blindly trying random combinations of inputs until we get close enough to what we want. No wiser at the end than we were at the start. The common thread here is “blind”. The box is black. We cannot see inside.

And the complex black box is deliberately made so – because the supplier of the super-tool does not want their “secret recipe” to be known to all – least of all their competitors.

This is a perfect recipe for confusion and for conflict. Lose-Lose-Lose.

Improvement Science is dedicated to eliminating confusion and conflict – so Black Box Tools are NOT on the menu.

Improvement Scientists need to understand how their tools work – and the best way to achieve that level of understanding is to design and build their own.

This may sound like re-inventing the wheel but it is not about building novel tools – it is about re-creating the tried and tested tools – for the purpose of understanding how they work. And understanding their strengths, their weaknesses, their opportunities and their risks or threats.

And doing that requires guidance from a mentor who has been through this same learning journey. Starting with simple, intuitive tools, and working step-by-step to design, build and understand the more complex ones.

So where do we start?

In the FISH course the first tool we learn to use is a Gantt Chart.

It was invented by Henry Laurence Gantt about 100 years ago and requires nothing more than pencil and paper. Coloured pencils and squared paper are even better.

This is an example of a Gantt Chart for a Day Surgery Unit.

At the top are the “tasks” – patients 1 and 2; and at the bottom are the “resources”.

Time runs left to right.

Each coloured bar appears twice: once on each chart.

The power of a Gantt Chart is that it presents a lot of information in a very compact and easy-to-interpret format. That is what Henry Gantt intended.

A Gantt Chart is like the surgeon’s scalpel. It is a simple, generic easy-to-create tool that has a wide range of uses. The skill is knowing where, when and how to use it: and just as importantly where-not, when-not and how-not.

The second tool that an Improvement Scientist learns to use is the Shewhart or time-series chart.

It was invented about 90 years ago.

This is a more complex tool and as such there is a BIG danger that it is used as a Black Box with no understanding of the innards. The SPC and Six-Sigma Zealots sell it as a Magic Box. It is not.

We could paste any old time-series data into a bit of SPC software; twiddle with the controls until we get the output we want; and copy the chart into our report. We could do that and hope that no-one will ask us to explain what we have done and how we have done it. Most do not because they do not want to appear ‘ignorant’. The elephant is in the room though. There is a conspiracy of silence.

The elephant-in-the-room is the risk we take when use Black Box tools – the risk of GIGO. Garbage In Garbage Out.

And unfortunately we have a tendency to blindly trust what comes out of the Black Box that a plausible Zealot tells us is “magic”. This is the Emporer’s New Clothes problem. Another conspiracy of silence follows.

The problem here is not the tool – it is the desperate person blindly wielding it. The Zealots know this and they warn the Desperados of the risk and offer their expensive Magician services. They are not interested in showing how the magic trick is done though! They prefer the Box to stay Black.

So to avoid this cat-and-mouse scenario and to understand both the simpler and the more complex tools, and to be able to use them effectively and safely, we need to be able to build one for ourselves.

And the know-how to do that is not obvious – if it were we would have already done it – so we need guidance.

And once we have built our first one – a rough-and-ready working prototype – then we can use the existing ones that have been polished with long use. And we can appreciate the wisdom that has gone into their design. The Black Box becomes Transparent.

So learning how the build the essential tools is the first part of the Improvement Science Practitioner (ISP) training – because without that knowledge it is difficult to progress very far. And without that understanding it is impossible to teach anyone anything other than to blindly follow a Black Box recipe.

Of course Magic Black Box Solutions Inc will not warm to this idea – they may not want to reveal what is inside their magic product. They are fearful that their customers may discover that it is much simpler than they are being told. And we can test that hypothesis by asking them to explain how it works in language that we can understand. If they cannot (or will not) then we may want to keep looking for someone who can and will.

16/11/2013

Temperament Treacle

If the headlines in the newspapers are a measure of social anxiety then healthcare in the UK is in a state of panic: “Hospitals Fear The Winter Crisis Is Here Early“.

The Panic Button is being pressed and the Patient Safety Alarms are sounding.

Closer examination of the statement suggests that the winter crisis is not unexpected – it is just here early. So we are assuming it will be worse than last year – which was bad enough.

The evidence shows this fear is well founded. Last year was the worst on the last 5 years and this year is shaping up to be worse still.

So if it is a predictable annual crisis and we have a lot of very intelligent, very committed, very passionate people working on the problem – then why is it getting worse rather than better?

One possible factor is Temperament Treacle.

This is the glacially slow pace of effective change in healthcare – often labelled as “resistance to change” and implying deliberate scuppering of the change boat by powerful forces within the healthcare system.

Resistance to the flow of change is probably a better term. We could call that cultural viscosity. Treacle has a very high viscosity – it resists flow. Wading through treacle is very hard work. So pushing change though cultural treacle is hard work. Many give up in exhaustion after a while.

So why the term “Temperament Treacle“?

Improvement Science has three parts – Processes, Politics and Systems.

Process Science is applied physics. It is an objective, logical, rational science. The Laws of Physics are not negotiable. They are absolute.

Political Science is applied psychology. It is a subjective, illogical, irrational science. The Laws of People are totally negotiable. They are arbitrary.

Systems Science is a combination of Physics and Psychology. A synthesis. A synergy. A greater-than-the-sum-of-the-parts combination.

The Swiss physician Carl Gustav Jung studied psychology – and in 1920 published “Psychological Types“. When this ground-breaking work was translated into English in 1923 it was picked up by Katherine Cook Briggs and made popular by her daughter Isabel. Isabel Briggs married Clarence Myers and in 1942 Isabel Myers learned about the Humm-Wadsworth Scale, a tool for matching people with jobs. So using her knowledge of psychological type differences she set out to develop her own “personality sorting tool”. The first prototype appeared in 1943; in the 1950’s she tested the third iteration and measured the personality types of 5,355 medical students and over 10,000 nurses. The Myers-Briggs Type Indicator was published 1962 and since then the MBTI® has been widely tested and validated and is the most extensively used personality type instrument. In 1980 Isabel Myers finished writing Gifts Differing just before she died at the age of 82 after a twenty year long battle with cancer.

The essence of Jung’s model is that an individual’s temperament is largely innate and the result of a combination of three dimensions:

1. The input or perceiving process (P). The poles are Intuitor (N) or Sensor (S).
2. The decision or judging process (J). The poles are Thinker (T) or Feeler (F).
3. The output or doing process. The poles are Extraversion (E) or Intraversion (I).

Each of Jung’s dimensions had two “opposite” poles so when combined they gave eight types. Isabel Myers, as a result of her extensive empirical testing, added a fourth dimension – which gives the four we see in the modern MBTI®. The fourth dimension linked the other three together – it describes if the J or the P process is the one shown to the outside world. So the MBTI® has sixteen broad personality types. In 1998 a book called “Please Understand Me II” written by David Keirsey, the MBTI® is put into an historical context and Keirsey concluded that there are four broad Temperaments – and these have been described since Ancient times.

When Isabel Myers measured different populations using her new tool she discovered a consistent pattern: that the proportions of the sixteen MBTI® types were consistent across a wide range of societies. Personality type is, as Jung had suggested, an innate part of the “human condition”. She also saw that different types clustered in different occupations. Finding the “right job” appeared to be a process of natural selection: certain types fitted certain roles better than others and people self-selected at an early age. If their choice was poor then the person would be unhappy and would not achieve their potential.

Isabel’s work also showed that each type had both strengths and weaknesses – and that people performed better and felt happier when their role played to their temperament strengths. It also revealed that considerable conflict could be attributed to type-mismatch. Polar opposite types have the least psychological “common ground” – so when they attempt to solve a common problem they do so by different routes and using different methods and language. This generates confusion and conflict. This is why Isabel Myers gave her book the title of “Gifts Differing” and her message was that just having awareness of and respect for the innate type differences was a big step towards reducing the confusion and conflict.

So what relevance does this have to change and improvement?

Well it turns out that certain types are much more open to change than others and certain types are much more resistant. If an organisation, by the very nature of its work, attracts the more change resistant types then that organisation will be culturally more viscous to the flow of change. It will exhibit the cultural characteristics of temperament treacle.

The key to understanding Temperament and the MBTI® is to ask a series of questions:

Q1. Does the person have the N or S preference on their perceiving function?

A1=N then Q2: Does the person have a T or F preference on their judging function?
A2=T gives the xNTx combination which is called the Rational or phlegmatic temperament.
A2=F gives the xNFx combination which is called the Idealist or choleric temperament.

A1=S then Q3: Does the person show a J or P preference to the outside world?
A3=J gives the xSxJ combination which is called the Guardian or melancholic temperament.
A3=P gives the xSxP combination which is called the Artisan or sanguine temperament.

So which is the most change resistant temperament? The answer may not be a big surprise. It is the Guardians. The melancholics. The SJ’s.

Bureaucracies characteristically attract SJ types. The upside is that they ensure stability – the downside is that they prevent agility. Bureaucracies block change.

The NF Idealists are the advocates and the mentors: they love initiating and facilitating transformations with the dream of making the world a better place for everyone. They light the emotional bonfire and upset the apple cart. The NT Rationals are the engineers and the architects. They love designing and building new concepts and things – so once the Idealists have cracked the bureaucratic carapace they can swing into action. The SP Sanguines are the improvisors and expeditors – they love getting the new “concept” designs to actually work in the messy real world.

Unfortunately the grand designs dreamed up by the ‘N’s often do not work in practice – and the scene is set for the we-told-you-so game, and the name-shame-blame game.

So if initiating and facilitating change is the Achilles Heel of the SJ’s then what is their strength?

Let us approach this from a different perspective:

Let us put ourselves in the shoes of patients and ask ourselves: “What do we want from a System of Healthcare and from those who deliver that care – the doctors?”

1. Safe?
2. Reliable?
3. Predictable?
4. Decisive?
5. Dependable?
6. All the above?

These are the strengths of the SJ temperament. So how do doctors measure up?

In a recent observational study, 168 doctors who attended a leadership training course completed their MBTI® self-assessments as part of developing insight into temperament from the perspective of a clinical leader. From the collective data we can answer our question: “Are there more SJ types in the medical profession than we would expect from the general population?”

The table shows the results – 60% of doctors were SJ compared with 35% expected for the general population.

Statistically this is highly significant difference (p<0.0001). Doctors are different.

It is of enormous practical importance well.

We are reassured that the majority of doctors have a preference for the very traits that patients want from them. That may explain why the Medical Profession always ranks highest in the league table of “trusted professionals”. We need to be able to trust them – it could literally be a matter of life or death.

The table also shows where the doctors were thin on the ground: in the mediating, improvising, developing, constructing temperaments. The very set of skills needed to initiate and facilitate effective and sustained change.

So when the healthcare system is lurching from one predictable crisis to another – the innate temperament of the very people we trust to deliver our health care are the least comfortable with changing the system of care itself.

That is a problem. A big problem.

Studies have show that when we get over-stressed, fearful and start to panic then in a desperate act of survival we tend to resort to the aspects of our temperament that are least well developed. An SJ who is in panic-mode may resort to NP tactics: opinion-led purposeless conceptual discussion and collective decision paralysis. This is called the “headless chicken and rabbit in the headlights” mode. We have all experienced it.

A system that is no longer delivering fit-for-purpose performance because its purpose has shifted requires redesign. The temperament treacle inhibits the flow of change so the crisis is not averted. The crisis happens, invokes panic and triggers ineffective and counter-productive behaviour. The crisis deepens and performance can drop catastrophically when the red tape is cut. It was the only thing holding the system together!

But while the bureaucracy is in disarray then innovation can start to flourish. And the next cycle starts.

It is a painful, slow, wasteful process called “reactionary evolution by natural selection“.

Improvement Science is different. It operates from a “proactive revolution through collective design” that is enjoyable, quick and efficient but it requires mastery of synergistic political science and process science. We do not have that capability – yet.

The table offers some hope. It shows the majority of doctors are xSTJ. They are Logical Guardians. That means that they solve problems using tried-tested-and-trustworthy logic. So they have no problem with the physics. Show them how to diagnose and design processes and they are inside their comfort zone.

Their collective weak spot is managing the politics – the critical cultural dimension of change. Often the result is manipulation rather than motivation. It does not work. The improvement stalls. Cynicism increases. The treacle gets thicker.

System-redesign requires synergistic support, development, improvisation and mediation. These strengths do exist in the medical profession – but they appear to be in short supply – so they need to be identified, and nurtured. And change teams need to assemble and respect the different gifts.

One further point about temperament. It is not immutable. We can all develop a broader set of MBTI® capabilities with guidance and practice – especially the ones that fill the gaps between xSTJ and xNFP. Those whose comfort zone naturally falls nearer the middle of the four dimensions find this easier. And that is one of the goals of Improvement Science training.

And if you are in a hurry then you might start today by identifying the xSFJ “supporters” and the xNFJ “mentors” in your organisation and linking them together to build a temporary bridge over the change culture chasm.

So to find your Temperament just click here to download the Temperament Sorter.

09/11/2013

The Mirror

[Dring Dring]

The phone announced the arrival of Leslie for the weekly ISP mentoring conversation with Bob.

<Leslie> Hi Bob.

<Bob> Hi Leslie. What would you like to talk about today?

<Leslie> A new challenge – one that I have not encountered before.

<Bob>Excellent. As ever you have pricked my curiosity. Tell me more.

<Leslie> OK. Up until very recently whenever I have demonstrated the results of our improvement work to individuals or groups the usual response has been “Yes, but“. The habitual discount as you call it. “Yes, but your service is simpler; Yes, but your budget is bigger; Yes, but your staff are less militant.” I have learned to expect it so I do not get angry any more.

<Bob> OK. The mantra of the skeptics is to be expected and you have learned to stay calm and maintain respect. So what is the new challenge?

<Leslie>There are two parts to it. Firstly, because the habitual discounting is such an effective barrier to diffusion of learning; our system has not changed; the performance is steadily deteriorating; the chaos is worsening and everything that is ‘obvious’ has been tried and has not worked. More red lights are flashing on the patient-harm dashboard and the Inspectors are on their way. There is an increasing turnover of staff at all levels – including Executive. There is an anguished call for “A return to compassion first” and “A search for new leaders” and “A cultural transformation“.

<Bob> OK. It sounds like the tipping point of awareness has been reached, enough people now appreciate that their platform is burning and radical change of strategy is required to avoid the ship sinking and them all drowning. What is the second part?

<Leslie> I am getting more emails along the line of “What would you do?”

<Bob> And your reply?

<Leslie> I say that I do not know because I do not have a diagnosis of the cause of the problem. I do know a lot of possible causes but I do not know which plausible ones are the actual ones.

<Bob> That is a good answer. What was the response?

<Leslie>The commonest one is “Yes, but you have shown us that Plan-Do-Study-Act is the way to improve – and we have tried that and it does not work for us. So we think that improvement science is just more snake oil!”

<Bob>Ah ha. And how do you feel about that?

<Leslie>I have learned the hard way to respect the opinion of skeptics. PDSA does work for me but not for them. And I do not understand why that is. I would like to conclude that they are not doing it right but that is just discounting them and I am wary of doing that.

<Bob>OK. You are wise to be wary. We have reached what I call the Mirror-on-the-Wall moment. Let me ask what your understanding of the history of PDSA is?

<Leslie>It was called Plan-Do-Check-Act by Walter Shewhart in the 1930’s and was presented as a form of the scientific method that could be applied on the factory floor to improving the quality of manufactured products. W Edwards Deming modified it to PDSA where the “Check” was changed to “Study”. Since then it has been the key tool in the improvement toolbox.

<Bob>Good. That is an excellent summary. What the Zealots do not talk about are the limitations of their wonder-tool. Perhaps that is because they believe it has no limitations. Your experience would seem to suggest otherwise though.

<Leslie>Spot on Bob. I have a nagging doubt that I am missing something here. And not just me.

<Bob>The reason PDSA works for you is because you are using it for the purpose it was designed for: incremental improvement of small bits of the big system; the steps; the points where the streams cross the stages. You are using your FISH training to come up with change plans that will work because you understand the Physics of Flow better. You make wise improvement decisions. In fact you are using PDSA in two separate modes: discovery mode and delivery mode. In discovery mode we use the Study phase to build your competence – and we learn most when what happens is not what we expected. In delivery mode we use the Study phase to build our confidence – and that grows most when what happens is what we predicted.

<Leslie>Yes, that makes sense. I see the two modes clearly now you have framed it that way – and I see that I am doing both at the same time, almost by second nature.

<Bob>Yes – so when you demonstrate it you describe PDSA generically – not as two complimentary but contrasting modes. And by demonstrating success you omit to show that there are some design challenges that cannot be solved with either mode. That hidden gap attracts some of the “Yes, but” reactions.

<Leslie>Do you mean the challenges that others are trying to solve and failing?

<Bob>Yes. The commonest error is to discount the value of improvement science in general; so nothing is done and the inevitable crisis happens because the system design is increasingly unfit for the evolving needs. The toast is not just burned it is on fire and is now too late to use the discovery mode of PDSA because prompt and effective action is needed. So the delivery mode of PDSA is applied to a emergent, ill-understood crisis. The Plan is created using invalid assumptions and guesswork so it is fundamentally flawed and the Do then just makes the chaos worse. In the ensuing panic the Study and Act steps are skipped so all hope of learning is lost and and a vicious and damaging spiral of knee-jerk Plan-Do-Plan-Do follows. The chaos worsens, quality falls, safety falls, confidence falls, trust falls, expectation falls and depression and despair increase.

<Leslie>That is exactly what is happening and why I feel powerless to help. What do I do?

<Bob>The toughest bit is past. You have looked squarely in the mirror and can now see harsh reality rather than hasty rhetoric. Now you can look out of the window with different eyes. And you are now looking for a real-world example of where complex problems are solved effectively and efficiently. Can you think of one?

<Leslie>Well medicine is one that jumps to mind. Solving a complex, emergent clinical problems requires a clear diagnosis and prompt and effective action to stabilise the patient and then to cure the underlying cause: the disease.

<Bob>An excellent example. Can you describe what happens as a PDSA sequence?

<Leslie>That is a really interesting question. I can say for starters that it does not start with P – we have learned are not to have a preconceived idea of what to do at the start because it badly distorts our clinical judgement. The first thing we do is assess the patient to see how sick and unstable they are – we use the Vital Signs. So that means that we decide to Act first and our first action is to Study the patient.

<Bob>OK – what happens next?

<Leslie>Then we will do whatever is needed to stabilise the patient based on what we have observed – it is called resuscitation – and only then we can plan how we will establish the diagnosis; the root cause of the crisis.

<Bob> So what does that spell?

<Leslie> A-S-D-P. It is the exact opposite of P-D-S-A … the mirror image!

<Bob>Yes. Now consider the treatment that addresses the root cause and that cures the patient. What happens then?

<Leslie>We use the diagnosis is used to create a treatment Plan for the specific patient; we then Do that, and we Study the effect of the treatment in that specific patient, using our various charts to compare what actually happens with what we predicted would happen. Then we decide what to do next: the final action. We may stop because we have achieved our goal, or repeat the whole cycle to achieve further improvement. So that is our old friend P-D-S-A.

<Bob>Yes. And what links the two bits together … what is the bit in the middle?

<Leslie>Once we have a diagnosis we look up the appropriate treatment options that have been proven to work through research trials and experience; and we tailor the treatment to the specific patient. Oh I see! The missing link is design. We design a specific treatment plan using generic principles.

<Bob>Yup. The design step is the jam in the improvement sandwich and it acts like a mirror: A-S-D-P is reflected back as P-D-S-A

<Leslie>So I need to teach this backwards: P-D-S-A and then Design and then A-S-P-D!

<Bob>Yup – and you know that by another name.

<Leslie> 6M Design®! That is what my Improvement Science Practitioner course is all about.

<Bob> Yup.

<Leslie> If you had told me that at the start it would not have made much sense – it would just have confused me.

<Bob>I know. That is the reason I did not. The Mirror needs to be discovered in order for the true value to appreciated. At the start we look in the mirror and perceive what we want to see. We have to learn to see what is actually there. Us. Now you can see clearly where P-D-S-A and Design fit together and the missing A-S-D-P component that is needed to assemble a 6M Design® engine. That is Improvement-by-Design in a nine-letter nutshell.

<Leslie> Wow! I can’t wait to share this.

<Bob> And what do you expect the response to be?

<Leslie>”Yes, but”?

<Bob> From the die hard skeptics – yes. It is the ones who do not say “Yes, but” that you want to engage with. The ones who are quiet. It is always the quiet ones that hold the key.

25/10/2013

The Black Curtain

A couple of weeks ago an important event happened. A Masterclass in Demand and Capacity for NHS service managers was run by an internationally renown and very experienced practitioner of Improvement Science.

The purpose was to assist the service managers to develop their capability for designing quality, flow and cost improvement using tried and tested operations management (OM) theory, techniques and tools.

It was assumed that as experienced NHS service managers that they already knew the basic principles of OM and the foundation concepts, terminology, techniques and tools.

It was advertised as a Masterclass and designed accordingly.

On the day it was discovered that none of the twenty delegates had heard of two fundamental OM concepts: Little’s Law and Takt Time.

These relate to how processes are designed-to-flow. It was a Demand and Capacity Master Class; not a safety, quality or cost one. The focus was flow.

And it became clear that none of the twenty delegates were aware before the day that there is a well-known and robust science to designing systems to flow.

So learning this fact came as a bit of a shock.

The implications of this observation are profound and worrying:

“if a significant % of senior NHS operational managers are unaware of the foundations of operations management then the NHS may have problem it was not aware of …“

because …

“if transformational change of the NHS into a stable system that is fit-for-purpose (now and into the future) requires the ability to design processes and systems that deliver both high effectiveness and high efficiency ...”

then …

“it raises the question of whether the current generation of NHS managers are fit-for-this-future-purpose“.

No wonder that discovering a Science of Improvement actually exists came as a bit of a shock!

And saying “Yes, but clinicians do not know this science either!” is a defensive reaction and not a constructive response. They may not but they do not call themselves “operational managers”.

[PS. If you are reading this and are employed by the NHS and do not know what Little’s Law and Takt Time are then it would be worth doing that first. Wikipedia is a good place to start].

And now we have another question:

“Given there are thousands of operational managers in the NHS; what does one sample of 20 managers tell us about the whole population?”

Now that is a good question.

It is also a question of statistics. More specifically quite advanced statistics.

And most people who work in the NHS have not studied statistics to that level. So now we have another do-not-know-how problem.

But it is still an important question that we need to understand the answer to – so we need to learn how and that means taking this learning path one step at a time using what we do know, rather than what we do not.

Step 1:

What do we know? We have one sample of 20 NHS service managers. We know something about our sample because our unintended experiment has measured it: that none of them had heard of Little’s Law or Takt Time. That is 0/20 or 0%.

This is called a “sample statistic“.

What we want to know is “What does this information tell us about the proportion of the whole population of all NHS managers who do have this foundation OM knowledge?”

This proportion of interest is called the unknown “population parameter“.

And we need to estimate this population parameter from our sample statistic because it is impractical to measure a population parameter directly: That would require every NHS manager completing an independent and accurate assessment of their basic OM knowledge. Which seems unlikely to happen.

The good news is that we can get an estimate of a population parameter from measurements made from small samples of that population. That is one purpose of statistics.

Step 2:

But we need to check some assumptions before we attempt this statistical estimation trick.

Q1: How representative is our small sample of the whole population?

If we chose the delegates for the masterclass by putting the names of all NHS managers in a hat and drawing twenty names out at random, as in a tombola or lottery, than we have what is called a “random sample” and we can trust our estimate of the wanted population parameter. This is called “random sampling”.

That was not the case here. Our sample was self-selecting. We were not conducting a research study. This was the real world … so there is a chance of “bias”. Our sample may not be representative and we cannot say what the most likely bias is.

It is possible that the managers who selected themselves were the ones struggling most and therefore more likely than average to have a gap in their foundation OM knowledge. It is also possible that the managers who selected themselves are the most capable in their generation and are very well aware that there is something else that they need to know.

We may have a biased sample and we need to proceed with some caution.

Step 3:

So given the fact that none of our possibly biased sample of mangers were aware of the Foundation OM Knowledge then it is possible that no NHS service managers know this core knowledge. In other words the actual population parameter is 0%. It is also possible that the managers in our sample were the only ones in the NHS who do not know this. So, in theory, the sought-for population parameter could be anywhere between 0% and very nearly 100%. Does that mean it is impossible to estimate the true value?

It is not impossible. In fact we can get an estimate that we can be very confident is accurate. Here is how it is done.

Statistical estimates of population parameters are always presented as ranges with a lower and an upper limit called a “confidence interval” because the sample is not the population. And even if we have an unbiased random sample we can never be 100% confident of our estimate. The only way to be 100% confident is to measure the whole population. And that is not practical.

So, we know the theoretical limits from consideration of the extreme cases … but what happens when we are more real-world-reasonable and say – “let us assume our sample is actually a representative sample, albeit not a randomly selected one“. How does that affect the range of our estimate of the elusive number – the proportion of NHS service managers who know basic operation management theory?

Step 4:

To answer that we need to consider two further questions:

Q2. What is the effect of the size of the sample? What if only 5 managers had come and none of them knew; what if had been 50 or 500 and none of them knew?

Q3. What if we repeated the experiment more times? With the same or different sample sizes? What could we learn from that?

Our intuition tells us that the larger the sample size and the more often we do the experiment then the more confident we will be of the result. In other words narrower the range of the confidence interval around our sample statistic.

Our intuition is correct because if our sample was 100% of the population we could be 100% confident.

So given we have not yet found an NHS service manager who has the OM Knowledge then we cannot exclude 0%. Our challenge narrows to finding a reasonable estimate of the upper limit of our confidence interval.

Step 5

Before we move on let us review where we have got to already and our purpose for starting this conversation: We want enough NHS service managers who are knowledgeable enough of design-for-flow methods to catalyse a transition to a fit-for-purpose and self-sustaining NHS.

One path to this purpose is to have a large enough pool of service managers who do understand this Science well enough to act as advocates and to spread both the know-of and the know-how. This is called the “tipping point“.

There is strong evidence that when about 20% of a population knows about something that is useful for the whole population – then that knowledge will start to spread through the grapevine. Deeper understanding will follow. Wiser decisions will emerge. More effective actions will be taken. The system will start to self-transform.

And in the Brave New World of social media this message may spread further and faster than in the past. This is good.

So if the NHS needs 20% of its operational managers aware of the Foundations of Operations Management then what value is our morsel of data from one sample of 20 managers who, by chance, were all unaware of the Knowledge. How can we use that data to say how close to the magic 20% tipping point we are?

Step 6:

To do that we need to ask the question in a slightly different way.

Q4. What is the chance of an NHS manager NOT knowing?

We assume that they either know or do not know; so if 20% know then 80% do not.

This is just like saying: if the chance of rolling a “six” is 1-in-6 then the chance of rolling a “not-a-six” is 5-in-6.

Next we ask:

Q5. What is the likelihood that we, just by chance, selected a group of managers where none of them know – and there are 20 in the group?

This is rather like asking: what is the likelihood of rolling twenty “not-a-sixes” in a row?

Our intuition says “an unlikely thing to happen!”

And again our intuition is sort of correct. How unlikely though? Our intuition is a bit vague on that.

If the actual proportion of NHS managers who have the OM Knowledge is about the same chance of rolling a six (about 16%) then we sense that the likelihood of getting a random sample of 20 where not one knows is small. But how small? Exactly?

We sense that 20% is too a high an estimate of a reasonable upper limit. But how much too high?

The answer to these questions is not intuitively obvious.

We need to work it out logically and rationally. And to work this out we need to ask:

Q6. As the % of Managers-who-Know is reduced from 20% towards 0% – what is the effect on the chance of randomly selecting 20 all of whom are not in the Know? We need to be able to see a picture of that relationship in our minds.

The good news is that we can work that out with a bit of O-level maths. And all NHS service managers, nurses and doctors have done O-level maths. It is a mandatory requirement.

The chance of rolling a “not-a-six” is 5/6 on one throw – about 83%;
and the chance of rolling only “not-a-sixes” in two throws is 5/6 x 5/6 = 25/36 – about 69%
and the chance of rolling only “not-a-sixes” in three throws is 5/6 x 5/6 x 5/6 – about 58%… and so on.

[This is called the “chain rule” and it requires that the throws are independent of each other – i.e. a random, unbiased sample]

If we do this 20 times we find that the chance of rolling no sixes at all in 20 throws is about 2.6% – unlikely but far from impossible.

We need to introduce a bit of O-level algebra now.

Let us call the proportion of NHS service managers who understand basic OM, our unknown population parameter something like “p”.

So if p is the chance of a “six” then (1-p) is a chance of a “not-a-six”.

Then the chance of no sixes in one throw is (1-p)

and no sixes after 2 throws is (1-p)(1-p) = (1-p)^2 (where ^ means raise to the power)

and no sixes after three throws is (1-p)(1-p)(1-p) = (1-p)^3 and so on.

So the likelihood of “no sixes in n throws” is (1-p)^n

Let us call this “t”

So the equation we need to solve to estimate the upper limit of our estimate of “p” is

t=(1-p)^20

Where “t” is a measure of how likely we are to choose 20 managers all of whom do not know – just by chance. And we want that to be a small number. We want to feel confident that our estimate is reasonable and not just a quirk of chance.

So what threshold do we set for “t” that we feel is “reasonable”? 1 in a million? 1 in 1000? 1 in 100? 1 in10?

By convention we use 1 in 20 (t=0.05) – but that is arbitrary. If we are more risk-averse we might choose 1:100 or 1:1000. It depends on the context.

Let us be reasonable – let is say we want to be 95% confident our our estimated upper limit for “p” – which means we are calculating the 95% confidence interval. This means that will accept a 1:20 risk of our calculated confidence interval for “p” being wrong: a 19:1 odds that the true value of “p” falls outside our calculated range. Pretty good odds! We will be reasonable and we will set the likelihood threshold for being “wrong” at 5%.

So now we need to solve:

0.05= (1-p)^20

And we want a picture of this relationship in our minds so let us draw a graph of t for a range of values of p.

We know the value of p must be between 0 and 1.0 so we have all we need and we can generate this graph easily using Excel. And every senior NHS operational manager knows how to use Excel. It is a requirement. Isn’t it?

The Excel-generated chart shows the relationship between p (horizontal axis) and t (vertical axis) using our equation:

t=(1-p)^20.

Step 7

Let us first do a “sanity check” on what we have drawn. Let us “check the extreme values”.

If 0% of managers know then a sample of 20 will always reveal none – i.e. the leftmost point of the chart. Check!

If 100% of managers know then a sample of 20 will never reveal none – i.e. way off to the right. Check!

What is clear from the chart is that the relationship between p and t is not a straight line; it is non-linear. That explains why we find it difficult to estimate intuitively. Our brains are not very good at doing non-linear analysis. Not very good at all.

So we need a tool to help us. Our Excel graph. We read down the vertical “t” axis from 100% to the 5% point, then trace across to the right until we hit the line we have drawn, then read down to the corresponding value for “p”. It says about 14%.

So that is the upper limit of our 95% confidence interval of the estimate of the true proportion of NHS service managers who know the Foundations of Operations Management. The lower limit is 0%.

And we cannot say better than somewhere between 0%-14% with the data we have and the assumptions we have made.

To get a more precise estimate, a narrower 95% confidence interval, we need to gather some more data.

[Another way we can use our chart is to ask “If the actual % of Managers who know is x% the what is the chance that no one of our sample of 20 will know?” Solving this manually means marking the x% point on the horizontal axis then tracing a line vertically up until it crosses the drawn line then tracing a horizontal line to the left until it crosses the vertical axis and reading off the likelihood.]

So if in reality 5% of all managers do Know then the chance of no one knowing in an unbiased sample of 20 is about 35% – really quite likely.

Now we are getting a feel for the likely reality. Much more useful than just dry numbers!

But we are 95% sure that 86% of NHS managers do NOT know the basic language of flow-improvement-science.

And what this chart also tells us is that we can be VERY confident that the true value of p is less than 2o% – the proportion we believe we need to get to transformation tipping point.

Now we need to repeat the experiment experiment and draw a new graph to get a more accurate estimate of just how much less – but stepping back from the statistical nuances – the message is already clear that we do have a Black Curtain problem.

A Black Curtain of Ignorance problem.

Many will now proclaim angrily “This cannot be true! It is just statistical smoke and mirrors. Surely our managers do know this by a different name – how could they not! It is unthinkable to suggest the majority of NHS manages are ignorant of the basic science of what they are employed to do!“

If that were the case though then we would already have an NHS that is fit-for-purpose. That is not what reality is telling us.

And it quickly become apparent at the master class that our sample of 20 did not know-this-by-a-different-name.

The good news is that this knowledge gap could hiding the opportunity we are all looking for – a door to a path that leads to a radical yet achievable transformation of the NHS into a system that is fit-for-purpose. Now and into the future.

A system that delivers safe, high quality care for those who need it, in full, when they need it and at a cost the country can afford. Now and for the foreseeable future.

And the really good news is that this IS knowledge gap may be and extensive deep but it is not wide … the Foundations are is easy to learn, and to start applying immediately. The basics can be learned in less than a week – the more advanced skills take a bit longer. And this is not untested academic theory – it is proven pragmatic real-world problem solving know-how. It has been known for over 50 years outside healthcare.

Our goal is not acquisition of theoretical knowledge – is is a deep enough understanding to make wise enough decisions to achieve good enough outcomes. For everyone. Starting tomorrow.

And that is the design purpose of FISH. To provide those who want to learn a quick and easy way to do so.

Stop Press: Further feedback from the masterclass is that some of the managers are grasping the nettle, drawing back their own black curtains, opening the door that was always there behind it, and taking a peek through into a magical garden of opportunity. One that was always there but was hidden from view.

25/08/2013

Find and Fill

Many barriers to improvement are invisible.

This is because they are caused by what is not present rather than what is. They are gaps or omissions.

Some gaps are blindingly obvious. This is because we expect to see something there so we notice when it is missing. We would notice the gap if a rope bridge across chasm is obviously missing because only end posts are visible.

Many gaps are not obvious. This is because we have no experience or expectation. The gap is invisible. We are blind to the omission.

These are the gaps that we accidentally stumble into. Such as a gap in our knowledge and understanding that we cannot see. These are the gaps that create the fear of failure. And the fear is especially real because the gap is invisible and we only know when it is too late.

It is like walking across an emotional minefield. At any moment we could step on an ignorance mine and our confidence would be blasted into fragments.

So our natural and reasonable reaction is to stay outside the emotional minefield and inside our comfort zones – where we feel safe. We give up trying to learn and trying to improve. Every-one hopes that Some-one or Any-one will do it for us. No-one does.

The path to Improvement is always across an emotional minefield because improvement implies unlearning. So we need a better design than blundering about hoping not to fall into an invisible gap. We need a safer design.

There are a number of options:

Option 1. Ask someone who knows the way across the minefield and can demonstrate it. Someone who knows where the mines are and knows how to avoid them. Someone to tell us where to step and where not to.

Option 2. Clear a new path and mark it clearly so others can trust that it is safe. Remove the ignorance mines. Find and Fill the knowledge map.

Option 1 is quicker but it leaves the ignorance mines in place. So sooner or later someone will step on one. Boom!

We need to be able to do Option 2.

The obvious strategy for Option 2 is to clear the ignorance mines. We could do this by deliberately blundering about setting off the mines. We could adopt the burn-and-scrape or learn-from-mistakes approach.

Or we could detect, defuse and remove them.

The former requires people willing to take emotional risks; the latter does not require such a sacrifice.

And “learn-by-mistakes” only works if people are able to make mistakes visibly so everyone can learn. In an adversarial, competitive, distrustful context this can not happen: and the result is usually for the unwilling troops to be forced into the minefield with the threat of a firing-squad if they do not!

And where a mistake implies irreversible harm it is not acceptable to learn that way. Mistakes are covered up. The ignorance mines are re-set for the next hapless victim to step on. The emotional carnage continues. Any change 0f sustained, system-wide improvement is blocked.

So in a low-trust cultural context the detect-defuse-and-remove strategy is the safer option.

And this requires a proactive approach to finding the gaps in understanding; a proactive approach to filling the knowledge holes; and a proactive approach to sharing what was learned.

Or we could ask someone who knows where the ignorance mines are and work our way through finding and filling our knowledge gaps. By that means any of us can build a safe, effective and efficient path to sustainable improvement.

And the person to ask is someone who can demonstrate a portfolio of improvement in practice – an experienced Improvement Science Practitioner.

And we can all learn to become an ISP and then guide others across their own emotional minefields.

All we need to do is take the first step on a well-trodden path to sustained improvement.

10/08/2013

Taming the Wicked Bull and the OH Effect

“Take the bull by the horns” is a phrase that is often heard in Improvement circles.

The metaphor implies that the system – the bull – is an unpredictable, aggressive, wicked, wild animal with dangerous sharp horns.

“Unpredictable” and “Dangerous” is certainly what the newspapers tell us the NHS system is – and this generates fear. Fear-for-our-safety and fear drives us to avoid the bad tempered beast.

It creates fear in the hearts of the very people the NHS is there to serve – the public. It is not the intended outcome.

“Bullish” is a phrase we use for “aggressive behaviour” and it is disappointing to see those accountable behave in a bullish manner – aggressive, unpredictable and dangerous.

We are taught that bulls are to be avoided and we are told to not to wave red flags at them! For our own safety.

But that is exactly what must happen for Improvement to flourish. We all need regular glimpses of the Red Flag of Reality. It is called constructive feedback – but it still feels uncomfortable. Our natural tendency to being shocked out of our complacency is to get angry and to swat the red flag waver. And the more powerful we are, the sharper our horns are, the more swatting we can do and the more fear we can generate. Often intentionally.

So inexperienced improvement zealots are prodded into “taking the executive bull by the horns” – but it is poor advice.

Improvement Scientists are not bull-fighters. They are not fearless champions who put themselves at personal risk for personal glory and the entertainment of others. That is what Rescuers do. The fire-fighters; the quick-fixers; the burned-toast-scrapers; the progress-chasers; and the self-appointed-experts. And they all get gored by an angry bull sooner or later. Which is what the crowd came to see – Bull Fighter Blood and Guts!

So attempting to slay the wicked bullish system is not a realistic option.

What about taming it?

This is the game of Bucking Bronco. You attach yourself to the bronco like glue and wear it down as it tries to throw you off and trample you under hoof. You need strength, agility, resilience and persistence. All admirable qualities. Eventually the exhausted beast gives in and does what it is told. It is now tamed. You have broken its spirit. The stallion is no longer a passionate leader; it is just a passive follower. It has become a Victim.

Improvement requires spirit – lots of it.

Improvement requires the spirit-of-courage to challenge dogma and complacency.
Improvement requires the spirit-of-curiosity to seek out the unknown unknowns.
Improvement requires the spirit-of-bravery to take calculated risks.
Improvement requires the spirit-of-action to make the changes needed to deliver the improvements.
Improvement requires the spirit-of-generosity to share new knowledge, understanding and wisdom.

So taming the wicked bull is not going to deliver sustained improvement. It will only achieve stable mediocrity.

So what next?

What about asking someone who has actually done it – actually improved something?

Good idea! Who?

What about someone like Don Berwick – founder of the Institute of Healthcare Improvement in the USA?

Excellent idea! We will ask him to come and diagnose the disease in our system – the one that lead to the Mid-Staffordshire septic safety carbuncle, and the nasty quality rash in 14 Trusts that Professor Sir Bruce Keogh KBE uncovered when he lifted the bed sheet.

[Click HERE to see Dr Bruce’s investigation].

We need a second opinion because the disease goes much deeper – and we need it from a credible, affable, independent, experienced expert. Like Dr Don B.

So Dr Don has popped over the pond, examined the patient, formulated his diagnosis and delivered his prescription.

[Click HERE to read Dr Don’s prescription].

Of course if you ask two experts the same question you get two slightly different answers. If you ask ten you get ten. This is because if there was only one answer that everyone agreed on then there would be no problem, no confusion, and need for experts. The experts know this of course. It is not in their interest to agree completely.

One bit of good news is that the reports are getting shorter. Mr Robert’s report on the failing of one hospital is huge and has 209 recommendations. A bit of a bucketful. Dr Bruce’s report is specific to the Naughty Fourteen who have strayed outside the statistical white lines of acceptable mediocrity.

Dr Don’s is even shorter and it has just 10 recommendations. One for each finger – so easy to remember.

1. The NHS should continually and forever reduce patient harm by embracing wholeheartedly an ethic of learning.

2. All leaders concerned with NHS healthcare – political, regulatory, governance, executive, clinical and advocacy – should place quality of care in general, and patient safety in particular, at the top of their priorities for investment, inquiry, improvement, regular reporting, encouragement and support.

3. Patients and their carers should be present, powerful and involved at all levels of healthcare organisations from wards to the boards of Trusts.

4. Government, Health Education England and NHS England should assure that sufficient staff are available to meet the NHS’s needs now and in the future. Healthcare organisations should ensure that staff are present in appropriate numbers to provide safe care at all times and are well-supported.

5. Mastery of quality and patient safety sciences and practices should be part of initial preparation and lifelong education of all health care professionals, including managers and executives.

6. The NHS should become a learning organisation. Its leaders should create and support the capability for learning, and therefore change, at scale, within the NHS.

7. Transparency should be complete, timely and unequivocal. All data on quality and safety, whether assembled by government, organisations, or professional societies, should be shared in a timely fashion with all parties who want it, including, in accessible form, with the public.

8. All organisations should seek out the patient and carer voice as an essential asset in monitoring the safety and quality of care.

9. Supervisory and regulatory systems should be simple and clear. They should avoid diffusion of responsibility. They should be respectful of the goodwill and sound intention of the vast majority of staff. All incentives should point in the same direction.

10. We support responsive regulation of organisations, with a hierarchy of responses. Recourse to criminal sanctions should be extremely rare, and should function primarily as a deterrent to wilful or reckless neglect or mistreatment.

The meat in the sandwich are recommendations 5 and 6 that together say “Learn Improvement Science“.

And what happens when we commit and engage in that learning journey?

Steve Peak has described what happens in this this very blog. It is called the OH effect.

OH stands for “Obvious-in-Hindsight”.

Obvious means “understandable” which implies visible, sensible, rational, doable and teachable.

Hindsight means “reflection” which implies having done something and learning from reality.

So if you would like to have a sip of Dr Don’s medicine and want to get started on the path to helping to create a healthier healthcare system you can do so right now by learning how to FISH – the first step to becoming an Improvement Science Practitioner.

The good news is that this medicine is neither dangerous nor nasty tasting – it is actually fun!

And that means it is OK for everyone – clinicians, managers, patients, carers and politicians. All of us.