Second Wave

The summer holidays are over and schools are open again – sort of.

Restaurants, pubs and nightclubs are open again – sort of.

Gyms and leisure facilities are open again – sort of.

And after two months of gradual easing of social restrictions and massive expansion of test-and-trace we now have the spectre of a Second Wave looming.  It has happened in Australia, Italy, Spain and France so it can happen here.

As usual, the UK media are hyping up the general hysteria and we now also have rioting disbelievers claiming it is all a conspiracy and that re-applying local restrictions is an infringement of their liberty.

So, what is all the fuss about?

We need to side-step the gossip and get some hard data from a reliable source (i.e. not a newspaper). Here is what worldometer is sharing …

OMG!  It looks like The Second Wave is here already!  There are already as many cases now as in March and we still have the mantra “Stay At Home – Protect the NHS – Save Lives” ringing in our ears.  But something is not quite right.  No one is shouting that hospitals are bursting at the seams.  No one is reporting that the mortuaries are filling up.  Something is different. We need more data.That is odd! We can clearly see that cases and deaths went hand-in-hand in the First Wave with about 1:5 cases not making it.  But this time the deaths are not rising with the cases.

Ah ha!  Maybe that is because the virus has mutated into something much more benign and because we have got much better at diagnosing and treating this illness – the ventilators and steroids saved the day.  Hurrah!  It’s all a big fuss about nothing … we should still be able to have friends round for a party and go on pub crawls!

But … what if there was a different explanation for the patterns on the charts above?

It is said that “data without context is meaningless” … and I’d go further than that … data without context is dangerous because if it leads to invalid conclusions and inappropriate decisions we can get well-intended actions that cause unintended harm.  People might die.

So, we need to check the context of the data.

In the First Wave the availability of the antigen (swab) test was limited so it was only available to hospitals and the “daily new cases” were in patients admitted to hospital – the ones with severe enough symptoms to get through the NHS 111 telephone triage.  Most people with symptoms, even really bad ones, stayed at home to protect the NHS.  They didn’t appear in the statistics.

But did the collective sacrifice of our social lives actually save lives?

The original estimates of the plausible death toll in the UK ranged up to 500,000 from coronavirus alone (and no one knows how many more from the collateral effects of an overwhelmed NHS).  The COVID-19 body count to date is just under 50,000 so, putting a positive spin on that tragic statistic, 90% of the potential deaths were prevented.  The lock down worked.  The NHS did not collapse.  The Nightingale Hospitals stood ready and idle – an expensive insurance policy.  Lives were saved.

Why isn’t that being talked about?

And the context changed in another important way.  The antigen testing capacity was scaled up despite being mired in confusing jargon.  Who thought up the idea of calling them “pillars”?

Anyway, if we dig about on the GOV.UK website enough there is a definition:

So, Pillar 1 = NHS testing capacity Pillar 2 = commercial testing capacity and we don’t actually know how much was in-hospital testing and how much was in-community testing because the definitions seem to reflect budgets rather than patients.  Ever has it been thus in the NHS!

However, we can see from the chart below that testing activity (blue bars) has increased many-fold but the two testing streams (in hospital and outside hospital) are combined in one chart.  Well, it is one big pot of tax-payers cash after all and it is the same test.

To unravel this a bit we have to dig about on the website, download the raw data, and plot it ourselves.  Looking at Pillar 2 (commercial) we can see they had a late start, caught the tail of the First Wave, and then ramped up activity as the population testing caught up with the available capacity (because hospital activity has been falling since late April).

Now we can see that the increased number of positive tests could be explained by the fact that we are now testing anyone with possible COVID-19 symptoms who steps up – mainly in the community.  And we were unable to do this before because the testing capacity did not exist.

The important message is that in the First Wave we were not measuring what was happening in the community – it was happening though – it must have been.  We measured the effects. Hospital admissions with positive tests and deaths after positive tests.

So, to present the daily positive tests as one time-series chart that conflates both ‘pillars’ is both meaningless and dangerous and it is no surprise that the pubic are confused.


This raises a question: “Can we estimate how many people there would have been in the community in the First Wave so that we can get a sense of what the rising positive test rate means now?”

The way that epidemiologists do this is to build a generic simulation model of the system dynamics of an epidemic (a SEIR compartment model) and then use the measured data to calibrate the this model so that it can then be used for specific prediction and planning.

Here is an example of the output a calibrated multi-compartment system dynamics model of the UK COVID-19 epidemic for a nominal 1.3 million population.  The compartments that are included are Susceptible, Exposed, Infectious, and Recovered (i.e. not infectious) and this model also simulates the severity of the illness i.e. Severe (in hospital), Critical (in ITU) and Died.

The difference in size of the various compartments is so great that the graph below requires two scales – the solid line (Infectious) is plotted on the left hand scale and the others are plotted on the right hand scale which is 10 times smaller.  The green line is today and the reported data up to that point can be used to calibrate the model and to estimate the historical metrics that we did not measure – such as how many people in the community were infectious (and would have tested positive).

At the peak of the First Wave, for this population of 1.3 million, the model estimates there were about 800 patients in hospital (which there were) and 24,000 patients in the community who would have tested positive if we had been able to test them.  24,000/800 = 30 which means the peak of the grey line is 30 x higher than the peak of the orange line – hence the need for the two Y-axes with a 10-fold difference in scale.

Note the very rapid rise in the number of infectious people from the beginning of March when the first UK death was announced, before the global pandemic was declared and before the UK lock down was enacted in law and implemented.  Coronavirus was already spreading very rapidly.

Note how this rapid rise in the number of infectious people came to an abrupt halt when the UK lock down was put into place in the third week of March.  Social distancing breaks the chain of transmission from one infectious person to many other susceptible ones.

Note how the peaks of hospital admissions, critical care admissions and deaths lag after the rise in infectious people (because it takes time for the coronavirus to do its damage) and how each peak is smaller (because only about 1:30 get sick enough to need admission, and only 1:5 of hospital admissions do not survive.

Note how the fall in the infectious group was more gradual than the rise (because the lock down was partial and because not everyone could stay at home – essential services like the NHS had to continue).


So, by early July it was possible to start a gradual relaxation of the lock down and from then we can see a gradual rise in infectious people again.  But now we were measuring them because of the growing capacity to perform antigen tests in the community.  The relatively low level and the relatively slow rise are much less dramatic than what was happening in March (because of the higher awareness and the continued social distancing and use of face coverings).  It is easy to become impatient and complacent.

But by early September it was clear that the number on infectious people was growing faster in the community – and then we saw the fall in hospital admissions reach a minimum and start to rise again. And the number if deaths reach a minimum and start to rise again.  And this evidence proves that the current level of social distancing is not enough to keep a lid on this disease.  We are in the foothills of a Second Wave.


So what do we do next?

First, we must estimate the effect that the current social distancing policies are having and one way to do that would be to stop doing them and see what happens.  Clearly that is not an ethical experiment to perform given what we already know.  But, we can simulate that experiment using our calibrated model.  Here is what is predicted to happen if we went back to the pre-lockdown behaviours: There would be a very rapid spread of the virus followed by a Second Wave that would be many times bigger than the first!!  Then it would burn itself out and those who had survived could go back to some semblance of normality.  The human sacrifice would be considerable.

So, despite the problems that the current social distancing is causing, they pale into insignificance compared to what could happen if they were dropped.

The previous model shows what is predicted would happen if we continue as we are with no further easing of restrictions and assuming people stick to them.  In short, we will have COVID-for-Christmas and it could be a very nasty business indeed as it would come at the same time as other winter-associated infectious diseases such as influenza and norovirus.

The next chart shows what could happen if we squeeze the social distancing brake a bit by focusing only on the behaviours that the track-and-trace-and-test system is highlighting as the key drivers of the growth infections, admissions and deaths.

What we see is an arrest of the rise of the number of infectious people (as we saw before), a small and not sustained increase in hospital admissions, then a slow decline back to the levels that were achieved in early July – and at which point it would be reasonable to have a more normal Christmas.

And another potential benefit of a bit more social distancing might be a much less problematic annual flu epidemic because that virus would also find it harder to spread – plus we have a flu vaccination which we can use to reduce that risk further.


It is not going to be easy.  We will have to sacrifice a bit of face-to-face social life for a bit longer.  We will have to measure, monitor, model and tweak the plan as we go.

And one thing we can do immediately is to share the available information in a more informative and less histrionic way than we are seeing at the moment.


Update: Sunday 1st November 2020

Yesterday the Government had to concede that the policy of regional restrictions had failed and bluffing it out and ignoring the scientific advice was, with the clarity of hindsight, an unwise decision.

In the face of the hard evidence of rapidly rising Covid+ve hospital admissions and deaths, the decision to re-impose a national 4-week lock down was announced.  This is the only realistic option to prevent overwhelming the NHS at a time of year that it struggles with seasonal influenza causing a peak of admissions and deaths.

Paradoxically, this year the effect of influenza may be less because social distancing will reduce the spread of that as well and also because there is a vaccination for influenza.  Many will have had their flu jab early … I certainly did.

So, what is the predicted effect of a 4 week lock down?  Well, the calibrated model (also used to generate the charts above) estimates that it could indeed suppress the second wave and mitigate a nasty COVID-4-Christmas scenario – but even with it the hospital admissions and associated mortality will continue to increase until the effect kicks in.  Brace yourselves.

Coronavirus


The start of a new year, decade, century or millennium is always associated with a sense of renewal and hope.  Little did we know that in January 2020 a global threat had hatched and was growing in the city of Wuhan, Hubei Province, China.  A virus of the family coronaviridae had mutated and jumped from animal to man where it found a new host and a vehicle to spread itself.   Several weeks later the World became aware of the new threat and in the West … we ignored it.  Maybe we still remember the SARS epidemic which was heralded as a potential global catastrophe but was contained in the Far East and fizzled out.  So, maybe we assumed this SARS-like virus would do the same.

It didn’t.  This mutant was different.  It caused a milder illness and unwitting victims were infectious before they were symptomatic; and most got better on their own, so they spread the mutant to many other people.  Combine that mutant behaviour with the winter (when infectious diseases spread more easily because we spend more time together indoors), Chinese New Year and global air travel … and we have the perfect recipe for cooking up a global pandemic of a new infectious disease.  But we didn’t know that at the time and we carried on as normal, blissfully unaware of the catastrophe that was unfolding.

By February it became apparent that the mutant had escaped containment in China and was wreaking havoc in other countries – with Italy high on the casualty list.  We watched in horror at the scenes on television of Italian hospitals overwhelmed with severely ill people fighting for breath as the virus attacked their lungs.  The death toll rose sharply but we still went on our ski holidays and assumed that the English Channel and our quarantine policy would protect us.

It didn’t.  This mutant was different.  We now know that it had already silently gained access into the UK and was growing and spreading.  The first COVID-19 death reported in the UK was in early March and only then did we sit up and start to take notice.  This was too close to home.

But it was too late.  The mathematics of how epidemics spread was worked out 100 years ago, not long after the 1918 pandemic of Spanish Flu that killed tens of millions of people before it burned itself out.  An epidemic is like cancer.  By the time it is obvious it is already far advanced because the growth is not linear – it is exponential.

 

As a systems engineer I am used to building simulation models to reveal the complex and counter-intuitive behaviour of nonlinear systems using the methods first developed by Jay W. Forrester in the 1950’s.  And when I looked up the equations that describe epidemics (on Wikipedia) I saw that I could build a system dynamics model of a COVID-19 epidemic using no more than an Excel spreadsheet.

So I did.  And I got a nasty surprise.  Using the data emerging from China on the nature of the spread of the mutant virus, the incidence of severe illness and the mortality rate … my simple Excel model predicted that, if COVID-19 was left to run its natural course in the UK, then it would burn itself out over several months but the human cost would be 500,000 deaths and the NHS would be completely overwhelmed with a “tsunami of sick”.  And I could be one of them!  The fact that there is no treatment and no vaccine for this novel threat excluded those options.  My basic Excel model confirmed that the only effective option to mitigate this imminent catastrophe was to limit the spread of the virus through social engineering i.e. an immediate and drastic lock down.  Everyone who was not essential to maintaining core services should “Stay at home, Protect the NHS and Save lives“.  That would become the mantra.  And others were already saying this – epidemiologists whose careers are spent planning for this sort of eventuality.  But despite all this there still seemed to be little sense of urgency, perhaps because their super-sophisticated models predicted that the peak of the UK epidemic would be in mid-June so there was time to prepare.  My basic model predicted that the peak would be in mid-April, in about 4 weeks, and that it was already too late to prevent about 50,000 deaths.

It turns out I was right.  That is exactly what happened.  By mid-March London was already seeing an exponential rise in hospital admissions, intensive care admissions and deaths and suddenly the UK woke up and panicked.  By that time I had enlisted the help of a trusted colleague who is a public health doctor and who has studied epidemiology, and together we wrote up and published the emerging story as we saw it:

An Acute Hospital Demand Surge Planning Model for the COVID-19 Epidemic using Stock-and-Flow Simulation in Excel: Part 1. Journal of Improvement Science 2020: 68; 1-20.  The link to download the full paper is here.

I also shared the draft paper with another trusted friend and colleague who works for my local clinical commissioning group (CCG) and I asked “Has the CCG a sense of the speed and magnitude of what is about to happen and has it prepared for the tsunami of sick that primary care will need to see?

What then ensued was an almost miraculous emergence of a coordinated and committed team of health care professionals and NHS managers with a single, crystal clear goal:  To design, build and deliver a high-flow, drive-through community-based facility to safely see-and-assess hundreds of patients per day with suspected COVID-19 who were too sick/worried to be managed on the phone, but not sick enough to go to A&E.  This was not a Nightingale Ward – that was a parallel, more public and much more expensive endeavour designed as a spillover for overwhelmed acute hospitals.  Our purpose was to help to prevent that and the time scale was short.  We had three weeks to do it because Easter weekend was the predicted peak of the COVID-19 surge if the national lock down policy worked as hoped.  No one really had an accurate estimate how effective the lock down would be and how big the peak of the tsunami of sick would rise as it crashed into the NHS;  so we planned for the worst and hoped for the best.  The Covid Referral Centre (CRC) was an insurance policy and we deliberately over-engineered it use to every scrap of space we had been offered in a small car park on the south side of the NEC site.

The CRC needed to open by Sunday 12th April and we were ready, but the actual opening was delayed by NHS bureaucracy and politics.  It did eventually open on 22nd April, just four weeks after we started, and it worked exactly as designed.  The demand was, fortunately, less than our worst case scenario; partly because we had missed the peak by 10 days and we opened the gates to a falling tide; and partly because the social distancing policy had been more effective than hoped; and partly because it takes time for risk-averse doctors to develop trust and to change their ingrained patterns of working.  A drive-through COVID-19 See-and-Treat facility? That was innovative and untested!!

The CRC expected to see a falling demand as the first wave of COVID-19 washed over, and that exactly is what happened.  So, as soon as that prediction was confirmed, the CRC was progressively repurposed to provide other much needed services such as drive-through blood tests, drive-through urgent care, and even outpatient clinics in the indoor part of the facility.

The CRC closed its gates to suspected COVID-19 patients on 31st July, as planned and as guided by the simple Excel computer model.

This is health care systems engineering in action.

And the simple Excel model has been re-calibrated as fresh evidence has emerged.  The latest version predicts that a second peak of COVID-19 (that is potentially worse than the first) will happen in late summer or autumn if social distancing is relaxed too far (see below).

But we don’t know what “too far” looks like in practical terms.  Oh, and a second wave could kick off just just when we expect the annual wave of seasonal influenza to arrive.  Or will it?  Maybe the effect of social distancing for COVID-19 in other countries will suppress the spread of seasonal flu as well?  We don’t know that either but the data of the incidence of flu from Australia certainly supports that hypothesis.

We may need a bit more health care systems engineering in the coming months. We shall see.

Oh, and if we are complacent enough to think a second wave could never happen in the UK … here is what is happening in Australia.

From Push to Pull

One of the most frequent niggles that I hear from patients is the difficultly they have getting an appointment with their general practitioner.  I too have personal experience of the distress caused by the ubiquitous “Phone at 8AM for an Appointment” policy, so in June 2018 when I was approached to help a group of local practices redesign their appointment booking system I said “Yes, please!


What has emerged is a fascinating, enjoyable and rewarding journey of co-evolution of learning and co-production of an improved design.  The multi-skilled design team (MDT) we pulled together included general practitioners, receptionists and practice managers and my job was to show them how to use the health care systems engineering (HCSE) framework to diagnose, design, decide and deliver what they wanted: A safe, calm, efficient, high quality, value-4-money appointment booking service for their combined list of 50,000 patients.


This week they reached the start of the ‘decide and deliver‘ phase.  We have established the diagnosis of why the current booking system is not delivering what we all want (i.e. patients and practices), and we have assembled and verified the essential elements of an improved design.

And the most important outcome for me is that the Primary Care MDT now feel confident and capable to decide what and how to deliver it themselves.   That is what I call embedded capability and achieving it is always an emotional roller coaster ride that we call The Nerve Curve.

What we are dealing with here is called a complex adaptive system (CAS) which has two main components: Processes and People.  Both are complicated and behave in complex ways.  Both will adapt and co-evolve over time.  The processes are the result of the policies that the people produce.  The policies are the result of the experiences that the people have and the explanations that they create to make intuitive sense of them.

But, complex systems often behave in counter-intuitive ways, so our intuition can actually lead us to make unwise decisions that unintentionally perpetuate the problem we are trying to solve.  The name given to this is a wicked problem.

A health care systems engineer needs to be able to demonstrate where these hidden intuitive traps lurk, and to explain what causes them and how to avoid them.  That is the reason the diagnosis and design phase is always a bit of a bumpy ride – emotionally – our Inner Chimp does not like to be challenged!  We all resist change.  Fear of the unknown is hard-wired into us by millions of years of evolution.

But we know when we are making progress because the “ah ha” moments signal a slight shift of perception and a sudden new clarity of insight.  The cognitive fog clears a bit and a some more of the unfamiliar terrain ahead comes into view.  We are learning.

The Primary Care MDT have experienced many of these penny-drop moments over the last six months and unfortunately there is not space here to describe them all, but I can share one pivotal example.


A common symptom of a poorly designed process is a chronically chaotic queue.

[NB. In medicine the term chronic means “long standing”.  The opposite term is acute which means “recent onset”].

Many assume, intuitively, that the cause of a chronically chaotic queue is lack of capacity; hence the incessant calls for ‘more capacity’.  And it appears that we have learned this reflex response by observing the effect of adding capacity – which is that the queue and chaos abate (for a while).  So that proves that lack of capacity was the cause. Yes?

Well actually it doesn’t.  Proving causality requires a bit more work.  And to illustrate this “temporal association does not prove causality trap” I invite you to consider this scenario.

I have a headache => I take a paracetamol => my headache goes away => so the cause of my headache was lack of paracetamol. Yes?

Errr .. No!

There are many contributory causes of chronically chaotic queues and lack of capacity is not one of them because the queue is chronic.  What actually happens is that something else triggers the onset of chaos which then consumes the very resource we require to avoid the chaos.  And once we slip into this trap we cannot escape!  The chaos-perpretuating behaviour we observe is called fire-fighting and the necessary resource it consumes is called resilience.


Six months ago, the Primary Care MDT believed that the cause of their chronic appointment booking chaos was a mismatch between demand and capacity – i.e. too much patient demand for the appointment capacity available.  So, there was a very reasonable resistance to the idea of making the appointment booking process easier for patients – they justifiably feared being overwhelmed by a tsunami of unmet need!

Six months on, the Primary Care MDT understand what actually causes chronic queues and that awareness has been achieved by a step-by-step process of explanation and experimentation in the relative safety of the weekly design sessions.

We played simulation games – lots of them.

One particularly memorable “Ah Ha!” moment happened when we played the Carveout Game which is done using dice, tiddly-winks, paper and coloured-pens.  No computers.  No statistics.  No queue theory gobbledygook.  No smoke-and-mirrors.  No magic.

What the Carveout Game demonstrates, practically and visually, is that an easy way to trigger the transition from calm-efficiency to chaotic-ineffectiveness is … to impose a carveout policy on a system that has been designed to achieve optimum efficiency by using averages.  Boom!  We slip on the twin banana skins of the Flaw-of-Averages and Sub-Optimisation, slide off the performance cliff, and career down the rocky slope of Chronic Chaos into the Depths of Despair – from which we cannot then escape.

This visual demonstration was a cognitive turning point for the MDT.  They now believed that there is a rational science to improvement and from there we were on the step-by-step climb to building the necessary embedded capability.


It now felt like the team were pulling what they needed to know.  I was no longer pushing.  We had flipped from push-to-pull.  That is called the tipping point.

And that is how health care systems engineering (HCSE) works.


Health care is a complex adaptive system, and what a health care systems engineer actually “designs” is a context-sensitive  incubator that nurtures the seeds of innovation that already exist in the system and encourages them to germinate, grow and become strong enough to establish themselves.

That is called “embedded improvement-by-design capability“.

And each incubator need to be different – because each system is different.  One-solution-fits-all-problems does not work here just as it does not in medicine.  Each patient is both similar and unique.


Just as in medicine, first we need to diagnose the actual cause;  second we need to design some effective solutions; third we need to decide which design to implement and fourth we need to deliver it.

But the how-to-do-it feels a bit counter-intuitive, and if it were not we would already be doing it. But the good news is that anyone can learn how to do HCSE.

Reflect and Celebrate

As we approach the end of 2018 it is a good time to look back and reflect on what has happened this year.

It has been my delight to have had the opportunity to work with front-line teams at University Hospital of North Midlands (UHNM) and to introduce them to the opportunity that health care systems engineering (HCSE) offers.

This was all part of a coordinated, cooperative strategy commissioned by the Staffordshire Clinical Commissioning Groups, and one area we were asked to look at was unscheduled care.

It was not my brief to fix problems.  I was commissioned to demonstrate how a systems engineer might approach them.  The first step was to raise awareness, then develop some belief and then grow some embedded capability – in the system itself.

The rest was up to the teams who stepped up to the challenge.  So what happened?

Winter is always a tough time for the NHS and especially for unscheduled care so let us have a look  and compare UHNM with NHS England as a whole – using the 4 hour A&E target yield – and over a longer time period of 7 years (so that we can see some annual cycles and longer term trends).

The A&E performance for the NHS in England as whole has been deteriorating at an accelerating pace over the 7 years.  This is a system-wide effect and there are a multitude of plausible causes.

The current UHNM system came into being at the end of 2014 with the merger of the Stafford and Stoke Hospital Trusts – and although their combined A&E performance dropped below average for England – the chart above shows that it did not continue to slide.

The NHS across the UK had a very bad time in the winter of 2017/18 – with a double whammy of sequential waves of Flu B and Flu A not helping!

But look at what happened at UHNM since Feb 2018.  Something has changed for the better and this is a macro system effect.  There has been a positive deviation from the expectation with about a 15% improvement in A&E 4-hr yield.  That is outstanding!

Now, I would say that news is worth celebrating and shouting “Well done everyone!” and then asking “How was that achieved?” and “What can we all learn that we can take forward into 2019 and build on?

Merry Christmas.

Filter-Pull versus Push-Carveout

It is November 2018, the clocks have changed back to GMT, the trick-and-treats are done, the fireworks light the night skies and spook the hounds, and the seasonal aisles in the dwindling number of high street stores are already stocked for Christmas.

I have been a bit quiet on the blog front this year but that is because there has been a lot happening behind the scenes and I have had to focus.

One output of is the recent publication of an article in Future Healthcare Journal on the topic of health care systems engineering (HCSE).  Click here to read the article and the rest of this excellent edition of FHJ that is dedicated to “systems”.

So, as we are back to the winter phase of the annual NHS performance cycle it is a good time to glance at the A&E Performance Radar and see who is doing well, and not-so-well.

Based on past experience, I was expecting Luton to be Top-of-the-Pops and so I was surprised (and delighted) to see that Barnsley have taken the lead.  And the chart shows that Barnsley has turned around a reasonable but sagging performance this year.

So I would be asking “What has happened at Barnsley that we can all learn from? What did you change and how did you know what and how to do that?

To be sure, Luton is still in the top three and it is interesting to explore who else is up there and what their A&E performance charts look like.

The data is all available for anyone with a web-browser to view – here.

For completeness, this is the chart for Luton, and we can see that, although the last point is lower than Barnsley, the performance-over-time is more consistent and less variable. So who is better?

NB. This is a meaningless question and illustrates the unhelpful tactic of two-point comparisons with others, and with oneself. The better question is “Is my design fit-for-purpose?”

The question I have for Luton is different. “How do you achieve this low variation and how do you maintain it? What can we all learn from you?”

And I have some ideas how they do that because in a recent HSJ interview they said “It is all about the filters“.


What do they mean by filters?

A filter is an essential component of any flow design if we want to deliver high safety, high efficiency, high effectiveness, and high productivity.  In other words, a high quality, fit-4-purpose design.

And the most important flow filters are the “upstream” ones.

The design of our upstream flow filters is critical to how the rest of the system works.  Get it wrong and we can get a spiralling decline in system performance because we can unintentionally trigger a positive feedback loop.

Queues cause delays and chaos that consume our limited resources.  So, when we are chasing cost improvement programme (CIP) targets using the “salami slicer” approach, and combine that with poor filter design … we can unintentionally trigger the perfect storm and push ourselves over the catastrophe cliff into perpetual, dangerous and expensive chaos.

If we look at the other end of the NHS A&E league table we can see typical examples that illustrate this pattern.  I have used this one only because it happens to be bottom this month.  It is not unique.

All other NHS trusts fall somewhere between these two extremes … stable, calm and acceptable and unstable, chaotic and unacceptable.

Most display the stable and chaotic combination – the “Zone of Perpetual Performance Pain”.

So what is the fundamental difference between the outliers that we can all learn from? The positive deviants like Barnsley and Luton, and the negative deviants like Blackpool.  I ask this because comparing the extremes is more useful than laboriously exploring the messy, mass-mediocrity in the middle.

An effective upstream flow filter design is a necessary component, but it is not sufficient. Triage (= French for sorting) is OK but it is not enough.  The other necessary component is called “downstream pull” and omitting that element of the design appears to be the primary cause of the chronic chaos that drags trusts and their staff down.

It is not just an error of omission though, the current design is an actually an error of commission. It is anti-pull; otherwise known as “push”.


This year I have been busy on two complicated HCSE projects … one in secondary care and the other in primary care.  In both cases the root cause of the chronic chaos is the same.  They are different systems but have the same diagnosis.  What we have revealed together is a “push-carveout” design which is the exact opposite of the “upstream-filter-plus-downstream-pull” design we need.

And if an engineer wanted to design a system to be chronically chaotic then it is very easy to do. Here is the recipe:

a) Set high average utilisation target of all resources as a proxy for efficiency to ensure everything is heavily loaded. Something between 80% and 100% usually does the trick.

b) Set a one-size-fits-all delivery performance target that is not currently being achieved and enforce it punitively.  Something like “>95% of patients seen and discharged or admitted in less than 4 hours, or else …”.

c) Divvy up the available resources (skills, time, space, cash, etc) into ring-fenced pots.

Chronic chaos is guaranteed.  The Laws of Physics decree it.


Unfortunately, the explanation of why this is the case is counter-intuitive, so it is actually better to experience it first, and then seek the explanation.  Reality first, reasoning second.

And, it is a bittersweet experience, so it needs to be done with care and compassion.

And that’s what I’ve been busy doing this year. Creating the experiences and then providing the explanations.  And if done gradually what then happens is remarkable and rewarding.

The FHJ article outlines one validated path to developing individual and organisational capability in health care systems engineering.

The 85% Optimum Bed Occupancy Myth

A few years ago I had a rant about the dangers of the widely promoted mantra that 85% is the optimum average measured bed-occupancy target to aim for.

But ranting is annoying, ineffective and often counter-productive.

So, let us revisit this with some calm objectivity and disprove this Myth a step at a time.

The diagram shows the system of interest (SoI) where the blue box represents the beds, the coloured arrows are the patient flows, the white diamond is a decision and the dotted arrow is information about how full the hospital is (i.e. full/not full).

A new emergency arrives (red arrow) and needs to be admitted. If the hospital is not full the patient is moved to an empty bed (orange arrow), the medical magic happens, and some time later the patient is discharged (green arrow).  If there is no bed for the emergency request then we get “spillover” which is the grey arrow, i.e. the patient is diverted elsewhere (n.b. these are critically ill patients …. they cannot sit and wait).


This same diagram could represent patients trying to phone their GP practice for an appointment.  The blue box is the telephone exchange and if all the lines are busy then the call is dropped (grey arrow).  If there is a line free then the call is connected (orange arrow) and joins a queue (blue box) to be answered some time later (green arrow).

In 1917, a Danish mathematician/engineer called Agner Krarup Erlang was working for the Copenhagen Telephone Company and was grappling with this very problem: “How many telephone lines do we need to ensure that dropped calls are infrequent AND the switchboard operators are well utilised?

This is the perennial quality-versus-cost conundrum. The Value-4-Money challenge. Too few lines and the quality of the service falls; too many lines and the cost of the service rises.

Q: Is there a V4M ‘sweet spot” and if so, how do we find it? Trial and error?

The good news is that Erlang solved the problem … mathematically … and the not-so good news is that his equations are very scary to a non mathematician/engineer!  So this solution is not much help to anyone else.


Fortunately, we have a tool for turning scary-equations into easy-2-see-pictures; our trusty Excel spreadsheet. So, here is a picture called a heat-map, and it was generated from one of Erlang’s equations using Excel.

The Erlang equation is lurking in the background, safely out of sight.  It takes two inputs and gives one output.

The first input is the Capacity, which is shown across the top, and it represents the number of beds available each day (known as the space-capacity).

The second input is the Load (or offered load to use the precise term) which is down the left side, and is the number of bed-days required per day (e.g. if we have an average of 10 referrals per day each of whom would require an average 2-day stay then we have an average of 10 x 2 = 20 bed-days of offered load per day).

The output of the Erlang model is the probability that a new arrival finds all the beds are full and the request for a bed fails (i.e. like a dropped telephone call).  This average probability is displayed in the cell.  The colour varies between red (100% failure) and green (0% failure), with an infinite number of shades of red-yellow-green in between.

We can now use our visual heat-map in a number of ways.

a) We can use it to predict the average likelihood of rejection given any combination of bed-capacity and average offered load.

Suppose the average offered load is 20 bed-days per day and we have 20 beds then the heat-map says that we will reject 16% of requests … on average (bottom left cell).  But how can that be? Why do we reject any? We have enough beds on average! It is because of variation. Requests do not arrive in a constant stream equal to the average; there is random variation around that average.  Critically ill patients do not arrive at hospital in a constant stream; so our system needs some resilience and if it does not have it then failures are inevitable and mathematically predictable.

b) We can use it to predict how many beds we need to keep the average rejection rate below an arbitrary but acceptable threshold (i.e. the quality specification).

Suppose the average offered load is 20 bed-days per day, and we want to have a bed available more than 95% of the time (less than 5% failures) then we will need at least 25 beds (bottom right cell).

c) We can use it to estimate the maximum average offered load for a given bed-capacity and required minimum service quality.

Suppose we have 22 beds and we want a quality of >=95% (failure <5%) then we would need to keep the average offered load below 17 bed-days per day (i.e. by modifying the demand and the length of stay because average load = average demand * average length of stay).


There is a further complication we need to be mindful of though … the measured utilisation of the beds is related to the successful admissions (orange arrow in the first diagram) not to the demand (red arrow).  We can illustrate this with a complementary heat map generated in Excel.

For scenario (a) above we have an offered load of 20 bed-days per day, and we have 20 beds but we will reject 16% of requests so the accepted bed load is only 16.8 bed days per day  (i.e. (100%-16%) * 20) which is the reason that the average  utilisation is only 16.8/20 = 84% (bottom left cell).

For scenario (b) we have an offered load of 20 bed-days per day, and 25 beds and will only reject 5% of requests but the average measured utilisation is not 95%, it is only 76% because we have more beds (the accepted bed load is 95% * 20 = 19 bed-days per day and 19/25 = 76%).

For scenario (c) the average measured utilisation would be about 74%.


So, now we see the problem more clearly … if we blindly aim for an average, measured, bed-utilisation of 85% with the untested belief that it is always the optimum … this heat-map says it is impossible to achieve and at the same time offer an acceptable quality (>95%).

We are trading safety for money and that is not an acceptable solution in a health care system.


So where did this “magic” value of 85% come from?

From the same heat-map perhaps?

If we search for the combination of >95% success (<5% fail) and 85% average bed-utilisation then we find it at the point where the offered load reaches 50 bed-days per day and we have a bed-capacity of 56 beds.

And if we search for the combination of >99% success (<1% fail) and 85% average utilisation then we find it with an average offered load of just over 100 bed-days per day and a bed-capacity around 130 beds.

H’mm.  “Houston, we have a problem“.


So, even in this simplified scenario the hypothesis that an 85% average bed-occupancy is a global optimum is disproved.

The reality is that the average bed-occupancy associated with delivering the required quality for a given offered load with a specific number of beds is almost never 85%.  It can range anywhere between 50% and 100%.  Erlang knew that in 1917.


So, if a one-size-fits-all optimum measured average bed-occupancy assumption is not valid then how might we work out how many beds we need and predict what the expected average occupancy will be?

We would design the fit-4-purpose solution for each specific context …
… and to do that we need to learn the skills of complex adaptive system design …
… and that is part of the health care systems engineering (HCSE) skill-set.

 

The Pathology of Variation II

It is that time of year – again.

Winter.

The NHS is struggling, front-line staff are having to use heroic measures just to keep the ship afloat, and less urgent work has been suspended to free up space and time to help man the emergency pumps.

And the finger-of-blame is being waggled by the army of armchair experts whose diagnosis is unanimous: “lack of cash caused by an austerity triggered budget constraint”.


And the evidence seems plausible.

The A&E performance data says that each year since 2009, the proportion of patients waiting more than 4 hours in A&Es has been increasing.  And the increase is accelerating. This is a progressive quality failure.

And health care spending since the NHS was born in 1948 shows a very similar accelerating pattern.    

So which is the chicken and which is the egg?  Or are they both symptoms of something else? Something deeper?


Both of these charts are characteristic of a particular type of system behaviour called a positive feedback loop.  And the cost chart shows what happens when someone attempts to control the cash by capping the budget:  It appears to work for a while … but the “pressure” is building up inside the system … and eventually the cash-limiter fails. Usually catastrophically. Bang!


The quality chart shows an associated effect of the “pressure” building inside the acute hospitals, and it is a very well understood phenomenon called an Erlang-Kingman queue.  It is caused by the inevitable natural variation in demand meeting a cash-constrained, high-resistance, high-pressure, service provider.  The effect is to amplify the natural variation and to create something much more dangerous and expensive: chaos.


The simple line-charts above show the long-term, aggregated  effects and they hide the extremely complicated internal structure and the highly complex internal behaviour of the actual system.

One technique that system engineers use to represent this complexity is a causal loop diagram or CLD.

The arrows are of two types; green indicates a positive effect, and red indicates a negative effect.

This simplified CLD is dominated by green arrows all converging on “Cost of Care”.  They are the positive drivers of the relentless upward cost pressure.

Health care is a victim of its own success.

So, if the cash is limited then the naturally varying demand will generate the queues, delays and chaos that have such a damaging effect on patients, providers and purses.

Safety and quality are adversely affected. Disappointment, frustration and anxiety are rife. Expectation is lowered.  Confidence and trust are eroded.  But costs continue to escalate because chaos is expensive to manage.

This system behaviour is what we are seeing in the press.

The cost-constraint has, paradoxically, had exactly the opposite effect, because it is treating the effect (the symptom) and ignoring the cause (the disease).


The CLD has one negative feedback loop that is linked to “Efficiency of Processes”.  It is the only one that counteracts all of the other positive drivers.  And it is the consequence of the “System Design”.

What this means is: To achieve all the other benefits without the pressures on people and purses, all the complicated interdependent processes required to deliver the evolving health care needs of the population must be proactively designed to be as efficient as technically possible.


And that is not easy or obvious.  Efficient design does not happen naturally.  It is hard work!  It requires knowledge of the Anatomy and Physiology of Systems and of the Pathology of Variation.  It requires understanding how to achieve effectiveness and efficiency at the same time as avoiding queues and chaos.  It requires that the whole system is continually and proactively re-designed to remain reliable and resilient.

And that implies it has to be done by the system itself; and that means the NHS needs embedded health care systems engineering know-how.

And when we go looking for that we discover sequence of gaps.

An Awareness gap, a Belief gap and a Capability gap. ABC.

So the first gap to fill is the Awareness gap.

H.R.O.

The New Year of 2018 has brought some unexpected challenges. Or were they?

We have belligerent bullies with their fingers on their nuclear buttons.

We have an NHS in crisis, with corridor-queues of urgent frail, elderly, unwell and a month of cancelled elective operations.

And we have winter storms, fallen trees, fractured power-lines, and threatened floods – all being handled rather well by people who are trained to manage the unexpected.

Which is the title of this rather interesting book that talks a lot about HROs.

So what are HROs?


“H” stands for High.  “O” stands for Organisation.

What does R stand for?  Rhetoric? Rigidity? Resistance?

Watching the news might lead one to suggest these words would fit … but they are not the answer.

“R” stands for Reliability and “R” stands for Resilience … and they are linked.


Think of a global system that is so reliable that we all depend on it, everyday.  The Global Positioning System or the Internet perhaps.  We rely on them because they serve a need and because they work. Reliably and resiliently.

And that was no accident.

Both the Internet and the GPS were designed and built to meet the needs of billions and to be reliable and resilient.  They were both created by an army of unsung heroes called systems engineers – who were just doing their job. The job they were trained to do.


The NHS serves a need – and often an urgent one, so it must also be reliable. But it is not.

The NHS needs to be resilient. It must cope with the ebb and flow of seasonal illness. But it does not.

And that is because the NHS has not been designed to be either reliable or resilient. And that is because the NHS has not been designed.  And that is because the NHS does not appear to have enough health care systems engineers trained to do that job.

But systems engineering is a mature discipline, and it works just as well inside health care as it does outside.


And to support that statement, here is evidence of what happened after a team of NHS clinicians and managers were trained in the basics of HCSE.

Monklands A&E Improvement

So the gap seems to be just an awareness/ability gap … which is a bridgeable one.


Who would like to train to be a Health Case Systems Engineer and to join the growing community of HCSE practitioners who have the potential to be the future unsung heroes of the NHS?

Click here if you are interested: http://www.ihcse.uk

PS. “Managing the Unexpected” is an excellent introduction to SE.

Diagnose-Design-Deliver

A story was shared this week.

A story of hope for the hard-pressed NHS, its patients, its staff and its managers and its leaders.

A story that says “We can learn how to fix the NHS ourselves“.

And the story comes with evidence; hard, objective, scientific, statistically significant evidence.


The story starts almost exactly three years ago when a Clinical Commissioning Group (CCG) in England made a bold strategic decision to invest in improvement, or as they termed it “Achieving Clinical Excellence” (ACE).

They invited proposals from their local practices with the “carrot” of enough funding to allow GPs to carve-out protected time to do the work.  And a handful of proposals were selected and financially supported.

This is the story of one of those proposals which came from three practices in Sutton who chose to work together on a common problem – the unplanned hospital admissions in their over 70’s.

Their objective was clear and measurable: “To reduce the cost of unplanned admissions in the 70+ age group by working with hospital to reduce length of stay.

Did they achieve their objective?

Yes, they did.  But there is more to this story than that.  Much more.


One innovative step they took was to invest in learning how to diagnose why the current ‘system’ was costing what it was; then learning how to design an improvement; and then learning how to deliver that improvement.

They invested in developing their own improvement science skills first.

They did not assume they already knew how to do this and they engaged an experienced health care systems engineer (HCSE) to show them how to do it (i.e. not to do it for them).

Another innovative step was to create a blog to make it easier to share what they were learning with their colleagues; and to invite feedback and suggestions; and to provide a journal that captured the story as it unfolded.

And they measured stuff before they made any changes and afterwards so they could measure the impact, and so that they could assess the evidence scientifically.

And that was actually quite easy because the CCG was already measuring what they needed to know: admissions, length of stay, cost, and outcomes.

All they needed to learn was how to present and interpret that data in a meaningful way.  And as part of their IS training,  they learned how to use system behaviour charts, or SBCs.


By Jan 2015 they had learned enough of the HCSE techniques and tools to establish the diagnosis and start to making changes to the parts of the system that they could influence.


Two years later they subjected their before-and-after data to robust statistical analysis and they had a surprise. A big one!

Reducing hospital mortality was not a stated objective of their ACE project, and they only checked the mortality data to be sure that it had not changed.

But it had, and the “p=0.014” part of the statement above means that the probability that this 20.0% reduction in hospital mortality was due to random chance … is less than 1.4%.  [This is well below the 5% threshold that we usually accept as “statistically significant” in a clinical trial.]

But …

This was not a randomised controlled trial.  This was an intervention in a complicated, ever-changing system; so they needed to check that the hospital mortality for comparable patients who were not their patients had not changed as well.

And the statistical analysis of the hospital mortality for the ‘other’ practices for the same patient group, and the same period of time confirmed that there had been no statistically significant change in their hospital mortality.

So, it appears that what the Sutton ACE Team did to reduce length of stay (and cost) had also, unintentionally, reduced hospital mortality. A lot!


And this unexpected outcome raises a whole raft of questions …


If you would like to read their full story then you can do so … here.

It is a story of hunger for improvement, of humility to learn, of hard work and of hope for the future.

Outliers

reading_a_book_pa_150_wht_3136An effective way to improve is to learn from others who have demonstrated the capability to achieve what we seek.  To learn from success.

Another effective way to improve is to learn from those who are not succeeding … to learn from failures … and that means … to learn from our own failings.

But from an early age we are socially programmed with a fear of failure.

The training starts at school where failure is not tolerated, nor is challenging the given dogma.  Paradoxically, the effect of our fear of failure is that our ability to inquire, experiment, learn, adapt, and to be resilient to change is severely impaired!

So further failure in the future becomes more likely, not less likely. Oops!


Fortunately, we can develop a healthier attitude to failure and we can learn how to harness the gap between intent and impact as a source of energy, creativity, innovation, experimentation, learning, improvement and growing success.

And health care provides us with ample opportunities to explore this unfamiliar terrain. The creative domain of the designer and engineer.


The scatter plot below is a snapshot of the A&E 4 hr target yield for all NHS Trusts in England for the month of July 2016.  The required “constitutional” performance requirement is better than 95%.  The delivered whole system average is 85%.  The majority of Trusts are failing, and the Trust-to-Trust variation is rather wide. Oops!

This stark picture of the gap between intent (95%) and impact (85%) prompts some uncomfortable questions:

Q1: How can one Trust achieve 98% and yet another can do no better than 64%?

Q2: What can all Trusts learn from these high and low flying outliers?

[NB. I have not asked the question “Who should we blame for the failures?” because the name-shame-blame-game is also a predictable consequence of our fear-of-failure mindset.]


Let us dig a bit deeper into the information mine, and as we do that we need to be aware of a trap:

A snapshot-in-time tells us very little about how the system and the set of interconnected parts is behaving-over-time.

We need to examine the time-series charts of the outliers, just as we would ask for the temperature, blood pressure and heart rate charts of our patients.

Here are the last six years by month A&E 4 hr charts for a sample of the high-fliers. They are all slightly different and we get the impression that the lower two are struggling more to stay aloft more than the upper two … especially in winter.


And here are the last six years by month A&E 4 hr charts for a sample of the low-fliers.  The Mark I Eyeball Test results are clear … these swans are falling out of the sky!


So we need to generate some testable hypotheses to explain these visible differences, and then we need to examine the available evidence to test them.

One hypothesis is “rising demand”.  It says that “the reason our A&E is failing is because demand on A&E is rising“.

Another hypothesis is “slow flow”.  It says that “the reason our A&E is failing is because of the slow flow through the hospital because of delayed transfers of care (DTOCs)“.

So, if these hypotheses account for the behaviour we are observing then we would predict that the “high fliers” are (a) diverting A&E arrivals elsewhere, and (b) reducing admissions to free up beds to hold the DTOCs.

Let us look at the freely available data for the highest flyer … the green dot on the scatter gram … code-named “RC9”.

The top chart is the A&E arrivals per month.

The middle chart is the A&E 4 hr target yield per month.

The bottom chart is the emergency admissions per month.

Both arrivals and admissions are increasing, while the A&E 4 hr target yield is rock steady!

And arranging the charts this way allows us to see the temporal patterns more easily (and the images are deliberately arranged to show the overall pattern-over-time).

Patterns like the change-for-the-better that appears in the middle of the winter of 2013 (i.e. when many other trusts were complaining that their sagging A&E performance was caused by “winter pressures”).

The objective evidence seems to disprove the “rising demand”, “slow flow” and “winter pressure” hypotheses!

So what can we learn from our failure to adequately explain the reality we are seeing?


The trust code-named “RC9” is Luton and Dunstable, and it is an average district general hospital, on the surface.  So to reveal some clues about what actually happened there, we need to read their Annual Report for 2013-14.  It is a public document and it can be downloaded here.

This is just a snippet …

… and there are lots more knowledge nuggets like this in there …

… it is a treasure trove of well-known examples of good system flow design.

The results speak for themselves!


Q: How many black swans does it take to disprove the hypothesis that “all swans are white”.

A: Just one.

“RC9” is a black swan. An outlier. A positive deviant. “RC9” has disproved the “impossibility” hypothesis.

And there is another flock of black swans living in the North East … in the Newcastle area … so the “Big cities are different” hypothesis does not hold water either.


The challenge here is a human one.  A human factor.  Our learned fear of failure.

Learning-how-to-fail is the way to avoid failing-how-to-learn.

And to read more about that radical idea I strongly recommend reading the recently published book called Black Box Thinking by Matthew Syed.

It starts with a powerful story about the impact of human factors in health care … and here is a short video of Martin Bromiley describing what happened.

The “black box” that both Martin and Matthew refer to is the one that is used in air accident investigations to learn from what happened, and to use that learning to design safer aviation systems.

Martin Bromiley has founded a charity to support the promotion of human factors in clinical training, the Clinical Human Factors Group.

So if we can muster the courage and humility to learn how to do this in health care for patient safety, then we can also learn to how do it for flow, quality and productivity.

Our black swan called “RC9” has demonstrated that this goal is attainable.

And the body of knowledge needed to do this already exists … it is called Health and Social Care Systems Engineering (HSCSE).


For more posts like this please vote here.
For more information please subscribe here.
To email the author please click here.


Postscript: And I am pleased to share that Luton & Dunstable features in the House of Commons Health Committee report entitled Winter Pressures in A&E Departments that was published on 3rd Nov 2016.

Here is part of what L&D shared to explain their deviant performance:

luton_nuggets

These points describe rather well the essential elements of a pull design, which is the antidote to the rather more prevalent pressure cooker design.

Righteous Indignation

On 5th July 2018, the NHS will be 70 years old, and like many of those it was created to serve, it has become elderly and frail.

We live much longer, on average, than we used to and the growing population of frail elderly are presenting an unprecedented health and social care challenge that the NHS was never designed to manage.

The creases and cracks are showing, and each year feels more pressured than the last.


This week a story that illustrates this challenge was shared with me along with permission to broadcast …

“My mother-in-law is 91, in general she is amazingly self-sufficient, able to arrange most of her life with reasonable care at home via a council tendered care provider.

She has had Parkinson’s for years, needing regular medication to enable her to walk and eat (it affects her jaw and swallowing capability). So the care provision is time critical, to get up, have lunch, have tea and get to bed.

She’s also going deaf, profoundly in one ear, pretty bad in the other. She wears a single ‘in-ear’ aid, which has a micro-switch on/off toggle, far too small for her to see or operate. Most of the carers can’t put it in, and fail to switch it off.

Her care package is well drafted, but rarely adhered to. It should be 45 minutes in the morning, 30, 15, 30 through the day. Each time administering the medications from the dossette box. Despite the register in/out process from the carers, many visits are far less time than designed (and paid for by the council), with some lasting 8 minutes instead of 30!

Most carers don’t ensure she takes her meds, which sometimes leads to dropped pills on the floor, with no hope of picking them up!

While the care is supposedly ‘time critical’ the provider don’t manage it via allocated time slots, they simply provide lists, that imply the order of work, but don’t make it clear. My mother-in-law (Mum) cannot be certain when the visit will occur, which makes going out very difficult.

The carers won’t cook food, but will micro-wave it, thus if a cooked meal is to happen, my Mum will start it, with the view of the carers serving it. If they arrive early, the food is under-cooked (“Just put vinegar on it, it will taste better”) and if they arrive late, either she’ll try to get it out herself, or it will be dried out / cremated.

Her medication pattern should be every 4 to 5 hours in the day, with a 11:40 lunch visit, and a 17:45 tea visit, followed by a 19:30 bed prep visit, she finishes up with too long between meds, followed by far too close together. Her GP has stated that this is making her health and Parkinson’s worse.

Mum also rarely drinks enough through the day, in the hot whether she tends to dehydrate, which we try to persuade her must be avoided. Part of the problem is Parkinson’s related, part the hassle of getting to the toilet more often. Parkinson’s affects swallowing, so she tends to sip, rather than gulp. By sipping often, she deludes herself that she is drinking enough.

She also is stubbornly not adjusting methods to align to issues. She drinks tea and water from her lovely bone china cups. Because her grip is not good and her hand shakes, we can’t fill those cups very high, so her ‘cup of tea’ is only a fraction of what it could be.

As she can walk around most days, there’s no way of telling whether she drinks enough, and she frequently has several different carers in a day.

When Mum gets dehydrated, it affects her memory and her reasoning, similar to the onset of dementia. It also seems to increase her probability of falling, perhaps due to forgetting to be defensive.

When she falls, she cannot get up, thus usually presses her alarm dongle, resulting in me going round to get her up, check for concussion, and check for other injuries, prior to settling her down again. These can be ten weeks apart, through to a few in a week.

When she starts to hallucinate, we do our very best to increase drinking, seeking to re-hydrate.

On Sunday, something exceptional happened, Mum fell out of bed and didn’t press her alarm. The carer found her and immediately called the paramedics and her GP, who later called us in. For the first time ever she was not sufficiently mentally alert to press her alarm switch.

After initial assessment, she was taken to A&E, luckily being early on Sunday morning it was initially quite quiet.

Hospital

The Hospital is on the boundary between two counties, within a large town, a mixture of new build elements, between aging structures. There has been considerable investment within A&E, X-ray etc. due partly to that growth industry and partly due to the closures of cottage hospitals and reducing GP services out of hours.

It took some persuasion to have Mum put on a drip, as she hadn’t had breakfast or any fluids, and dehydration was a probable primary cause of her visit. They took bloods, an X-ray of her chest (to check for fall related damage) and a CT scan of her head, to see if there were issues.

I called the carers to tell them to suspend visits, but the phone simply rang without be answered (not for the first time.)

After about six hours, during which time she was awake, but not very lucid, she was transferred to the day ward, where after assessment she was given some meds, a sandwich and another drip.

Later that evening we were informed she was to be kept on a drip for 24 hours.

The next day (Bank Holiday Monday) she was transferred to another ward. When we arrived she was not on a drip, so their decisions had been reversed.

I spoke at length with her assigned staff nurse, and was told the following: Mum could come out soon if she had a 24/7 care package, and that as well as the known issues mum now has COPD. When I asked her what COPD was, she clearly didn’t know, but flustered a ‘it is a form of heart failure that affects breathing’. (I looked it up on my phone a few minutes later.)

So, to get mum out, I had to arrange a 24/7 care package, and nowhere was open until the next day.

Trying to escalate care isn’t going to be easy, even in the short term. My emails to ‘usually very good’ social care people achieved nothing to start with on Tuesday, and their phone was on the ‘out of hours’ setting for evenings and weekends, despite being during the day of a normal working week.

Eventually I was told that there would be nothing to achieve until the hospital processed the correct exit papers to Social Care.

When we went in to the hospital (on Tuesday) a more senior nurse was on duty. She explained that mum was now medically fit to leave hospital if care can be re-established. I told her that I was trying to set up 24/7 care as advised. She looked through the notes and said 24/7 care was not needed, the normal 4 x a day was enough. (She was clearly angry).

I then explained that the newly diagnosed COPD may be part of the problem, she said that she’s worked with COPD patients for 16 years, and mum definitely doesn’t have COPD. While she was amending the notes, I noticed that mum’s allergy to aspirin wasn’t there, despite us advising that on entry. The nurse also explained that as the hospital is in one county, but almost half their patients are from another, they are always stymied on ‘joined up working’

While we were talking with mum, her meds came round and she was only given paracetamol for her pain, but NOT her meds for Parkinson’s. I asked that nurse why that was the case, and she said that was not on her meds sheet. So I went back to the more senior nurse, she checked the meds as ordered and Parkinson’s was required 4 x a day, but it was NOT transferred onto the administration sheet. The doctor next to us said she would do it straight away, and I was told, “Thank God you are here to get this right!”

Mum was given her food, it consisted of some soup, which she couldn’t spoon due to lack of meds and a dry tough lump of gammon and some mashed sweet potato, which she couldn’t chew.

When I asked why meds were given at five, after the delivery of food, they said ‘That’s our system!’, when I suggested that administering Parkinson’s meds an hour before food would increase the ability to eat the food they said “that’s a really good idea, we should do that!”

On Wednesday I spoke with Social Care to try to re-start care to enable mum to get out. At that time the social worker could neither get through to the hospital nor the carers. We spoke again after I had arrived in hospital, but before I could do anything.

On arrival at the hospital I was amazed to see the white-board declaring that mum would be discharged for noon on Monday (in five days-time!). I spoke with the assigned staff nurse who said, “That’s the earliest that her carers can re-start, and anyway its nearly the weekend”.

I said that “mum was medically OK for discharge on Tuesday, after only two days in the hospital, and you are complacent to block the bed for another six days, have you spoken with the discharge team?”

She replied, “No they’ll have gone home by now, and I’ve not seen them all day” I told her that they work shifts, and that they will be here, and made it quite clear if she didn’t contact SHEDs that I’d go walkabout to find them. A few minutes later she told me a SHED member would be with me in 20 minutes.

While the hospital had resolved her medical issues, she was stuck in a ward, with no help to walk, the only TV via a complex pay-for system she had no hope of understanding, with no day room, so no entertainment, no exercise, just boredom encouraged to lay in bed, wear a pad because she won’t be taken to the loo in time.

When the SHED worker arrived I explained the staff nurse attitude, she said she would try to improve those thinking processes. She took lots of details, then said that so long as mum can walk with assistance, she could be released after noon, to have NHS carer support, 4 times a day, from the afternoon. She walked around the ward for the first time since being admitted, and while shaky was fine.

Hopefully all will be better now?”


This story is not exceptional … I have heard it many times from many people in many different parts of the UK.  It is the norm rather than the exception.

It is the story of a fragmented and fractured system of health and social care.

It is the story of frustration for everyone – patients, family, carers, NHS staff, commissioners, and tax-payers.  A fractured care system is unsafe, chaotic, frustrating and expensive.

There are no winners here.  It is not a trade off, compromise or best possible.

It is just poor system design.


What we want has a name … it is called a Frail Safe design … and this is not a new idea.  It is achievable. It has been achieved.

http://www.frailsafe.org.uk

So why is this still happening?

The reason is simple – the NHS does not know any other way.  It does not know how to design itself to be safe, calm, efficient, high quality and affordable.

It does not know how to do this because it has never learned that this is possible.

But it is possible to do, and it is possible to learn, and that learning does not take very long or cost very much.

And the return vastly outnumbers the investment.


The title of this blog is Righteous Indignation

… if your frail elderly parents, relatives or friends were forced to endure a system that is far from frail safe; and you learned that this situation was avoidable and that a safer design would be less expensive; and all you hear is “can’t do” and “too busy” and “not enough money” and “not my job” …  wouldn’t you feel a sense of righteous indignation?

I do.


For more posts like this please vote here.
For more information please subscribe here.

Bloodsucking Bugs

BloodSuckerThis is a magnified picture of a blood sucking bug called a Red Poultry Mite.

They go red after having gorged themselves on chicken blood.

Their life-cycle is only 7 days so, when conditions are just right, they can quickly cause an infestation – and one that is remarkably difficult to eradicate!  But if it is not dealt with then chicken coop productivity will plummet.


We use the term “bug” for something else … a design error … in a computer program for example.  If the conditions are just right, then software bugs can spread too and can infest a computer system.  They feed on the hardware resources – slurping up processor time and memory space until the whole system slows to a crawl.


And one especially pernicious type of system design error is called an Error of Omission.  These are the things we do not do that would prevent the bloodsucking bugs from breeding and spreading.

Prevention is better than cure.


In the world of health care improvement there are some blood suckers out there, ones who home in on a susceptible host looking for a safe place to establish a colony.  They are masters of the art of mimicry.  They look like and sound like something they are not … they claim to be symbiotic whereas in reality they are parasitic.

The clue to their true nature is that their impact does not match their intent … but by the time that gap is apparent they are entrenched and their spores have already spread.

Unlike the Red Poultry Mites, we do not want to eradicate them … we need to educate them. They only behave like parasites because they are missing a few essential bits of software.  And once those upgrades are installed they can achieve their potential and become symbiotic.

So, let me introduce them, they are called Len, Siggy and Tock and here is their story:

Six Ways Not To Improve Flow

Crash Test Dummy

CrashTestDummyThere are two complementary approaches to safety and quality improvement: desire and design.

In the improvement-by-desire world we use a suck-it-and-see approach to fix a problem.  It is called PDSA.

Sometimes this works and we pat ourselves on the back, and remember the learning for future use.

Sometimes it works for us but has a side effect: it creates a problem for someone else.  And we may not be aware of the unintended consequence unless someone shouts “Oi!” It may be too late by then of course.


The more parts in a system, and the more interconnected they are, the more likely it is that a well-intended suck-it-and-see change will create an unintended negative impact.

And in that situation our temptation is to … do nothing … and put up with the problems. It seems the safest option.


In the improvement-by-design world we choose to study first, and to find the causal roots of the system behaviour we are seeing.  Our first objective is a diagnosis.

With that we can propose rational design changes that we anticipate will deliver the improvement we seek without creating adverse effects.

But we have learned the hard way that our intuition can trick us … so we need a way to test our designs … a safe and controlled way.  We need a crash test dummy!


What they do is to deliberately experience our design in a controlled experiment, and what they generate for us is constructive feedback. What did work, and what did not.

A crash test dummy is tough and sensitive at the same time.  They do not break easily and yet they feel the pain and gain too.  They are resilient.


And with their feedback we can re-visit our design and improve it further, or we can use it to offer evidence-based assurance that our design is fit-for-purpose.

Safety and Quality Assurance is improvement-by-design. Diagnosis-and-treatment.

Safety and Quality Control is improvement-by-desire. Suck-and-see.

If you were a passenger or a patient … which option would you prefer?

Fragmentation Cost

figure_falling_with_arrow_17621The late Russell Ackoff used to tell a great story. It goes like this:

“A team set themselves the stretch goal of building the World’s Best Car.  So the put their heads together and came up with a plan.

First they talked to drivers and drew up a list of all the things that the World’s Best Car would need to have. Safety, speed, low fuel consumption, comfort, good looks, low emissions and so on.

Then they drew up a list of all the components that go into building a car. The engine, the wheels, the bodywork, the seats, and so on.

Then they set out on a quest … to search the world for the best components … and to bring the best one of each back.

Then they could build the World’s Best Car.

Or could they?

No.  All they built was a pile of incompatible parts. The WBC did not work. It was a futile exercise.


Then the penny dropped. The features in their wish-list were not associated with any of the separate parts. Their desired performance emerged from the way the parts worked together. The working relationships between the parts were as necessary as the parts themselves.

And a pile of average parts that work together will deliver a better performance than a pile of best parts that do not.

So the relationships were more important than the parts!


From this they learned that the quickest, easiest and cheapest way to degrade performance is to make working-well-together a bit more difficult.  Irrespective of the quality of the parts.


Q: So how do we reverse this degradation of performance?

A: Add more failure-avoidance targets of course!

But we just discovered that the performance is the effect of how the parts work well together?  Will another failure-metric-fueled performance target help? How will each part know what it needs to do differently – if anything?  How will each part know if the changes they have made are having the intended impact?

Fragmentation has a cost.  Fear, frustration, futility and ultimately financial failure.

So if performance is fading … the quality of the working relationships is a good place to look for opportunities for improvement.

Precious Life Time

stick_figure_help_button_150_wht_9911Imagine this scenario:

You develop some non-specific symptoms.

You see your GP who refers you urgently to a 2 week clinic.

You are seen, assessed, investigated and informed that … you have cancer!


The shock, denial, anger, blame, bargaining, depression, acceptance sequence kicks off … it is sometimes called the Kübler-Ross grief reaction … and it is a normal part of the human psyche.

But there is better news. You also learn that your condition is probably treatable, but that it will require chemotherapy, and that there are no guarantees of success.

You know that time is of the essence … the cancer is growing.

And time has a new relevance for you … it is called life time … and you know that you may not have as much left as you had hoped.  Every hour is precious.


So now imagine your reaction when you attend your local chemotherapy day unit (CDU) for your first dose of chemotherapy and have to wait four hours for the toxic but potentially life-saving drugs.

They are very expensive and they have a short shelf-life so the NHS cannot afford to waste any.   The Aseptic Unit team wait until all the safety checks are OK before they proceed to prepare your chemotherapy.  That all takes time, about four hours.

Once the team get to know you it will go quicker. Hopefully.

It doesn’t.

The delays are not the result of unfamiliarity … they are the result of the design of the process.

All your fellow patients seem to suffer repeated waiting too, and you learn that they have been doing so for a long time.  That seems to be the way it is.  The waiting room is well used.

Everyone seems resigned to the belief that this is the best it can be.

They are not happy about it but they feel powerless to do anything.


Then one day someone demonstrates that it is not the best it can be.

It can be better.  A lot better!

And they demonstrate that this better way can be designed.

And they demonstrate that they can learn how to design this better way.

And they demonstrate what happens when they apply their new learning …

… by doing it and by sharing their story of “what-we-did-and-how-we-did-it“.

CDU_Waiting_Room

If life time is so precious, why waste it?

And perhaps the most surprising outcome was that their safer, quicker, calmer design was also 20% more productive.

The Capstan

CapstanA capstan is a simple machine for combining the effort of many people and enabling them to achieve more than any of them could do alone.

The word appears to have come into English from the Portuguese and Spanish sailors at around the time of the Crusades.

Each sailor works independently of the others. There is no requirement them to be equally strong because the capstan will combine their efforts.  And the capstan also serves as a feedback loop because everyone can sense when someone else pushes harder or slackens off.  It is an example of simple, efficient, effective, elegant design.


In the world of improvement we also need simple, efficient, effective and elegant ways to combine the efforts of many in achieving a common purpose.  Such as raising the standards of excellence and weighing the anchors of resistance.

In health care improvement we have many simultaneous constraints and we have many stakeholders with specific perspectives and special expertise.

And if we are not careful they will tend to pull only in their preferred direction … like a multi-way tug-o-war.  The result?  No progress and exhausted protagonists.

There are those focused on improving productivity – Team Finance.

There are those focused on improving delivery – Team Operations.

There are those focused on improving safety – Team Governance.

And we are all tasked with improving quality – Team Everyone.

So we need a synergy machine that works like a capstan-of-old, and here is one design.

Engine_Of_ExcellenceIt has four poles and it always turns in a clockwise direction, so the direction of push is clear.

And when all the protagonists push in the same direction, they will get their own ‘win’ and also assist the others to make progress.

This is how the sails of success are hoisted to catch the wind of change; and how the anchors of anxiety are heaved free of the rocks of fear; and how the bureaucratic bilge is pumped overboard to lighten our load and improve our speed and agility.

And the more hands on the capstan the quicker we will achieve our common goal.

Collective excellence.

Notably Absent

KingsFund_Quality_Report_May_2016This week the King’s Fund published their Quality Monitoring Report for the NHS, and it makes depressing reading.

These highlights are a snapshot.

The website has some excellent interactive time-series charts that transform the deluge of data the NHS pumps out into pictures that tell a shameful story.

On almost all reported dimensions, things are getting worse and getting worse faster.

Which I do not believe is the intention.

But it is clearly the impact of the last 20 years of health and social care policy.


What is more worrying is the data that is notably absent from the King’s Fund QMR.

The first omission is outcome: How well did the NHS deliver on its intended purpose?  It is stated at the top of the NHS England web site …

NHSE_Purpose

And lets us be very clear here: dying, waiting, complaining, and over-spending are not measures of what we want: health and quality success metrics.  They are a measures of what we do not want; they are failure metrics.

The fanatical focus on failure is part of the hyper-competitive, risk-averse medical mindset:

primum non nocere (first do no harm),

and as a patient I am reassured to hear that but is no harm all I can expect?

What about:

tunc mederi (then do some healing)


And where is the data on dying in the Kings Fund QMR?

It seems to be notably absent.

And I would say that is a quality issue because it is something that patients are anxious about.  And that may be because they are given so much ‘open information’ about what might go wrong, not what should go right.


And you might think that sharp, objective data on dying would be easy to collect and to share.  After all, it is not conveniently fuzzy and subjective like satisfaction.

It is indeed mandatory to collect hospital mortality data, but sharing it seems to be a bit more of a problem.

The fear-of-failure fanaticism extends there too.  In the wake of humiliating, historical, catastrophic failures like Mid Staffs, all hospitals are monitored, measured and compared. And the negative deviants are named, shamed and blamed … in the hope that improvement might follow.

And to do the bench-marking we need to compare apples with apples; not peaches with lemons.  So we need to process the raw data to make it fair to compare; to ensure that factors known to be associated with higher risk of death are taken into account. Factors like age, urgency, co-morbidity and primary diagnosis.  Factors that are outside the circle-of-control of the hospitals themselves.

And there is an army of academics, statisticians, data processors, and analysts out there to help. The fruit of their hard work and dedication is called SHMI … the Summary Hospital Mortality Index.

SHMI_Specification

Now, the most interesting paragraph is the third one which outlines what raw data is fed in to building the risk-adjusted model.  The first four are objective, the last two are more subjective, especially the diagnosis grouping one.

The importance of this distinction comes down to human nature: if a hospital is failing on its SHMI then it has two options:
(a) to improve its policies and processes to improve outcomes, or
(b) to manipulate the diagnosis group data to reduce the SHMI score.

And the latter is much easier to do, it is called up-coding, and basically it involves camping at the pessimistic end of the diagnostic spectrum. And we are very comfortable with doing that in health care. We favour the Black Hat.

And when our patients do better than our pessimistically-biased prediction, then our SHMI score improves and we look better on the NHS funnel plot.

We do not have to do anything at all about actually improving the outcomes of the service we provide, which is handy because we cannot do that. We do not measure it!


And what might be notably absent from the data fed in to the SHMI risk-model?  Data that is objective and easy to measure.  Data such as length of stay (LOS) for example?

Is there a statistical reason that LOS is omitted? Not really. Any relevant metric is a contender for pumping into a risk-adjustment model.  And we all know that the sicker we are, the longer we stay in hospital, and the less likely we are to come out unharmed (or at all).  And avoidable errors create delays and complications that imply more risk, more work and longer length of stay. Irrespective of the illness we arrived with.

So why has LOS been omitted from SHMI?

The reason may be more political than statistical.

We know that the risk of death increases with infirmity and age.

We know that if we put frail elderly patients into a hospital bed for a few days then they will decondition and become more frail, require more time in hospital, are more likely to need a transfer of care to somewhere other than home, are more susceptible to harm, and more likely to die.

So why is LOS not in the risk-of-death SHMI model?

And it is not in the King’s Fund QR report either.

Nor is the amount of cash being pumped in to keep the HMS NHS afloat each month.

All notably absent!

Undiscussables

Chimp_NoHear_NoSee_NoSpeakLast week I shared a link to Dr Don Berwick’s thought provoking presentation at the Healthcare Safety Congress in Sweden.

Near the end of the talk Don recommended six books, and I was reassured that I already had read three of them. Naturally, I was curious to read the other three.

One of the unfamiliar books was “Overcoming Organizational Defenses” by the late Chris Argyris, a professor at Harvard.  I confess that I have tried to read some of his books before, but found them rather difficult to understand.  So I was intrigued that Don was recommending it as an ‘easy read’.  Maybe I am more of a dimwit that I previously believed!  So fear of failure took over my inner-chimp and I prevaricated. I flipped into denial. Who would willingly want to discover the true depth of their dimwittedness!


Later in the week, I was forwarded a copy of a recently published paper that was on a topic closely related to a key thread in Dr Don’s presentation:

understanding variation.

The paper was by researchers who had looked at the Board reports of 30 randomly selected NHS Trusts to examine how information on safety and quality was being shared and used.  They were looking for evidence that the Trust Boards understood the importance of variation and the need to separate ‘signal’ from ‘noise’ before making decisions on actions to improve safety and quality performance.  This was a point Don had stressed too, so there was a link.

The randomly selected Trust Board reports contained 1488 charts, of which only 88 demonstrated the contribution of chance effects (i.e. noise). Of these, 72 showed the Shewhart-style control charts that Don demonstrated. And of these, only 8 stated how the control limits were constructed (which is an essential requirement for the chart to be meaningful and useful).

That is a validity yield of 8 out of 1488, or 0.54%, which is for all practical purposes zero. Oh dear!


This chance combination of apparently independent events got me thinking.

Q1: What is the reason that NHS Trust Boards do not use these signal-and-noise separation techniques when it has been demonstrated, for at least 12 years to my knowledge, that they are very effective for facilitating improvement in healthcare? (e.g. Improving Healthcare with Control Charts by Raymond G. Carey was published in 2003).

Q2: Is there some form of “organizational defense” system in place that prevents NHS Trust Boards from learning useful ‘new’ knowledge?


So I surfed the Web to learn more about Chris Argyris and to explore in greater depth his concept of Single Loop and Double Loop learning.  I was feeling like a dimwit again because to me it is not a very descriptive title!  I suspect it is not to many others too.

I sensed that I needed to translate the concept into the language of healthcare and this is what emerged.

Single Loop learning is like treating the symptoms and ignoring the disease.

Double Loop learning is diagnosing the underlying disease and treating that.


So what are the symptoms?
The pain of NHS Trust  failure on all dimensions – safety, delivery, quality and productivity (i.e. affordability for a not-for-profit enterprise).

And what are the signs?
The tell-tale sign is more subtle. It’s what is not present that is important. A serious omission. The missing bits are valid time-series charts in the Trust Board reports that show clearly what is signal and what is noise. This diagnosis is critical because the strategies for addressing them are quite different – as Julian Simcox eloquently describes in his latest essay.  If we get this wrong and we act on our unwise decision, then we stand a very high chance of making the problem worse, and demoralizing ourselves and our whole workforce in the process! Does that sound familiar?

And what is the disease?
Undiscussables.  Emotive subjects that are too taboo to table in the Board Room.  And the issue of what is discussable is one of the undiscussables so we have a self-sustaining system.  Anyone who attempts to discuss an undiscussable is breaking an unspoken social code.  Another undiscussable is behaviour, and our social code is that we must not upset anyone so we cannot discuss ‘difficult’ issues.  But by avoiding the issue (the undiscussable disease) we fail to address the root cause and end up upsetting everyone.  We achieve exactly what we are striving to avoid, which is the technical definition of incompetence.  And Chris Argyris labelled this as ‘skilled incompetence’.


Does an apparent lack of awareness of what is already possible fully explain why NHS Trust Boards do not use the tried-and-tested tool called a system behaviour chart to help them diagnose, design and deliver effective improvements in safety, flow, quality and productivity?

Or are there other forces at play as well?

Some deeper undiscussables perhaps?

Culture – cause or effect?

The Harvard Business Review is worth reading because many of its articles challenge deeply held assumptions, and then back up the challenge with the pragmatic experience of those who have succeeded to overcome the limiting beliefs.

So the heading on the April 2016 copy that awaited me on my return from an Easter break caught my eye: YOU CAN’T FIX CULTURE.


 

HBR_April_2016

The successful leaders of major corporate transformations are agreed … the cultural change follows the technical change … and then the emergent culture sustains the improvement.

The examples presented include the Ford Motor Company, Delta Airlines, Novartis – so these are not corporate small fry!

The evidence suggests that the belief of “we cannot improve until the culture changes” is the mantra of failure of both leadership and management.


A health care system is characterised by a culture of risk avoidance. And for good reason. It is all too easy to harm while trying to heal!  Primum non nocere is a core tenet – first do no harm.

But, change and improvement implies taking risks – and those leaders of successful transformation know that the bigger risk by far is to become paralysed by fear and to do nothing.  Continual learning from many small successes and many small failures is preferable to crisis learning after a catastrophic failure!

The UK healthcare system is in a state of chronic chaos.  The evidence is there for anyone willing to look.  And waiting for the NHS culture to change, or pushing for culture change first appears to be a guaranteed recipe for further failure.

The HBR article suggests that it is better to stay focussed; to work within our circles of control and influence; to learn from others where knowledge is known, and where it is not – to use small, controlled experiments to explore new ground.


And I know this works because I have done it and I have seen it work.  Just by focussing on what is important to every member on the team; focussing on fixing what we could fix; not expecting or waiting for outside help; gathering and sharing the feedback from patients on a continuous basis; and maintaining patient and team safety while learning and experimenting … we have created a micro-culture of high safety, high efficiency, high trust and high productivity.  And we have shared the evidence via JOIS.

The micro-culture required to maintain the safety, flow, quality and productivity improvements emerged and evolved along with the improvements.

It was part of the effect, not the cause.


So the concept of ‘fix the system design flaws and the continual improvement culture will emerge’ seems to work at macro-system and at micro-system levels.

We just need to learn how to diagnose and treat healthcare system design flaws. And that is known knowledge.

So what is the next excuse?  Too busy?

FrailSafe Design

frailsafeSafe means avoiding harm, and safety is an emergent property of a well-designed system.

Frail means infirm, poorly, wobbly and at higher risk of harm.

So we want our health care system to be a FrailSafe Design.

But is it? How would we know? And what could we do to improve it?


About ten years ago I was involved in a project to improve the safety design of a specific clinical stream flowing through the hospital that I work in.

The ‘at risk’ group of patients were frail elderly patients admitted as an emergency after a fall and who had suffered a fractured thigh bone. The neck of the femur.

Historically, the outcome for these patients was poor.  Many do not survive, and many of the survivors never returned to independent living. They become even more frail.


The project was undertaken during an organisational transition, the hospital was being ‘taken over’ by a bigger one.  This created a window of opportunity for some disruptive innovation, and the project was labelled as a ‘Lean’ one because we had been inspired by similar work done at Bolton some years before and Lean was the flavour of the month.

The actual change was small: it was a flow design tweak that cost nothing to implement.

First we asked two flow questions:
Q1: How many of these high-risk frail patients do we admit a year?
A1: About one per day on average.
Q2: What is the safety critical time for these patients?
A2: The first four days.  The sooner they have hip surgery and are able to be actively mobilise the better their outcome.

Second we applied Little’s Law which showed the average number of patients in this critical phase is four. This was the ‘work in progress’ or WIP.

And we knew that variation is always present, and we knew that having all these patients in one place would make it much easier for the multi-disciplinary teams to provide timely care and to avoid potentially harmful delays.

So we suggested that one six-bedded bay on one of the trauma wards be designated the Fractured Neck Of Femur bay.

That was the flow diagnosis and design done.

The safety design was created by the multi-disciplinary teams who looked after these patients: the geriatricians, the anaesthetists, the perioperative emergency care team (PECT), the trauma and orthopaedic team, the physiotherapists, and so on.

They designed checklists to ensure that all #NOF patients got what they needed when they needed it and so that nothing important was left to chance.

And that was basically it.

And the impact was remarkable. The stream flowed. And one measured outcome was a dramatic and highly statistically significant reduction in mortality.

Injury_2011_Results
The full paper was published in Injury 2011; 42: 1234-1237.

We had created a FrailSafe Design … which implied that what was happening before was clearly not safe for these frail patients!


And there was an improved outcome for the patients who survived: A far larger proportion rehabilitated and returned to independent living, and a far smaller proportion required long-term institutional care.

By learning how to create and implement a FrailSafe Design we had added both years-to-life and life-to-years.

It cost nothing to achieve and the message was clear, as this quote is from the 2011 paper illustrates …

Injury_2011_Message

What was a bit disappointing was the gap of four years between delivering this dramatic and highly significant patient safety and quality improvement and the sharing of the story.


What is more exciting is that the concept of FrailSafe is growing, evolving and spreading.

Grit in the Oyster

Pearl_and_OysterThe word pearl is a metaphor for something rare, beautiful, and valuable.

Pearls are formed inside the shell of certain mollusks as a defense mechanism against a potentially threatening irritant.

The mollusk creates a pearl sac to seal off the irritation.


And so it is with change and improvement.  The growth of precious pearls of improvement wisdom – the ones that develop slowly over time – are triggered by an irritant.

Someone asking an uncomfortable question perhaps, or presenting some information that implies that an uncomfortable question needs to be asked.


About seven years ago a question was asked “Would improving healthcare flow and quality result in lower costs?”

It is a good question because some believe that it would and some believe that it would not.  So an experiment to test the hypothesis was needed.

The Health Foundation stepped up to the challenge and funded a three year project to find the answer. The design of the experiment was simple. Take two oysters and introduce an irritant into them and see if pearls of wisdom appeared.

The two ‘oysters’ were Sheffield Hospital and Warwick Hospital and the irritant was Dr Kate Silvester who is a doctor and manufacturing system engineer and who has a bit-of-a-reputation for asking uncomfortable questions and backing them up with irrefutable information.


Two rare and precious pearls did indeed grow.

In Sheffield, it was proved that by improving the design of their elderly care process they improved the outcome for their frail, elderly patients.  More went back to their own homes and fewer left via the mortuary.  That was the quality and safety improvement. They also showed a shorter length of stay and a reduction in the number of beds needed to store the work in progress.  That was the flow and productivity improvement.

What was interesting to observe was how difficult it was to get these profoundly important findings published.  It appeared that a further irritant had been created for the academic peer review oyster!

The case study was eventually published in Age and Aging 2014; 43: 472-77.

The pearl that grew around this seed is the Sheffield Microsystems Academy.


In Warwick, it was proved that the A&E 4 hour performance could be improved by focussing on improving the design of the processes within the hospital, downstream of A&E.  For example, a redesign of the phlebotomy and laboratory process to ensure that clinical decisions on a ward round are based on todays blood results.

This specific case study was eventually published as well, but by a different path – one specifically designed for sharing improvement case studies – JOIS 2015; 22:1-30

And the pearls of wisdom that developed as a result of irritating many oysters in the Warwick bed are clearly described by Glen Burley, CEO of Warwick Hospital NHS Trust in this recent video.


Getting the results of all these oyster bed experiments published required irritating the Health Foundation oyster … but a pearl grew there too and emerged as the full Health Foundation report which can be downloaded here.


So if you want to grow a fistful of improvement and a bagful of pearls of wisdom … then you will need to introduce a bit of irritation … and Dr Kate Silvester is a proven source of grit for your oyster!

Learning How To Manage …

Learning how to manage is as vital as learning how to lead.

by Julian Simcox

Recently I blogged to introduce the re-publication of my 10 year old essay:

“Intervening into Personal and Organisational Systems by Powerfully Leading and Wisely Managing”

The key ideas in that essay were seven fold:

  1. Aiming to develop Leadership separately from Management is likely to confuse anyone targeted by a separatist training programme, the reality being that everyone in organisational life is necessarily and simultaneously both Managing and Leading (M/L) and often desperately trying to integrate them as two very different action-logics.
  2. Managing and Leading are not roles but ways of thinking and acting that need to be intently chosen, according to the particular learning context (one of three) that any Managerial Leader (12) is facing.
  3. Like in Stephen Covey’s “Maturity Continuum” (8) M/L capability evolves over time (see the diagram below) and makes possible a transformational outcome, if supported in one’s organisation by sufficient and timely post-conventional thinking.
  4. Such an outcome (9,10,11,14,17,19,20,21,23) occurred in Toyota from 1950, making it possible for the organisation to evolve into what Peter Senge (18) calls a “Learning Organisation” – one in which improvement science (4) ensues continually from the bottom-up, within a structure that has evolved top-down.
  5. In Toyota’s case it was W. Edwards Deming who is most credited with having been the catalyst. Jim Collins (6) evidences eleven other examples of an organisational transformation sparked by an individual with a post-conventional world view that transcended a pre-existing conventional one.
  6. Deming talked a lot about ways of thinking – paradigms – that, like Euclidian geometry, make sense in their own world, but not outside it. When speaking with anyone in a client organisation he always aimed at being empathic to a person’s individual frame of reference. He was interested in how individuals make their own common sense because he had learned that it is this that often negatively impacts an individual’s decision-making process and hence their impact on an organisational system that needs to continually learn – a phenomenon he called “tampering”.
  7. The diagram seeks to capture the ways in which paradigms (world views) collectively and sequentially evolve. It combines the research of several practitioners (2,7,15,16) who sought to empirically trace the archetypal evolution of individual sense-making.

JS_Blog_20160307_Fig1

In 2013, Don Berwick (5) recommended to the UK government that, in order to prioritise quality and safety, the National Health Service must become a Deming-style learning organisation. The NHS however is not one single organisation, it is a thousand organisations – both privately and publically owned.  Yet if structured with “Liberating Disciplines” (22) via appropriately set central standards (e.g. tools that prompt thinking that is scientifically methodical), each can be invited as a single organisation to transform themselves into a body with learning its core value. Berwick seems to appreciate that out of the apparently sufficient conventional thinking, enough post-conventional managerial leadership will then have a chance to take root, and in time bloom.

The purpose of this blog is to introduce a second essay:

“Managerial Leadership: Five action-logics viewed via two developmental lenses.”

In the first essay I used P-D-S-A as the integrative link between Managing and Leading – offering a total of just three learning contexts, but this always felt a little over-simplistic and in 2005 when coaching my daughter Josie – then in her sandwich year as an undergraduate trainee in the hospitality industry – I was persuaded by her to further sub-divide the two M/L modes – replacing two with four:

  1. maintaining
  2. continually improving
  3. innovating
  4. transforming.

Applying this new 4 action-logic model, Josie succeeded in transforming the fortunes of her hotel – winning a national award for her efforts – and this made me wonder if she might be on to something important?

I decided to use the new version of the model to explore what it would look like through first a “conventional” lens, and then second a “post-conventional” lens – illustrating the kinds of paradigm shifts that one might see in action when inside a learning organisation, in particular the way that accountabilities for performance are handled.

It is hard to describe a post-conventional way of seeing things to someone who developmentally has discovered only the conventional way – about 85% of adults. It is as if the instructions about how to get out of the box are on the outside. It is hoped that this essay may help some individuals unlock this conundrum. In a learning organisation for example it turns out that real-time data and feedback are essential for continually prompting individuals and organisations to rapidly evolve a new way of seeing.

BaseLine® for example is a tool that has been designed with this in mind. It allows conventional organisations and individuals, even those considering themselves relatively innumerate, to develop post-conventional habits; simply by using the time-series data that in many cases is already being collected – albeit usually for reasons of top-down accountability rather than methodical improvement. In this way, healthy developmental conversation gets sparked – and at all organisational levels: bottom, middle and top.

It also turns out that Continuous Improvement when seen though the second lens is not the same as Continual Improvement (mode 2) – and this is another one of the paradigm shifts that in the essay gets explained. Here is the model as it then appears:

JS_Blog_20160307_Fig2

Note that a fifth action-logic mode, modelling, is also now included. This emerged out of conversations I was having with Simon Dodds when writing the final draft in 2011. The essence of this mode is embodied in a phrase coined by the late Russell Ackoff – “idealized design” (1) – using modern computing technology to facilitate transformative change within tolerable levels of risk.

People often readily admit to spending much of their life in mode 1 (maintaining), whilst really preferring to be in mode 3 (innovating) – even admitting to seeing mode 1 as relatively boring, or at best as overly bureaucratic. Such individuals are especially prone to tampering, and may even shun regimes in which they feel overly controlled. What the post-conventional worldview offers however is not the prospect of being controlled, but the prospect of being in control – whilst simultaneously letting go – a paradox that is not easy to get unless developmentally ready – hence the 2005 essay. This goes for the tools too – especially when being deployed with the full cultural support that can flow from an organisation imbued with sufficient post-conventional design.

If the organisation can be designed to sufficiently support the right people to take control of each critical process or sub-system, who at the right level (usually the lowest point in the hierarchy that accountability may be accepted), may feel safely equipped to make sound decisions, genuine empowerment then becomes possible. Essentially, people then feel safe enough to self-empower and take charge of their system.

Toyota are an exemplar “learning organisation” – actually a system of organisations that work so harmoniously as a whole that by continually adapting to its changing environment, risk can be smoothly managed. Their preoccupation from bottom to top is understanding in real time what is changing so that changes (to the system) can then be proactively and wisely made. Each employee at each organisational level is educated to both manage and lead.

This approach has enabled them to grow to become the largest volume car maker in the world – and largely via organic growth alone. They have achieved this simply by constantly delivering what the customer wants with low variation (hence high reliability) and by continually studying that variation to uncover the real causes of problems. Performance is continually assessed over time and seen largely as pertaining to the system rather than being down to any one individual. Job hoppers – who though charismatic may also be practiced at being able to avoid having to live with the longer-term consequences of their actions – are not appointed to key roles.

Some will read the essay and say to themselves that little of this applies to me or my organisation – “we’re not Toyota, we’re not a private company, and we’re not even in manufacturing”. That however is likely to be a conventional view. The post-conventional principles described in the essay apply as much to service industries as to the public sector – both commissioners and providers – some of whom would intentionally evolve a post-conventional culture if given the space to do so.

At the very least I hope to have succeeded in convincing you, even if you don’t buy in to the notion of a Berwick-style learning system, that schooling people in management or leadership separately, or without a workable definition of each, is likely to be both cruel to the individual and to court dysfunction in the organisation.

References

  1. Ackoff R. Why so few organisations adopt systems thinking – 2007
  2. Beck D.E & Cowan C.C. – Spiral Dynamics – Mastering Values, Leadership, and Change – 1996
  3. Berwick D. – The Science of Improvement – 2008 : http://www.allhealth.org/BriefingMaterials/JAMA-Berwick-1151.pdf
  4. Berwick D. – The Science of Improvement – 2008 : http://www.allhealth.org/BriefingMaterials/JAMA-Berwick-1151.pdf
  5. Berwick Donald M. – Berwick Review into patient safety – 2013
  6. Collins J.C. – Level 5 Leadership: The triumph of Humility and Fierce Resolve – HBR Jan 2001
  7. Cook-Greuter. S. – Maps for living: ego-Development Stages Symbiosis to Conscious Universal Embeddedness – 1990
  8. Covey. S.R. – The 7 habits of Highly Effective People – 1989   (ISBN 0613191455)
  9. Delavigne K.T & Robertson J. D. – Deming’s profound changes – 1994
  10. Deming W. Edwards – Out of the Crisis – 1986 (ISBN 0-911379-01-0)
  11. Deming W.Edwards – The New Economics – 1993 (ISBN 0-911379-07-X) First edition
  12. Jaques. E. – Requisite Organisation: A Total System for Effective Managerial Organisation and Managerial Leadership for the 21st Century 1998 (ISBN 1886436045)
  13. Kotter. J. P. – A Force for Change: How Leadership Differs from Management – 1990
  14. Liker J.K & Meier D. – The Toyota Way Fieldbook – 2006
  15. Rooke D and Torbert W.R. – Organisational Transformation as a function of CEO’s Development Stage 1998 (Organisation Development Journal, Vol. 6.1)
  16. Rooke D and Torbert W.R. – Seven Transformations of Leadership – Harvard Business Review April 2005
  17. Scholtes Peter R. The Leader’s Handbook: Making Things Happen, Getting Things Done – 1998
  18. Senge. P. M. – The Fifth Discipline 1990 ISBN 10 – 0385260946
  19. Spear. S and Bowen H. K- Decoding the DNA of the Toyota Production System – Harvard Business Review Sept/Oct 1999
  20. Spear. S. – Learning to Lead at Toyota – Harvard Business Review – May 2004
  21. Takeuchi H, Osono E, Shimizu N. The contradictions that drive Toyota’s success. Harvard Business Review: June 2008
  22. Torbert W.R. & Associates – Action Inquiry – The secret of timely and transforming leadership – 2004
  23. Wheeler Donald J. – Advanced Topics in Statistical Process Control – the power of Shewhart Charts – 1995

 

The Cost of Chaos

british_pound_money_three_bundled_stack_400_wht_2425This week I conducted an experiment – on myself.

I set myself the challenge of measuring the cost of chaos, and it was tougher than I anticipated it would be.

It is easy enough to grasp the concept that fire-fighting to maintain patient safety amidst the chaos of healthcare would cost more in terms of tears and time …

… but it is tricky to translate that concept into hard numbers; i.e. cash.


Chaos is an emergent property of a system.  Safety, delivery, quality and cost are also emergent properties of a system. We can measure cost, our finance departments are very good at that. We can measure quality – we just ask “How did your experience match your expectation”.  We can measure delivery – we have created a whole industry of access target monitoring.  And we can measure safety by checking for things we do not want – near misses and never events.

But while we can feel the chaos we do not have an easy way to measure it. And it is hard to improve something that we cannot measure.


So the experiment was to see if I could create some chaos, then if I could calm it, and then if I could measure the cost of the two designs – the chaotic one and the calm one.  The difference, I reasoned, would be the cost of the chaos.

And to do that I needed a typical chunk of a healthcare system: like an A&E department where the relationship between safety, flow, quality and productivity is rather important (and has been a hot topic for a long time).

But I could not experiment on a real A&E department … so I experimented on a simplified but realistic model of one. A simulation.

What I discovered came as a BIG surprise, or more accurately a sequence of big surprises!

  1. First I discovered that it is rather easy to create a design that generates chaos and danger.  All I needed to do was to assume I understood how the system worked and then use some averaged historical data to configure my model.  I could do this on paper or I could use a spreadsheet to do the sums for me.
  2. Then I discovered that I could calm the chaos by reactively adding lots of extra capacity in terms of time (i.e. more staff) and space (i.e. more cubicles).  The downside of this approach was that my costs sky-rocketed; but at least I had restored safety and calm and I had eliminated the fire-fighting.  Everyone was happy … except the people expected to foot the bill. The finance director, the commissioners, the government and the tax-payer.
  3. Then I got a really big surprise!  My safe-but-expensive design was horribly inefficient.  All my expensive resources were now running at rather low utilisation.  Was that the cost of the chaos I was seeing? But when I trimmed the capacity and costs the chaos and danger reappeared.  So was I stuck between a rock and a hard place?
  4. Then I got a really, really big surprise!!  I hypothesised that the root cause might be the fact that the parts of my system were designed to work independently, and I was curious to see what happened when they worked interdependently. In synergy. And when I changed my design to work that way the chaos and danger did not reappear and the efficiency improved. A lot.
  5. And the biggest surprise of all was how difficult this was to do in my head; and how easy it was to do when I used the theory, techniques and tools of Improvement-by-Design.

So if you are curious to learn more … I have written up the full account of the experiment with rationale, methods, results, conclusions and references and I have published it here.

Does your job title say “Manager” or “Leader”?

by Julian Simcox

Actually, it doesn’t much matter because everyone needs to be able to choose between managing and leading – as distinct and yet mutually complementary action/ logics – and to argue that one is better than the other, or worse to try to school people about just one of them on its own, is inane. The UK’s National Health Service for example is currently keen on convincing medics that they should become “clinical leaders”, the term “clinical manager” being rarely heard, yet if anything the NHS suffers more from a shortage of management skill.

It is not only healthcare that is short on management. In the first half of my career I held the title “manager” in seven different roles, and in three different organisations, and had even completed an Exec MBA, but still didn’t properly get what it meant. The people I reported into also had little idea about what “managing well” actually meant, and even if they had possessed an inclination to coach me, would have merely added to my confusion.

If however you are fortunate enough to be working in an organisation that over time has been purposefully developed as a “Learning Culture” you will have acquired an appreciation of the vital distinction between managing and leading, and just what a massive difference this makes to your effectiveness, for it requires you, before you act, to understand (11) how your system is really flowing and performing. Only then will you be ready to choose whether to manage or to lead.

It is therefore not your role’s title that matters but whether the system you are running is stable, and whether it is capable of producing the outcomes needed by your customers. It also matters how risk is to be handled by you and your organisation when you are making changes. Outcomes will depend heavily upon you and your team’s accumulated levels of learning – as well, as it turns out, upon your personal world view/ developmental stage (more of which later).

Here is a diagram that illustrates that there are three basic learning contexts that a “managerial leader” (7) needs to be adept at operating within if they are to be able to nimbly choose between them.

JS_Blog_20160221_Fig1

Depending on one’s definitions of the processes of managing and leading, most people would agree that the first learning context pertains to the process of managing, and the third to the process of leading. The second context         (P-D-S-A) which helpfully for NHS employees is core to the NHS “Model of Improvement” turns out to be especially vital for effective managerial leadership for it binds the other two contexts together – as long as you know how?

Following the Mid-Staffs Hospital disaster, David Cameron asked Professor Don Berwick to recommend how to enhance public safety in the UK’s healthcare system. Unusually for a clinician he gets the importance of understanding your system and knowing moment-to-moment whether managing or leading is the right course of action. He recommends that to evolve a system to be as safe as it can be, all NHS employees should “Learn, master and apply the modern methods of quality control, quality improvement and quality planning” (1). He makes this recommendation because without the thinking that accompanies modern quality control methods, clinical managerial leadership is lame.

The Journal of Improvement Science has recently re-published my 10 year old essay called:

“Intervening into Personal and Organisational Systems by Powerfully Leading and Wisely Managing”

Originally written from the perspective of a practising executive coach, and as a retrospective on the work of W. Edwards Deming, the essay describes just what it is that a few extraordinary Managerial Leaders seem to possess that enables them to simultaneously Manage and Lead Transformation – first of themselves, and second of their organisation. The essay culminates in a comparison of “conventional” and “post-conventional” organisations. Toyota (9,12) in which Deming’s influence continues to be profound, is used as an example of the latter. Using the 3 generic intervention modes/ learning contexts, and the way that these corresponds to an executive’s evolving developmental stage I illustrate how this works and with it what a massive difference it makes. It is only in the later (post-conventional) stages for example that the processes of managing and leading are seen as two sides of the same coin. Dee Hock (6) called these heightened levels of awareness “chaordic” and Jim Collins (2) calls the level of power this brings “Level 5 Leadership”.

JS_Blog_20160221_Fig2

Berwick, borrowing from Deming (4,5) knows that to be structured-to-learn organisations need systems thinking (11) – and that organisations need Managerial Leaders who are sufficiently developed to know how to think and intervene systemically – in other words he recognises the need for personally developing the capability to lead and manage.

Deming in particular seemed to understand the importance of developing empathy for different worldviews – he knew that each contains coherence, just as in its own flat-earth world Euclidian geometry makes perfect sense. When consulting he spent much of his time listening and asking people questions that might develop paradigmatic understanding – theirs and his. Likewise in my own work, primed with knowledge about the developmental stage of key individual players, I am more able to give my interventions teeth.

Possessing a definition of managerial leadership that can work at all the stages is also vital:

Managing =  keeping things flowing, and stable – and hence predictable – so you can consistently and confidently deliver what you’re promising. Any improvement comes from noticing what causes instability and eliminating that cause, or from learning what causes it via experimentation.

Leading  =  changing things, or transforming them, which risks a temporary loss of stability/ predictability in order to shift performance to a new and better level – a level that can then be managed and sustained.

If you resonate with the first essay you need to know that after publishing it I continued to develop the managerial leadership model into one that would work equally well for Managerial Leaders in either developmental epoch – conventional and post-conventional – whilst simultaneously balancing the level of change needed with the level of risk that’s politically tolerable – and all framed by the paradigm-shifts that typically characterise these two epochs. This revised model is described in detail in the essay:

Managerial Leadership: Five action logics viewed via two developmental lenses

– also soon to be made available via the Journal of Improvement Science.

References

  1. Berwick Donald M. – Berwick Review into patient safety (2013)
  2. Collins J.C. – Level 5 Leadership: The triumph of Humility and Fierce Resolve – HBR Jan 2001
  3. Covey. S.R. – The 7 habits of Highly Effective People – 1989 (ISBN 0613191455)
  4. Deming W. Edwards – Out of the Crisis – 1986   (ISBN 0-911379-01-0)
  5. Deming W.E – The New Economics – 1993 (ISBN 0-911379-07-X) First edition
  6. Hock. D. – The birth of the Chaordic Age 2000 (ISBN: 1576750744)
  7. Jaques. E. – Requisite Organisation: A Total System for Effective Managerial Organisation and Managerial Leadership for the 21st Century 1998 (ISBN 1886436045)
  8. Kotter. J. P. – A Force for Change: How Leadership Differs from Management – 1990
  9. Liker J.K & Meier D. – The Toyota Way Fieldbook. 2006
  10. Scholtes Peter R. The Leader’s Handbook: Making Things Happen, Getting Things Done. 1998
  11. Senge. P. M. – The Fifth Discipline 1990   ISBN 10-0385260946
  12. Spear. S. – Learning to Lead at Toyota – Harvard Business Review – May 2004

New Meat for Old Bones

FreshMeatOldBonesEvolution is an amazing process.

Using the same building blocks that have been around for a lot time, it cooks up innovative permutations and combinations that reveal new and ever more useful properties.

Very often a breakthrough in understanding comes from a simplification, not from making it more complicated.

Knowledge evolves in just the same way.

Sometimes a well understood simplification in one branch of science is used to solve an ‘impossible’ problem in another.

Cross-fertilisation of learning is a healthy part of the evolution process.


Improvement implies evolution of knowledge and understanding, and then application of that insight in the process of designing innovative ways of doing things better.


And so it is in healthcare.  For many years the emphasis on healthcare improvement has been the Safety-and-Quality dimension, and for very good reasons.  We need to avoid harm and we want to achieve happiness; for everyone.

But many of the issues that plague healthcare systems are not primarily SQ issues … they are flow and productivity issues. FP. The safety and quality problems are secondary – so only focussing on them is treating the symptoms and not the cause.  We need to balance the wheel … we need flow science.


Fortunately the science of flow is well understood … outside healthcare … but apparently not so well understood inside healthcare … given the queues, delays and chaos that seem to have become the expected norm.  So there is a big opportunity for cross fertilisation here.  If we choose to make it happen.


For example, from computer science we can borrow the knowledge of how to schedule tasks to make best use of our finite resources and at the same time avoid excessive waiting.

It is a very well understood science. There is comprehensive theory, a host of techniques, and fit-for-purpose tools that we can pick of the shelf and use. Today if we choose to.

So what are the reasons we do not?

Is it because healthcare is quite introspective?

Is it because we believe that there is something ‘special’ about healthcare?

Is it because there is no evidence … no hard proof … no controlled trials?

Is it because we assume that queues are always caused by lack of resources?

Is it because we do not like change?

Is it because we do not like to admit that we do not know stuff?

Is it because we fear loss of face?


Whatever the reasons the evidence and experience shows that most (if not all) the queues, delays and chaos in healthcare systems are iatrogenic.

This means that they are self-generated. And that implies we can un-self-generate them … at little or no cost … if only we knew how.

The only cost is to our egos of having to accept that there is knowledge out there that we could use to move us in the direction of excellence.

New meat for our old bones?

The Bit In The Middle

 

RIA_graphicA question that is often asked by doctors in particular is “What is the difference between Research, Audit and Improvement Science?“.

It is a very good question and the diagram captures the essence of the answer.

Improvement science is like a bridge between research and audit.

To understand why that is we first need to ask a different question “What are the purposes of research, improvement science and audit? What do they do?

In a nutshell:

Research provides us with new knowledge and tells us what the right stuff is.
Improvement Science provides us with a way to design our system to do the right stuff.
Audit provides us with feedback and tells us if we are doing the right stuff right.


Research requires a suggestion and an experiment to test it.   A suggestion might be “Drug X is better than drug Y at treating disease Z”, and the experiment might be a randomised controlled trial (RCT).  The way this is done is that subjects with disease Z are randomly allocated to two groups, the control group and the study group.  A measure of ‘better’ is devised and used in both groups. Then the study group is given drug X and the control group is given drug Y and the outcomes are compared.  The randomisation is needed because there are always many sources of variation that we cannot control, and it also almost guarantees that there will be some difference between our two groups. So then we have to use sophisticated statistical data analysis to answer the question “Is there a statistically significant difference between the two groups? Is drug X actually better than drug Y?”

And research is often a complicated and expensive process because to do it well requires careful study design, a lot of discipline, and usually large study and control groups. It is an effective way to help us to know what the right stuff is but only in a generic sense.


Audit requires a standard to compare with and to know if what we are doing is acceptable, or not. There is no randomisation between groups but we still need a metric and we still need to measure what is happening in our local reality.  We then compare our local experience with the global standard and, because variation is inevitable, we have to use statistical tools to help us perform that comparison.

And very often audit focuses on avoiding failure; in other words the standard is a ‘minimum acceptable standard‘ and as long as we are not failing it then that is regarded as OK. If we are shown to be failing then we are in trouble!

And very often the most sophisticated statistical tool used for audit is called an average.  We measure our performance, we average it over a period of time (to remove the troublesome variation), and we compare our measured average with the minimum standard. And if it is below then we are in trouble and if it is above then we are not.  We have no idea how reliable that conclusion is though because we discounted any variation.


A perfect example of this target-driven audit approach is the A&E 95% 4-hour performance target.

The 4-hours defines the metric we are using; the time interval between a patient arriving in A&E and them leaving. It is called a lead time metric. And it is easy to measure.

The 95% defined the minimum  acceptable average number of people who are in A&E for less than 4-hours and it is usually aggregated over three months. And it is easy to measure.

So, if about 200 people arrive in a hospital A&E each day and we aggregate for 90 days that is about 18,000 people in total so the 95% 4-hour A&E target implies that we accept as OK for about 900 of them to be there for more than 4-hours.

Do the 900 agree? Do the other 17,100?  Has anyone actually asked the patients what they would like?


The problem with this “avoiding failure” mindset is that it can never lead to excellence. It can only deliver just above the minimum acceptable. That is called mediocrity.  It is perfectly possible for a hospital to deliver 100% on its A&E 4 hour target by designing its process to ensure every one of the 18,000 patients is there for exactly 3 hours and 59 minutes. It is called a time-trap design.

We can hit the target and miss the point.

And what is more the “4-hours” and the “95%” are completely arbitrary numbers … there is not a shred of research evidence to support them.

So just this one example illustrates the many problems created by having a gap between research and audit.


And that is why we need Improvement Science to help us to link them together.

We need improvement science to translate the global knowledge and apply it to deliver local improvement in whatever metrics we feel are most important. Safety metrics, flow metrics, quality metrics and productivity metrics. Simultaneously. To achieve system-wide excellence. For everyone, everywhere.

When we learn Improvement Science we learn to measure how well we are doing … we learn the power of measurement of success … and we learn to avoid averaging because we want to see the variation. And we still need a minimum acceptable standard because we want to exceed it 100% of the time. And we want continuous feedback on just how far above the minimum acceptable standard we are. We want to see how excellent we are, and we want to share that evidence and our confidence with our patients.

We want to agree a realistic expectation rather than paint a picture of the worst case scenario.

And when we learn Improvement Science we will see very clearly where to focus our improvement efforts.


Improvement Science is the bit in the middle.


Stop Press:  There is currently an offer of free on-line foundation training in improvement science for up to 1000 doctors-in-training … here  … and do not dally because places are being snapped up fast!

Turning the Corner

Nerve_CurveThe emotional journey of change feels like a roller-coaster ride and if we draw as an emotion versus time chart it looks like the diagram above.

The toughest part is getting past the low point called the Well of Despair and doing that requires a combination of inner strength and external support.

The external support comes from an experienced practitioner who has been through it … and survived … and has the benefit of experience and hindsight.

The Improvement Science coach.


What happens as we  apply the IS principles, techniques and tools that we have diligently practiced and rehearsed? We discover that … they work!  And all the fence-sitters and the skeptics see it too.

We start to turn the corner and what we feel next is that the back pressure of resistance falls a bit. It does not go away, it just gets less.

And that means that the next test of change is a bit easier and we start to add more evidence that the science of improvement does indeed work and moreover it is a skill we can learn, demonstrate and teach.

We have now turned the corner of disbelief and have started the long, slow, tough climb through mediocrity to excellence.


This is also a time of risks and there are several to be aware of:

  1. The objective evidence that dramatic improvements in safety, flow, quality and productivity are indeed possible and that the skills can be learned will trigger those most threatened by the change to fight harder to defend their disproved rhetoric. And do not underestimate how angry and nasty they can get!
  2. We can too easily become complacent and believe that the rest will follow easily. It doesn’t.  We may have nailed some of the easier niggles to be sure … but there are much more challenging ones ahead.  The climb to excellence is a steep learning curve … all the way. But the rewards get bigger and bigger as we progress so it is worth it.
  3. We risk over-estimating our capability and then attempting to take on the tougher improvement assignments without the necessary training, practice, rehearsal and support. If we do that we will crash and burn.  It is like a game of snakes and ladders.  Our IS coach is there to help us up the ladders and to point out where the slippery snakes are lurking.

So before embarking on this journey be sure to find a competent IS coach.

They are easy to identify because they will have a portfolio of case studies that they have done themselves. They have the evidence of successful outcomes and that they can walk-the-talk.

And avoid anyone who talks-the-walk but does not have a portfolio of evidence of their own competence. Their Siren song will lure you towards the submerged Rocks of Disappointment and they will disappear like morning mist when you need them most – when it comes to the toughest part – turning the corner. You will be abandoned and fall into the Well of Despair.

So ask your IS coach for credentials, case studies and testimonials and check them out.

A Case of Chronic A&E Pain: Part 6

Dr_Bob_ThumbnailDr Bob runs a Clinic for Sick Systems and is sharing the Case of St Elsewhere’s® Hospital which is suffering from chronic pain in their A&E department.

The story so far: The history and examination of St.Elsewhere’s® Emergency Flow System have revealed that the underlying disease includes carveoutosis multiforme.  StE has consented to a knowledge transplant but is suffering symptoms of disbelief – the emotional rejection of the new reality. Dr Bob prescribed some loosening up exercises using the Carveoutosis Game.  This is the appointment to review the progress.


<Dr Bob> Hello again. I hope you have done the exercises as we agreed.

<StE> Indeed we have.  Many times in fact because at first we could not believe what we were seeing. We even modified the game to explore the ramifications.  And we have an apology to make. We discounted what you said last week but you were absolutely correct.

<Dr Bob> I am delighted to hear that you have explored further and I applaud you for the curiosity and courage in doing that.  There is no need to apologize. If this flow science was intuitively obvious then we we would not be having this conversation. So, how have you used the new understanding?

<StE> Before we tell the story of what happened next we are curious to know where you learned about this?

<Dr Bob> The pathogenesis of carveoutosis spatialis has been known for about 100 years but in a different context.  The story goes back to the 1870s when Alexander Graham Bell invented the telephone.  He was not an engineer or mathematician by background; he was interested in phonetics and he was a pragmatist and experimented by making things. He invented the telephone and the Bell Telephone Co. was born.  This innovation spread like wildfire, as you can imagine, and by the early 1900’s there were many telephone companies all over the world.  At that time the connections were made manually by telephone operators using patch boards and the growing demand created a new problem.  How many lines and operators were needed to provide a high quality service to bill paying customers? In other words … to achieve an acceptably low chance of hearing the reply “I’m sorry but all lines are busy, please try again later“.  Adding new lines and more operators was a slow and expensive business so they needed a way to predict how many would be needed – and how to do that was not obvious!  In 1917, a Danish mathematician, statistician and engineer called Agner Krarup Erlang published a paper with the solution.  A complicated formula that described the relationship and his Erlang B equation allowed telephone exchanges to be designed, built and staffed and to provide a high quality service at an acceptably low cost.  Mass real-time voice communication by telephone became affordable and has transformed the world.

<StE> Fascinating! We sort of sense there is a link here and certainly the “high quality and low cost” message resonates for us. But how does designing telephone exchanges relate to hospital beds?

<Dr Bob> If we equate an emergency admission needing a bed to a customer making a phone call, and we equate the number of telephone lines to the number of beds, then the two systems are very similar from the flow physics perspective. Erlang’s scary-looking equation can be used to estimate the minimum number of beds needed to achieve any specified level of admission service quality if you know the average rate of demand and average the length of stay.  That is how I made the estimate last week. It is this predictable-within-limits behaviour that you demonstrated to yourself with the Carveoutosis Game.

<StE> And this has been known for nearly 100 years but we have only just learned about it!

<Dr Bob> Yes. That is a bit annoying isn’t it?

<StE> And that explains why when we ‘ring-fence’ our fixed stock of beds the 4-hour performance falls!

<Dr Bob> Yes, that is a valid assertion. By doing that you are reducing your space-capacity resilience and the resulting danger, chaos, disappointment and escalating cost is completely predictable.

<StE> So our pain is iatrogenic as you said! We have unwittingly caused this. That is uncomfortable news to hear.

<Dr Bob> The root cause is actually not what you have done wrong, it is what you have not done right. It is an error of omission. You have not learned to listen to what your system is telling you. You have not learned how that can help you to deepen your understanding of how your system works. It is that information, knowledge, understanding and wisdom that you need to design a safer, calmer, higher quality and more affordable healthcare system.

<StE> And now we can see our omission … before it was like a blind spot … and now we can see the fallacy of our previously deeply held belief: that it was impossible to solve this without more beds, more staff and more money.  The gap is now obvious where before it was invisible. It is like a light has been turned on.  Now we know what to do and we are on the road to recovery. We need to learn how to do this ourselves … but not by guessing and meddling … we need to learn to diagnose and then to design and then to deliver safety, flow, quality and productivity.  All at the same time.

<Dr Bob> Welcome to the world of Improvement Science. And here I must sound a note of caution … there is a lot more to it than just blindly applying Erlang’s B equation. That will get us into the ball-park, which is a big leap forward, but real systems are not just simple, passive games of chance; they are complicated, active and adaptive.  Applying the principles of flow design in that context requires more than just mathematics, statistics and computer models.  But that know-how is available and accessible too … and waiting for when you are ready to take that leap of learning.

OK. I do not think you require any more help from me at this stage. You have what you need and I wish you well.  And please let me know the outcome.

<StE> Thank you and rest assured we will. We have already started writing our story … and we wanted to share the that with you today … but with this new insight we will need to write a few more chapters first.  This is really exciting … thank you so much.


St.Elsewhere’s® is a registered trademark of Kate Silvester Ltd,  and to read more real cases of 4-hour A&E pain download Kate’s: The Christmas Crisis


Part 1 is here. Part 2 is here. Part 3 is here. Part 4 is here. Part 5 is here.

The Catastrophe is Coming

Monitor_Summary


This week an interesting report was published by Monitor – about some possible reasons for the A&E debacle that England experienced in the winter of 2014.

Summary At A Glance

“91% of trusts did not  meet the A&E 4-hour maximum waiting time standard last winter – this was the worst performance in 10 years”.


So it seems a bit odd that the very detailed econometric analysis and the testing of “Ten Hypotheses” did not look at the pattern of change over the previous 10 years … it just compared Oct-Dec 2014 with the same period for 2013! And the conclusion: “Hospitals were fuller in 2014“.  H’mm.


The data needed to look back 10 years is readily available on the various NHS England websites … so here it is plotted as simple time-series charts.  These are called system behaviour charts or SBCs. Our trusted analysis tools will be a Mark I Eyeball connected to the 1.3 kg of wetware between our ears that runs ChimpOS 1.0 …  and we will look back 11 years to 2004.

A&E_Arrivals_2004-15First we have the A&E Arrivals chart … about 3.4 million arrivals per quarter. The annual cycle is obvious … higher in the summer and falling in the winter. And when we compare the first five years with the last six years there has been a small increase of about 5% and that seems to associate with a change of political direction in 2010.

So over 11 years the average A&E demand has gone up … a bit … but only by about 5%.


A&E_Admissions_2004-15In stark contrast the A&E arrivals that are admitted to hospital has risen relentlessly over the same 11 year period by about 50% … that is about 5% per annum … ten times the increase in arrivals … and with no obvious step in 2010. We can see the annual cycle too.  It is a like a ratchet. Click click click.


But that does not make sense. Where are these extra admissions going to? We can only conclude that over 11 years we have progressively added more places to admit A&E patients into.  More space-capacity to store admitted patients … so we can stop the 4-hour clock perhaps? More emergency assessment units perhaps? Places to wait with the clock turned off perhaps? The charts imply that our threshold for emergency admission has been falling: Admission has become increasingly the ‘easier option’ for whatever reason.  So why is this happening? Do more patients need to be admitted?


In a recent empirical study we asked elderly patients about their experience of the emergency process … and we asked them just after they had been discharged … when it was still fresh in their memories. A worrying pattern emerged. Many said that they had been admitted despite them saying they did not want to be.  In other words they did not willingly consent to admission … they were coerced.

This is anecdotal data so, by implication, it is wholly worthless … yes?  Perhaps from a statistical perspective but not from an emotional one.  It is a red petticoat being waved that should not be ignored.  Blissful ignorance comes from ignoring anecdotal stuff like this. Emotionally uncomfortable anecdotal stories. Ignore the early warning signs and suffer the potentially catastrophic consequences.


A&E_Breaches_2004-15And here is the corresponding A&E 4-hour Target Failure chart.  Up to 2010 the imposed target was 98% success (i.e. 2% acceptable failure) and, after bit of “encouragement” in 2004-5, this was actually achieved in some of the summer months (when the A&E demand was highest remember).

But with a change of political direction in 2010 the “hated” 4-hour target was diluted down to 95% … so a 5% failure rate was now ‘acceptable’ politically, operationally … and clinically.

So it is no huge surprise that this is what was achieved … for a while at least.

In the period 2010-13 the primary care trusts (PCTs) were dissolved and replaced by clinical commissioning groups (CCGs) … the doctors were handed the ignition keys to the juggernaut that was already heading towards the cliff.

The charts suggest that the seeds were already well sown by 2010 for an evolving catastrophe that peaked last year; and the changes in 2010 and 2013 may have just pressed the accelerator pedal a bit harder. And if the trend continues it will be even worse this coming winter. Worse for patients and worse for staff and worse for commissioners and  worse for politicians. Lose lose lose lose.


So to summarise the data from the NHS England’s own website:

1. A&E arrivals have gone up 5% over 11 years.
2. Admissions from A&E have gone up 50% over 11 years.
3. Since lowering the threshold for acceptable A&E performance from 98% to 95% the system has become unstable and “fallen off the cliff” … but remember, a temporal association does not prove causation.

So what has triggered the developing catastrophe?

Well, it is important to appreciate that when a patient is admitted to hospital it represents an increase in workload for every part of the system that supports the flow through the hospital … not just the beds.  Beds represent space-capacity. They are just where patients are stored.  We are talking about flow-capacity; and that means people, consumables, equipment, data and cash.

So if we increase emergency admissions by 50% then, if nothing else changes, we will need to increase the flow-capacity by 50% and the space-capacity to store the work-in-progress by 50% too. This is called Little’s Law. It is a mathematically proven Law of Flow Physics. It is not negotiable.

So have we increased our flow-capacity and our space-capacity (and our costs) by 50%? I don’t know. That data is not so easy to trawl from the websites. It will be there though … somewhere.

What we have seen is an increase in bed occupancy (the red box on Monitor’s graphic above) … but not a 50% increase … that is impossible if the occupancy is already over 85%.  A hospital is like a rigid metal box … it cannot easily expand to accommodate a growing queue … so the inevitable result in an increase in the ‘pressure’ inside.  We have created an emergency care pressure cooker. Well lots of them actually.

And that is exactly what the staff who work inside hospitals says it feels like.

And eventually the relentless pressure and daily hammering causes the system to start to weaken and fail, gradually at first then catastrophically … which is exactly what the NHS England data charts are showing.


So what is the solution?  More beds?

Nope.  More beds will create more space and that will relieve the pressure … for a while … but it will not address the root cause of why we are admitting 50% more patients than we used to; and why we seem to need to increase the pressure inside our hospitals to squeeze the patients through the process and extrude them out of the various exit nozzles.

Those are the questions we need to have understandable and actionable answers to.

Q1: Why are we admitting 5% more of the same A&E arrivals each year rather than delivering what they need in 4 hours or less and returning them home? That is what the patients are asking for.

Q2: Why do we have to push patients through the in-hospital process rather than pulling them through? The staff are willing to work but not inside a pressure cooker.


A more sensible improvement strategy is to look at the flow processes within the hospital and ensure that all the steps and stages are pulling together to the agreed goals and plan for each patient. The clinical management plan that was decided when the patient was first seen in A&E. The intended outcome for each patient and the shortest and quickest path to achieving it.


Our target is not just a departure within 4 hours of arriving in A&E … it is a competent diagnosis (study) and an actionable clinical management plan (plan) within 4 hours of arriving; and then a process that is designed to deliver (do) it … for every patient. Right, first time, on time, in full and at a cost we can afford.

Q: Do we have that?
A: Nope.

Q: Is that within our gift to deliver?
A: Yup.

Q: So what is the reason we are not already doing it?
A: Good question.  Who in the NHS is trained how to do system-wide flow design like this?

Not as Easy as it Looks

smack_head_in_disappointment_150_wht_16653One of the traps for the inexperienced Improvement Science Practitioner is to believe that applying the science in the real world is as easy as it is in the safety of the training environment.

It isn’t.

The real world is messier and more complicated and it is easy to get lost in the fog of confusion and chaos.


So how do we avoid losing our footing, slipping into the toxic emotional swamp of organisational culture and giving ourselves an unpleasant dunking!

We use safety equipment … to protect ourselves and others from unintended harm.

The Improvement-by-Design framework is like a scaffold.  It is there to provide structure and safety.  The techniques and tools are like the harnesses, shackles, ropes, crampons, and pitons.  They give us flexibility and security.

But we need to know how to use them. We need to be competent as well as confident.

We do not want to tie ourselves up in knots … and we do not want to discover that we have not tied ourselves to something strong enough to support us if we slip. Which we will.


So we need to learn an practice the basics skills to the point that they are second nature.

We need to learn how to tie secure knots, quickly and reliably.

We need to learn how to plan an ascent … identifying the potential hazards and designing around them.

We need to learn how to assemble and check what we will need before we start … not too much and not too little.

We need to learn how to monitor out progress against our planned milestones and be ready to change the plan as we go …and even to abandon the attempt if necessary.


We would not try to climb a real mountain without the necessary training, planning, equipment and support … even though it might look easy.

And we do not try to climb an improvement mountain without the necessary training, planning, tools and support … even though it might look easy.

It is not as easy as it looks.

The Five-day versus Seven-day Bun-Fight

Dr_Bob_ThumbnailThere is a big bun-fight kicking off on the topic of 7-day working in the NHS.

The evidence is that there is a statistical association between mortality in hospital of emergency admissions and day of the week: and weekends are more dangerous.

There are fewer staff working at weekends in hospitals than during the week … and delays and avoidable errors increase … so risk of harm increases.

The evidence also shows that significantly fewer patients are discharged at weekends.


So the ‘obvious’ solution is to have more staff on duty at weekends … which will cost more money.


Simple, obvious, linear and wrong.  Our intuition has tricked us … again!


Let us unravel this Gordian Knot with a bit of flow science and a thought experiment.

1. The evidence shows that there are fewer discharges at weekends … and so demonstrates lack of discharge flow-capacity. A discharge process is not a single step, there are many things that must flow in sync for a discharge to happen … and if any one of them is missing or delayed then the discharge does not happen or is delayed.  The weakest link effect.

2. The evidence shows that the number of unplanned admissions varies rather less across the week; which makes sense because they are unplanned.

3. So add those two together and at weekends we see hospitals filling up with unplanned admissions – not because the sick ones are arriving faster – but because the well ones are leaving slower.

4. The effect of this is that at weekends the queue of people in beds gets bigger … and they need looking after … which requires people and time and money.

5. So the number of staffed beds in a hospital must be enough to hold the biggest queue – not the average or some fudged version of the average like a 95th percentile.

6. So a hospital running a 5-day model needs more beds because there will be more variation in bed use and we do not want to run out of beds and delay the admission of the newest and sickest patients. The ones at most risk.

7. People do not get sicker because there is better availability of healthcare services – but saying we need to add more unplanned care flow capacity at weekends implies that it does.  What is actually required is that the same amount of flow-resource that is currently available Mon-Fri is spread out Mon-Sun. The flow-capacity is designed to match the customer demand – not the convenience of the supplier.  And that means for all parts of the system required for unplanned patients to flow.  What, where and when. It costs the same.

8. Then what happens is that the variation in the maximum size of the queue of patients in the hospital will fall and empty beds will appear – as if by magic.  Empty beds that ensure there is always one for a new, sick, unplanned admission on any day of the week.

9. And empty beds that are never used … do not need to be staffed … so there is a quick way to reduce expensive agency staff costs.

So with a comprehensive 7-day flow-capacity model the system actually gets safer, less chaotic, higher quality and less expensive. All at the same time. Safety-Flow-Quality-Productivity.

What is Productivity?

It was the time for Bob and Leslie’s regular coaching session. Dr_Bob_ThumbnailBob was already on line when Leslie dialed in to the teleconference.

<Leslie> Hi Bob, sorry I am a bit late.

<Bob> No problem Leslie. What aspect of improvement science shall we explore today?

<Leslie> Well, I’ve been working through the Safety-Flow-Quality-Productivity cycle in my project and everything is going really well.  The team are really starting to put the bits of the jigsaw together and can see how the synergy works.

<Bob> Excellent. And I assume they can see the sources of antagonism too.

<Leslie> Yes, indeed! I am now up to the point of considering productivity and I know it was introduced at the end of the Foundation course but only very briefly.

<Bob> Yes,  productivity was described as a system metric. A ratio of a steam metric and a stage metric … what we get out of the streams divided by what we put into the stages.  That is a very generic definition.

<Leslie> Yes, and that I think is my problem. It is too generic and I get it confused with concepts like efficiency.  Are they the same thing?

<Bob> A very good question and the short answer is “No”, but we need to explore that in more depth.  Many people confuse efficiency and productivity and I believe that is because we learn the meaning of words from the context that we see them used in. If  others use the words imprecisely then it generates discussion, antagonism and confusion and we are left with the impression of that it is a ‘difficult’ subject.  The reality is that it is not difficult when we use the words in a valid way.

<Leslie> OK. That reassures me a bit … so what is the definition of efficiency?

<Bob> Efficiency is a stream metric – it is the ratio of the minimum cost of the resources required to complete one task divided by the actual cost of the resources used to complete one task.

<Leslie> Um.  OK … so how does time come into that?

<Bob> Cost is a generic concept … it can refer to time, money and lots of other things.  If we stick to time and money then we know that if we have to employ ‘people’ then time will cost money because people need money to buy essential stuff that the need for survival. Water, food, clothes, shelter and so on.

<Leslie> So we could use efficiency in terms of resource-time required to complete a task?

<Bob> Yes. That is a very useful way of looking at it.

<Leslie> So how is productivity different? Completed tasks out divided by cash in to pay for resource time would be a productivity metric. It looks the same.

<Bob> Does it?  The definition of efficiency is possible cost divided by actual cost. It is not the as our definition of system productivity.

<Leslie> Ah yes, I see. So do others define productivity the same way?

<Bob> Try looking it up on Wikipedia …

<Leslie> OK … here we go …

Productivity is an average measure of the efficiency of production. It can be expressed as the ratio of output to inputs used in the production process, i.e. output per unit of input”.

Now that is really confusing!  It looks like efficiency and productivity are the same. Let me see what the Wikipedia definition of efficiency is …

“Efficiency is the (often measurable) ability to avoid wasting materials, energy, efforts, money, and time in doing something or in producing a desired result”.

But that is closer to your definition of efficiency – the actual cost is the minimum cost plus the cost of waste.

<Bob> Yes.  I think you are starting to see where the confusion arises.  And this is because there is a critical piece of the jigsaw missing.

<Leslie> Oh …. and what is that?

<Bob> Worth.

<Leslie> Eh?

<Bob> Efficiency has nothing to do with whether the output of the stream has any worth.  I can produce a worthless product with low waste … in other words very efficiently.  And what if we have the situation where the output of my process is actually harmful.  The more efficiently I use my resources the more harm I will cause from a fixed amount of resource … and in that situation it is actually safer to have an inefficient process!

<Leslie> Wow!  That really hits the nail on the head … and the implications are … profound.  Efficiency is objective and relates only to flow … and between flow and productivity we have to cross the Safety-Quality line. Productivity also includes the subjective concept of worth or value. That all makes complete sense now. A productive system is a subjectively and objectively win-win-win design.

<Bob> Yup.  Get the safety, flow and quality perspectives of the design in synergy and productivity will sky-rocket. It is called a Fit-4-Purpose design.

Excellent or Mediocre?

smack_head_in_disappointment_150_wht_16653Many organisations proclaim that their mission is to achieve excellence but then proceed to deliver mediocre performance.

Why is this?

It is certainly not from lack of purpose, passion or people.

So the flaw must lie somewhere in the process.


The clue lies in how we measure performance … and to see the collective mindset behind the design of the performance measurement system we just need to examine the key performance indicators or KPIs.

Do they measure failure or success?


Let us look at some from the NHS …. hospital mortality, hospital acquired infections, never events, 4-hour A&E breaches, cancer wait breaches, 18 week breaches, and so on.

In every case the metric reported is a failure metric. Not a success metric.

And the focus of action is getting away from failure.

Damage mitigation, damage limitation and damage compensation.


So we have the answer to our question: we know we are doing a good job when we are not failing.

But are we?

When we are not failing we are not doing a bad job … is that the same as doing a good job?

Q: Does excellence  = not excrement?

A: No. There is something between these extremes.

The succeed-or-fail dichotomy is a distorting simplification created by applying an arbitrary threshold to a continuous measure of performance.


And how, specifically, have we designed our current system to avoid failure?

Usually by imposing an arbitrary target connected to a punitive reaction to failure. Management by fear.

This generates punishment-avoidance and back-covering behaviour which is manifest as a lot of repeated checking and correcting of the inevitable errors that we find.  A lot of extra work that requires extra time and that requires extra money.

So while an arbitrary-target-driven-check-and-correct design may avoid failing on safety, the additional cost may cause us to then fail on financial viability.

Out of the frying pan and into the fire.

No wonder Governance and Finance come into conflict!

And if we do manage to pull off a uneasy compromise … then what level of quality are we achieving?


Studies show that if take a random sample of 100 people from the pool of ‘disappointed by their experience’ and we ask if they are prepared to complain then only 5% will do so.

So if we use complaints as our improvement feedback loop and we react to that and make changes that eliminate these complaints then what do we get? Excellence?

Nope.

We get what we designed … just good enough to avoid the 5% of complaints but not the 95% of disappointment.

We get mediocrity.


And what do we do then?

We start measuring ‘customer satisfaction’ … which is actually asking the question ‘did your experience meet your expectation?’

And if we find that satisfaction scores are disappointingly low then how do we improve them?

We have two choices: improve the experience or reduce the expectation.

But as we are very busy doing the necessary checking-and-correcting then our path of least resistance to greater satisfaction is … to lower expectations.

And we do that by donning the black hat of the pessimist and we lay out the the risks and dangers.

And by doing that we generate anxiety and fear.  Which was not the intended outcome.


Our mission statement proclaims ‘trusted to achieve excellence’ not ‘designed to deliver mediocrity’.

But mediocrity is what the evidence says we are delivering. Just good enough to avoid a smack from the Regulators.

And if we are honest with ourselves then we are forced to conclude that:

A design that uses failure metrics as the primary feedback loop can achieve no better than mediocrity.


So if we choose  to achieve excellence then we need a better feedback design.

We need a design that uses success metrics as the primary feedback loop and we use failure metrics only in safety critical contexts.

And the ideal people to specify the success metrics are those who feel the benefit directly and immediately … the patients who receive care and the staff who give it.

Ask a patient what they want and they do not say “To be treated in less than 18 weeks”.  In fact I have yet to meet a patient who has even heard of the 18-week target!

A patient will say ‘I want to know what is wrong, what can be done, when it can be done, who will do it, what do I need to do, and what can I expect to be the outcome’.

Do we measure any of that?

Do we measure accuracy of diagnosis? Do we measure use of best evidenced practice? Do we know the possible delivery time (not the actual)? Do we inform patients of what they can expect to happen? Do we know what they can expect to happen? Do we measure outcome for every patient? Do we feed that back continuously and learn from it?

Nope.


So …. if we choose and commit to delivering excellence then we will need to start measuring-4-success and feeding what we see back to those who deliver the care.

Warts and all.

So that we know when we are doing a good job, and we know where to focus further improvement effort.

And if we abdicate that commitment and choose to deliver mediocrity-by-default then we are the engineers of our own chaos and despair.

We have the choice.

We just need to make it.

The Improvement Pyramid

IS_PyramidDeveloping productive improvement capability in an organisation is like building a pyramid in the desert.

It is not easy and it takes time before there is any visible evidence of success.

The height of the pyramid is a measure of the level of improvement complexity that we can take on.

An improvement of a single step in a system would only require a small pyramid.

Improving the whole system will require a much taller one.


But if we rush and attempt to build a sky-scraper on top of the sand then we will not be surprised when it topples over before we have made very much progress.  The Egyptians knew this!

First, we need to dig down and to lay some foundations.  Stable enough and strong enough to support the whole structure.  We will never see the foundations so it is easy to forget them in our rush but they need to be there and they need to be there first.

It is the same when developing improvement science capability  … the foundations are laid first and when enough of that foundation knowledge is in place we can start to build the next layer of the pyramid: the practitioner layer.


It is the the Improvement Science Practitioners (ISPs) who start to generate tangible evidence of progress.  The first success stories help to spur us all on to continue to invest effort, time and money in widening our foundations to be able to build even higher – more layers of capability -until we can realistically take on a system wide improvement challenge.

So sharing the first hard evidence of improvement is an important milestone … it is proof of fitness for purpose … and that news should be shared with those toiling in the hot desert sun and with those watching from the safety of the shade.

So here is a real story of a real improvement pyramid achieving this magical and motivating milestone.


Politicial Purpose

count_this_vote_400_wht_9473The question that is foremost in the mind of a designer is “What is the purpose?”   It is a future-focussed question.  It is a question of intent and outcome. It raises the issues of worth and value.

Without a purpose it impossible to answer the question “Is what we have fit-for-purpose?

And without a clear purpose it is impossible for a fit-for-purpose design to be created and tested.

In the absence of a future-purpose all that remains are the present-problems.

Without a future-purpose we cannot be proactive; we can only be reactive.

And when we react to problems we generate divergence.  We observe heated discussions. We hear differences of opinion as to the causes and the solutions.  We smell the sadness, anger and fear. We taste the bitterness of cynicism. And we are touched to our core … but we are paralysed.  We cannot act because we cannot decide which is the safest direction to run to get away from the pain of the problems we have.


And when the inevitable catastrophe happens we look for somewhere and someone to place and attribute blame … and high on our target-list are politicians.


So the prickly question of politics comes up and we need to grasp that nettle and examine it with the forensic lens of the system designer and we ask “What is the purpose of a politician?”  What is the output of the political process? What is their intent? What is their worth? How productive are they? Do we get value for money?

They will often answer “Our purpose is to serve the public“.  But serve is a verb so it is a process and not a purpose … “To serve the public for what purpose?” we ask. “What outcome can we expect to get?” we ask. “And when can we expect to get it?

We want a service (a noun) and as voters and tax-payers we have customer rights to one!

On deeper reflection we see a political spectrum come into focus … with Public at one end and Private at the other.  A country generates wealth through commerce … transforming natural and human resources into goods and services. That is the Private part and it has a clear and countable measure of success: profit.  The Public part is the redistribution of some of that wealth for the benefit of all – the tax-paying public. Us.

Unfortunately the Public part does not have quite the same objective test of success: so we substitute a different countable metric: votes. So the objectively measurable outcome of a successful political process is the most votes.

But we are still talking about process … not purpose.  All we have learned so far is that the politicians who attract the most votes will earn for themselves a temporary mandate to strive to achieve their political purpose. Whatever that is.

So what do the public, the voters, the tax-payers (and remember whenever we buy something we pay tax) … the customers of this political process … actually get for their votes and cash?  Are they delighted, satisfied or disappointed? Are they getting value-for-money? Is the political process fit-for-purpose? And what is the purpose? Are we all clear about that?

And if we look at the current “crisis” in health and social care in England then I doubt that “delight” will feature high on the score-sheet for those who work in healthcare or for those that they serve. The patients. The long-suffering tax-paying public.


Are politicians effective? Are they delivering on their pledge to serve the public? What does the evidence show?  What does their portfolio of public service improvement projects reveal?  Welfare, healthcare, education, police, and so on.The_Whitehall_Effect

Well the actual evidence is rather disappointing … a long trail of very expensive taxpayer-funded public service improvement failures.

And for an up-to-date list of some of the “eye-wateringly”expensive public sector improvement train-wrecks just read The Whitehall Effect.

But lurid stories of public service improvement failures do not attract precious votes … so they are not aired and shared … and when they are exposed our tax-funded politicians show their true skills and real potential.

Rather than answering the questions they filter, distort and amplify the questions and fire them at each other.  And then fall over each other avoiding the finger-of-blame and at the same time create the next deceptively-plausible election manifesto.  Their food source is votes so they have to tickle the voters to cough them up. And they are consummate masters of that art.

Politicians sell dreams and serve disappointment.


So when the-most-plausible with the most votes earn the right to wield the ignition keys for the engine of our national economy they deflect future blame by seeking the guidance of experts. And the only place they can realistically look is into the private sector who, in manufacturing anyway, have done a much better job of understanding what their customers need and designing their processes to deliver it. On-time, first-time and every-time.

Politicians have learned to be wary of the advice of academics – they need something more pragmatic and proven.  And just look at the remarkable rise of the manufacturing phoenix of Jaguar-Land-Rover (JLR) from the politically embarrassing ashes of the British car industry. And just look at Amazon to see what information technology can deliver!

So the way forward is blindingly obvious … combine manufacturing methods with information technology and build a dumb-robot manned production-line for delivering low-cost public services via a cloud-based website and an outsourced mega-call-centre manned by standard-script-following low-paid operatives.


But here we hit a bit of a snag.

Designing a process to deliver a manufactured product for a profit is not the same as designing a system to deliver a service to the public.  Not by a long chalk.  Public services are an example of what is now known as a complex adaptive system (CAS).

And if we attempt to apply the mechanistic profit-focussed management mantras of “economy of scale” and “division of labour” and “standardisation of work” to the messy real-world of public service then we actually achieve precisely the opposite of what we intended. And the growing evidence is embarrassingly clear.

We all want safer, smoother, better, and more affordable public services … but that is not what we are experiencing.

Our voted-in politicians have unwittingly commissioned complicated non-adaptive systems that ensure we collectively fail.

And we collectively voted the politicians into power and we are collectively failing to hold them to account.

So the ball is squarely in our court.


Below is a short video that illustrates what happens when politicians and civil servants attempt complex system design. It is called the “Save the NHS Game” and it was created by a surgeon who also happens to be a system designer.  The design purpose of the game is to raise awareness. The fundamental design flaw in this example is “financial fragmentation” which is the the use of specific budgets for each part of the system together with a generic, enforced, incremental cost-reduction policy (the shrinking budget).  See for yourself what happens …


In health care we are in the improvement business and to do that we start with a diagnosis … not a dream or a decision.

We study before we plan, and we plan before we do.

And we have one eye on the problem and one eye on the intended outcome … a healthier patient.  And we often frame improvement in the negative as a ‘we do not want a not sicker patient’ … physically or psychologically. Primum non nocere.  First do no harm.

And 99.9% of the time we do our best given the constraints of the system context that the voted-in politicians have created for us; and that their loyal civil servants have imposed on us.


Politicians are not designers … that is not their role.  Their part is to create and sell realistic dreams in return for votes.

Civil servants are not designers … that is not their role.  Their part is to enact the policy that the vote-seeking politicians cook up.

Doctors are not designers … that is not their role.  Their part is to make the best possible clinical decisions that will direct actions that lead, as quickly as possible, to healthier and happier patients.

So who is doing the complex adaptive system design?  Whose role is that?

And here we expose a gap.  No one.  For the simple reason that no one is trained to … so no one is tasked to.

But there is a group of people who are perfectly placed to create the context for developing this system design capability … the commissioners, the executive boards and the senior managers of our public services.

So that is where we might reasonably start … by inviting our leaders to learn about the science of complex adaptive system improvement-by-design.

And there are now quite a few people who can now teach this science … they are the ones who have done it and can demonstrate and describe their portfolios of successful and sustained public service improvement projects.

Would you vote for that?

Righteous Indignation

NHS_Legal_CostsThis heading in the the newspaper today caught my eye.

Reading the rest of the story triggered a strong emotional response: anger.

My inner chimp was not happy. Not happy at all.

So I took my chimp for a walk and we had a long chat and this is the story that emerged.

The first trigger was the eye-watering fact that the NHS is facing something like a £26 billion litigation cost.  That is about a quarter of the total NHS annual budget!

The second was the fact that the litigation bill has increased by over £3 billion in the last year alone.

The third was that the extra money will just fall into a bottomless pit – the pockets of legal experts – not to where it is intended, to support overworked and demoralised front-line NHS staff. GPs, nurses, AHPs, consultants … the ones that deliver care.

That is why my chimp was so upset.  And it sounded like righteous indignation rather than irrational fear.


So what is the root cause of this massive bill? A more litigious society? Ambulance chasing lawyers trying to make a living? Dishonest people trying to make a quick buck out of a tax-funded system that cannot defend itself?

And what is the plan to reduce this cost?

Well in the article there are three parts to this:
“apologise and learn when you’re wrong,  explain and vigorously defend when we’re right, view court as a last resort.”

This sounds very plausible but to achieve it requires knowing when we are wrong or right.

How do we know?


Generally we all think we are right until we are proved wrong.

It is the way our brains are wired. We are more sure about our ‘rightness’ than the evidence suggests is justified. We are naturally optimistic about our view of ourselves.

So to be proved wrong is emotionally painful and to do it we need:
1) To make a mistake.
2) For that mistake to lead to psychological or physical harm.
3) For the harm to be identified.
4) For the cause of the harm to be traced back to the mistake we made.
5) For the evidence to be used to hold us to account, (to apologise and learn).

And that is all hunky-dory when we are individually inept and we make avoidable mistakes.

But what happens when the harm is the outcome of a combination of actions that individually are harmless but which together are not?  What if the contributory actions are sensible and are enforced as policies that we dutifully follow to the letter?

Who is held to account?  Who needs to apologise? Who needs to learn?  Someone? Anyone? Everyone? No one?

The person who wrote the policy?  The person who commissioned the policy to be written? The person who administers the policy? The person who follows the policy?

How can that happen if the policies are individually harmless but collectively lethal?


The error here is one of a different sort.

It is called an ‘error of omission’.  The harm is caused by what we did not do.  And notice the ‘we’.

What we did not do is to check the impact on others of the policies that we write for ourselves.

Example:

The governance department of a large hospital designs safety policies that if not followed lead to disciplinary action and possible dismissal.  That sounds like a reasonable way to weed out the ‘bad apples’ and the policies are adhered to.

At the same time the operations department designs flow policies (such as maximum waiting time targets and minimum resource utilisation) that if not followed lead to disciplinary action and possible dismissal.  That also sounds like a reasonable way to weed out the layabouts whose idleness cause queues and delays and the policies are adhered to.

And at the same time the finance department designs fiscal policies (such as fixed budgets and cost improvement targets) that if not followed lead to disciplinary action and possible dismissal. Again, that sounds like a reasonable way to weed out money wasters and the policies are adhered to.

What is the combined effect? The multiple safety checks take more time to complete, which puts extra workload on resources and forces up utilisation. As the budget ceiling is lowered the financial and operational pressures build, the system heats up, stress increases, corners are cut, errors slip through the safety checks. More safety checks are added and the already over-worked staff are forced into an impossible position.  Chaos ensues … more mistakes are made … patients are harmed and justifiably seek compensation by litigation.  Everyone loses (except perhaps the lawyers).


So why was my inner chimp really so unhappy?

Because none of this is necessary. This scenario is avoidable.

Reducing the pain of complaints and the cost of litigation requires setting realistic expectations to avoid disappointment and it requires not creating harm in the first place.

That implies creating healthcare systems that are inherently safe, not made not-unsafe by inspection-and-correction.

And it implies measuring and sharing intended and actual outcomes not  just compliance with policies and rates of failure to meet arbitrary and conflicting targets.

So if that is all possible and all that is required then why are we not doing it?

Simple. We never learned how. We never knew it is possible.

Fit-4-Purpose

F4P_PillsWe all want a healthcare system that is fit for purpose.

One which can deliver diagnosis, treatment and prognosis where it is needed, when it is needed, with empathy and at an affordable cost.

One that achieves intended outcomes without unintended harm – either physical or psychological.

We want safety, delivery, quality and affordability … all at the same time.

And we know that there are always constraints we need to work within.

There are constraints set by the Laws of the Universe – physical constraints.

These are absolute,  eternal and are not negotiable.

Dr Who’s fantastical tardis is fictional. We cannot distort space, or travel in time, or go faster than light – well not with our current knowledge.

There are also constraints set by the Laws of the Land – legal constraints.

Legal constraints are rigid but they are also adjustable.  Laws evolve over time, and they are arbitrary. We design them. We choose them. And we change them when they are no longer fit for purpose.

The third limit is often seen as the financial constraint. We are required to live within our means. There is no eternal font of  limitless funds to draw from.  We all share a planet that has finite natural resources  – and ‘grow’ in one part implies ‘shrink’ in another.  The Laws of the Universe are not negotiable. Mass, momentum and energy are conserved.

The fourth constraint is perceived to be the most difficult yet, paradoxically, is the one that we have most influence over.

It is the cultural constraint.

The collective, continuously evolving, unwritten rules of socially acceptable behaviour.


Improvement requires challenging our unconscious assumptions, our beliefs and our habits – and selectively updating those that are no longer fit-4-purpose.

To learn we first need to expose the gaps in our knowledge and then to fill them.

We need to test our hot rhetoric against cold reality – and when the fog of disillusionment forms we must rip up and rewrite what we have exposed to be old rubbish.

We need to examine our habits with forensic detachment and we need to ‘unlearn’ the ones that are limiting our effectiveness, and replace them with new habits that better leverage our capabilities.

And all of that is tough to do. Life is tough. Living is tough. Learning is tough. Leading is tough. But it energising too.

Having a model-of-effective-leadership to aspire to and a peer-group for mutual respect and support is a critical piece of the jigsaw.

It is not possible to improve a system alone. No matter how smart we are, how committed we are, or how hard we work.  A system can only be improved by the system itself. It is a collective and a collaborative challenge.


So with all that in mind let us sketch a blueprint for a leader of systemic cultural improvement.

What values, beliefs, attitudes, knowledge, skills and behaviours would be on our ‘must have’ list?

What hard evidence of effectiveness would we ask for? What facts, figures and feedback?

And with our check-list in hand would we feel confident to spot an ‘effective leader of systemic cultural improvement’ if we came across one?


This is a tough design assignment because it requires the benefit of  hindsight to identify the critical-to-success factors: our ‘must have and must do’ and ‘must not have and must not do’ lists.

H’mmmm ….

So let us take a more pragmatic and empirical approach. Let us ask …

“Are there any real examples of significant and sustained healthcare system improvement that are relevant to our specific context?”

And if we can find even just one Black Swan then we can ask …

Q1. What specifically was the significant and sustained improvement?
Q2. How specifically was the improvement achieved?
Q3. When exactly did the process start?
Q4. Who specifically led the system improvement?

And if we do this exercise for the NHS we discover some interesting things.

First let us look for exemplars … and let us start using some official material – the Monitor website (http://www.monitor.gov.uk) for example … and let us pick out ‘Foundation Trusts’ because they are the ones who are entrusted to run their systems with a greater degree of capability and autonomy.

And what we discover is a league table where those FTs that are OK are called ‘green’ and those that are Not OK are coloured ‘red’.  And there are some that are ‘under review’ so we will call them ‘amber’.

The criteria for deciding this RAG rating are embedded in a large balanced scorecard of objective performance metrics linked to a robust legal contract that provides the framework for enforcement.  Safety metrics like standardised mortality ratios, flow metrics like 18-week and 4-hour target yields, quality metrics like the friends-and-family test, and productivity metrics like financial viability.

A quick tally revealed 106 FTs in the green, 10 in the amber and 27 in the red.

But this is not much help with our quest for exemplars because it is not designed to point us to who has improved the most, it only points to who is failing the most!  The league table is a name-and-shame motivation-destroying cultural-missile fuelled by DRATs (delusional ratios and arbitrary targets) and armed with legal teeth.  A projection of the current top-down, Theory-X, burn-the-toast-then-scrape-it management-of-mediocrity paradigm. Oh dear!

However,  despite these drawbacks we could make better use of this data.  We could look at the ‘reds’ and specifically at their styles of cultural leadership and compare with a random sample of all the ‘greens’ and their models for success. We could draw out the differences and correlate with outcomes: red, amber or green.

That could offer us some insight and could give us the head start with our blueprint and check-list.


It would be a time-consuming and expensive piece of work and we do not want to wait that long. So what other avenues are there we can explore now and at no cost?

Well there are unofficial sources of information … the ‘grapevine’ … the stuff that people actually talk about.

What examples of effective improvement leadership in the NHS are people talking about?

Well a little blue bird tweeted one in my ear this week …

And specifically they are talking about a leader who has learned to walk-the-improvement-walk and is now talking-the-improvement-walk: and that is Sir David Dalton, the CEO of Salford Royal.

Here is a copy of the slides from Sir David’s recent lecture at the Kings Fund … and it is interesting to compare and contrast it with the style of NHS Leadership that led up to the Mid Staffordshire Failure, and to the Francis Report, and to the Keogh Report and to the Berwick Report.

Chalk and cheese!


So if you are an NHS employee would you rather work as part of an NHS Trust where the leaders walk-DD’s-walk and talk-DD’s-talk?

And if you are an NHS customer would you prefer that the leaders of your local NHS Trust walked Sir David’s walk too?


We are the system … we get the leaders that we deserve … we make the  choice … so we need to choose wisely … and we need to make our collective voice heard.

Actions speak louder than words.  Walk works better than talk.  We must be the change we want to see.

A Little Law and Order

teamwork_puzzle_build_PA_150_wht_2341[Bing bong]. The sound heralded Lesley logging on to the weekly Webex coaching session with Bob, an experienced Improvement Science Practitioner.

<Bob> Good afternoon Lesley.  How has your week been and what topic shall we explore today?

<Lesley> Hi Bob. Well in a nutshell, the bit of the system that I have control over feels like a fragile oasis of calm in a perpetual desert of chaos.  It is hard work keeping the oasis clear of the toxic sand that blows in!

<Bob> A compelling metaphor. I can just picture it.  Maintaining order amidst chaos requires energy. So what would you like to talk about?

<Lesley> Well, I have a small shoal of FISHees who I am guiding  through the foundation shallows and they are getting stuck on Little’s Law.  I confess I am not very good at explaining it and that suggests to me that I do not really understand it well enough either.

<Bob> OK. So shall we link those two theme – chaos and Little’s Law?

<Lesley> That sounds like an excellent plan!

<Bob> OK. So let us refresh the foundation knowledge. What is Little’s Law?

<Lesley>It is a fundamental Law of process physics that relates flow, with lead time and work in progress.

<Bob> Good. And specifically?

<Lesley> Average lead time is equal to the average flow multiplied by the average work in progress.

<Bob>Yes. And what are the units of flow in your equation?

<Lesley> Ah yes! That is  a trap for the unwary. We need to be clear how we express flow. The usual way is to state it as number of tasks in a defined period of time, such as patients admitted per day.  In Little’s Law the convention is to use the inverse of that which is the average interval between consecutive flow events. This is an unfamiliar way to present flow to most people.

<Bob> Good. And what is the reason that we use the ‘interval between events’ form?

<Leslie> Because it is easier to compare it with two critically important  flow metrics … the takt time and the cycle time.

<Bob> And what is the takt time?

<Leslie> It is the average interval between new tasks arriving … the average demand interval.

<Bob> And the cycle time?

<Leslie> It is the shortest average interval between tasks departing …. and is determined by the design of the flow constraint step.

<Bob> Excellent. And what is the essence of a stable flow design?

<Lesley> That the cycle time is less than the takt time.

<Bob>Why less than? Why not equal to?

<Leslie> Because all realistic systems need some flow resilience to exhibit stable and predictable-within-limits behaviour.

<Bob> Excellent. Now describe the design requirements for creating chronically chaotic system behaviour?

<Leslie> This is a bit trickier to explain. The essence is that for chronically chaotic behaviour to happen then there must be two feedback loops – a destabilising loop and a stabilising loop.  The destabilising loop creates the chaos, the stabilising loop ensures it is chronic.

<Bob> Good … so can you give me an example of a destabilising feedback loop?

<Leslie> A common one that I see is when there is a long delay between detecting a safety risk and the diagnosis, decision and corrective action.  The risks are often transitory so if the corrective action arrives long after the root cause has gone away then it can actually destabilise the process and paradoxically increase the risk of harm.

<Bob> Can you give me an example?

<Leslie>Yes. Suppose a safety risk is exposed by a near miss.  A delay in communicating the niggle and a root cause analysis means that the specific combination of factors that led to the near miss has gone. The holes in the Swiss cheese are not static … they move about in the chaos.  So the action that follows the accumulation of many undiagnosed near misses is usually the non-specific mantra of adding yet another safety-check to the already burgeoning check-list. The longer check-list takes more time to do, and is often repeated many times, so the whole flow slows down, queues grow bigger, waiting times get longer and as pressure comes from the delivery targets corners start being cut, and new near misses start to occur; on top of the other ones. So more checks are added and so on.

<Bob> An excellent example! And what is the outcome?

<Leslie> Chronic chaos which is more dangerous, more disordered and more expensive. Lose lose lose.

<Bob> And how do the people feel who work in the system?

<Leslie> Chronically naffed off! Angry. Demotivated. Cynical.

<Bob>And those feelings are the key symptoms.  Niggles are not only symptoms of poor process design, they are also symptoms of a much deeper problem: a violation of values.

<Leslie> I get the first bit about poor design; but what is that second bit about values?

<Bob>  We all have a set of values that we learned when we were very young and that have bee shaped by life experience.  They are our source of emotional energy, and our guiding lights in an uncertain world. Our internal unconscious check-list.  So when one of our values is violated we know because we feel angry. How that anger is directed varies from person to person … some internalise it and some externalise it.

<Leslie> OK. That explains the commonest emotion that people report when they feel a niggle … frustration which is the same as anger.

<Bob>Yes.  And we reveal our values by uncovering the specific root causes of our niggles.  For example if I value ‘Hard Work’ then I will be niggled by laziness. If you value ‘Experimentation’ then you may be niggled by ‘Rigid Rules’.  If someone else values ‘Safety’ then they may value ‘Rigid Rules’ and be niggled by ‘Innovation’ which they interpret as risky.

<Leslie> Ahhhh! Yes, I see.  This explains why there is so much impassioned discussion when we do a 4N Chart! But if this behaviour is so innate then it must be impossible to resolve!

<Bob> Understanding  how our values motivate us actually helps a lot because we are naturally attracted to others who share the same values – because we have learned that it reduces conflict and stress and improves our chance of survival. We are tribal and tribes share the same values.

<Leslie> Is that why different  departments appear to have different cultures and behaviours and why they fight each other?

<Bob> It is one factor in the Silo Wars that are a characteristic of some large organisations.  But Silo Wars are not inevitable.

<Leslie> So how are they avoided?

<Bob> By everyone knowing what common purpose of the organisation is and by being clear about what values are aligned with that purpose.

<Leslie> So in the healthcare context one purpose is avoidance of harm … primum non nocere … so ‘safety’ is a core value.  Which implies anything that is felt to be unsafe generates niggles and well-intended but potentially self-destructive negative behaviour.

<Bob> Indeed so, as you described very well.

<Leslie> So how does all this link to Little’s Law?

<Bob>Let us go back to the foundation knowledge. What are the four interdependent dimensions of system improvement?

<Leslie> Safety, Flow, Quality and Productivity.

<Bob> And one measure of  productivity is profit.  So organisations that have only short term profit as their primary goal are at risk of making poor long term safety, flow and quality decisions.

<Leslie> And flow is the key dimension – because profit is just  the difference between two cash flows: income and expenses.

<Bob> Exactly. One way or another it all comes down to flow … and Little’s Law is a fundamental Law of flow physics. So if you want all the other outcomes … without the emotionally painful disorder and chaos … then you cannot avoid learning to use Little’s Law.

<Leslie> Wow!  That is a profound insight.  I will need to lie down in a darkened room and meditate on that!

<Bob> An oasis of calm is the perfect place to pause, rest and reflect.

Firewall

buncefield_fireFires are destructive, indifferent, and they can grow and spread very fast.

The picture is of  the Buncefield explosion and conflagration that occurred on 11th December 2005 near Hemel Hempstead in the UK.  The root cause was a faulty switch that failed to prevent tank number 912 from being overfilled. This resulted in an initial 300 gallon petrol spill which created the perfect conditions for an air-fuel explosion.  The explosion was triggered by a spark and devastated the facility. Over 2000 local residents needed to be evacuated and the massive fuel fire took days to bring under control. The financial cost of the accident has been estimated to run into tens of millions of pounds.

The Great Fire of London in September 1666 led directly to the adoption of new building standards – notably brick and stone instead of wood because they are more effective barriers to fire.

A common design to limit the spread of a fire is called a firewall.

And we use the same principle in computer systems to limit the spread of damage when a computer system goes out of control.


Money is the fuel that keeps the wheels of healthcare systems turning.  And healthcare is an expensive business so every drop of cash-fuel is precious.  Healthcare is also a risky business – from both a professional and a financial perspective. Mistakes can quickly lead to loss of livelihood, expensive recovery plans and huge compensation claims. The social and financial equivalent of a conflagration.

Financial fires spread just like real ones – quickly. So it makes good sense not to have all the cash-fuel in one big pot.  It makes sense to distribute it to smaller pots – in each department – and to distribute the cash-fuel intermittently. These cash-fuel silos are separated by robust financial firewalls and they are called Budgets.

The social sparks that ignite financial fires are called ‘Niggles‘.  They are very numerous but we have effective mechanisms for containing them. The problem happens when a multiple sparks happen at the same time and place and together create a small chain reaction. Then we get a complaint. A ‘Not Again‘.  And we are required to spend some of our precious cash-fuel investigating and apologizing.  We do not deal with the root cause, we just scrape the burned toast.

And then one day the chain reaction goes a bit further and we get a ‘Near Miss‘.  That has a different  reporting mechanism so it stimulates a bigger investigation and it usually culminates in some recommendations that involve more expensive checking, documenting and auditing of the checking and documentation.  The root cause, the Niggles, go untreated – because there are too many of them.

But this check-and-correct reaction is also  expensive and we need even more cash-fuel to keep the organizational engine running – but we do not have any more. Our budgets are capped. So we start cutting corners. A bit here and a bit there. And that increases the risk of more Niggles, Not Agains, and Near Misses.

Then the ‘Never Event‘ happens … a Safety and Quality catastrophe that triggers the financial conflagration and toasts the whole organization.


So although our financial firewalls, the Budgets, are partially effective they also have downsides:

1. Paradoxically they can create the perfect condition for a financial conflagration when too small a budget leads to corner-cutting on safety.

2. They lead to ‘off-loading’ which means that too-expensive-to-solve problems are chucked over the financial firewalls into the next department.  The cost is felt downstream of the source – in a different department – and is often much larger. The sparks are blown downwind.

For example: a waiting list management department is under financial pressure and is running short staffed as a recruitment freeze has been imposed. The overburdening of the remaining staff leads to errors in booking patients for operations. The knock on effect that is patients being cancelled on the day and the allocated operating theatre time is wasted.  The additional cost of wasted theatre time is orders of magnitude greater than the cost-saving achieved in the upstream stage.  The result is a lower quality service, a greater cost to the whole system, and the risk that safety corners will be cut leading to a Near Miss or a Never Event.

The nature of real systems is that small perturbations can be rapidly amplified by a ‘tight’ financial design to create a very large and expensive perturbation called a ‘catastrophe’.  A silo-based financial budget design with a cost-improvement thumbscrew feature increases the likelihood of this universally unwanted outcome.

So if we cannot use one big fuel tank or multiple, smaller, independent fuel tanks then what is the solution?

We want to ensure smooth responsiveness of our healthcare engine, we want healthcare  cash-fuel-efficiency and we want low levels of toxic emissions (i.e. complaints) at the same time. How can we do that?

Fuel-injection.

fuel_injectorsElectronic Fuel Injection (EFI) designs have now replaced the old-fashioned, inefficient, high-emission  carburettor-based engines of the 1970’s and 1980’s.

The safer, more effective and more efficient cash-flow design is to inject the cash-fuel where and when it is needed and in just the right amount.

And to do that we need to have a robust, reliable and rapid feedback system that controls the cash-injectors.

But we do not have such a feedback system in healthcare so that is where we need to start our design work.

Designing an automated cash-injection system requires understanding how the Seven Flows of any  system work together and the two critical flows are Data Flow and Cash Flow.

And that is possible.

Reducing Avoidable Harm

patient_stumbling_with_bandages_150_wht_6861Primum non nocere” is Latin for “First do no harm”.

It is a warning mantra that had been repeated by doctors for thousands of years and for good reason.

Doctors  can be bad for your health.

I am not referring to the rare case where the doctor deliberately causes harm.  Such people are criminals and deserve to be in prison.

I am referring to the much more frequent situation where the doctor has no intention to cause harm – but harm is the outcome anyway.

Very often the risk of harm is unavoidable. Healthcare is a high risk business. Seriously unwell patients can be very unstable and very unpredictable.  Heroic efforts to do whatever can be done can result in unintended harm and we have to accept those risks. It is the nature of the work.  Much of the judgement in healthcare is balancing benefit with risk on a patient by patient basis. It is not an exact science. It requires wisdom, judgement, training and experience. It feels more like an art than a science.

The focus of this essay is not the above. It is on unintentionally causing avoidable harm.

Or rather unintentionally not preventing avoidable harm which is not quite the same thing.

Safety means prevention of avoidable harm. A safe system is one that does that. There is no evidence of harm to collect. A safe system does not cause harm. Never events never happen.

Safe systems are designed to be safe.  The root causes of harm are deliberately designed out one way or another.  But it is not always easy because to do that we need to understand the cause-and-effect relationships that lead to unintended harm.  Very often we do not.


In 1847 a doctor called Ignaz Semmelweis made a very important discovery. He discovered that if the doctors and medical students washed their hands in disinfectant when they entered the labour ward, then the number of mothers and babies who died from infection was reduced.

And the number dropped a lot.

It fell from an annual average of 10% to less than 2%!  In really bad months the rate was 30%.

The chart below shows the actual data plotted as a time-series chart. The yellow flag in 1848 is just after Semmelweis enforced a standard practice of hand-washing.

Vienna_Maternal_Mortality_1785-1848

Semmelweis did not know the mechanism though. This was not a carefully designed randomised controlled trial (RCT). He was desperate. And he was desperate because this horrendous waste of young lives was only happening on the doctors ward.  On the nurses ward, which was just across the corridor, the maternal mortality was less than 2%.

The hospital authorities explained it away as ‘bad air’ from outside. That was the prevailing belief at the time. Unavoidable. A risk that had to be just accepted.

Semmeleis could not do a randomized controlled trial because they were not invented until a century later.

And Semmelweis suspected that the difference between the mortality on the nurses and the doctors wards was something to do with the Mortuary. Only the doctors performed the post-mortems and the practice of teaching anatomy to medical students using post-mortem dissection was an innovation pioneered in Vienna in 1823 (the first yellow flag on the chart above). But Semmelweis did not have this data in 1847.  He collated it later and did not publish it until 1861.

What Semmelweis demonstrated was the unintended and avoidable deaths were caused by ignorance of the mechanism of how microorganisms cause disease. We know that now. He did not.

It would be another 20 years before Louis Pasteur demonstrated the mechanism using the famous experiment with the swan neck flask. Pasteur did not discover microorganisms;  he proved that they did not appear spontaneously in decaying matter as was believed. He proved that by killing the bugs by boiling, the broth in the flask  stayed fresh even though it was exposed to the air. That was a big shock but it was a simple and repeatable experiment. He had a mechanism. He was believed. Germ theory was born. A Scottish surgeon called Joseph Lister read of this discovery and surgical antisepsis was born.

Semmelweis suspected that some ‘agent’ may have been unwittingly transported from the dead bodies to the live mothers and babies on the hands of the doctors.  It was a deeply shocking suggestion that the doctors were unwittingly killing their patients.

The other doctors did not take this suggestion well. Not well at all. They went into denial. They discounted the message and they discharged the messenger. Semmelweis never worked in Vienna again. He went back to Hungary and repeated the experiment. It worked.


Even today the message that healthcare practitioners can unwittingly bring avoidable harm to their patients is disturbing. We still seek solace in denial.

Hospital acquired infections (HAI) are a common cause of harm and many are avoidable using simple, cheap and effective measures such as hand-washing.

The harm does not come from what we do. It comes from what we do not do. It happens when we omit to follow the simple safety measures that have be proven to work. Scientifically. Statistically Significantly. Understood and avoidable errors of omission.


So how is this “statistically significant scientific proof” acquired?

By doing experiments. Just like the one Ignaz Semmelweis conducted. But the improvement he showed was so large that it did not need statistical analysis to validate it.  And anyway such analysis tools were not available in 1847. If they had been he might have had more success influencing his peers. And if he had achieved that goal then thousands, if not millions, of deaths from hospital acquired infections may have been prevented.  With the clarity of hindsight we now know this harm was avoidable.

No. The problem we have now is because the improvement that follows a single intervention is not very large. And when the causal mechanisms are multi-factorial we need more than one intervention to achieve the improvement we want. The big reduction in avoidable harm. How do we do that scientifically and safely?


About 20% of hospital acquired infections occur after surgical operations.

We have learned much since 1847 and we have designed much safer surgical systems and processes. Joseph Lister ushered in the era of safe surgery, much has happened since.

We routinely use carefully designed, ultra-clean operating theatres, sterilized surgical instruments, gloves and gowns, and aseptic techniques – all to reduce bacterial contamination from outside.

But surgical site infections (SSIs) are still common place. Studies show that 5% of patients on average will suffer this complication. Some procedures are much higher risk than others, despite the precautions we take.  And many surgeons assume that this risk must just be accepted.

Others have tried to understand the mechanism of SSI and their research shows that the source of the infections is the patients themselves. We all carry a ‘bacterial flora’ and normally that is no problem. Our natural defense – our skin – is enough.  But when that biological barrier is deliberately breached during a surgical operation then we have a problem. The bugs get in and cause mischief. They cause surgical site infections.

So we have done more research to test interventions to prevent this harm. Each intervention has been subject to well-designed, carefully-conducted, statistically-valid and very expensive randomized controlled trials.  And the results are often equivocal. So we repeat the trials – bigger, better controlled trials. But the effects of the individual interventions are small and they easily get lost in the noise. So we pool the results of many RCTs in what is called a ‘meta-analysis’ and the answer from that is very often ‘not proven’ – either way.  So individual surgeons are left to make the judgement call and not surprisingly there is wide variation in practice.  So is this the best that medical science can do?

No. There is another way. What we can do is pool all the learning from all the trials and design a multi-facetted intervention. A bundle of care. And the idea of a bundle is that the  separate small effects will add or even synergise to create one big effect.  We are not so much interested in the mechanism as the outcome. Just like Ignaz Semmelweiss.

And we can now do something else. We can test our bundle of care using statistically robust tools that do not require a RCT.  They are just as statistically valid as a RCT but a different design.

And the appropriate tool for this to measure the time interval between adverse the events  – and then to plot this continuous metric as a time-series chart.

But we must be disciplined. First we must establish the baseline average interval and then we introduce our bundle and then we just keep measuring the intervals.

If our bundle works then the interval between the adverse events gets longer – and we can easily prove that using our time-series chart. The longer the interval the more ‘proof’ we have.  In fact we can even predict how long we need to observe to prove that ‘no events’ is a statistically significant improvement. That is an elegant an efficient design.


Here is a real and recent example.

The time-series chart below shows the interval in days between surgical site infections following routine hernia surgery. These are not life threatening complications. They rarely require re-admission or re-operation. But they are disruptive for patients. They cause pain, require treatment with antibiotics, and the delay recovery and return to normal activities. So we would like to avoid them if possible.

Hernia_SSI_CareBundle

The green and red lines show the baseline period. The  green line says that the average interval between SSIs is 14 days.  The red line says that an interval more than about 60 days would be surprisingly long: valid statistical evidence of an improvement.  The end of the green and red lines indicates when the intervention was made: when the evidence-based designer care bundle was adopted together with the discipline of applying it to every patient. No judgement. No variation.

The chart tells the story. No complicated statistical analysis is required. It shows a statistically significant improvement.  And the SSI rate fell by over 80%. That is a big improvement.

We still do not know how the care bundle works. We do not know which of the seven simultaneous simple and low-cost interventions we chose are the most important or even if they work independently or in synergy.  Knowledge of the mechanism was not our goal.

Our goal was to improve outcomes for our patients – to reduce avoidable harm – and that has been achieved. The evidence is clear.

That is Improvement Science in action.

And to read the full account of this example of the Science of Improvement please go to:

http://www.journalofimprovementscience.org

It is essay number 18.

And avoid another error of omission. If you have read this far please share this message – it is important.

The Battle of the Chimps

Chimp_BattleImprovement implies change.
Change implies action.
Action implies decision.

So how is the decision made?
With Urgency?
With Understanding?

Bitter experience teaches us that often there is an argument about what to do and when to do it.  An argument between two factions. Both are motivated by a combination of anger and fear. One side is motivated more by anger than fear. They vote for action because of the urgency of the present problem. The other side is motivated more by fear than anger. They vote for inaction because of their fear of future failure.

The outcome is unhappiness for everyone.

If the ‘action’ party wins the vote and a failure results then there is blame and recrimination. If the ‘inaction’ party wins the vote and a failure results then there is blame and recrimination. If either party achieves a success then there is both gloating and resentment. Lose Lose.

The issue is not the decision and how it is achieved.The problem is the battle.

Dr Steve Peters is a psychiatrist with 30 years of clinical experience.  He knows how to help people succeed in life through understanding how the caveman wetware between their ears actually works.

In the run up to the 2012 Olympic games he was the sports psychologist for the multiple-gold-medal winning UK Cycling Team.  The World Champions. And what he taught them is described in his book – “The Chimp Paradox“.

Chimp_Paradox_SmallSteve brilliantly boils the current scientific understanding of the complexity of the human mind down into a simple metaphor.

One that is accessible to everyone.

The metaphor goes like this:

There are actually two ‘beings’ inside our heads. The Chimp and the Human. The Chimp is the older, stronger, more emotional and more irrational part of our psyche. The Human is the newer, weaker, logical and rational part.  Also inside there is the Computer. It is just a memory where both the Chimp and the Human store information for reference later. Beliefs, values, experience. Stuff like that. Stuff they use to help them make decisions.

And when some new information arrives through our senses – sight and sound for example – the Chimp gets first dibs and uses the Computer to look up what to do.  Long before the Human has had time to analyse the new information logically and rationally. By the time the Human has even started on solving the problem the Chimp has come to a decision and signaled it to the Human and associated it with a strong emotion. Anger, Fear, Excitement and so on. The Chimp operates on basic drives like survival-of-the-self and survival-of-the-species. So if the Chimp gets spooked or seduced then it takes control – and it is the stronger so it always wins the internal argument.

But the human is responsible for the actions of the Chimp. As Steve Peters says ‘If your dog bites someone you cannot blame the dog – you are responsible for the dog‘.  So it is with our inner Chimps. Very often we end up apologising for the bad behaviour of our inner Chimp.

Because our inner Chimp is the stronger we cannot ‘control’ it by force. We have to learn how to manage the animal. We need to learn how to soothe it and to nurture it. And we need to learn how to remove the Gremlins that it has programmed into the Computer. Our inner Chimp is not ‘bad’ or ‘mad’ it is just a Chimp and it is an essential part of us.

Real chimpanzees are social, tribal and territorial.  They live in family groups and the strongest male is the boss. And it is now well known that a troop of chimpanzees in the wild can plan and wage battles to acquire territory from neighbouring troops. With casualties on both sides.  And so it is with people when their inner Chimps are in control.

Which is most of the time.

Scenario:
A hospital is failing one of its performance targets – the 18 week referral-to-treatment one – and is being threatened with fines and potential loss of its autonomy. The fear at the top drives the threat downwards. Operational managers are forced into action and do so using strategies that have not worked in the past. But they do not have time to learn how to design and test new ones. They are bullied into Plan-Do mode. The hospital is also required to provide safe care and the Plan-Do knee-jerk triggers fear-of-failure in the minds of the clinicians who then angrily oppose the diktat or quietly sabotage it.

This lose-lose scenario is being played out  in  100’s if not 1000’s of hospitals across the globe as we speak.  The evidence is there for everyone to see.

The inner Chimps are in charge and the outcome is a turf war with casualties on all sides.

So how does The Chimp Paradox help dissolve this seemingly impossible challenge?

First it is necessary to appreciate that both sides are being controlled by their inner Chimps who are reacting from a position of irrational fear and anger. This means that everyone’s behaviour is irrational and their actions likely to be counter-productive.

What is needed is for everyone to be managing their inner Chimps so that the Humans are back in control of the decision making. That way we get wise decisions that lead to effective actions and win-win outcomes. Without chaos and casualties.

To do this we all need to learn how to manage our own inner Chimps … and that is what “The Chimp Paradox” is all about. That is what helped the UK cyclists to become gold medalists.

In the scenario painted above we might observe that the managers are more comfortable in the Pragmatist-Activist (PA) half of the learning cycle. The Plan-Do part of PDSA  – to translate into the language of improvement. The clinicians appear more comfortable in the Reflector-Theorist (RT) half. The Study-Act part of PDSA.  And that difference of preference is fueling the firestorm.

Improvement Science tells us that to achieve and sustain improvement we need all four parts of the learning cycle working  smoothly and in sequence.

So what at first sight looks like it must be pitched battle which will result in two losers; in reality is could be a three-legged race that will result in everyone winning. But only if synergy between the PA and the RT halves can be achieved.

And that synergy is achieved by learning to respect, understand and manage our inner Chimps.

Rocket Science

ViewFromSpaceThis is a picture of Chris Hadfield. He is an astronaut and to prove it here he is in the ‘cupola’ of the International Space Station (ISS). Through the windows is a spectacular view of the Earth from space.

Our home seen from space.

What is remarkable about this image is that it even exists.

This image is tangible evidence of a successful outcome of a very long path of collaborative effort by 100’s of 1000’s of people who share a common dream.

That if we can learn to overcome the challenge of establishing a permanent manned presence in space then just imagine what else we might achieve?

Chis is unusual for many reasons.  One is that he is Canadian and there are not many Canadian astronauts. He is also the first Canadian astronaut to command the ISS.  Another claim to fame is that when he recently lived in space for 5 months on the ISS, he recorded a version of David Bowie’s classic song – for real – in space. To date this has clocked up 21 million YouTube hits and had helped to bring the inspiring story of space exploration back to the public consciousness.

Especially the next generation of explorers – our children.

Chris has also written a book ‘An Astronaut’s View of Life on Earth‘ that tells his story. It describes how he was inspired at a young age by seeing the first man to step onto the Moon in 1969.  He overcame seemingly impossible obstacles to become an astronaut, to go into space, and to command the ISS.  The image is tangible evidence.

We all know that space is a VERY dangerous place.  I clearly remember the two space shuttle disasters. There have been many other much less public accidents.  Those tragic events have shocked us all out of complacency and have created a deep sense of humility in those who face up to the task of learning to overcome the enormous technical and cultural barriers.

Getting six people into space safely, staying there long enough to conduct experiments on the long-term effects of weightlessness, and getting them back again safely is a VERY difficult challenge.  And it has been overcome. We have the proof.

Many of the seemingly impossible day-to-day problems that we face seem puny in comparison.

For example: getting every patient into hospital, staying there just long enough to benefit from cutting edge high-technology healthcare, and getting them back home again safely.

And doing it repeatedly and consistently so that the system can be trusted and we are not greeted with tragic stories every time we open a newspaper. Stories that erode our trust in the ability of groups of well-intended people to do anything more constructive than bully, bicker and complain.

So when the exasperated healthcare executive exclaims ‘Getting 95% of emergency admissions into hospital in less than 4 hours is not rocket science!‘ – then perhaps a bit more humility is in order. It is rocket science.

Rocket science is Improvement science.

And reading the story of a real-life rocket-scientist might be just the medicine our exasperated executives need.

Because Chris explains exactly how it is done.

And he is credible because he has walked-the-talk so he has earned the right to talk-the-walk.

The least we can do is listen and learn.

Here is is Chris answering the question ‘How to achieve an impossible dream?

Navigating the Nerve Curve

Nerve_CurveThe emotional roller-coaster ride that is associated with change, learning and improvement is called the Nerve Curve.

We are all very familiar with the first stages – of Shock, Denial, Anger, Bargaining, Depression and Despair.  We are less familiar with the stages associated with the long climb out to Resolution: because most improvement initiatives fail for one reason of another.

The critical first step is to “Disprove Impossibility” and this is the first injection of innovation. Someone (the ‘innovator’) discovers that what was believed to be impossible is not. They only have to have one example too. One Black Swan.

The tougher task is to influence those languishing in the ‘Depths of Despair’ that there is hope and that there is a ‘how’. This is not easy because cynicism is toxic to innovation.  So an experienced Improvement Science Practitioner (ISP) bypasses the cynics and engages with the depressed-but-still-healthy-skeptics.

The challenge now is how to get a shed load of them up the hill.

When we first learn to drive we start on the flat, not on hills,  for a very good reason. Safety.

We need to learn to become confident with the controls first. The brake, the accelerator, the clutch and the steering wheel.  This takes practice until it is comfortable, unconscious and almost second nature. We want to achieve a smooth transition from depression to delight, not chaotic kangaroo jumps!

Only when we can do that on the flat do we attempt a hill-start. And the key to a successful hill start is the sequence.  Hand brake on  for safety, out of gear, engine running, pointing at the goal. Then we depress the clutch and select a low gear – we do not want to stall. Speed is not the goal. Safety comes first. Then we rev the engine to give us the power we need to draw on. Then we ease the clutch until the force of the engine has overcome the force of gravity and we feel the car wanting to move forward. And only then do we ease the handbrake off, let the clutch out more and hit the gas to keep the engine revs in the green.

So when we are planning to navigate a group of healthy skeptics up the final climb of the Nerve Curve we need to plan and prepare carefully.

What is least likely to be successful?

Well, if all we have is our own set of wheels,  a cheap and cheerful mini-motor, then it is not going to be a good idea to shackle a trailer to it; fill the trailer with skeptics and attempt a hill start. We will either stall completely or burn out our clutch. We may even be dragged backwards into the Cynic Infested Toxic Swamp.

So what if we hire a bus, load up our skeptical passengers, and have a go.  We may be lucky –  but if we have no practice doing hill starts with a full bus then we could be heading for disappointment; or disaster.

So what is a safer plan:
1) First we need to go up the mountain ourselves to demonstrate it is possible.
2) Then we take one or two of the least skeptical up in our car to show it is safe.
3) We then invite those skeptics with cars to learn how to do safe hill starts.
4) Finally we ask the ex-skeptics to teach the fresh-skeptics how to do it.

Brmmmm Brmmmm. Off we go.

Jiggling

hurry_with_the_SFQP_kit[Dring] Bob’s laptop signaled the arrival of Leslie for their regular ISP remote coaching session.

<Bob> Hi Leslie. Thanks for emailing me with a long list of things to choose from. It looks like you have been having some challenging conversations.

<Leslie> Hi Bob. Yes indeed! The deepening gloom and the last few blog topics seem to be polarising opinion. Some are claiming it is all hopeless and others, perhaps out of desperation, are trying the FISH stuff for themselves and discovering that it works.  The ‘What Ifs’ are engaged in war of words with the ‘Yes Buts’.

<Bob> I like your metaphor! Where would you like to start on the long list of topics?

<Leslie> That is my problem. I do not know where to start. They all look equally important.

<Bob> So, first we need a way to prioritise the topics to get the horse-before-the-cart.

<Leslie> Sounds like a good plan to me!

<Bob> One of the problems with the traditional improvement approaches is that they seem to start at the most difficult point. They focus on ‘quality’ first – and to be fair that has been the mantra from the gurus like W.E.Deming. ‘Quality Improvement’ is the Holy Grail.

<Leslie>But quality IS important … are you saying they are wrong?

<Bob> Not at all. I am saying that it is not the place to start … it is actually the third step.

<Leslie>So what is the first step?

<Bob> Safety. Eliminating avoidable harm. Primum Non Nocere. The NoNos. The Never Events. The stuff that generates the most fear for everyone. The fear of failure.

<Leslie> You mean having a service that we can trust not to harm us unnecessarily?

<Bob> Yes. It is not a good idea to make an unsafe design more efficient – it will deliver even more cumulative harm!

<Leslie> OK. That makes perfect sense to me. So how do we do that?

<Bob> It does not actually matter.  Well-designed and thoroughly field-tested checklists have been proven to be very effective in the ‘ultra-safe’ industries like aerospace and nuclear.

<Leslie> OK. Something like the WHO Safe Surgery Checklist?

<Bob> Yes, that is a good example – and it is well worth reading Atul Gawande’s book about how that happened – “The Checklist Manifesto“.  Gawande is a surgeon who had published a lot on improvement and even so was quite skeptical that something as simple as a checklist could possibly work in the complex world of surgery. In his book he describes a number of personal ‘Ah Ha!’ moments that illustrate a phenomenon that I call Jiggling.

<Leslie> OK. I have made a note to read Checklist Manifesto and I am curious to learn more about Jiggling – but can we stick to the point? Does quality come after safety?

<Bob> Yes, but not immediately after. As I said, Quality is the third step.

<Leslie> So what is the second one?

<Bob> Flow.

There was a long pause – and just as Bob was about to check that the connection had not been lost – Leslie spoke.

<Leslie> But none of the Improvement Schools teach basic flow science.  They all focus on quality, waste and variation!

<Bob> I know. And attempting to improve quality before improving flow is like papering the walls before doing the plastering.  Quality cannot grow in a chaotic context. The flow must be smooth before that. And the fear of harm must be removed first.

<Leslie> So the ‘Improving Quality through Leadership‘ bandwagon that everyone is jumping on will not work?

<Bob> Well that depends on what the ‘Leaders’ are doing. If they are leading the way to learning how to design-for-safety and then design-for-flow then the bandwagon might be a wise choice. If they are only facilitating collaborative agreement and group-think then they may be making an unsafe and ineffective system more efficient which will steer it over the edge into faster decline.

<Leslie>So, if we can stabilize safety using checklists do we focus on flow next?

<Bob>Yup.

<Leslie> OK. That makes a lot of sense to me. So what is Jiggling?

<Bob> This is Jiggling. This conversation.

<Leslie> Ah, I see. I am jiggling my understanding through a series of ‘nudges’ from you.

<Bob>Yes. And when the learning cogs are a bit rusty, some Improvement Science Oil and a bit of Jiggling is more effective and much safer than whacking the caveman wetware with a big emotional hammer.

<Leslie>Well the conversation has certainly jiggled Safety-Flow-Quality-and-Productivity into a sensible order for me. That has helped a lot. I will sort my to-do list into that order and start at the beginning. Let me see. I have a plan for safety, now I can focus on flow. Here is my top flow niggle. How do I design the resource capacity I need to ensure the flow is smooth and the waiting times are short enough to avoid ‘persecution’ by the Target Time Police?

<Bob> An excellent question! I will send you the first ISP Brainteaser that will nudge us towards an answer to that question.

<Leslie> I am ready and waiting to have my brain-teased and my niggles-nudged!

The Speed of Trust

London_UndergroundSystems are built from intersecting streams of work called processes.

This iconic image of the London Underground shows a system map – a set of intersecting transport streams.

Each stream links a sequence of independent steps – in this case the individual stations.  Each step is a system in itself – it has a set of inner streams.

For a system to exhibit stable and acceptable behaviour the steps must be in synergy – literally ‘together work’. The steps also need to be in synchrony – literally ‘same time’. And to do that they need to be aligned to a common purpose.  In the case of a transport system the design purpose is to get from A to B safety, quickly, in comfort and at an affordable cost.

In large socioeconomic systems called ‘organisations’ the steps represent groups of people with special knowledge and skills that collectively create the desired product or service.  This creates an inevitable need for ‘handoffs’ as partially completed work flows through the system along streams from one step to another. Each step contributes to the output. It is like a series of baton passes in a relay race.

This creates the requirement for a critical design ingredient: trust.

Each step needs to be able to trust the others to do their part:  right-first-time and on-time.  All the steps are directly or indirectly interdependent.  If any one of them is ‘untrustworthy’ then the whole system will suffer to some degree. If too many generate dis-trust then the system may fail and can literally fall apart. Trust is like social glue.

So a critical part of people-system design is the development and the maintenance of trust-bonds.

And it does not happen by accident. It takes active effort. It requires design.

We are social animals. Our default behaviour is to trust. We learn distrust by experiencing repeated disappointments. We are not born cynical – we learn that behaviour.

The default behaviour for inanimate systems is disorder – and it has a fancy name – it is called ‘entropy’. There is a Law of Physics that says that ‘the average entropy of a system will increase over time‘. The critical word is ‘average’.

So, if we are not aware of this and we omit to pay attention to the hand-offs between the steps we will observe increasing disorder which leads to repeated disappointments and erosion of trust. Our natural reaction then is ‘self-protect’ which implies ‘check-and-reject’ and ‘check and correct’. This adds complexity and bureaucracy and may prevent further decline – which is good – but it comes at a cost – quite literally.

Eventually an equilibrium will be achieved where our system performance is limited by the amount of check-and-correct bureaucracy we can afford.  This is called a ‘mediocrity trap’ and it is very resilient – which means resistant to change in any direction.


To escape from the mediocrity trap we need to break into the self-reinforcing check-and-reject loop and we do that by developing a design that challenges ‘trust eroding behaviour’.  The strategy is to develop a skill called  ‘smart trust’.

To appreciate what smart trust is we need to view trust as a spectrum: not as a yes/no option.

At one end is ‘nonspecific distrust’ – otherwise known as ‘cynical behaviour’. At the other end is ‘blind trust’ – otherwise  known and ‘gullible behaviour’.  Neither of these are what we need.

In the middle is the zone of smart trust that spans healthy scepticism  through to healthy optimism.  What we need is to maintain a balance between the two – not to eliminate them. This is because some people are ‘glass-half-empty’ types and some are ‘glass-half-full’. And both views have a value.

The action required to develop smart trust is to respectfully challenge every part of the organisation to demonstrate ‘trustworthiness’ using evidence.  Rhetoric is not enough. Politicians always score very low on ‘most trusted people’ surveys.

The first phase of this smart trust development is for steps to demonstrate trustworthiness to themselves using their own evidence, and then to share this with the steps immediately upstream and downstream of them.

So what evidence is needed?

SFQP1Safety comes first. If a step cannot be trusted to be safe then that is the first priority. Safe systems need to be designed to be safe.

Flow comes second. If the streams do not flow smoothly then we experience turbulence and chaos which increases stress,  the risk of harm and creates disappointment for everyone. Smooth flow is the result of careful  flow design.

Third is Quality which means ‘setting and meeting realistic expectations‘.  This cannot happen in an unsafe, chaotic system.  Quality builds on Flow which builds on Safety. Quality is a design goal – an output – a purpose.

Fourth is Productivity (or profitability) and that does not automatically follow from the other three as some QI Zealots might have us believe. It is possible to have a safe, smooth, high quality design that is unaffordable.  Productivity needs to be designed too.  An unsafe, chaotic, low quality design is always more expensive.  Always. Safe, smooth and reliable can be highly productive and profitable – if designed to be.

So whatever the driver for improvement the sequence of questions is the same for every step in the system: “How can I demonstrate evidence of trustworthiness for Safety, then Flow, then Quality and then Productivity?”

And when that happens improvement will take off like a rocket. That is the Speed of Trust.  That is Improvement Science in Action.

Find and Fill

Many barriers to improvement are invisible.

This is because they are caused by what is not present rather than what is.  They are gaps or omissions.

Some gaps are blindingly obvious.  This is because we expect to see something there so we notice when it is missing. We would notice the gap if a rope bridge across chasm is obviously missing because only end posts are visible.

Many gaps are not obvious. This is because we have no experience or expectation.  The gap is invisible.  We are blind to the omission.

These are the gaps that we accidentally stumble into. Such as a gap in our knowledge and understanding that we cannot see. These are the gaps that create the fear of failure. And the fear is especially real because the gap is invisible and we only know when it is too late.

minefieldIt is like walking across an emotional minefield.  At any moment we could step on an ignorance mine and our confidence would be blasted into fragments.

So our natural and reasonable reaction is to stay outside the emotional minefield and inside our comfort zones – where we feel safe.  We give up trying to learn and trying to improve. Every-one hopes that Some-one or Any-one will do it for us.  No-one does.

The path to Improvement is always across an emotional minefield because improvement implies unlearning. So we need a better design than blundering about hoping not to fall into an invisible gap.  We need a safer design.

There are a number of options:

Option 1. Ask someone who knows the way across the minefield and can demonstrate it. Someone who knows where the mines are and knows how to avoid them. Someone to tell us where to step and where not to.

Option 2. Clear a new path and mark it clearly so others can trust that it is safe.  Remove the ignorance mines. Find and Fill the knowledge map.

Option 1 is quicker but it leaves the ignorance mines in place.  So sooner or later someone will step on one. Boom!

We need to be able to do Option 2.

The obvious  strategy for Option 2 is to clear the ignorance mines.  We could do this by deliberately blundering about setting off the mines. We could adopt the burn-and-scrape or learn-from-mistakes approach.

Or we could detect, defuse and remove them.

The former requires people willing to take emotional risks; the latter does not require such a sacrifice.

And “learn-by-mistakes” only works if people are able to make mistakes visibly so everyone can learn. In an adversarial, competitive, distrustful context this can not happen: and the result is usually for the unwilling troops to be forced into the minefield with the threat of a firing-squad if they do not!

And where a mistake implies irreversible harm it is not acceptable to learn that way. Mistakes are covered up. The ignorance mines are re-set for the next hapless victim to step on. The emotional carnage continues. Any change 0f sustained, system-wide improvement is blocked.

So in a low-trust cultural context the detect-defuse-and-remove strategy is the safer option.

And this requires a proactive approach to finding the gaps in understanding; a proactive approach to filling the knowledge holes; and a proactive approach to sharing what was learned.

Or we could ask someone who knows where the ignorance mines are and work our way through finding and filling our knowledge gaps. By that means any of us can build a safe, effective and efficient path to sustainable improvement.

And the person to ask is someone who can demonstrate a portfolio of improvement in practice – an experienced Improvement Science Practitioner.

And we can all learn to become an ISP and then guide others across their own emotional minefields.

All we need to do is take the first step on a well-trodden path to sustained improvement.

Taming the Wicked Bull and the OH Effect

bull_by_the_horns_anim_150_wht_9609Take the bull by the horns” is a phrase that is often heard in Improvement circles.

The metaphor implies that the system – the bull – is an unpredictable, aggressive, wicked, wild animal with dangerous sharp horns.

“Unpredictable” and “Dangerous” is certainly what the newspapers tell us the NHS system is – and this generates fear.  Fear-for-our-safety and fear drives us to avoid the bad tempered beast.

It creates fear in the hearts of the very people the NHS is there to serve – the public.  It is not the intended outcome.

Bullish” is a phrase we use for “aggressive behaviour” and it is disappointing to see those accountable behave in a bullish manner – aggressive, unpredictable and dangerous.

We are taught that bulls are to be  avoided and we are told to not to wave red flags at them! For our own safety.

But that is exactly what must happen for Improvement to flourish.  We all need regular glimpses of the Red Flag of Reality.  It is called constructive feedback – but it still feels uncomfortable.  Our natural tendency to being shocked out of our complacency is to get angry and to swat the red flag waver.  And the more powerful we are,  the sharper our horns are, the more swatting we can do and the more fear we can generate.  Often intentionally.

So inexperienced improvement zealots are prodded into “taking the executive bull by the horns” – but it is poor advice.

Improvement Scientists are not bull-fighters. They are not fearless champions who put themselves at personal risk for personal glory and the entertainment of others.  That is what Rescuers do. The fire-fighters; the quick-fixers; the burned-toast-scrapers; the progress-chasers; and the self-appointed-experts. And they all get gored by an angry bull sooner or later.  Which is what the crowd came to see – Bull Fighter Blood and Guts!

So attempting to slay the wicked bullish system is not a realistic option.

What about taming it?

This is the game of Bucking Bronco.  You attach yourself to the bronco like glue and wear it down as it tries to throw you off and trample you under hoof. You need strength, agility, resilience and persistence. All admirable qualities. Eventually the exhausted beast gives in and does what it is told. It is now tamed. You have broken its spirit.  The stallion is no longer a passionate leader; it is just a passive follower. It has become a Victim.

Improvement requires spirit – lots of it.

Improvement requires the spirit-of-courage to challenge dogma and complacency.
Improvement requires the spirit-of-curiosity to seek out the unknown unknowns.
Improvement requires the spirit-of-bravery to take calculated risks.
Improvement requires the spirit-of-action to make  the changes needed to deliver the improvements.
Improvement requires the spirit-of-generosity to share new knowledge, understanding and wisdom.

So taming the wicked bull is not going to deliver sustained improvement.  It will only achieve stable mediocrity.

So what next?

What about asking someone who has actually done it – actually improved something?

Good idea! Who?

What about someone like Don Berwick – founder of the Institute of Healthcare Improvement in the USA?

Excellent idea! We will ask him to come and diagnose the disease in our system – the one that lead to the Mid-Staffordshire septic safety carbuncle, and the nasty quality rash in 14 Trusts that Professor Sir Bruce Keogh KBE uncovered when he lifted the bed sheet.

[Click HERE to see Dr Bruce’s investigation].

We need a second opinion because the disease goes much deeper – and we need it from a credible, affable, independent, experienced expert. Like Dr Don B.

So Dr Don has popped over the pond,  examined the patient, formulated his diagnosis and delivered his prescription.

[Click HERE to read Dr Don’s prescription].

Of course if you ask two experts the same question you get two slightly different answers.  If you ask ten you get ten.  This is because if there was only one answer that everyone agreed on then there would be no problem, no confusion, and need for experts. The experts know this of course. It is not in their interest to agree completely.

One bit of good news is that the reports are getting shorter.  Mr Robert’s report on the failing of one hospital is huge and has 209 recommendations.  A bit of a bucketful.  Dr Bruce’s report is specific to the Naughty Fourteen who have strayed outside the statistical white lines of acceptable mediocrity.

Dr Don’s is even shorter and it has just 10 recommendations. One for each finger – so easy to remember.

1. The NHS should continually and forever reduce patient harm by embracing wholeheartedly an ethic of learning.

2. All leaders concerned with NHS healthcare – political, regulatory, governance, executive, clinical and advocacy – should place quality of care in general, and patient safety in particular, at the top of their priorities for investment, inquiry, improvement, regular reporting, encouragement and support.

3. Patients and their carers should be present, powerful and involved at all levels of healthcare organisations from wards to the boards of Trusts.

4. Government, Health Education England and NHS England should assure that sufficient staff are available to meet the NHS’s needs now and in the future. Healthcare organisations should ensure that staff are present in appropriate numbers to provide safe care at all times and are well-supported.

5. Mastery of quality and patient safety sciences and practices should be part of initial preparation and lifelong education of all health care professionals, including managers and executives.

6. The NHS should become a learning organisation. Its leaders should create and support the capability for learning, and therefore change, at scale, within the NHS.

7. Transparency should be complete, timely and unequivocal. All data on quality and safety, whether assembled by government, organisations, or professional societies, should be shared in a timely fashion with all parties who want it, including, in accessible form, with the public.

8. All organisations should seek out the patient and carer voice as an essential asset in monitoring the safety and quality of care.

9. Supervisory and regulatory systems should be simple and clear. They should avoid diffusion of responsibility. They should be respectful of the goodwill and sound intention of the vast majority of staff. All incentives should point in the same direction.

10. We support responsive regulation of organisations, with a hierarchy of responses. Recourse to criminal sanctions should be extremely rare, and should function primarily as a deterrent to wilful or reckless neglect or mistreatment.

The meat in the sandwich are recommendations 5 and 6 that together say “Learn Improvement Science“.

And what happens when we commit and engage in that learning journey?

Steve Peak has described what happens in this this very blog. It is called the OH effect.

OH stands for “Obvious-in-Hindsight”.

Obvious means “understandable” which implies visible, sensible, rational, doable and teachable.

Hindsight means “reflection” which implies having done something and learning from reality.

So if you would like to have a sip of Dr Don’s medicine and want to get started on the path to helping to create a healthier healthcare system you can do so right now by learning how to FISH – the first step to becoming an Improvement Science Practitioner.

The good news is that this medicine is neither dangerous nor nasty tasting – it is actually fun!

And that means it is OK for everyone – clinicians, managers, patients, carers and politicians.  All of us.

 

Burn-and-Scrape


telephone_ringing_300_wht_14975[Ring Ring]

<Bob> Hi Leslie how are you to today?

<Leslie> I am good thanks Bob and looking forward to today’s session. What is the topic?

<Bob> We will use your Niggle-o-Gram® to choose something. What is top of the list?

<Leslie> Let me see.  We have done “Engagement” and “Productivity” so it looks like “Near-Misses” is next.

<Bob> OK. That is an excellent topic. What is the specific Niggle?

<Leslie> “We feel scared when we have a safety near-miss because we know that there is a catastrophe waiting to happen.”

<Bob> OK so the Purpose is to have a system that we can trust not to generate avoidable harm. Is that OK?

<Leslie> Yes – well put. When I ask myself the purpose question I got a “do” answer rather than a “have” one. The word trust is key too.

<Bob> OK – what is the current safety design used in your organisation?

<Leslie> We have a computer system for reporting near misses – but it does not deliver the purpose above. If the issue is ranked as low harm it is just counted, if medium harm then it may be mentioned in a report, and if serious harm then all hell breaks loose and there is a root cause investigation conducted by a committee that usually results in a new “you must do this extra check” policy.

<Bob> Ah! The Burn-and-Scrape model.

<Leslie>Pardon? What was that? Our Governance Department call it the Swiss Cheese model.

<Bob> Burn-and-Scrape is where we wait for something to go wrong – we burn the toast – and then we attempt to fix it – we scrape the burnt toast to make it look better. It still tastes burnt though and badly burnt toast is not salvageable.

<Leslie>Yes! That is exactly what happens all the time – most issues never get reported – we just “scrape the burnt toast” at all levels.

fire_blaze_s_150_clr_618 fire_blaze_h_150_clr_671 fire_blaze_n_150_clr_674<Bob> One flaw with the Burn-and-Scrape design is that harm has to happen for the design to work.

It is all reactive.

Another design flaw is that it focuses attention on the serious harm first – avoidable mortality for example.  Counting the extra body bags completely misses the purpose.  Avoidable death means avoidably shortened lifetime.  Avoidable non-fatal will also shorten lifetime – and it is even harder to measure.  Just consider the cumulative effect of all that non-fatal life-shortening avoidable-but-ignored harm?

Most of the reasons that we live longer today is because we have removed a lot of lifetime shortening hazards – like infectious disease and severe malnutrition.

Take health care as an example – accurately measuring avoidable mortality in an inherently high-risk system is rather difficult.  And to conclude “no action needed” from “no statistically significant difference in mortality between us and the global average” is invalid and it leads to a complacent delusion that what we have is good enough.  When it comes to harm it is never “good enough”.

<Leslie> But we do not have the resources to investigate the thousands of cases of minor harm – we have to concentrate on the biggies.

<Bob> And do the near misses keep happening?

<Leslie> Yes – that is why they are top rank  on the Niggle-o-Gram®.

<Bob> So the Burn-and-Scrape design is not fit-for-purpose.

<Leslie> So it seems. But what is the alternative? If there was one we would be using it – surely?

<Bob> Look back Leslie. How many of the Improvement Science methods that you have already learned are business-as-usual?

<Leslie> Good point. Almost none.

<Bob> And do they work?

<Leslie> You betcha!

<Bob> This is another example.  It is possible to design systems to be safe – so the frequent near misses become rare events.

<Leslie> Is it?  Wow! That know-how would be really useful to have. Can you teach me?

<Bob> Yes. First we need to explore what the benefits would be.

<Leslie> OK – well first there would be no avoidable serious harm and we could trust in the safety of our system – which is the purpose.

<Bob> Yes …. and?

<Leslie> And … all the effort, time and cost spent “scraping the burnt toast” would be released.

<Bob> Yes …. and?

<Leslie> The safer-by-design processes would be quicker and smoother, a more enjoyable experience for both customers and suppliers, and probably less expensive as well!

<Bob> Yes. So what does that all add up to?

<Leslie> A win-win-win-win outcome!

<Bob> Indeed. So a one-off investment of effort, time and money in learning Safety-by-Design methods would appear to be a wise business decision.

<Leslie> Yes indeed!  When do we start?

<Bob> We have already started.


For a real-world example of this approach delivering a significant and sustained improvement in safety click here.

Do Not Give Up Too Soon

clock_hands_spinning_import_150_wht_3149Tangible improvement takes time. Sometimes it takes a long time.

The more fundamental the improvement the more people are affected. The more people involved the greater the psychological inertia. The greater the resistance the longer it takes to show tangible effects.

The advantage of deep-level improvement is that the cumulative benefit is greater – the risk is that the impatient Improvementologist may give up too early – sometimes just before the benefit becomes obvious to all.

The seeds of change need time to germinate and to grow – and not all good ideas will germinate. The green shoots of innovation do not emerge immediately – there is often a long lag and little tangible evidence for a long time.

This inevitable  delay is a source of frustration, and the impatient innovator can unwittingly undo their good work.  By pushing too hard they can drag a failure from the jaws of success.

Q: So how do we avoid this trap?

The trick is to understand the effect of the change on the system.  This means knowing where it falls on our Influence Map that is marked with the Circles of Control, Influence and Concern.

Our Circle of Concern includes all those things that we are aware of that present a threat to our future survival – such as a chunk of high-velocity space rock smashing into the Earth and wiping us all out in a matter of milliseconds. Gulp! Very unlikely but not impossible.

Some concerns are less dramatic – such as global warming – and collectively we may have more influence over changing that. But not individually.

Our Circle of Influence lies between the limit of our individual control and the limit of our collective control. This a broad scope because “collective” can mean two, twenty, two hundred, two thousand, two million, two billion and so on.

Making significant improvements is usually a Circle of Influence challenge and only collectively can we make a difference.  But to deliver improvement at this level we have to influence others to change their knowledge, understanding, attitudes, beliefs and behaviour. That is not easy and that is not quick. It is possible though – with passion, plausibility, persistence, patience – and an effective process.

It is here that we can become impatient and frustrated and are at risk of giving up too soon – and our temperaments influence the risk. Idealists are impatient for fundamental change. Rationals, Guardians and Artisans do not feel the same pain – and it is a rich source of conflict.

So if we need to see tangible results quickly then we have to focus closer to home. We have to work inside our Circle of Individual Influence and inside our Circle of Control.  The scope of individual influence varies from person-to-person but our Circle of Control is the same for all of us: the outer limit is our skin.  We all choose our behaviour and it is that which influences others: for better or for worse.  It is not what we think it is what we do. We cannot read or control each others minds. We can all choose our attitudes and our actions.

So if we want to see tangible improvement quickly then we must limit the scope of our action to our Circle of Individual Influence and get started.  We do what we can and as soon as we can.

Choosing what to do and what not do requires wisdom. That takes time to develop too.


Making an impact outside the limit of our Circle of Individual Influence is more difficult because it requires influencing many other people.

So it is especially rewarding for to see examples of how individual passion, persistence and patience have led to profound collective improvement.  It proves that it is still possible. It provides inspiration and encouragement for others.

One example is the recently published Health Foundation Quality, Cost and Flow Report.

This was a three-year experiment to test if the theory, techniques and tools of Improvement Science work in healthcare: specifically in two large UK acute hospitals – Sheffield and Warwick.

The results showed that Improvement Science does indeed work in healthcare and it worked for tough problems that were believed to be very difficult if not impossible to solve. That is very good news for everyone – patients and practitioners.

But the results have taken some time to appear in published form – so it is really good news to report that the green shoots of improvement are now there for all to see.

The case studies provide hard evidence that win-win-win outcomes are possible and achievable in the NHS.

The Impossibility Hypothesis has been disproved. The cynics can step off the bus. The skeptics have their evidence and can now become adopters.

And the report offers a lot of detail on how to do it including two references that are available here:

  1. A Recipe for Improvement PIE
  2. A Study of Productivity Improvement Tactics using a Two-Stream Production System Model

These references both describe the fundamentals of how to align financial improvement with quality and delivery improvement to achieve the elusive win-win-win outcome.

A previously invisible door has opened to reveal a new Land of Opportunity. A land inhabited by Improvementologists who mark the path to learning and applying this new knowledge and understanding.

There are many who do not know what to do to solve the current crisis in healthcare – they now have a new vista to explore.

Do not give up too soon –  there is a light at the end of the dark tunnel.

And to get there safely and quickly we just need to learn and apply the Foundations of Improvement Science in Healthcare – and we first learn to FISH in our own ponds first.

fish

Creep-Crack-Crunch

The current crisis of confidence in the NHS has all the hallmarks of a classic system behaviour called creep-crack-crunch.

The first obvious crunch may feel like a sudden shock but it is usually not a complete surprise and it is actually one of a series of cracks that are leading up to a BIG CRUNCH. These cracks are an early warning sign of pressure building up in parts of the system and causing localised failures. These cracks weaken the whole system. The underlying cause is called creep.

SanFrancisco_PostEarthquake

Earthquakes are a perfect example of this phenomemon. Geological time scales are measured in thousands of years and we now know that the surface of the earth is a dynamic structure with vast contient-sized plates of solid rock floating on a liquid core of molten magma. Over millions of years the continents have moved huge distances and the world we see today on our satellite images is just a single frame in a multi-billion year geological video.  That is the geological creep bit. The cracks first appear at the edges of these tectonic plates where they smash into each other, grind past each other or are pulled apart from each other.  The geological hot-spots are marked out on our global map by lofty mountain ranges, fissured earthquake zones, and deep mid-ocean trenches. And we know that when a geological crunch arrives it happens in a blink of the geological eye.

The panorama above shows the devastation of San Francisco caused by the 1906 earthquake. San Francisco is built on the San Andreas Fault – the junction between the Pacific plate and the North American plate. The dramatic volcanic eruption in Iceland in 2010 came and went in a matter of weeks but the irreversible disruption it caused for global air traffic will be felt for years. The undersea earthquakes that caused the devastating tsunamis in 2006 and 2011 lasted only a few minutes; the deadly shock waves crossed an ocean in a matter of hours; and when they arrived the silent killer wiped out whole shoreside communities in seconds. Tens of thousands of lives were lost and the social after-shocks of that geological-crunch will be felt for decades.

These are natural disasters. We have little or no influence over them. Human-engineered disasters are a different matter – and they are just as deadly.

The NHS is an example. We are all painfully aware of the recent crisis of confidence triggered by the Francis Report. Many could see the cracks appearing and tried to blow their warning whistles but with little effect – they were silenced with legal gagging clauses and the opening cracks were papered over. It was only after the crunch that we finally acknowledged what we already knew and we started to search for the creep. Remorse and revenge does not bring back those who have been lost.  We need to focus on the future and not just point at the past.

UK_PopulationPyramid_2013Socio-economic systems evolve at a pace that is measured in years. So when a social crunch happens it is necessary to look back several decades for the tell-tale symptoms of creep and the early signs of cracks appearing.

Two objective measures of a socio-economic system are population and expenditure.

Population is people-in-progress; and national expenditure is the flow of the cash required to keep the people-in-progress watered, fed, clothed, housed, healthy and occupied.

The diagram above is called a population pyramid and it shows the distribution by gender and age of the UK population in 2013. The wobbles tell a story. It does rather look like the profile of a bushy-eyebrowed, big-nosed, pointy-chinned old couple standing back-to-back and maybe there is a hidden message for us there?

The “eyebrow” between ages 67 and 62 is the increase in births that happened 62 to 67 years ago: betwee 1946 and 1951. The post WWII baby boom.  The “nose” of 42-52 year olds are the “children of the 60’s” which was a period of rapid economic growth and new optimism. The “upper lip” at 32-42 correlates with the 1970’s that was a period of stagnant growth,  high inflation, strikes, civil unrest and the dark threat of global thermonuclear war. This “stagflation” is now believed to have been triggered by political meddling in the Middle-East that led to the 1974 OPEC oil crisis and culminated in the “winter of discontent” in 1979.  The “chin” signals there was another population expansion in the 1980s when optimism returned (SALT-II was signed in 1979) and the economy was growing again. Then the “neck” contraction in the 1990’s after the 1987 Black Monday global stock market crash.  Perhaps the new optimism of the Third Millenium led to the “chest” expansion but the financial crisis that followed the sub-prime bubble to burst in 2008 has yet to show its impact on the population chart. This static chart only tells part of the story – the animated chart reveals a significant secondary expansion of the 20-30 year old age group over the last decade. This cannot have been caused by births and is evidence of immigration of a large number of young couples – probably from the expanding Europe Union.

If this “yo-yo” population pattern is repeated then the current economic downturn will be followed by a contraction at the birth end of the spectrum and possibly also net emigration. And that is a big worry because each population wave takes a 100 years to propagate through the system. The most economically productive population – the  20-60 year olds  – are the ones who pay the care bills for the rest. So having a population curve with lots of wobbles in it causes long term socio-economic instability.

Using this big-picture long-timescale perspective; evidence of an NHS safety and quality crunch; silenced voices of cracks being papered-over; let us look for the historical evidence of the creep.

Nowadays the data we need is literally at our fingertips – and there is a vast ocean of it to swim around in – and to drown in if we are not careful.  The Office of National Statistics (ONS) is a rich mine of UK socioeconomic data – it is the source of the histogram above.  The trick is to find the nuggets of knowledge in the haystack of facts and then to convert the tables of numbers into something that is a bit more digestible and meaningful. This is what Russ Ackoff descibes as the difference between Data and Information. The data-to-information conversion needs context.

Rule #1: Data without context is meaningless – and is at best worthless and at worse is dangerous.

boxes_connected_PA_150_wht_2762With respect to the NHS there is a Minotaur’s Labyrinth of data warehouses – it is fragmented but it is out there – in cyberspace. The Department of Health publishes some on public sites but it is a bit thin on context so it can be difficult to extract the meaning.

Relying on our memories to provide the necessary context is fraught with problems. Memories are subject to a whole range of distortions, deletions, denials and delusions.  The NHS has been in existence since 1948 and there are not many people who can personally remember the whole story with objective clarity.  Fortunately cyberspace again provides some of what we need and with a few minutes of surfing we can discover something like a website that chronicles the history of the NHS in decades from its creation in 1948 – http://www.nhshistory.net/ – created and maintained by one person and a goldmine of valuable context. The decade that is of particular interest is 1998-2007 – Chapter 6

With just some data and some context it is possible to pull together the outline of the bigger picture of the decade that led up to the Mid Staffordshire healthcare quality crunch.

We will look at this as a NHS system evolving over time within its broader UK context. Here is the time-series chart of the population of England – the source of the demand on the NHS.

Population_of_England_1984-2010This shows a significant and steady increase in population – 12% overall between 1984 an 2012.

This aggregate hides a 9% increase in the under 65 population and 29% growth in the over 65 age group.

This is hard evidence of demographic creep – a ticking health and social care time bomb. And the curve is getting steeper. The pressure is building.

The next bit of the map we need is a measure of the flow through hospitals – the activity – and this data is available as the annual HES (Hospital Episodes Statistics) reports.  The full reports are hundreds of pages of fine detail but the headline summaries contain enough for our present purpose.

NHS_HES_Admissions_1997-2011

The time- series chart shows a steady increase in hospital admissions. Drilling into the summaries revealed that just over a third are emergency admissions and the rest are planned or maternity.

In the decade from 1998 to 2008 there was a 25% increase in hospital activity. This means more work for someone – but how much more and who for?

But does it imply more NHS beds?

Beds require wards, buildings and infrastructure – but it is the staff that deliver the health care. The bed is just a means of storage.  One measure of capacity and cost is the number of staffed beds available to be filled.  But this like measuring the number of spaces in a car park – it does not say much about flow – it is a just measure of maximum possible work in progress – the available space to hold the queue of patients who are somewhere between admission and discharge.

Here is the time series chart of the number of NHS beds from 1984 to 2006. The was a big fall in the number of beds in the decade after 1984 [Why was that?]

NHS_Beds_1984-2006

Between 1997 and 2007 there was about a 10% fall in the number of beds. The NHS patient warehouse was getting smaller.

But the activity – the flow – grew by 25% over the same time period: so the Laws Of Physics say that the flow must have been faster.

The average length of stay must have been falling.

This insight has another implication – fewer beds must mean smaller hospitals and lower costs – yes?  After all everyone seems to equate beds-to-cost; more-beds-cost-more less-beds-cost-less. It sounds reasonable. But higher flow means more demand and more workload so that would require more staff – and that means higher costs. So which is it? Less, the same or more cost?

NHS_Employees_1996_2007The published data says that staff headcount  went up by 25% – which correlates with the increase in activity. That makes sense.

And it looks like it “jumped” up in 2003 so something must have triggered that. More cash pumped into the system perhaps? Was that the effect of the Wanless Report?

But what type of staff? Doctors? Nurses? Admin and Clerical? Managers?  The European Working Time Directive (EWTD) forced junior doctors hours down and prompted an expansion of consultants to take on the displaced service work. There was also a gradual move towards specialisation and multi-disciplinary teams. What impact would that have on cost? Higher most likely. The system is getting more complex.

Of course not all costs have the same impact on the system. About 4% of staff are classified as “management” and it is this group that are responsible for strategic and tactical planning. Managers plan the work – workers work the plan.  The cost and efficiency of the management component of the system is not as useful a metric as the effectiveness of its collective decision making. Unfortuately there does not appear to be any published data on management decision making qualty and effectiveness. So we cannot estimate cost-effectiveness. Perhaps that is because it is not as easy to measure effectiveness as it is to count admissions, discharges, head counts, costs and deaths. Some things that count cannot easily be counted. The 4% number is also meaningless. The human head represents about 4% of the bodyweight of an adult person – and we all know that it is not the size of our heads that is important it is the effectiveness of the decisions that it makes which really counts!  Effectiveness, efficiency and costs are not the same thing.

Back to the story. The number of beds went down by 10% and number of staff went up by 25% which means that the staff-per-bed ratio went up by nearly 40%.  Does this mean that each bed has become 25% more productive or 40% more productive or less productive? [What exactly do we mean by “productivity”?]

To answer that we need to know what the beds produced – the discharges from hospital and not just the total number, we need the “last discharges” that signal the end of an episode of hospital care.

NHS_LastDischarges_1998-2011The time-series chart of last-discharges shows the same pattern as the admissions: as we would expect.

This output has two components – patients who leave alive and those who do not.

So what happened to the number of deaths per year over this period of time?

That data is also published annually in the Hospital Episode Statistics (HES) summaries.

This is what it shows ….

NHS_Absolute_Deaths_1998-2011The absolute hospital mortality is reducing over time – but not steadily. It went up and down between 2000 and 2005 – and has continued on a downward trend since then.

And to put this into context – the UK annual mortality is about 600,000 per year. That means that only about 40% of deaths happen in hospitals. UK annual mortality is falling and births are rising so the population is growing bigger and older.  [My head is now starting to ache trying to juggle all these numbers and pictures in it].

This is not the whole story though – if the absolute hospital activity is going up and the absolute hospital mortality is going down then this raw mortality number may not be telling the whole picture. To correct for those effects we need the ratio – the Hospital Mortality Ratio (HMR).

NHS_HospitalMortalityRatio_1998-2011This is the result of combining these two metrics – a 40% reduction in the hospital mortality ratio.

Does this mean that NHS hospitals are getting safer over time?

This observed behaviour can be caused by hospitals getting safer – it can also be caused by hospitals doing more low-risk work that creates a dilution effect. We would need to dig deeper to find out which. But that will distract us from telling the story.

Back to productivity.

The other part of the productivity equation is cost.

So what about NHS costs?  A bigger, older population, more activity, more staff, and better outcomes will all cost more taxpayer cash, surely! But how much more?  The activity and head count has gone up by 25% so has cost gone up by the same amount?

NHS_Annual_SpendThis is the time-series chart of the cost per year of the NHS and because buying power changes over time it has been adjusted using the Consumer Price Index using 2009 as the reference year – so the historical cost is roughly comparable with current prices.

The cost has gone up by 100% in one decade!  That is a lot more than 25%.

The published financial data for 2006-2010 shows that the proportion of NHS spending that goes to hospitals is about 50% and this has been relatively stable over that period – so it is reasonable to say that the increase in cash flowing to hospitals has been about 100% too.

So if the cost of hospitals is going up faster than the output then productivity is falling – and in this case it works out as a 37% drop in productivity (25% increase in activity for 100% increase in cost = 37% fall in productivity).

So the available data which anyone with a computer, an internet connection, and some curiosity can get; and with bit of spreadsheet noggin can turn into pictures shows that over the decade of growth that led up to the the Mid Staffs crunch we had:

1. A slightly bigger population; and a
2. significantly older population; and a
3. 25% increase in NHS hospital activity; and a
4. 10% fall in NHS beds; and a
5. 25% increase in NHS staff; which gives a
6. 40% increase in staff-per-bed ratio; an an
7. 8% reduction in absolute hospital mortality; which gives a
8. 40% reduction in relative hospital mortality; and a
9. 100% increase in NHS  hospital cost; which gives a
10. 37% fall drop in “hospital productivity”.

An experienced Improvement Scientist knows that a system that has been left to evolve by creep-crack-and-crunch can be re-designed to deliver higher quality and higher flow at lower total cost.

The safety creep at Mid-Staffs is now there for all to see. A crack has appeared in our confidence in the NHS – and raises a couple of crunch questions:

Where Has All The Extra Money Gone?

 How Will We Avoid The BIG CRUNCH?

The huge increase in NHS funding over the last decade was the recommendation of the Wanless Report but the impact of implementing the recommendations has never been fully explored. Healthcare is a service system that is designed to deliver two intangible products – health and care. So the major cost is staff-time – particularly the clinical staff.  A 25% increase in head count and a 100% increase in cost implies that the heads are getting more expensive.  Either a higher proportion of more expensive clinically trained and registered staff, or more pay for the existing staff or both.  The evidence shows that about 50% of NHS Staff are doctors and nurses and over the last decade there has been a bigger increase in the number of doctors than nurses. Added to that the Agenda for Change programme effectively increased the total wage bill and the new contracts for GPs and Consultants added more upward wage pressure.  This is cost creep and it adds up over time. The Kings Fund looked at the impact in 2006 and suggested that, in that year alone, 72% of the additional money was sucked up by bigger wage bills and other cost-pressures! The previous year they estimated 87% of the “new money” had disappeared hte same way. The extra cash is gushing though the cracks in the bottom of the fiscal bucket that had been clumsily papered-over. And these are recurring revenue costs so they add up over time into a future financial crunch.  The biggest one may be yet to come – the generous final-salary pensions that public-sector employees enjoy!

So it is even more important that the increasingly expensive clinical staff are not being forced to spend their time doing work that has no direct or indirect benefit to patients.

Trying to do a good job in a poorly designed system is both frustrating and demotivating – and the outcome can be a cynical attitude of “I only work here to pay the bills“. But as public sector wages go up and private sector pensions evaporate the cynics are stuck in a miserable job that they cannot afford to give up. And their negative behaviour poisons the whole pool. That is the long term cumulative cultural and financial cost of poor NHS process design. That is the outcome of not investing earlier in developing an Improvement Science capability.

The good news is that the time-series charts illustrate that the NHS is behaving like any other complex, adaptive, human-engineered value system. This means that the theory, techniques and tools of Improvement Science and value system design can be applied to answer these questions. It means that the root causes of the excessive costs can be diagnosed and selectively removed without compromising safety and quality. It means that the savings can be wisely re-invested to improve the resilience of some parts and to provide capacity in other parts to absorb the expected increases in demand that are coming down the population pipe.

This is Improvement Science. It is a learnable skill.

18/03/2013: Update

The question “Where Has The Money Gone?” has now been asked at the Public Accounts Committee