December 2013 – HCSE Blog

28/12/201313/08/2024

The Time Trap

[Hmmmmmm]

The desk amplified the vibration of Bob’s smartphone as it signaled the time for his planned e-mentoring session with Leslie.

<Bob> Hi Leslie, right-on-time, how are you today?

<Leslie> Good thanks Bob. I have a specific topic to explore if that is OK. Can we talk about time traps.

<Bob> OK – do you have a specific reason for choosing that topic?

<Leslie> Yes. The blog last week about ‘Recipe for Chaos‘ set me thinking and I remembered that time-traps were mentioned in the FISH course but I confess, at the time, I did not understand them. I still do not.

<Bob> Can you describe how the ‘Recipe for Chaos‘ blog triggered this renewed interest in time-traps?

<Leslie> Yes – the question that occurred to me was: ‘Is a time-trap a recipe for chaos?’

<Bob> A very good question! What do you feel the answer is?

<Leslie> I feel that time-traps can and do trigger chaos but I cannot explain how. I feel confused.

<Bob> Your intuition is spot on – so can you localize the source of your confusion?

<Leslie> OK. I will try. I confess I got the answer to the MCQ correct by guessing – and I wrote down the answer when I eventually guessed correctly – but I did not understand it.

<Bob> What did you write down?

<Leslie> “The lead time is independent of the flow”.

<Bob> OK. That is accurate – though I agree it is perhaps a bit abstract. One source of confusion may be that there are different causes of time-traps and there is a lot of overlap with other chaos-creating policies. Do you have a specific example we can use to connect theory with reality?

<Leslie> OK – that might explain my confusion. The example that jumped to mind is the RTT target.

<Bob> RTT?

<Leslie> Oops – sorry – I know I should not use undefined abbreviations. Referral to Treatment Time.

<Bob> OK – can you describe what you have mapped and measured already?

<Leslie> Yes. When I plot the lead-time for patients in date-of-treatment order the process looks stable but the histogram is multi-modal with a big spike just underneath the RTT target of 18 weeks. What you describe as the ‘Horned Gaussian’ – the sign that the performance target is distorting the behaviour of the system and the design of the system is not capable on its own.

<Bob> OK, and have you investigated why there is not just one spike?

<Leslie> Yes – the factor that best explains that is the ‘priority’ of the referral. The ‘urgents’ jump in front of the ‘soons’ and both jump in front of the ‘routines’. The chart has three overlapping spikes.

<Bob> That sounds like a reasonable policy for mixed-priority demand. So what is the problem?

<Leslie> The ‘Routine’ group is the one that clusters just underneath the target. The lead time for routines is almost constant but most of the time those patients sit in one queue or another being leap-frogged by other higher-priority patients. Until they become high-priority – then they do the leap frogging.

<Bob> OK – and what is the condition for a time trap again?

<Leslie> That the lead time is independent of flow.

<Bob> Which implies?

<Leslie> Um. Let me think. That the flow can be varying but the lead time stays the same?

<Bob> Yup. So is the flow of routine referrals varying?

<Leslie> Not over the long term. The chart is stable.

<Bob> What about over the short term? Is demand constant?

<Leslie> No of course not – it varies – but that is expected for all systems. Constant means ‘over-smoothed data’ – the Flaw of Averages trap!

<Bob> OK. And how close is the average lead time for routines to the RTT maximum allowable target?

<Leslie> Ah! I see what you mean. The average is about 17 weeks and the target is 18 weeks.

<Bob> So, what is the flow variation on a week-to-week time scale?

<Leslie> Demand or Activity?

<Bob> Both.

<Leslie> H’mm – give me a minute to re-plot flow as a weekly-aggregated chart. Oh! I see what you mean – both the weekly activity and demand are both varying widely and they are not in sync with each other. Work in progress must be wobbling up and down a lot! So how can the lead time variation be so low?

<Bob> What do the flow histograms look like?

<Leslie> Um. Just a second. That is weird! They are both bi-modal with peaks at the extremes and not much in the middle – the exact opposite of what I expected to see! I expected a centered peak.

<Bob> What you are looking at is the characteristic flow fingerprint of a chaotic system – it is called ‘thrashing’.

<Leslie> So, I was right!

<Bob> Yes. And now you know the characteristic pattern to look for. So, what is the policy design flaw here?

<Leslie> The DRAT – the delusional ratio and arbitrary target?

<Bob> That is part of it – that is the external driver policy. The one you cannot change easily. What is the internally driven policy? The reaction to the DRAT?

<Leslie> The policy of leaving routine patients until they are about to breach then re-classifying them as ‘urgent’.

<Bob> Yes! It is called a ‘Prevarication Policy’ and it is surprisingly and uncomfortably common. Ask yourself – do you ever prevaricate? Do you ever put off ‘lower priority’ tasks until later and then not fill the time freed up with ‘higher priority tasks’?

<Leslie> OMG! I do that all the time! I put low priority and unexciting jobs on a ‘to do later’ heap but I do not sit idle – I do then focus on the high priority ones.

<Bob> High priority for whom?

<Leslie> Ah! I see what you mean. High priority for me. The ones that give me the biggest reward! The fun stuff or the stuff that I get a pat on the back for doing or that I feel good about.

<Bob> And what happens?

<Leslie> The heap of ‘no-fun-for-me-to-do’ jobs gets bigger and I await the ‘reminders’ and then have to rush round in a mad panic to avoid disappointment, criticism and blame. It feels chaotic. I get grumpy. I make more mistakes and I deliver lower-quality work. If I do not get a reminder I assume that the job was not that urgent after all and if I am challenged I claim I am too busy doing the other stuff.

<Bob> And have you avoided disappointment?

<Leslie> Ah! No – that I needed to be reminded meant that I had already disappointed. And when I do not get a reminded does not prove I have not disappointed either. Most people blame rather than complain. I have just managed to erode other people’s trust in my reliability. I have disappointed myself. I have achieved exactly the opposite of what I intended. Drat!

<Bob> So, what is the reason that you work this way? There will be a reason. A good reason.

<Leslie> That is a very good question! I will reflect on that because I believe it will help me understand why others behave this way too.

<Bob> OK – I will be interested to hear your conclusion. Let us return to the question. What is the downside of a ‘Prevarication Policy’?

<Leslie> It creates stress, chaos, fire-fighting, last minute changes, increased risk of errors, more work and it erodes both quality, confidence and trust.

<Bob> Indeed so – and the impact on productivity?

<Leslie> The activity falls, the system productivity falls, revenue falls, queues increase, waiting times increase and the chaos increases!

<Bob> And?

<Leslie> We treat the symptoms by throwing resources at the problem – waiting list initiatives – and that pushes our costs up. Either way we are heading into a spiral of decline and disappointment. We do not address the root cause.

<Bob> So what is the way out of chaos?

<Leslie> Reduce the volume on the destabilizing feedback loop? Stop the managers meddling!

<Bob> Or?

<Leslie> Eh? I do not understand what you mean. The blog last week said management meddling was the problem.

<Bob> It is a problem. How many feedback loops are there?

<Leslie> Two – that need to be balanced.

<Bob> So, what is another option?

<Leslie> OMG! I see. Turn UP the volume of the stabilizing feedback loop!

<Bob> Yup. And that is a lot easier to do in reality. So, that is your other challenge to reflect on this week. And I am delighted to hear you using the terms ‘stabilizing feedback loop’ and ‘destabilizing feedback loop’.

<Leslie> Thank you. That was a lesson for me after last week – when I used the terms ‘positive and negative feedback’ it was interpreted in the emotional context – positive feedback as encouragement and negative feedback as criticism. So ‘reducing positive feedback’ in that sense is the exact opposite of what I was intending. So I switched my language to using ‘stabilizing and destabilizing’ feedback loops that are much less ambiguous and the confusion and conflict disappeared.

<Bob> That is very useful learning Leslie … I think I need to emphasize that distinction more in the blog. That is one advantage of online media – it can be updated!

<Leslie> Thanks again Bob! And I have the perfect opportunity to test a new no-prevarication-policy design – in part of the system that I have complete control over – me!

21/12/201306/01/2024

The Recipe for Chaos

There are only four ingredients required to create Chaos.

The first is Time.

All processes and systems are time-dependent.

The second ingredient is a Metric of Interest (MoI).

That means a system performance metric that is important to all – such as a Safety or Quality or Cost; and usually all three.

The third ingredient is a feedback loop of a specific type – it is called a Negative Feedback Loop. The NFL is one that tends to adjust, correct and stabilise the behaviour of the system.

Negative feedback loops are very useful – but they have a drawback. They resist change and they reduce agility. The name is also a disadvantage – the word ‘negative feedback’ is often associated with criticism.

The fourth and final ingredient in our Recipe for Chaos is also a feedback loop but one of a different design – a Positive Feedback Loop (PFL)- one that amplifies variation and change.

Positive feedback loops are also very useful – they are required for agility – quick reactions to unexpected events. Fast reflexes.

The downside of a positive feedback loop is that increases instability.

The name is also confusing – ‘positive feedback’ is associated with encouragement and praise.

So, in this context it is better to use the terms ‘stabilizing feedback’ and ‘destabilizing feedback’ loops.

When we mix these four ingredients in just the right amounts we get a system that may behave chaotically. That is surprising and counter-intuitive. But it is how the Universe works.

For example:

Suppose our Metric of Interest is the amount of time that patients spend in a Accident and Emergency Department. We know that the longer this time is the less happy they are and the higher the risk of avoidable harm – so it is a reasonable goal to reduce it.

Longer-than-possible waiting times have many root causes – it is a non-specific metric. That means there are many things that could be done to reduce waiting time and the most effective actions will vary from case-to-case, day-to-day and even minute-to-minute. There is no one-size-fits-all solution.

This implies that those best placed to correct the causes of these delays are the people who know the specific system well – because they work in it. Those who actually deliver urgent care. They are the stabilizing ingredient in our Recipe for Chaos.

The destabilizing ingredient is the hit-the-arbitrary-target policy which drives a performance management feedback loop.

This policy typically involves:
(1) Setting a performance target that is desirable but impossible for the current design to achieve reliably;
(2) inspecting how close to the target we are; then
(3) using the real-time data to justify threats of dire consequences for failure.

Now we have a perfect Recipe for Chaos.

The higher the failure rate the more inspections, reports, meetings, exhortations, threats, interruptions, and interventions that are generated. Fear-fuelled management meddling. This behaviour consumes valuable time – so leaves less time to do the worthwhile work. Less time to devote to safety, flow, and quality. The queues build and the pressure increases and the system becomes hyper-sensitive to small fluctuations. Delays multiply and errors are more likely and spawn more workload, more delays and more errors. Tempers become frayed and molehills are magnified into mountains. Irritations become arguments. And all of this makes the problem worse rather than better. Less stable. More variable. More chaotic. More dangerous. More expensive.

It is actually possible to write a simple equation that captures this complex dynamic behaviour characteristic of real systems. And that was a very surprising finding when it was discovered in 1976 by a mathematician called Robert May.

This equation is called the logistic equation.

Here is the abstract of his seminal paper.

Nature 261, 459-467 (10 June 1976)

Simple mathematical models with very complicated dynamics

First-order difference equations arise in many contexts in the biological, economic and social sciences. Such equations, even though simple and deterministic, can exhibit a surprising array of dynamical behaviour, from stable points, to a bifurcating hierarchy of stable cycles, to apparently random fluctuations. There are consequently many fascinating problems, some concerned with delicate mathematical aspects of the fine structure of the trajectories, and some concerned with the practical implications and applications. This is an interpretive review of them.

The fact that this chaotic behaviour is completely predictable and does not need any ‘random’ element was a big surprise. Chaotic is not the same as random. The observed chaos in the urgent healthcare care system is the result of the design of the system – or more specifically the current healthcare system management policies.

This has a number of profound implications – the most important of which is this:

If the chaos we observe in our health care systems is the predictable and inevitable result of the management policies we ourselves have created and adopted – then eliminating the chaos will only require us to re-design these policies.

In fact we only need to tweak one of the ingredients of the Recipe for Chaos – such as to reduce the strength of the destabilizing feedback loop. The gain. The volume control on the variation amplifier!

This is called the MM factor – otherwise known as ‘Management Meddling‘.

We need to keep all four ingredients though – because we need our system to have both agility and stability. It is the balance of ingredients that that is critical.

The flaw is not the Managers themselves – it is their learned behaviour – the Meddling. This is learned so it can be unlearned. We need to keep the Managers but “tweak” their role slightly. As they unlearn their old habits they move from being ‘Policy-Enforcers and Fire-Fighters’ to becoming ‘Policy-Engineers and Chaos-Calmers’. They focus on learning to understand the root causes of variation that come from outside the circle of influence of the non-Managers. They learn how to rationally and radically redesign system policies to achieve both agility and stability.

And doing that requires developing systemic-thinking and learning Improvement Science skills – because the causes of chaos are counter-intuitive. If it were intuitively-obvious we would have discovered the nature of chaos thousands of years ago. The fact that it was not discovered until 1976 demonstrates this fact.

It is our homo sapiens intuition that got us into this mess! The inherent flaws of the chimp-ware between our ears. Our current management policies are intuitively-obvious, collectively-agreed, rubber-stamped and wrong! They are part of the Recipe for Chaos.

And when we learn to re-design our system policies and upload the new system software then the chaos evaporates as if a magic wand had been waved.

And that comes as a really BIG surprise!

What also comes as a big surprise is just how small the counter-intuitive policy design tweaks often are.

Safe, smooth, efficient, effective, and productive flow is restored. Calm confidence reigns. Safety, Flow, Quality and Productivity all increase – at the same time. The emotional storm clouds dissipate and the prosperity sun shines again.

Everyone feels better. Everyone. Patients, managers, and non-managers.

This is Win-Win-Win improvement by design. Improvement Science.

14/12/201312/03/2023

Unknown-Knowns

If we were exploring the corridors in an unfamiliar building and our way forward was blocked by a door that looked like this … we would suspect that something of value lay beyond.

We know there is an unknown.

The puzzle we have to solve to release the chain tells us this. This is called an “affordance” – the design of the lock tells us what we need to do.

More often what we need to know to move forward is unknown to us, and the problems we face afford us no clues as to how to solve them. Worse than that – the clues they do offer are misleading. Our intuition is tricked. We do the ‘intuitively obvious’ thing and the problem gets worse.

It is easy to lose confidence, become despondent, and even to start to believe there is no solution. We begin to believe that the problem is impossible for us to solve.

Then one day someone shows us how to solve an “impossible” problem. And with the benefit of our new perspective the solution looks simple, and how it works is now obvious. But only in retrospect.

Our unknown was known all along. But not by us. We were ignorant. We were agnostic.

And our intuitions are sometimes flaky, forgetful and fickle. They are not to be blindly trusted. And our egos are fragile too – we do not like to feel flaky, forgetful and fickle. So, we lie to ourselves and we confuse obvious-in-hindsight with obvious-in-foresight.

They are not the same.

Suppose we now want to demonstrate our new understanding to someone else – to help them solve their “impossible” problem. How do we do that?

Do we say “But it is obvious – if you cannot see it you must be blind or stupid!”

How can we say that when it was not obvious to us only a short time ago? Is our ego getting the in way again? Can our intuition or ego be trusted at all?

To help others gain insight and to help them deepen their understanding we must put ourselves back into the shoes we used to be in: and we need to look at the problem again from their perspective. With the benefit of the three views of the problem: our old one, their current one and our new one we may be able to then see where the Unknown-Known is for them – because it might be different.

Only then can we help them discover it for themselves; and then they can help others discover their Unknown-Knowns. That is know knowledge and understanding spreads.

Understanding is the bridge between Knowledge and Wisdom.

And it is a wonderful thing to see someone move from conflict, through confusion to clarity by asking them just the right question, at just the right time, in just the right way. For them.

Socrates, the Greek philosopher and teacher, knew how to do this a long time ago – which is why it is called the Socratic Method.

07/12/2013

Software First

A healthcare system has two inter-dependent parts. Let us call them the ‘hardware’ and the ‘software’ – terms we are more familiar with when referring to computer systems.

In a computer the critical-to-success software is called the ‘operating system’ – and we know that by the brand labels such as Windows, Linux, MacOS, or Android. There are many.

It is the O/S that makes the hardware fit-for-purpose. Without the O/S the computer is just a box of hot chips. A rather expensive room heater.

All the programs and apps that we use to to deliver our particular information service require the O/S to manage the actual hardware. Without a coordinator there would be chaos.

In a healthcare system the ‘hardware’ is the buildings, the equipment, and the people. They are all necessary – but they are not sufficient on their own.

The ‘operating system’ in a healthcare system are the management policies: the ‘instructions’ that guide the ‘hardware’ to do what is required, when it is required and sometimes how it is required. These policies are created by managers – they are the healthcare operating system design engineers so-to-speak.

Change the O/S and you change the behaviour of the whole system – it may look exactly the same – but it will deliver a different performance. For better or for worse.

In 1953 the invention of the transistor led to the first commercially viable computers. They were faster, smaller, more reliable, cheaper to buy and cheaper to maintain than their predecessors. They were also programmable. And with many separate customer programs demanding hardware resources – an effective and efficient operating system was needed. So the understanding of “good” O/S design developed quickly.

In the 1960’s the first integrated circuits appeared and the computer world became dominated by mainframe computers. They filled air-conditioned rooms with gleaming cabinets tended lovingly by white-coated technicians carrying clipboards. Mainframes were, and still are, very expensive to build and to run! The valuable resource that was purchased by the customers was ‘CPU time’. So the operating systems of these machines were designed to squeeze every microsecond of value out of the expensive-to-maintain CPU: for very good commercial reasons. Delivering the “data processing jobs” right, on-time and every-time was paramount.

The design of the operating system software was critical to the performance and to the profit. So a lot of brain power was invested in learning how to schedule jobs; how to orchestrate the parts of the hardware system so that they worked in harmony; how to manage data buffers to smooth out flow and priority variation; how to design efficient algorithms for number crunching, sorting and searching; and how to switch from one task to the next quickly and without wasting time or making errors.

Every modern digital computer has inherited this legacy of learning.

In the 1970’s the first commercial microprocessors appeared – which reduced the size and cost of computers by orders of magnitude again – and increased their speed and reliability even further. Silicon Valley blossomed and although the first micro-chips were rather feeble in comparison with their mainframe equivalents they ushered in the modern era of the desktop-sized personal computer.

In the 1980’s players such as Microsoft and Apple appeared to exploit this vast new market. The only difference was that Microsoft only offered just the operating system for the new IBM-PC hardware (called MS-DOS); while Apple created both the hardware and software as a tightly integrated system – the Apple I.

The ergonomic-seamless-design philosophy at Apple led to the Apple Mac which revolutionised personal computing. It made them usable by people who had no interest in the innards or in programming. The Apple Macs were the “designer”computers and were reassuringly more expensive. The innovations that Apple designed into the Mac are now expected in all personal computers as well as the latest generations of smartphones and tablets.

Today we carry more computing power in our top pocket than a mainframe of the 1970’s could deliver! The design of the operating system has hardly changed though.

It was the O/S design that leveraged the maximum potential of the very expensive hardware. And that is still the case – but we take it for completely for granted.

Exactly the same principle applies to our healthcare systems.

The only difference is that the flow is not 1’s and 0’s – it is patients and all the things needed to deliver patient care. The ‘hardware’ is the expensive part to assemble and run – and the largest cost is the people. Healthcare is a service delivered by people to people. Highly-trained nurses, doctors and allied healthcare professionals are expensive.

So the key to healthcare system performance is high quality management policy design – the healthcare operating system (HOS).

And here we hit a snag.

Our healthcare management policies have not been designed using the same rigor as the operating systems for our computers. They have not been designed using the well-understood principles of flow physics. The various parts of our healthcare system do not work well together. The flows are fractured. The silos work independently. And the ubiquitous symptom of this dysfunction is confusion, chaos and conflict. The managers and the doctors are at each others throats. And this is because the management policies have evolved through a largely ineffective and very inefficient strategy called “burn-and-scrape”. Firefighting.

The root cause of the poor design is that neither healthcare managers nor the healthcare workers are trained in operational policy design. Design for Safety. Design for Quality. Design for Delivery. Design for Productivity.

And we are all left with a lose-lose-lose legacy: a system that is no longer fit-for-purpose and a generation of managers and clinicians who have never learned how to design the operational and clinical policies that ensure the system actually delivers what the ‘hardware’ is capable of delivering.

For example:

Suppose we have a simple healthcare system with three stages called A, B and C. All the patients flow through A, then to B and then to C. Let us assume these three parts are managed separately as departments with separate budgets and that they are free to use whatever policies they choose so long as they achieve their performance targets -which are (a) to do all the work and (b) to stay in budget and (c) to deliver on time. So far so good.

Now suppose that the work that arrives at Department B from Department A is not all the same and different tasks require different pathways and different resources. A Radiology, Pathology or Pharmacy Department for example.

Sorting the work into separate streams and having expensive special-purpose resources sitting idle waiting for work to arrive is inefficient and expensive. It will push up the unit cost – the total cost divided by the total activity. This is called ‘carve-out’.

Switching resources from one pathway to another takes time and that change-over time implies some resources are not able to do the work for a while. These inefficiencies will contribute to the total cost and therefore push up the “unit-cost”. The total cost for the department divided by the total activity for the department.

So Department B decides to improve its “unit cost” by deploying a policy called ‘batching’. It starts to sort the incoming work into different types of task and when a big enough batch has accumulated it then initiates the change-over. The cost of the change-over is shared by the whole batch. The “unit cost” falls because Department B is now able to deliver the same activity with fewer resources because they spend less time doing the change-overs. That is good. Isn’t it?

But what is the impact on Departments A and C and what effect does it have on delivery times and work in progress and the cost of storing the queues?

Department A notices that it can no longer pass work to B when it wants because B will only start the work when it has a full batch of requests. The queue of waiting work sits inside Department A. That queue takes up space and that space costs money but the queue cost is incurred by Department A – not Department B.

What Department C sees is the order of the work changed by Department B to create a bigger variation in lead times for consecutive tasks. So if the whole system is required to achieve a delivery time specification – then Department C has to expedite the longest waiters and delay the shortest waiters – and that takes work, time, space and money. That cost is incurred by Department C not by Department B.

The unit costs for Department B go down – and those for A and C both go up. The system is less productive as a whole. The queues and delays caused by the policy change means that work can not be completed reliably on time. The blame for the failure falls on Department C. Conflict between the parts of the system is inevitable. Lose-Lose-Lose.

And conflict is always expensive – on all dimensions – emotional, temporal and financial.

The policy design flaw here looks like it is ‘batching’ – but that policy is just a reaction to a deeper design flaw. It is a symptom. The deeper flaw is not even the use of ‘unit costing’. That is a useful enough tool. The deeper flaw is the incorrect assumption that by improving the unit costs of the stages independently will always get an improvement in whole system productivity.

This is incorrect. This error is the result of ‘linear thinking’.

The Laws of Flow Physics do not work like this. Real systems are non-linear.

To design the management policies for a non-linear system using linear-thinking is guaranteed to fail. Disappointment and conflict is inevitable. And that is what we have. As system designers we need to use ‘systems-thinking’.

This discovery comes as a bit of a shock to management accountants. They feel rather challenged by the assertion that some of their cherished “cost improvement policies” are actually making the system less productive. Precisely the opposite of what they are trying to achieve.

And it is the senior management that decide the system-wide financial policies so that is where the linear-thinking needs to be challenged and the ‘software patch’ applied first.

It is not a major management software re-write. Just a minor tweak is all that is required.

And the numbers speak for themselves. It is not a difficult experiment to do.

So that is where we need to start.

We need to learn Healthcare Operating System design and we need to learn it at all levels in healthcare organisations.

And that system-thinking skill has another name – it is called Improvement Science.

The good news is that it is a lot easier to learn than most people believe.

And that is a big shock too – because how to do this has been known for 50 years.

So if you would like to see a real and current example of how poor policy design leads to falling productivity and then how to re-design the policies to reverse this effect have a look at Journal Of Improvement Science 2013:8;1-20.

And if you would like to learn how to design healthcare operating policies that deliver higher productivity with the same resources then the first step is FISH.