A pitch for sparse models – Part 5 of a 10-part series on how complexity can produce better insight on what programs do, and why

Common Introduction to all sections

This is part 5 of 10 blog posts I’m writing to convey the information that I present in various workshops and lectures that I deliver about complexity. I’m an evaluator so I think in terms of evaluation, but I’m convinced that what I’m saying is equally applicable for planning.

I wrote each post to stand on its own, but I designed the collection to provide a wide-ranging view of how research and theory in the domain of “complexity” can contribute to the ability of evaluators to show stakeholders what their programs are producing, and why. I’m going to try to produce a YouTube video on each section. When (if?) I do, I’ll edit the post to include the YT URL.

Part Title Approximate post date
1 Complex systems or complex behavior? up
2 Complexity has awkward implications for program designers and evaluators up
3 Ignoring complexity can make sense up
4 Complex behavior can be evaluated using comfortable, familiar methodologies up
5 A pitch for sparse models up
6 Joint optimization of unrelated outcomes 7/8
7 Why should evaluators care about emergence? 7/16
8 Why might it be useful to think of programs and their outcomes in terms of attractors? 7/19
9 A few very successful programs, or many, connected, somewhat successful programs? 7/24
10 Evaluating for complexity when programs are not designed that way 7/31

Models

I’ll start with my take on the subject of “models”. I do not think of models exclusively in terms of traditional evaluation logic models, or in terms of the box and arrow graphics that we use to depict program theory. Rather, I think in terms of how “models” function in the process of scientific inquiry.  Table 1 summarizes how I engage models when I do evaluation. [Some writings that influenced my thinking about this topic: 1) Evaluation as technology, not science (Morell), 2) Models in Science Frigg, Roman and Hartmann, 3) The Model Thinker: What You Need to Know to Make Data Work for You (Page) 4) Timelines as evaluation logic models (Morell).]

Table 1: How Jonny Thinks About Models

Simplification A model is a simplification of reality that deliberately omits some aspects of a phenomenon’s functioning in order to highlight others. Simplification is required because without it, no methodology could cover all relevant factors.
Ubiquity Because evaluation is an analytical exercise there is always a need for some kind of a model. That model may be implicit or explicit, detailed or sparse, comprised of qualitative or quantitative concepts, and designed to drive any number of qualitative or quantitative ways of understanding a program. Also, models can vary in their half-lives. Some will remain relatively constant over an entire evaluation. Some may change with each new piece of data or each new analysis. But there will always be more going on than can be managed in any analysis. There will always be a need to decide what to strip out in order to discern relationships among elements of what is left.
Ignorance No matter how smart we are, we will never know what all the relevant factors are. We cannot have a complete model no matter how hard we try.
Choice Models can be cast in different forms and at different levels of detail. The appropriate form is the one that works best for a particular inquiry.
Multiple forms There is no reason to restrict an inquiry to only one model, or one form of model. In fact, there are many good reasons to use multiple models.
Wrong but useful George Box was right. “All models are wrong, but some are useful”. (Go here for a dated but public version. To here for the journal version.)
Outcome focus I use models to guide decisions about what methodology I should employ, what data I should collect, and how I should interpret the data. I tend not to use models to explain a program. If I did, I would include more detail than I could handle in an evaluation exercise. I do not use models for program advocacy, but if I did, it would use less detail.

A common view of models in evaluation

Considering the above, what should evaluation models look like? This question is unanswerable, but I do have a strong opinion as to what a model should not look like. It should not look almost all the models I have ever seen. It should not look like Figure 1. I know that no model used by evaluators looks exactly like this, but almost all models I have ever seen have a core logic that is similar. Qualitatively, they are all the same. I do not like these models.

Figure 1: Common, way over specified model

One reason I do not like these models is because they do not recognize complex behavior. Here are some examples of some complex behaviors that these kinds of models miss.

 

  • Even a single feedback loop can result in non-linear behavior
  • Small perturbations in any part of the model’s behavior may result in a major change in a model’s trajectory.
  • The model as a whole, or regions of it, may combine to generate effects that are not attributable to any single element in the model.
  • Models as depicted in Figure 1 are cast as networks, but the model is not treated as a network that can exhibit network behavior.
  • The model asserts that intermediate outcomes can be identified, as can paths through those outcomes. It is entirely possible that precise path cannot be predicted, but that long-term outcome can.

Another reason I do not like these models is because they are not modest. Read on. 

Recognizing ignorance

Give all the specific detail in Figure 1 a good look. Give it the sniff test. Is it plausible that we know enough about how the program works to specify it at that

Figure 2 Models with successive degrees of ignorance

level of detail? I suppose it’s possible, but I bet not.

As an aside, I also think that if models like this are used, they should include information that they always lack.  Here are two examples. 1) Are all those multiple arrows equally important? 2) Do those multiple connections represent “and”, or “or” relationships? It makes a difference because too many “and” requirements almost certainly portend that the program will fail. These are my favorites from a long list I developed for an analysis of implicit assumptions. If you want them all go to: Revealing Implicit Assumptions: Why, Where, and How?

My preference is to use models along the lines of those in Figure 2. From top to bottom, they capture a greater sense of what we do not know because we have not done enough research, or what we cannot know because of the workings of complex behaviors.

Blue model: The story in this model is that there are outcomes that matter, but whose precise relationships cannot be identified. (See the ovals in the “later”) column. The best we can do is think of these outcomes in groups such that if something happens in one group, something will happen in the subsequent group. This is the best we can do. We cannot specify relationships among single outcomes within each group, or specific outcomes across groups. Also, it is possible that for each replication of the program, the 1:1 relationship within and across groups may differ. Or, there may be no 1:1 relationships at all. Rather, there is emergent behavior in one group that is affecting the other. Or put more simply, the best we can say is that “if stuff happens here, stuff will happen there”. 

Green model: The story in the middle acknowledges an even greater degree of ignorance. The intermediate outcomes are still there, but the model acknowledges that much else not related to the program might be affecting the long-range outcome. Still, that long-range outcome can be isolated and identified. This seems like an odd possibility, but I believe that it is quite possible. (See Part 8: How can the concept of “attractors” be useful in evaluation?)  

Yellow model: The story at the bottom acknowledges more ignorance still. There, not only are the intermediate outcomes tangled with other activity, but the long-range outcome is as well.

I have no a priori preference for any of these models. The choice would depend on how much we know about the program, what the outcomes were, how much uncertainty we could tolerate, what data were available, what methodologies were available, the actual numbers for “later” and “much later”, and the needs of the stakeholders. What matters though, is that thinking of models in this way acknowledges the effects of complex behavior on program outcomes, and that it recognizes how little we know about the details of why a program will do what it does. Also, l I do not claim that these models are the only ones possible. They are as they say, for illustrative purposes only. Evaluators can and should be creative in fashioning models that serve the needs of their customers.

Locally right but not globally right

Models can have the odd characteristic of being everywhere locally correct but not globally correct. I tried to illustrate this with the green rectangle in Figure 3. Imagine moving that rectangle over the model. The relationships shown within the rectangle may well behave as the model depicts them, but as the size of the rectangle grows to overlap with the ent

Figure 3 Models can be everywhere correct locally but wrong globally

ire model, the fit between model and reality may fade. Several aspects of complex behavior explain why this is so.

  • Multiple interacting elements may exhibit global behavior that cannot be explained in terms of the sum of its parts. This is the phenomenon of emergence. (Part 7 Why should evaluators care about emergence?)
  • The model is a network, and networks can adapt and change as communication runs along its edges.
  • Because of sensitive dependence, small changes in any part of a system can result in long term change as the system evolves. The direction of that evolution cannot be predicted. To know it, the system must run, and its behavior observed.
  • All those feedback loops can result in non-linear change.
  • Collections of entities and relationships like this can result in phase shift behavior, a phenomenon where the characteristics of a system can change almost instantaneously.

Summary of common themes

There are two common themes that run through everything I have said in this post.

  • The models limit detail, either by removing specific element-to-element relationships, or by limiting the number and range of elements under investigation.
  • They portray scenarios in which complex behavior is affecting program outcome.

These two themes are related. One of the reasons we should use sparse models is because complex behavior makes it inappropriate to specify too much detail.

 

Complex systems or complex behavior? Part 1 of a 10-part series on how complexity can produce better insight on what programs do, and why

Common Introduction to all sections

This is part 1 of 10 blog posts I’m writing to convey the information that I present in various workshops and lectures that I deliver about complexity. I’m an evaluator so I think in terms of evaluation, but I’m convinced that what I’m saying is equally applicable for planning.

I wrote each post to stand on its own, but I designed the collection to provide a wide-ranging view of how research and theory in the domain of “complexity” can contribute to the ability of evaluators to show stakeholders what their programs are producing, and why. I’m going to try to produce a YouTube video on each section. When (if?) I do, I’ll edit the post to include the YouTube URL.

Part Title Approximate post date
1 Complex systems or complex behavior? up
2 Complexity has awkward implications for program designers and evaluators 6/14
3 Ignoring complexity can make sense 6/21
4 Complex behavior can be evaluated using comfortable, familiar methodologies 6/28
5 A pitch for sparse models 7/1
6 Joint optimization of unrelated outcomes 7/8
7 Why should evaluators care about emergence? 7/16
8 Why might it be useful to think of programs and their outcomes in terms of attractors? 7/19
9 A few very successful programs, or many, connected, somewhat successful programs? 7/24
10 Evaluating for complexity when programs are not designed that way 7/31

Complex systems or complex behavior?

There are two reasons why I am uncomfortable talking about complex systems. One reason is that I have never been able to find an unambiguous definition that everyone (or at least most people) agree on, and which also captures the range of topics that I think are useful in Evaluation. The second reason is that even if I knew what a complex system was, I would have no idea what to do with it when designing or conducting an evaluation.

What I do find useful is a focus on what complex systems do, on how they behave. Those behaviors are something I can do something with.  To telegraph an example I’ll use in Part 7, (Why should evaluators care about emergence?), when there is emergent behavior, a whole cannot be understood in terms of its parts. Were I to suspect such behavior, my program models would be less granular, my methodology would address different constructs, and my data interpretation would ignore fine level detail.

Table 1: Cross reference, complexity themes and complex behaviors that are useful in evaluation
  Theme in Complexity Science
Complex behavior that may be useful in evaluation Pattern Predictability How  change happens
Attractors      
Emergence      
Sensitive dependence      
Unpredictable outcome chains      
Network effects among outcomes      
Joint optimization of uncorrelated outcomes      

Not all complex behaviors are useful in evaluation, but some are. Also, appreciating the application of complexity in evaluation extends to themes that cut across much of the writings that appear in fields such as biology, meteorology, physics, mathematics, economics, and many others. For doing evaluation, I find it useful to think in terms of three themes: 1) pattern, 2) predictability, and 3) how change happens. When I do evaluation, I try to think about how invoking complex behaviors can help me understand a program in terms of those three themes. Table 1 shows the cross-references. In any given evaluation some cells will have content, and some will be empty.

Complexity has awkward implications for program designers and evaluators – Part 2 of a 10-part series on how complexity can produce better insight on what programs do, and why

Common Introduction to all sections

This is part 2 of 10 blog posts I’m writing to convey the information that I present in various workshops and lectures that I deliver about complexity. I’m an evaluator so I think in terms of evaluation, but I’m convinced that what I’m saying is equally applicable for planning.

I wrote each post to stand on its own, but I designed the collection to provide a wide-ranging view of how research and theory in the domain of “complexity” can contribute to the ability of evaluators to show stakeholders what their programs are producing, and why. I’m going to try to produce a YouTube video on each section. When (if?) I do, I’ll edit the post to include the YT URL.

Part Title Approximate post date
1 Complex systems or complex behavior? up
2 Complexity has awkward implications for program designers and evaluators up
3 Ignoring complexity can make sense 6/21
4 Complex behavior can be evaluated using comfortable, familiar methodologies 6/28
5 A pitch for sparse models 7/1
6 Joint optimization of unrelated outcomes 7/8
7 Why should evaluators care about emergence? 7/16
8 Why might it be useful to think of programs and their outcomes in terms of attractors? 7/19
9 A few very successful programs, or many, connected, somewhat successful programs? 7/24
10 Evaluating for complexity when programs are not designed that way 7/31

 

Complexity has awkward implications for evaluators and stakeholders

Complex behavior is problematic because it has implications for program outcomes that either do not conform to common sense, or which challenge accepted processes of program design, or both. Table 1 shows some examples. Table 2 outlines the complex behaviors that explain the outcome patterns.

Table 1:  Examples of complexity-driven program behaviors that have awkward implications for program designers

 
Program Behavior Implications for Common Sense and/or Design and Funding Logic
1 Benefits are highly skewed toward a small number of service recipients. This pattern may not be an aberration or a fault of the program design. Rather, it may be fundamental to the program and the conditions in which it is operating. Despite this inevitability, both politics and ideology favor a reasonably balanced distribution of program effects. It’s not pleasant to contemplate that the fundamental nature of an innovation, no matter how valuable that innovation may be, will be distributed in a highly unequal manner.
2 Understanding program effects defies understanding in terms of outcome chains identified in the program theory.  

 

Psychology: As humans, we have a natural desire to take things apart and see how the pieces fit back together. It’s unsettling to think that looking at the pieces will not help us understand what we have taken apart. 

Political, economic, and social realities: We live in a world where incremental efforts are needed when a long-term objective is pursued. That reality makes it difficult to admit that we can’t explain how incremental change adds up.

3 A program can be relied upon to produce long term outcomes, but a chain of intermediate outcomes cannot be identified in advance. The difficulty here is like the one above. It does not make sense that we can reliably predict where a program will end up, but we cannot identify the intermediate steps. And selling that assertion to funders can be no small effort.
4 Achieving program goals induces dysfunctional change in related programs. Our funding mechanisms are “stovepiped”, and thus optimized to achieve a single outcome, or at least, a set of highly correlated outcomes. It is disconcerting to contemplate the possibility that success within those stovepipes will of necessity, breed undesirable change in other activities that we care about.
Table 2:  Complex behaviors that explain the program performance
1 Benefits are highly skewed toward a small number of service recipients.  

Complex behavior = preferential attachment

Preferential attachment refers to a process by which one “entity” connects with another based on “size”.  One frequently cited example is the Internet, with its set of larger hubs connecting to smaller ones. Snowflakes also follow this pattern, as does wealth.

What do these seemingly different constructs have in common? They can all be thought of as random processes. If you are going to link to a URL, is there a greater chance that you will know about larger or smaller URL possibilities? If you are an ice particle are you more or less likely to find and bind to a larger or smaller collection of particles? If you are a business opportunity, are you more or less likely to seek larger or smaller centers of partnering potential? The direction in these examples can be turned around. If you are a large URL, is there a higher or lower chance that you will attract potential connections? If you are a snowflake, is there a higher or lower chance that you will bump into ice particles? If you are known to possesses resources, is here a higher or lower probability that you will attract partners?  Other characteristics of these patterns is that they are fractal and the sizes of their connections are power law distributed.

2 Understanding program effects defies understanding in terms of outcomes identified in the program theory.  

Complex behavior = emergence

Emergence is a phenomenon in which the functioning of an entire unit cannot be explained in terms of how individual parts interact. Contrast an automobile engine with a beehive, a traffic jam, or an economy. I could identify each part of the engine, explain its construction, discuss how an internal combustion engine works, and what role that part plays in the operation of the engine. The whole engine may be greater than the sum of its parts, but the unique role of each part remains. The contribution of each individual part does not exist with beehives, traffic jams, or economies. With these, it may be possible to identify the rules of interaction that have to be in place for emergence to manifest, but it would still be impossible to identify the unique contribution of each part.

3 A program can be relied upon to produce long term outcomes, but a chain of intermediate outcomes cannot be identified in advance.  

Complex behaviors = sensitive dependence and attractors

“Sensitive dependence” refers to a phenomenon in which small perturbations can result in a radical change in the trajectory of the system.  “Attractor” is a set of conditions that constrain the states in which a system can exist. It is well within the bounds of possibility that sensitive dependence precludes identifying how an entity will move within its attractor, but that the attractor will constrain the conditions in which the system can find itself.

4 Achieving program goals induces dysfunctional change in related programs.  

Complex behaviors = evolution / adaptation within an ecosystem

One can think of programs as organisms that are attempting to maximize their viability on a fitness landscape. In that sense the program can be thought of as competing with the other programs with which it shares an environment. In situations like this, anything that changes the allocation of resources among the organisms will result in adaptations to new realities. Put in a language of more resonance to planners and evaluators, there will be unexpected (and probably undesirable) consequences to implementing an effort to maximize only one outcome. As an example, consider a suite of health care services – AIDS, prenatal care, women’s’ health, tertiary care, and so on. What would happen to other elements of that health care suite if money, the most interesting jobs, planners’ intellectual effort, and networks of informal relationships all flowed into AIDS efforts?

From the point of view of doing evaluation, none of the complex behaviors, or their consequences for programs, are difficult to address. (See Part 4 Complex behavior can be evaluated using comfortable, familiar methodologies.) Getting program designers on board, however, is a different matter. One of my intentions in this blog series is to convince evaluators that program designers are acting rationally when they ignore complexity, but that productive dialogue about the complex can still be had. (See Part 3 Ignoring complexity can make sense.) And in any case, it’s a possible and worthwhile to evaluate based on complexity even when programs are not designed that way. (See Part 10 Evaluating for complexity when programs are not designed that way.)