Common Introduction to all 6 Posts
History and Context
These blog posts are an extension of my efforts to convince evaluators to shift their focus from complex systems to specific behaviors of complex systems. We need to make this switch because there is no practical way to apply the notion of a “complex system” to decisions about program models, metrics, or methodology. But we can make practical decisions about models, metrics, and methodology if we attend to the things that complex systems do. My current favorite list of complex system behavior that evaluators should attend to is:
Complexity behavior | Posting date |
· Emergence | up |
· Power law distributions | up |
· Network effects and fractals | Sept. 28 |
· Unpredictable outcome chains | Oct. 5 |
· Consequence of small changes | Oct. 12 |
· Joint optimization of uncorrelated outcomes | Oct. 19 |
For a history of my activity on this subject see: PowerPoint presentations: 1, 2, and 3; fifteen minute AEA “Coffee Break” videos 4, 5, and 6; long comprehensive video: 7.
Since I began thinking of complexity and evaluation in this way I have been uncomfortable with the idea of just having a list of seemingly unconnected items. I have also been unhappy because presentations and lectures are not good vehicles for developing lines of reasoning. I wrote these posts to address these dissatisfactions.
From my reading in complexity I have identified four themes that seem relevant for evaluation.
- Pattern
- Predictability
- How change happens
- Adaptive and evolutionary behavior
Others may pick out different themes, but these are the ones that work for me. Boundaries among these themes are not clean, and connections among them abound. But treating them separately works well enough for me, at least for right now.
Figure 1 is a visual depiction of my approach to this subject.
![]() Figure 1: Complex Behaviors and Complexity Themes
|
- The black rectangles on the left depict a scenario that pairs a well-defined program with a well-defined evaluation, resulting in a clear understanding of program outcomes. I respect evaluation like this. It yields good information, and there are compelling reasons working this way. (For reasons why I believe this, see 1 and 2.)
- The blue region is there to indicate that no matter how clear cut the program and the evaluation; it is also true that both the program and the evaluation are embedded in a web of entities (programs, policies, culture, regulation, legislation, etc.) that interact with our program in unknown and often unknowable ways.
- The green region depicts what happens over time. The program may be intact, but the contextual web has evolved in unknown and often unknowable ways. Such are the ways of complex systems.
- Recognizing that we have a complex system, however, is not amenable to developing program theory, formulating methodology, or analyzing and interpreting data. For that, we need to focus on the behaviors of complex systems, as depicted in the red text in the table. Note that the complex behaviors form the rows of a table. The columns show the complexity themes. The Xs in the cells show which themes relate to which complexity behaviors.
Power Law Distributions
Pattern
|
Predictability
|
How change happens | Adaptive evolutionary behavior | |
Emergence | ||||
Power law distributions | X | |||
Network effects and fractals | ||||
Unspecifiable outcome chains | ||||
Consequence of small changes | ||||
Joint optimization of uncorrelated outcomes |
It’s hard to read much complexity literature without running into power law distributions. Whether these distributions are formally power laws, or just power law-like, is subject to debate on theoretical grounds and testing via empirical methods. There is also the question of why these kinds of distributions show up in the first place. There is a lot of deep thinking on these matters, much of which I’ll admit to not understanding very well. But it’s still true that the pattern keeps coming up when complex phenomena are investigated. What I do understand are the implications of this pattern for program theory and for the conversations that evaluators need to have with their stakeholders.
The essential difficulty is that if benefits are power law distributed, then an astoundingly successful program may provide very large benefits to a few, and almost nothing to most. And, this pattern is not because program designers do not try hard enough, or because they do not pay enough attention to equity. Rather, it’s because the distribution is baked into the way the world works. (For good examples of power laws for human-scale phenomenon, try organizations, and business and cities. If you really want to get into the question of whether true power laws are so common, try scant evidence of power laws.)
![]() Figure 2: Power and Symmetrical Outcome Distributions |
From an analytical point of view, analyzing the data is easy enough. There are plenty of statisticians who know what to do. But getting program designers to realize that their program theory precludes a reasonably equitable distribution of benefits – that may not be such a comfortable conversation. Perhaps not comfortable, but I do think that evaluators should be sensitive to the possibility of power law distributions of outcomes when they work with their customers to develop program theory.
To illustrate the issue, I like to use an agricultural extension example that a friend of mine who does this kind of work assures me is not too far off the wall. The example shown in Figure 2 begins with an agricultural extension education program designed to induce farmers to employ a crop management method. The original program theory is shown in red at the left. It is straightforward. Develop the curriculum. Train farmers. Measure crop yield.
This program theory contains the implicit assumption that impact will be more or less symmetrically distributed around some mean. (It also assumes that “yield” is the only thing that matters, leaving out factors such as taste and other food attributes. But that’s a complication I don’t want to deal with.) Due to differences in farms and farmers’ capabilities, some farmers will gain more than others, but more or less, the value of the program will be experienced by all. That’s a reasonable program theory and a reasonable assumption about the distribution of outcomes. It’s also comfortable for people who are responsible for spending public money.
But what might happen if we expand the program theory to include a larger range of outcomes, as shown in yellow? These outcomes are added after a bit of thinking about the consequences of improved crop yield. The extended model adds “family living standard”, which in turn results in better living conditions in the entire village. To top it off, there is a feedback loop to indicate that as conditions in the village improve, family living standards improve.
Also note that the family/village dynamic is independent of crop yield. This disconnect implies that while the “living standard” part of the program theory is in this case related to the agricultural program, many different interventions might also activate that cycle. That raises a host of interesting questions about the best intervention to use if in fact the “living standard” outcomes are more important than “crop yield” outcome. It’s also worth pondering the question of whether I should have added a feedback loop between “standard of living” and “crop yield”. After all, a higher standard of living may allow a farmer to hire more help or to engage in other activities that will increase crop yield. I did not think that would be a major factor in explaining results, but others may disagree. In any case, who says that a theory of change can depict only a single hypothesis as to how a program will work?
Now the program theory gets interesting for evaluators and uncomfortable for program funders. It’s a wonderful thing that family and village conditions improve. However, I find it entirely plausible that those benefits will be power-law distributed, as is “income” and many other resource allocations in society, along the lines of the Pareto principle. This makes for a delicate problem for evaluators. On the one hand, we revel in getting our customers to think more carefully about program theory. On the other hand, as we expand the universe of possible outcomes, we run the risk of our customers having to confront the unpleasant reality that some of the most important outcomes will be unequally distributed.
Thanks to Guy Sharrock for pointing out the error of my ways on previous drafts of this post.
Thanks grreat post
Thanks. I’m always looking for examples.