I’m doing a presentation at the U.S. Department of State’s Fourth Annual Conference on Program Evaluation: “Diplomacy, Development, and Defense – Working Together to Achieve Foreign Policy Goals” June 7-8, 2011

I’m looking for comment and critique. Draft of presentation slides


Purpose and Assumptions

The focus of the proposed presentation is methodology to evaluate unanticipated and unintended consequences of program action. It is based on an evaluation theory and a set of case studies developed by the presenter.[1] The specific track for this proposal is “building evaluation capacity”, with an immediate impact on “development”, and a longer term impact on the “building democracy” element of the “diplomacy” theme. The proposed presentation is based on two principles.

§         Good evaluation must often depends on maintaining the integrity of an evaluation design over time. For instance, it may be necessary to interview service recipients within a narrow window of opportunity after they received a service, or to apply a previously validated scale over a protracted period of time, or to maintain good relations with program staff so that observations of their work can be made.

§         In order to assure the integrity of an evaluation’s design, scarce resources are needed to build and maintain an “evaluation infrastructure”. To continue the previous example, it takes time and effort for evaluators to maintain a set of agreements with program managers and policy makers, or to develop and validate scales. Once these resources are used, there is that much less opportunity to adjust the evaluation design in the face of unexpected program behavior.

These principles are challenged by the fact that the consequences of programs are often different from what planners expected, and therefore, different from the outcomes that evaluators had planned to measure. How then, to best maintain the power of an evaluation design and still be responsive to changing needs? This is the question that the panel will address. As the answer unfolds it will have major consequences for how research questions are formulated, strategic planning, and program monitoring. Consequences for question formulation and strategic planning derive from unintended outcomes whose roots are in narrow conceptualizations of program theory and outcome. Consequences for monitoring derive from the fact that “lead time” for detecting incipient program change is a critical element in evaluation response to unanticipated change. Good monitoring can increase lead time.

Unexpected Outcomes in Development and Democracy

The techniques to be advocated are particularly relevant in development settings because programs in those contexts are extremely prone to outcomes that were not anticipated by planners and policy makers. This uncertainty exists because development programs often involve rich, tight linkages that affect many aspects of  the systems in which they reside, and also because the environments in which they exist can be unstable.

The role of unexpected program outcome is particularly important when trying to assess the relationship between program outcome and democracy because of a mismatch between what is known about the relationship between democracy and development on the one hand, and the causal path between a development program  and the precursors to democracy on the other. We have a good idea about what brings about democracy:

The good news, however, is that the conditions conducive to democracy can and do emerge—and the process of “modernization,” according to abundant empirical evidence, advances them. Modernization is a syndrome of social changes linked to industrialization. Once set in motion, it tends to penetrate all aspects of life, bringing occupational specialization, urbanization, rising educational levels, rising life expectancy, and rapid economic growth. These create a self-reinforcing process that transforms social life and political institutions, bringing rising mass participation in politics and—in the long run—making the establishment of democratic political institutions increasingly likely. Today, we have a clearer idea than ever before of why and how this process of democratization happens.[2]

From the point of view of program theory and evaluation though, any given development program may affect any democracy precursor in many ways, not all of which will be foreseen by planners and evaluators. Moreover, the evaluation challenge is compounded in the typical situation where many different development programs coexist within the same system boundaries. Finally, given the many paths and interactions that may develop, the same result may come about through many different paths. To illustrate:

Example: Difficulty of Relating Programs to the long term goal of promoting democracy

Imagine an overly simple scenario in which two programs are operating, one whose primary goal is job training, and one that provides affordable cell phone service in rural areas. Again to oversimplify for the sake of illustration, we may state a program theory for each program. The job training program leads directly to “occupational specialization”. The cell phone program increases the richness of social contact and the ability of farmers to peg their prices to world commodity prices. These are reasonable immediate outcomes to expect, and they do need to be justified (i.e. evaluated) as such. Understanding their consequences for democracy, however is complicated. One problem is that many possible interactions can take place. Will the richer social networks make it easier for people to find ways to specialize their work? Will better commodity pricing allow more people to take advantage of the job training? Will unequally distributed rising levels of wealth and education support or upset the existing social structure? Might the changed social structure set in motion its own undesirable consequences? Any of these outcomes are possible, as are many more that one could conjure with a little bit more time and imagination. From the point of view of evaluation these uncertain outcomes are problematic for methodological reasons. Again to take two overly simple examples. What if, after the evaluation was established, we suspected an unanticipated interaction between occupational specialization and social relationships? Or, we suspected that agricultural pricing in the location of the cell phone intervention was having larger scale impact because of its contribution to a tipping point change in a region that was experiencing the effects of other development programs? Supposing both were occurring and we cared about their relative impact? The first example is about micro-level short term change. The second is longer term and larger scale. They may or may not interact. What they have in common is that the original evaluation infrastructure would be inadequate to determine the state of affairs. Different people would need to be interviewed. Different statistical data bases would have to be queried. Different data collection timelines would have to be worked out. Different comparison groups would be needed. None of this is cheap or easy.

Methods for Evaluating Unintended Program Impact

The approach to be advocated in the proposed presentation is set out in the writing of the presenter, and can be summarized in the graphic below.1


§         There is a continuum that ranges from events that “could reasonably be foreseen” to those that  are impossible to predict because they emanate from the operations of complex systems.

§         Different evaluation tools are differentially useful at different points along this continuum. For instance applying diverse program theories is particularly useful for anticipating outcomes, monitoring and evaluation for early detection of unexpected change, and  multiple data sources to make the evaluation more capable of measuring a wider range of program behavior.

§         It is impossible to deal with all eventualities, but there are ways to “chip away” at the problem and by so doing to make evaluation more robust in the face of program change.

§         Any design choice to make an evaluation more robust in the face of change carries its own potentially negative consequences. Thus choices have to be made carefully. For instance the resources needed to maintain multiple data sources may diminish resources for interaction with stakeholders, or effort devoted to analysis.

[1] Morell, J. A. (2005). “Why are there unintended consequences of program action, and What Are the Implications for Doing Evaluation?” American Journal of Evaluation 26(4): 444 – 463

Morell, J. A. (2010). Evaluation in the Face of Uncertainty: Anticipating Surprise and Responding to the Inevitable. New York, Guilford.

Workshop: http://www.jamorell.com/documents/UC_AEA_CDC.pdf

[2] Inglehart, R. and C. Welzel (2009). “How Development Leads to Democracy: What We Know About Modernization.” Foreign Affairs 88(2): 33 – 49.

6 thoughts on “Evaluating the relationship between development and democracy. Comments sought on draft of a presentation

  1. I think there’s a fundamental disconnect between your introduction- which focuses on the collective impact of multiple programs (presumably over some extended period of time) and your subsequent discussion of unintended consequences, which, as I read it, sort of takes for granted that there is a program being evaluated which has planned consequences (and therefore may have unintended consequences). The basket of programs which impact development do not have an intention, and therefore cannot have an “unintention.” I think you were on to something in the beginning of the presentation – it’s certainly a reality that no single program is going to have much impact on development – and I think you should work from that premise, to see where it leads, rather than jumping to an existing theory you’ve been working on for a while. I just don’t see the fit right now.

  2. Jonathan,

    1. I don’t see how your examples of how evaluation designs must preserve integrity over time make the point.

    2. You ask “How then, to best maintain the power of an evaluation design and still be responsive to changing needs?” One response to this, albeit one that needs to attend to other issues such as lead time, is to not use an outcome-based design. Wouldn’t a “developmental evaluation” design fit better here if you could develop a system for monitoring critical indicators as a formative task? You state “Good monitoring can increase lead time.” But there are different ways to monitor and different aspects of a program/policy to monitor. What these are define what lead time you can expect to achieve.

    3. You make the observation: This uncertainty exists because development programs often involve rich, tight linkages that affect many aspects of the systems in which they reside, and also because the environments in which they exist can be unstable. However, I think a stronger point of some type needs to be made here because many of the programs we evaluate these days in these times, e.g., education, health, have rich tight linkages in the systems in which they are being implemented and are being implemented in unstable, rapid change environments. I’m not sure this is as unique as you make it sound to the programs to which you are referring. Something that is perhaps more unique in the foreign relations arena, is in the areas creating the most focus for us, the righ tight linkages are less known and less understood and the instability of the environments have regional if not global repercussions.

    4. You state: Today, we have a clearer idea than ever before of why and how this process of democratization happens.[2] I think this was true when international relations focused on the nation state as the decision, action and change agent. I think and have read but durn it, don’t have the references, where this process of democratization has changed dramatically in areas fragmented by socio-cultural differences and most importantly, absolutist religious/political beliefs. Change agents now cross national borders and make it difficult to implement the policies and practices that advance industrialization and democracy. This view smacks of the presumption that everyone in the world buys off on the notion of “the good life” or “the good society” as being one in which productivity pays off economically. But this view is not in fact shared globally or by trans-national groups who have political and military klout.

    5. In your first simplified example of job training and cell phone programs, you state: “From the point of view of evaluation these uncertain outcomes are problematic for methodological reasons.” But a huge reason these uncertain outcomes are problematic have nothing to do with evaluation but rather the equation of industrialization and democracy. You have suggested the former leads to the latter but right or wrong, the two are not the same set of principles and processes.

    6. There is much of value in your last section. But I do take issue with your characterization of program theory. Within the intersection of diplomacy, development, and foreign relations, “program theories” are inextricably related to larger, strategic political theories which are usually related to economic theories. In thinking about development programs that ultimately connect with this intersection, we must clarify these broader theories and then build evaluations that not only assess micro-policies but provide the framework for seeing and anticipating contradictions with normative dimensions of variation.

    Really interesting task you have here! I’ll be interested to keep reading about your progress on your presentation!


    1. Hi Joanne –
      You have given me a lot to think about. I guess that is what blogs are for. I see two major themes in your comments – 1) evaluation methods and 2) development and democracy.

      As for whether my example makes the point, Mitch hit on a similar topic. I’ll need to give that some thought and figure out what to do to bring out the ideas I want to highlight.

      As for an outcome based design. I make no apologies. I think it is entirely reasonable to have a particular outcome in mind, and to care about whether that outcome is achieved. Does a STEM initiative increase the representation of women in science and engineering? Does a public health program increase the ability of diabetics to maintain their weight? Does abstinence only sex education decrease teenagers’ sexual activity? Does a change in safety culture decrease accidents and derailments in the railroad industry? And so on. I don’t argue that a developmental approach can often be useful in these cases. But I also believe something else.

      Answering these questions is really hard and to say that we will only use developmental approaches with a monitoring and formative emphasis abandons a host of powerful evaluation tools. And a big subset of those evaluation tools require design integrity over time. We may need to conduct data collection during narrow windows of opportunity. (E.g. just before entry into treatment, at defined times after treatment, etc.) We may need control groups, and control groups often need tending to assure their availability. We may need carefully validated instruments that take time and effort to develop and which cannot be changed easily. We may need considerable effort to get access to corporate or agency data, and changing the data request mid-course may not succeed. Etc. It’s the very fact that this bind exists – between unexpected behavior and the value of many powerful evaluation tools – that makes this whole question so interesting for me.

      As of using a developmental approach to help with monitoring and lead time. Yup. That is a great idea and everyone should do it.

      In that same paragraph you touch on the notion that there are different aspects of programs and policies to monitor. To me that is an exceedingly interesting notion because it gets to the methodology needed for monitoring. In a way the question poses the very same issues that I write about in my book relative to evaluation as a whole. (One might say the same dynamics scale, kind of like fractals.) What needs to be monitored? Some indicators can be predicted in advance, and some can’t. It’s the same continuum that I set up for all of evaluation, from the highly predictable to that which for theoretical reasons (complexity etc.) can never be anticipated.

      In the example I use in the presentation, I make a specific point that what I’m presenting does NOT purport to measure the impact of development on democracy. I believe that there is such a relationship, but that it’s not a practical evaluation question. I do claim that there is a theory of democratization that posits “political participation” as an precursor to democracy, and that based on that theory (and the data behind it), it is reasonable to constrain the problem by looking only at political participation.

      One of the big challenges I had in developing this presentation was that the conference is interested in evaluating how development affects democracy. We have been, (and could continue on an on forever) to explain how problematic and impractical it would be for evaluators to show that relationship. That’s why I constrained the conversation to an outcome short of democracy. Of course other factors are important, e.g. legal protections for minorities (else we have tyranny of the majority which does not make for democracy), corruption (which decreases the legitimacy of political participation as a change mechanism), the position of the elites about the value of democracy, cultural beliefs about authoritarianism, the depth of rifts in the society (e.g. ethnicity, religion), commodity prices, and much else besides. I could have worked out an example that included all that, but it seemed a bit much for the point I was trying to make.

      Your comment about transnational groups opposing democracy is well founded. (More is the pity.) But as far as I am concerned this is just one more of the multitude of forces opposing democracy, many of which are internal to whatever nation state we may be talking about. There are ideological reasons to oppose democracy. There are selfish reasons to oppose democracy. And there is certainly no historical determinism operating here.

      I most certainly do not believe that “the good life” or “the good society” has to be one in which “productivity pays off economically”. But I do believe that material wealth spread beyond the elites is an important factor in raising education levels, providing people with the “leisure”, self interest and skills to get involve politically, facilitating interest groups, and a variety of other factors that affect the likelihood that democracy will emerge. As for what “democracy” means, and what forms it can take, that is a whole other question.

      I really appreciate your making the effort to respond to my blog post. I can’t figure this stuff out by myself.

  3. Jonathan,

    I pretty much agree with your responses to my comments. My comments were intended seriously but also to prod your thinking. One thing comes to mind. You say: “One of the big challenges I had in developing this presentation was that the conference is interested in evaluating how development affects democracy. We have been, (and could continue on an on forever) to explain how problematic and impractical it would be for evaluators to show that relationship.” Let me see if I can get clear on what I’m thinking here. I think I agree that it is both problematic and impractical for evaluators to try and evaluate how development affects democracy. But I think as perhaps evaluation theorists we can analyze how different theories of development have different implications for a preferred vision of development which inevitably leads to a preferred vision of how we are to organize ourselves politically, especially when we analyze the implications different theories of development have for the other end or ends of the theoretical dimension of development. Now this conference may not be interested in this question but it should be as there are different theoretical visions of development and they do have different implications for governance, political decision making, etc. Just an aside.

    As I said, I’ll look forward to how you progress with the presentation! I’m very impressed you were invited to present!


  4. Dear Jonathan,

    It was not really easy to grasp the ideas on the basis of the powerpoint presentation and the text added. I thus do not really know whether my comments will be of any value.

    -I think that your presentation is applicable to more issues (and maybe better applicable) than to the relationship between development and democracy (on which there is a lot of discussion and no final ‘program theory’. It is also not clear to me why you are particularly choosing the specific program theory which is mainly based on one article). It could be interesting to apply programme theory evaluation (cf. White and Carvalho), where you consider for each of the steps in the programme theory possible alternative theories (which you may derive from other articles which deal with the relationship among development and democracy, see e.g. Collier)

    -What I found interesting is the idea of using different evaluation techniques at different stages. You might also have a look at the articles on ‘shoestring evaluation’ (Bamberger) which also deal with the problem of evaluating under constraints of time, resources and complex settings.

    -an issue which I found missing is the idea that the context in which you are doing an evaluation might also influence the evaluation itself (politics of evaluation). I add an article of a colleague and myself on this (influence of Political Opportunity Structure on different dimensions of evaluation).

    I would of course be interested to see further versions of your ppt. (and maybe an article which could make it more easy to grasp the ideas).

    Kind regards,

  5. Hi Nathalie,

    Your post raised a few thoughts in my mind. First, my general approach is certainly applicable to a wide variety of subjects. This presentation just happens to be on a topic of interest to the attendees at a particular conference. If fact one of the reasons I was attracted to this meeting was a chance to apply my ideas to yet another domain. As for my approach being better applicable to other settings, you are not alone in this belief. Mitch was certainly clear in his opinion about that. I do think my approach is applicable, but clearly I have to give it more thought, which I will.

    As for why I chose the program theory I did, the decision was based on two factors. First, it made sense to me based on my reading. Second, I felt that all I needed was one good theory to apply as an example. For my purposes “good” meant plausible and useful for making the points I wanted to make. I think the one I chose is theoretically and methodologically defensible, and it works well to set up a scenario I can use. Whether that particular theory is really the one (or one of a set) that actually explains the formation of democracy – well that is an entirely different matter.
    Your point about multiple program theories is very well taken. It has many advantages, not the least of which is that it forces all involved to think harder about what the research says and about what assumptions they are making. It often bothers me that evaluations are based on “program theory” as defined by a consensus of stakeholders. A logic model is built and off everyone goes, happily marching down a single well defined evaluation path. That path may go to the right place, but it can also lead off cliffs and into swamps. That’s one of the reasons I put so much emphasis on alternate evaluation theories in my book as a way of anticipating “unforeseen” consequences of program behavior. I also think that political ideology has a major affect on program theory, and hence, on evaluation. I’m hoping to stir up a bit of trouble at next year’s AEA meeting on this topic. Surprise in Evaluation: Values and Valuing as Expressed in Political Ideology, Program Theory, Metrics, and Methodology AEA 2011 – Think Tank Proposal

    I also think that different program theories have different implications for methodology, and that is a topic worthy of more discussion than it gets. To take a simple example, imagine two program theories about the same program. One of which posits an interaction between two outcomes, and one of which does not contain that interaction. An evaluation methodology based on each program would be quite different.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s