Monday, December 28, 2015

Aiming for the stars versus "the adjacent possible"


Background: I have been exploring the uses of a new Excel application I have been developing with the help of Aptivate, provisionally called EvalC3. You can find out more about it here: http://evalc3.net/

If you have a data set that describes a range of attributes of a set of projects, plus an outcome measure for these projects which is of interest, you may be able to identify a set of attributes (aka a model) which best predicts the presence of the outcome.

In one small data experiment I used a randomly generated data set, with 100 cases and 10 attributes. Using EvalC3 I found that the presence of attributes "A" and "I" best predicted the presence of the outcome with an accuracy of 65%. In other words, of all the cases with these attributes 65% also had the outcome present.

Imagine I am running a project with the attributes D and J but not A or I. In the data set this set of attributes was associated with the presence of the outcome in 49% of the cases. Not very good really, I probably need to make some changes to the project design. But if I want to do the best possible, according the data analysis so far, I will need to ditch the core features of my current project (D and A) and replace them with the new features (A and I). This sounds like a big risk to me.

Alternately, I could explore what has been called by Stuart Kauffmann "the adjacent possible". In other words, make small changes to my project design that might improve its likelihood of success, even though the improvements might fall well short of the optimum level shown by the analysis above (i.e. 65%).

If data was available on a wide range of projects I could do this exploration virtually, in the sense of finding other projects with similar but different attributes to mine, and see how well they performed. In my data based experiment my existing project had attributes D and J. Using EvalC3 I then carried out a systematic search for a better set of attributes that kept these two original attributes but introduced one extra attribute. This is what could be called a conservative innovation strategy. The search process found that including a particular extra attribute in the design improved the accuracy of my project model from 49% to 54%. Then introducing another particular attribute improved it to 59%.

So what? Well, if you are an existing project and there is a real life data set of reasonably comparable (but not identical) projects you would be able to explore explore relatively low risk ways of improving your performance. The findings from the same data set on the model which produced the best possible performance (65% in the example above) might be more relevant to those designing new projects from scratch. Secondly,  your subsequent experience with these cautious experiments could be used to update and extend the project data base with extra data on what is effectively a new case i.e a project with a new set of attributes slightly different from its previous status.

The connection with evolutionary theory: On a more theoretical level you may be interested in the correspondence of this approach with evolutionary strategies for innovation. As I have explained elsewhere "Evolution may change speed (e.g. as in punctuated equilibrium), but it does not make big jumps. It progresses through numerous small moves, exploring adjacent spaces of what else might be possible. Some of those spaces lead to better fitness, some to less. This is low cost exploration, big mutational jumps involve much more risk that the changes will be dysfunctional, or even terminal" A good read on how innovation arises from such re-iterated local searches is Andreas Wagner's recent book "Arrival of the Fittest"

Fitness ladscapes: There is another concept from evolutionary theory that is relevant here. This is the metaphor of a "fitness landscape" Any given position on the landscape represents, in simplified form, one of many possible designs in what is in reality a multidimensional space of possible designs. The height of any position on the landscape represents the relative fitness of that design, higher being more fit. Fitness in the example above is the performance of the model in accurately predicting whether an outcome is present of not.

An important distinction that can be made between fitness landscapes, or parts thereof, is whether they are smooth or rugged. A smooth landscape means the transition in the fitness of one design (point in the landscape) to that of another very similar design located next door is not sudden but gradual, like a gentle slope on a real landscape. A rugged landscape is the opposite. The fitness of one design may be very different from the fitness of a design immediately next door (i.e. very similar). Metaphorically speaking, immediately next door there maybe a sinkhole or a mountain. A conservative innovation strategy as described above will work better on a smooth landscape, where there are no sudden surprises.

With data sets of the kind described above it may be possible to measure how smooth or rough a fitness landscape is, and thus make informed choices  about the best innovation strategy to use. As mentioned elsewhere in this website, the similarity of the attributes of two cases can be measured using Hamming distance, which is simply the proportion of all their attributes which are different from each other. If each case in a data set is compared to all other cases in the same data set then each case can be described in terms of its average similarity with all other cases. In a smooth landscape very similar cases should have a similar fitness level i.e  be of similar "height", but the more dissimilar cases should have more disparate fitness levels. In a rugged landscape the differences in fitness will have no relationship to similarity measures.

Postscript:  In my 2015 analysis of Civil Society Challenge Fund data it seemed that there were often adjacent designs that did almost as well as the best performing designs that could be found. This finding suggests that we should be cautious about research or evaluation based claims about "what works" that are too dogmatic and exclusive of other possibly relevant versions.