Can An Evolutionary Process Generate English Text? Guest Post by David Bailey!

The Mormon Organon welcomes again guest blogger David H. Bailey! David is a researcher at the High-Performance Computational Research Department at the Lawrence Berkeley Laboratory in Berkeley, California. He is a leading figure in the field of high-performance scientific computing. He has over 100 scientific papers in that area, but to Mormon audiences he is best known for his insightful writings about Mormonism and Science issues. Welcome David!


A fundamental precept of evolutionary biology is that a combination of random variation and natural selection is the fundamental driving force for evolution. The consensus of the vast majority of biologists is that over the course of many generations, species have diverged and adapted to their local environment, thus producing the remarkable variety of life presently seen on earth. In contrast, skeptics of evolution, including many in the creationist and intelligent de-sign communities, assert that whereas natural biological processes may result in minor changes in a single species over time, nothing fundamentally new can arise from “random” evolution.

Some writers have drawn the analogy to English text. For example, David Foster, in a book skeptical of evolution, discusses and then refutes an argument he attributed to Thomas Huxley, namely that a few monkeys typing randomly for millions of millions of years would type all the books in the British Museum. Foster asserts that even a single line of 50 characters could not be produced in this way, since there are at least 8.5 x 10^(49) alphabetic strings of length 50; thus generating a specific given string “at random” is unlikely even over billions of years.

In response to Foster, biologist Gert Kortof points out that Huxley could not possibly have told this story in 1860, because typewriters were not commercially available until 1874. Furthermore, as both Gert Kortof and Peter Olofsson have noted, this type of argument suffers from fail-ing to define precisely what should truly be counted as “surprising.” To correctly assess the odds of such an occurrence, one should not calculate the probability of some single event (all of which may have the same probability), but instead the probability of all events in a given class.

Along this line, Oxford biologist Richard Dawkins has described a simple computer program he wrote to generate the Shakespearean sentence “Methinks it is like a weasel,” starting from a ran-domly generated character string. His program achieved its goal in 41 evolution-like iterations, where, at each iteration the population of “sentences” were scored based on how many letters were in agreement with the target phrase. Selective “breeding” improved the score of the best sentence until there were no errors.

While this is an interesting exercise, it has significant flaws, some of which Dawkins himself acknowledged. To begin with, his experiment involved only a single “species.” Secondly, Dawkins’ process was defined by a single pre-specified target, whereas biological evolution is governed instead by a complicated “fitness landscape” involving hundreds of interacting factors. Finally, Dawkins’ experiment progressed to a fixed future goal, whereas real biological evolution does not operate with any future goal in mind – each step must bestow some advantage.

A Computational Experiment

I thought it would be interesting to explore whether an evolutionary computing approach can generate more than a single, short, targeted phrase as Dawkins produced, but instead a significant volume of text segments that are typical, say, of some genre of English literature. To that end, I wrote a computer program that begins by constructing a set of 1024 segments of text, each 64 characters long. The individual characters are chosen at random according to the natural distribution of individual characters in Charles Dickens’ novel Great Expectations. Some examples:

o ao ,fludoy aocueu feidh,iaemehaiheyh daneny shpesaems y nhte
nrtnnbaa.nn hymeo t fiilunnw nt t,ntehg eu y’ t h l dieosea ii
mbdsoee lueleciro ,ynaeenetg itln h srw l,pn uf svee,ee a’l sl
snd etke snoymnra lhs gdnu,nmrs e trlhueafpraa.c.ys f yjser g

The program then finds the longest consecutive match of a given segment in character position 1 up through position 16 to any 16-long segment in the text of Great Expectations. This check is then repeated for positions 2 through 17 of the segment, then for positions 3 through 18, and on until the end of the segment is reached. The sum of the match lengths for these checks is the score for the given 64-long segment. Note that this scoring function has no specific future target, but only measures how typical the given segment is of text in Great Expectations. In other words, Great Expectations plays the role of “fitness landscape.”

Evolutionary iterations are then initiated: First, the top-scoring segments are permitted to “mate” (i.e., randomly exchange 4-long character strings, beginning at positions 1, 5, 9, etc.) with an-other segment chosen at random from the top-scoring segments. Then random changes are made to these strings, much in the spirit of mutations observed in real biology. After these “mutations” have been performed, each resulting segment is scored, and the segments are sorted according to their new scores. This cycle repeats until 10,000 iterations have been performed. At the end of these iterations, the highest-scoring segment is taken to be the result of the trial, and the other 1023 segments are discarded. The computer program ran for 24,576 repetitions of the process described above, thus generating 24,576 segments of length 64 characters each.

Many segments generated by the program, such as these four examples, contain syntax errors and nonsensical or misspelled words:

had i learn a lesson – looked at the stars, and held the gate.
i felt as if he were a surgeon or a dentistrate in the table.
did, in a comfortable about it and hear a triale beside her.
he is sure to be executed on mond another in the mire of time.

Many other segments, such as these four, are syntactically acceptable but don’t make much sense:

and gloves, and as there no one and between his countenance.
asked me why i wanted it and at her, said i, almost in a french
for three in the station that he was in it rather resented.
at remained, all these reasons for my part, he were a file.

But other segments are entirely reasonable, and could easily pass as fragments of literary text. Along this line, I constructed the quiz below, then had it administered to some college students at a large university. They were told only that some of these twenty segments of English text are extracted from the writings of Charles Dickens, and some are computer generated.

1. up at it for an instant. but he was down on the rank wet grass,
2. or do any such job, i was favoured with the employment. in order,
3. at the fire as she took up her work again, and said she would be
4. the monster was even careless as to the word that i had him so.
5. as to go with him to his father’s house on a visit, that i might
6. fitted it to nothing and get the ashes between me to the last.
7. as no relation into another that it is the same room – a little
8. a separation to be made for the desolater, like the man he was.
9. we said that as you put it in your pocket very glad to get it, you
10. that he had treated him to a little bee, he was to call the
11. if he had for a time such an interest here and contented me.
12. great iron coat-tails, as he had done, and then ran to that.
13. he saw me going to ask him anything, he looked at me with his glass
14. on my objecting to this retreat, he took us into another room with
15. been born on there, or that i had the greatest indignature.
16. the chimney as though it could not bear to go out into such a night
17. later to settle to anything i had hesitated as to the sound.
18. the greatest slight and injury that could be done to the many far
19. of it on the hearth close to the fear that she had done rather
20. out of my thoughts for a few moments together since the hiding had

The reader is invited to try to identify which of these are authentic snippets of Dickens’ writings and which are computer-generated segments produced by the scheme described above, without consulting any references. The answers are given in the Appendix below.

Looking collectively at the 66 sets of responses that the author received for this quiz, the average number of correct responses is 40 (60.6%), which is not a great deal higher than the 33 correct responses (50%) that one would expect at random. If we look at “majority vote” statistics, the majority of the 66 responses is correct for most items, but it is wrong for items #8, 9, 11, 13, 20. All of the computer-generated items had at least 18 incorrect responses out of 66.

It is important to note that none of the 24,576 segments produced by the computer program coincides with any 64-character segment of Great Expectations. In other words, the computer program is not merely “regurgitating” portions of the input text file. What’s more, none of these 24,576 computer-generated segments coincides with any other of the 24,576 segments in more than 17 consecutive characters, even when shifts are allowed – all 24,576 generated segments are substantially distinct. In addition, the computer program constructed numerous legitimate English words that do not appear anywhere in Great Expectations. Some examples:

administer, agitate, allowing, arrangers, assail, assessed, attenuated,
attraction, auctioned, baroness, batter, bellow, breather, chastened,
coached, conspire, contentions, credited, deceived, descension, despot,
detained, detriment, discriminate, dispensable, dispenses, distances,
easiness, elected, enhance, formations, foundered, generate, generation,
gentile, glisten, gradation, handler, hitches, inconvenient, increase,
intentionally, intentioned, intimations, iterate, lacerate, liberate,
liberated, likened, mattered, mediated, migration, ministered, mission,
necessitated, operated, positioned, possibilities, powered, prostrate,
releases, remonstration, renderings, retirements, retreated, searches,
session, silenced, simmer, situations, slinging, soothings, spheres,
statements, steamed, steers, straits, stratified, stressed, teased,
tendered, termination, thickens, threatenings, threshes, torments,
traitors, trench, utters, wandered, wither, weathers

Recall that 24,576 distinct 64-long segments of text were generated by the computer program, for a total of 1,572,864 bytes. Note that this figure is higher than the length of the computer program (17,622 bytes) plus the length of the Great Expectations input file (994,587 bytes), which total 1,012,209 bytes. In other words, the computer program generated 1.55 times more text than the combined input data file and computer program. After compressing these files using a well-known compression utility (as a measure of underlying information), the ratio is still 1.46.


A computer program based on methodology developed in the genetic programming community is indeed able to generate English text segments reminiscent of Dickens literature. At the least, some of the better resulting text segments are sufficiently good to fool human judges in an informal test – college students were correct in distinguishing true Dickens from computer-generated segments only about 61% of the time (on average).

Obviously a full-scale computer simulation of biological evolution would have to be much more sophisticated. It would have to incorporate thousands of species and millions of individual organisms, together with full details of a complicated and changing environment. Such a simulation is well beyond the scope of what could be done today even on the most powerful supercomputers. Nonetheless, it is clear that if the claims of creationist and intelligent design scholars (namely that “random” evolution cannot generate truly novel information) have any substance, we should be able to see evidence of this phenomenon even in modest simulations of evolutionary processes, such as the one described in this note.

But we do not see the claimed effect. Instead, we see results very much in keeping with principles of evolution that have been established in the field for many years, harkening back to the original mathematical models of evolution presented by Fisher back in the 1920s. Evolution does generate novel information.


In the exercise presented above, these items are authentic Dickens:
1, 2, 3, 5, 9, 13, 14, 16, 18, 20
These items are produced by the computer program:
4, 6, 7, 8, 10, 11, 12, 15, 17, 19

Full details and references are available here:

This entry was posted in Evolution, Philosophy of Science. Bookmark the permalink.

25 Responses to Can An Evolutionary Process Generate English Text? Guest Post by David Bailey!

  1. S.Faux says:

    Seconds after the randomness created by the “big bang” the universe contained all the tools it needed for matter, water, life, TV, computers, and the internet. If the creation were a billiards game, then the observers would have exclaimed to the player God, “Heckuva, break.”

  2. b says:

    I loved the ‘mating phrases’ bit.

    Isn’t this publishable research (likely with a little polishing and validation)?

  3. David Bailey says:

    I have submitted the full version to a journal. We’ll see what they say. The full version is available at the link at the end of the article.

  4. Mark D. says:

    This demonstration is rather dubious. Suppose instead of Great Expectations, one chose a contemporary human genome as a reference model. Then, starting with some random DNA sequence, one followed the same procedure as described above. It would hardly be surprising if the result contains lots of substrings that look vaguely like human DNA. After all, you used human DNA as a reference(!).

    The problem, of course, is that the natural landscape contains nothing remotely as specific as the human genome or the text of Great Expectations to serve as a step by step reference model.

    To suppose otherwise is to assert the existence of natural Platonic forms of enormous complexity and sophistication without any actual evidence that this is the case.

    A pre-biotic natural landscape is an exceedingly information poor environment. You have the laws of physics, perhaps an optimal temperature, stable physical environment, and some simple molecules floating around. So where in this pre-biotic soup does one get a step by step “fitness landscape” as rich as the text of Great Expectations?

  5. David Bailey says:

    It is true that the exercise above applies to an environment that already has a “computer” (e.g., DNA-replication mechanism) in place. But the point is to demonstrate that an evolutionary process can indeed produce information — Mt Improbable can be climbed — contrary to the claims of “intelligent design” scholars that evolution can produce nothing new.

    Indeed, biogenesis is a major challenge to biology. But even here, inroads have been made. In the latest SciAme, there is an article about “polymer nucleic acid” (PNA) molecules, which may have played a role in biogenesis. Some recent re-analysis of Miller-Urey-type experiments have produced some interesting results. Other researchers are investigating whether a self-catalyzing chain of chemical reactions was the initial step to life. I don’t know how all this will turn out. But is anyone willing to stick their neck out and insist, fighting to the last soldier, that science will never be able to find a plausible scenario? I, for one, wouldn’t place such a bet, especially given the progress already made.

  6. Mark D. says:

    Suppose we move up a step and start with a single celled organism. It still, in combination with is environment, does not have a “fitness landscape” that is remotely as rich as the text of Great Expectations. The only thing that has been added is a copy of its own DNA, and that is part of the initial state, not the transition rules. The transition rules (laws of physics mostly) are as information poor as before.

    In addition, the idea that this example demonstrates the creation of new information is incorrect. The independent information content of the result is guaranteed to be less than the sum of the information content of the initial state of the system and the information content of the transition rules.

    Standard statistical mechanics. Information is a measure of what you know about the state of the system. Entropy is a measure of what you don’t know. A random stochastic process imports entropy into the subsystem (which is the generated text in this case). It can also import information from the transition rules (which include the text of Great Expectations in this case).

    The resulting text you demonstrate here is certainly not as rich or as structured as the text of Great Expectations by any conceivable independent standard. It is just a mixture of the psuedo-entropy injected by the random number generator and information extracted from the reference text. The latter is the only source of structure in the result. No matter how much entropy you inject, you will (statistically) never get something with greater structure than Great Expectations.

    Working backwards, you cannot possibly arrive at such a result from much simpler structure unless either the laws of physics are billions of times richer than presently known, or the fitness landscape already has structure greater than or equal to what you expect to generate.

    In other words, it is statistically rational to believe that bacteria can evolve from humans, but not the other way around, unless there is third entity with greater structure than humans have to import information from. No plausible fitness landscape has that kind of information.

  7. David Bailey says:

    I’m not so sure. Keep in mind that using a state-of-the-art data compression program as an objective measure of information content, the output of my program was larger than the input Dickens file.

    Also, keep in mind that there are text analysis schemes much more powerful than what I used, such as the system behind Google’s language translation facility. This recently won a U.S. government-sponsored competition to translate Arabic to English, as I recall, even though no one on the Google translation team knew any Arabic! Ditto for numerous other languages.

    According to according to Franz-Josef Och of Google, their text analysis system is based on utilizing a bilingual text collection of at least one million words and two monolingual collections of roughly one billion words each. Statistical models obtained from this data are then used to translate between the pair of languages.

    In other words, their system is doing the same sort of mundane statistical transition analysis as my program does, except that theirs is working word-to-word instead of character-to-character, and they are using a much, much larger sample of text.

    I’ll bet an evolution-style English (or French or German or Chinese or Arabic) text generation program, based on the Google analyzer, would do extremely well in producing extended amounts of believable text.

  8. SteveP says:


      “The only thing that has been added is a copy of its own DNA, and that is part of the initial state, not the transition rules. The transition rules (laws of physics mostly) are as information poor as before.”

    Actually its presence changes the environment creating a new environment. This creates a back and forth between the environment and whatever is there. This idea is known as “niche construction” It’s fairly well understood in evolutionary ecology. This creates new fitness landscapes almost every generation on which selection can play out.

      “In other words, it is statistically rational to believe that bacteria can evolve from humans, but not the other way around, unless there is third entity with greater structure than humans have to import information from. No plausible fitness landscape has that kind of information.”

    This mistakes how entropy plays out in these systems. These are not closed systems. There are energy inputs and therefore complexity can and will increase. David’s example here clearly and very nicely demonstrates complexity increases above what was input.

    Mark, You are mistaking thermodynamically closed systems with open ones. Different rules apply.

  9. Mark D. says:

    David B.,

    Entropy and information are measured in the same units. A data compression program does not measure information content in a relevant manner for this experiment.

    The output of a random number generator has maximal information content granted perfect knowledge _after_ the fact. Before the the fact, the projected output has maximal *entropy*.

    To do a viable probability calculation, you cannot assume knowledge of the desired result. You have to calculate the ratio of the number of microstate the end system can be in a desired (rich, structured, reproducing) state vs. the number of microstates that the end system is in an undesirable result (fuzz, emptiness, death).

    Statistical thermodynamics partitions microstates of a system by total energy. Thermal entropy is a logarithmic measure of how many different microstates all look like they have the same temperature. The whole point is that you can measure and project it without knowledge of exactly which microstate (fuzz arrangement) a system is in.

    It is easy to see that the number of microstates of the system that look like nothing more than random fuzz is many orders of magnitude higher than the number of microstates of the system that are as rich and structured as the text of Great Expectations. That is why states like the latter are known as the “mountain of improbability”.

    I don’t think you can expect to convince anyone that the result of your experiment is richer or more structured the the text of Great Expectations by sight alone. You have to come up with a plausible metric for classifying the desirability (evidence of life, richness, structure, whatever) of an arbitrary state. Uncompressability (random fuzz) does not desirability make.

    A usuable biological information metric has to divide microstates in a radically different fashion. You don’t have perfect knowledge in the real world, so you would have to do a Monte Carlo simulation of your algorithm and demonstrate that the probability of the state being “desirable” is a monotonically increasing function of time.

    My claim is that your system demonstrates a temporary increase that levels off at a relatively low level, as the structural information flow (measured according to any viable richness metric) from Great Expectations into the evolving text reaches equilibrium.

    Steve P., Nowhere do I assume that any relevant subsystem is closed. I do assume that the universe is a closed system however.

  10. SteveP says:

    So Mark, you don’t believe that the cultural complexity or information content of the 20th century is greater than the 7th? All of your arguments would seem to have to hold for cultural information content as well. Yet that seems an absurd claim. I see our culture as being much richer in information than theirs.

    All of your arguments from thermodynamics assume closed system thermodynamics so you are making that assumption implicitly if not explicitly.

  11. Mark D. says:

    Steve P,

    I have yet to make an argument from the thermodynamics of closed systems and have said very little about thermodynamics except by analogy. The gist of my last comment is that thermal entropy is an irrelevant metric here – it can’t distinguish between a bird and a brick, for example.

    There are at least three different senses of entropy in common use (thermal, Bayesian, algorithmic). However, as nearly everyone who hears the term thinks “thermal” and doesn’t appreciate the relation with other kinds I will avoid the term from here on out.

    As cultural complexity goes: You make an argument by contradiction. I am doing the same. I argue that generation of the proper kind of complexity is statistically impossible in a world where all causation is either deterministic or random.

    On the contrary, cultural complexity is much better evidence of reality of conciousness and libertarian free will, the exception to the clockwork model of physics, the dreaded “ghost in the machine” that most biologists mock.

  12. Rob Osborn says:

    I don’t know a lot of big biology words or scientific terms, but I am somewhat familiar with the the whole argument at hand.

    Marvelous mathematics work I say! But then that is where the marvelous stops. The problem here is that you are using an “intelligent design” element in generating the program to fit your desired results or outcome. In this sense you have already broken your own rule! You already have a measuring device in place (yourself as the programmer) to run an outcome satisfactory to a preset quantity. Basically what I am saying is that all you have created is an intelligent program that runs sets of anamoulous measurements that register within the program and then keeps running the better ones (the ones that are preprogrammed) until it achieves your preset goal. You have thus created an “intelligent design” program, not a random evolutionary model program.

    It would not be that hard at all to be able to create a computer program that could actually make up complete sentences that made sense and even make the sentences coincide with each other and make sense- creating a paragraph. From there one could conceive a short novel to be made along the same lines. There is nothing marvelous or eye-popping miraculous as far as evolutionary science is concerned though with that feat. You have just used the intelligent design model with pre-set conditions in a closed environment where everything is controlled by the designer himself! There is truly nothing random about it because it is intentionally designed and created to acheive the results you are looking for.

    Although your work is good it completely misses the mark where it matters most. The program never generated words on its own, no, it already had the design element already imbedded in it. Try something of real novelty- create a program that makes programs. Or would that be impossible because you would always have to have more intelligence for input than output!

    It all comes down to intelligence- How is intelligence in nature created. It certainly isn’t no random process- that is for darn sure! Try creating a computer that recognizes itself and feels for itself and then you just may be able to do what you really propose to do- have a seemingly random process create something tangible and intelligent.

  13. Cynthia L. says:

    Just stumbled on this video, which is a lot of fun:

    They evolve neural nets for “fish.”

  14. David Bailey says:

    “The problem here is that you are using an ‘intelligent design’ element in generating the program to fit your desired results or outcome. … Basically what I am saying is that all you have created is an intelligent program that runs sets of anomalous measurements that register within the program and then keeps running the better ones (the ones that are preprogrammed) until it achieves your preset goal.”

    Not at all — keep in mind that the program starts with random gibberish, and then proceeds to refine them according to a very complicated “fitness environment” (concordance in a specific sense with Dickens text) — definitely not selecting ones that are “preprogrammed”. In addition, the total amount of text generated exceeds that of the input.

    “Try something of real novelty- create a program that makes programs.”

    But this has been done many, many times in the genetic algorithm field (also known as “evolutionary programming” field), using a methodology very similar to mine. In numerous cases, such programs have devised computer programs or engineering designs superior to the best state-of-the-art human-assisted versions. See

  15. David Bailey says:

    Yes, the early earth was information-poor. But it was energy rich, with quadrillions of joules coming from the sun every day, not to mention heat generated by radioactive elements (which continues today) and things like occasional meteorite impacts.

    Just today Scientific American has added to their website an article about a recent study of the shock of a meteorite hitting the earth’s ocean. Researchers found that the impact generated a large number of exotic carbon-based compounds, including at least one amino acid. This is an interesting follow-up to the famous Miller-Urey experiment (recently updated by some additional studies) wherein amino acids were found in a brew subjected to lighting, etc. See

    In each case, one can ask whether the conditions assumed by the experimenters were the same as the early earth, etc. But the point of these studies is the same: thermodynamic processes (including the formation of higher-information chemical compounds) can definitely run uphill.

  16. David Bailey says:

    I meant “lightning”, not “lighting”.

  17. Brad W. says:

    I challenge any of you Darwinists who consider yourselves intellectually honest to watch this one hour documentary:

    and then tell me specifically why Intelligent Design is not science and intellectually challenging.

  18. SteveP says:

    Brad W., This is the regular nonsense from the regular crowd. Watch this one from evolutionary theist Ken Miller. He dismantles this kind of silliness quite handily:

    Why do the ID people have to engage in disingenuousness to promote their ideas? The data they ignore and misinterpret can only be called dishonest. Which is what the Dover Judge did call it.

  19. Brad W. says:

    You didn’t watch the full video did you Steve? I can tell because you’re still throwing out the old, worn out, cliched arguments that atheists have against Intelligent Design. I challenge you to ask your students to watch this video and then have a discussion about it in your classes. Would you be opened-minded (and daring) enough to do that?

  20. Brad W. says:

    “The brethren have been talking about temporal things. We cannot talk about spiritual things (religion) without connecting with them temporal things (science), neither can we talk about temporal things (science) without connecting spiritual things (religion) with them. They are inseparably connected.” 10:329.
    –Brigham Young

  21. SteveP says:

    ‘cliched arguments that atheists have against Intelligent Design’ Ken Miller (catholic) is a believing scientist, as I am. The ID arguments have failed all tests from believing scientists like Ken Miller and Myself and those who do not believe. Pretending it’s science does not make it so. Did you watch the video I put up? Apparently not or you would know Ken Miller is a man of Faith.

  22. Ben says:

    I am one of the students who took this quiz. I am an English major (this was not an English class), and I have studied every major work by Dickens extensively. It was very difficult to differentiate between the actual Dickens text and the computer generated stuff. I’m not even sure that I got all of them right.

    While this study does not constitute conclusive evidence, I think it makes a valid point.

  23. Ben says:

    As a follow up to my last post, I would like to say that I have recently been “converted” to an acceptance of the theory of evolution, largely because I have come to know what the theory actually entails. In addition to being an English major I will graduate in one week with a BS in Genetics. I have traditionally been skeptical of evolution, which has made for some interesting mental gymnastics throughout my college career. After read Ken Miller’s books I finally asked myself, “what would it change if the theory of evolution – in its pure form, not the dogmatism often spouted by those like Richard Dawkins (Dawkins is a great scientist, but his presentation, as Miller points out, is often clouded by his own beliefs) – and I came to understand that it would change nothing. Read Miller’s books, take a real course in Genetics, and then start having some informed discussion. I promise your perspective will change.

  24. Sebbe says:

    Hi David, i’ve recently read your article and wonder if you could post the program you used?

  25. Alex says:

    Not sure why this program is relevent as it doesn’t model evolution due to your use of a pseudorandom number generator.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>