drcabana.org
Computational divertissements


How to build a quine generator

Posted on

The plan per the previous posts has been to:

  1. Develop a way to write quines by hand, described in the first post.
  2. Use the lessons learned to write a quine generator.
  3. Use the further lessons learned to write a quine relay generator.

Step 2 is the subject of this post. So far we have seen a pair of quines, one in python and one in F#. Each consists of data and logic used to control the printing of said data. The plan here is to repackage the data and the logic. A single python program will handle the printing logic across all languages. The data representing the various quines will live in in python lists, one per language.

Let's start by looking at (part of) the data for one language:

fsharp = [
    "let data = [",
    "]",
    "let boundary = 1",
    "let a, b, c = 0, boundary, data.Length",
    "let quote, space, separator = string(char 34), string(char 32), string(char 59)",
    "for k in a..b-1 do",
    "   System.Console.WriteLine(data.[k])",
    "for d in data do",
    "   System.Console.WriteLine(space + quote + d + quote + separator)",
    "for k in b..c-1 do",
    "   System.Console.WriteLine(data.[k])"
    ]

Starting with that data, here is the basic idea for the printing logic.

def quote(line, separator):
    q = '"'
    return ' ' + q + line + q + separator

def quine():
    code = fsharp
    boundary = 1
    data = [quote(line, ';') for line in code]
    return  code[:boundary] + data + code[boundary:]

for line in quine():
    print(line)

The quine() function returns a string representing a quine. As seen in the return statement, that string is built by concatenating three pieces. Those three pieces correspond exactly to the three print loops in our earlier hand-written quines.

The sketch above needs to be generalized so that it can generate quines in multiple languages. We'll modify quine() to take an argument specifying which language to use. We'll have to look up language specific details, such as the character used to separate elements of a list. Here's the revised idea:

def quine(lang):
    code = template(lang)
    bnd = boundary(lang)
    sep = separator(lang)
    data = [quote(line, sep) for line in code]
    return code[:bnd] + data + code[bnd:]

That's pretty much the final version, except for the glue code used to implement the various lookups. The complete code is available here. It includes language templates for python, F#, lua, ruby, java, and babashka. Also included is a slightly whimsical test harness that uses python to generate bash scripts that test whether a purported quine is actually a quine. It just felt right to use generated test code to test generated code.

One last thing. Consider these lines from the F# template above:

    "let a, b, c = 0, boundary, data.Length",
    "for k in a..b-1 do",
    "   System.Console.WriteLine(data.[k])",
    "for d in data do",
    "   System.Console.WriteLine(space + quote + d + quote + separator)",
    "for k in b..c-1 do",
    "   System.Console.WriteLine(data.[k])"

Why take the trouble to declare variables a,b,c and then write the print loops in terms of those variables? Why not instead use the known values of the variables in the loop indices? Those values are known at design time when generating quines, but they will have to be computed at runtime when generating quine relays. In that scenario it will be convenient to have parameterized loop indices. More on that in the next post.

The supporting code repo is drcabana/quines.