drcabana.org
Computational divertissements


How to build a quine relay generator

Posted on

This is the fourth post in a series. The goal here is to extend the quine generator built in the previous post so that it can generate quine relays. Let's get started.

The quine relay generator is a python program called polyquine. It expects one or more string arguments, selected from java, python, ruby, lua, fsharp, or babashka. A single argument results in a standard quine. It is OK to repeat arguments.

Below is terminal output from a sample run. We ask for a quine relay using java, babaskha, and F#. The output is a java program, call it J. Running J would result in a babashka program, B. Running B would result in an F# program, F. Running F would result in the original program, J.

❯ ./polyquine.py java babashka fsharp
public class quine {
 public static void main(String[] args) {
  String[] data = {
 "public class quine {",
 " public static void main(String[] args) {",
 "  String[] data = {",
 "  };",
 "  int a=14; int b=15; int c=21; int sep=32;",
 "  char q = (char) 34; char space = (char) 32; char separator = (char) sep;",
 "  for (int i = a; i < b; i++)",
 "    System.out.println(data[i]);",
 "  for (int i = 0; i < data.length; i++)",
 "    System.out.println(String.valueOf(space) + q + data[i] + q + separator);",
 "  for (int i = b; i < c; i++)",
 "    System.out.println(data[i]);",
 " }",
 "}",
 "(let [data [",
 " ]",
 " [a b c sep] [21 22 31 59]",
 " q (char 34), space (char 32), separator (char sep)]",
 " (doseq [k (range a b)]           (println (nth data k)))",
 " (doseq [k (range (count data))] (println (str space q (nth data k) q separator)))",
 " (doseq [k (range b c)]           (println (nth data k))))",
 "let data = [",
 "]",
 "let a, b, c, sep = 0, 3, 14, 44",
 "let q, space, separator = string(char 34), string(char 32), string(char sep)",
 "for i in a..b-1 do",
 "   System.Console.WriteLine(data.[i])",
 "for i in 0..data.Length-1 do",
 "   System.Console.WriteLine(space + q + data.[i] + q + separator)",
 "for i in b..c-1 do",
 "   System.Console.WriteLine(data.[i])",
  };
  int a=14; int b=15; int c=21; int sep=32;
  char q = (char) 34; char space = (char) 32; char separator = (char) sep;
  for (int i = a; i < b; i++)
    System.out.println(data[i]);
  for (int i = 0; i < data.length; i++)
    System.out.println(String.valueOf(space) + q + data[i] + q + separator);
  for (int i = b; i < c; i++)
    System.out.println(data[i]);
 }
}

The output should seem familiar if you read the preceding posts. We see a list of strings followed by three print loops. The print loops are controlled by the indices a,b, and c. The trick is to compute appropriate values for those indices.

Let's look at some of the data used to generate quine relays. It is almost the same as the data used by the quine generator from the previous post. Here is the template used to generate java code:

java = [
[   "public class quine {",
    " public static void main(String[] args) {",
    "  String[] data = {",
    "  };", ],
[   "  char q = (char) 34; char space = (char) 32; char separator = (char) sep;",
    "  for (int i = a; i < b; i++)",
    "    System.out.println(data[i]);",
    "  for (int i = 0; i < data.length; i++)",
    "    System.out.println(String.valueOf(space) + q + data[i] + q + separator);",
    "  for (int i = b; i < c; i++)",
    "    System.out.println(data[i]);",
    " }",
    "}", ]

The analogous data in the quine generator was a list of strings. This is a list containing two lists of strings. What is the reason for the change? Compare the template to the java strings in the output from the polyquine run. There is a line in the output unaccounted for in the template. The template contains no string declaring the variables a,b,c.

That line will be constructed at runtime, and inserted into the template. Where? Exactly between the two lists that initially make up the template. The syntax for the line to be constructed comes from the parms found in data representing facts about the various languages:

facts = { "babashka" :{"boundary":1 , "template":babashka , "separator":32 ,
                       "size": 1 + len(babashka[0]) + len(babashka[1]),
                       "parms": " [a b c sep] [{} {} {} {}]"},

         "fsharp"  :{"boundary":1 , "template":fsharp ,  "separator":59 ,
                     "size": 1 + len(fsharp[0]) + len(fsharp[1]),
                     "parms": "let a, b, c, sep = {}, {}, {}, {}"},

         "java"    :{"boundary":3  ,"template":java ,  "separator":44 ,
                     "size": 1 + len(java[0]) + len(java[1]),
                     "parms": "  int a={}; int b={}; int c={}; int sep={};"},

         "lua"     :{"boundary":1 , "template":lua ,  "separator":44 ,
                     "size": 1 + len(lua[0]) + len(lua[1]),
                     "parms": "a, b, c, sep= {}, {}, {}, {}"},

         "python"  :{"boundary":1 , "template":python,  "separator":44 ,
                     "size": 1 + len(python[0]) + len(python[1]),
                     "parms": "a, b, c, sep = {}, {}, {}, {}"},

         "ruby"    :{"boundary":1 , "template":ruby,  "separator":44 ,
                     "size": 1 + len(ruby[0]) + len(ruby[1]),
                     "parms": "a, b, c, sep = {}, {}, {}, {}" }

Let's take a look at the mechanics of computing a,b,c for langs = ['java', 'babashka', 'fsharp'], as in the sample run. Here is the code.

def populate_templates(langs):
    ''' Compute required offsets per language choices, inject them into raw templates. '''
    offsets = partial_sums([ size(l) for l in langs[:-1] ])
    templates = []
    num_langs = len(langs)
    for k in range(num_langs):
        langX = langs[k]
        langY = langs[(k+1) % num_langs]
        a = offsets[(k+1) % num_langs]
        b = a + boundary(langY)
        c = a + size(langY)
        parms = parameters(langX).format(a, b, c, separator(langY))
        top,bottom = template(langX)
        templates.append(top + [parms] + bottom)
    return templates[0],concat(templates)

The for loop will run three times, because we use three languages. Each pass handles two of the languages, langX and langY. In the first pass these are java and babashaka. In the second pass they are babashka and F#, and in the final pass F# and java. This reflects the goal of building a java program that emits a babashka program that emits an F# program that emits the original java program.

Consider the first pass. There langX is java, langY is bashska. Imagine that we have concatenated the templates for our three languages into a single list. The variable a is the offset of langY in that list. By construction, all three indices a,b,c will point into the langY strings. Those indices will be used as values in the langX template. Why? Because we are generating a langX program that emits a langY program.

The populate_templates function returns a pair of values. The first value is the completed langX template. The polyquine function will need that because it will construct a program in langX. The second value is the concatenation of all the completed language templates. That list is exactly the data seen in the sample quine relay.

With those two values in hand, constructing the quine relay is almost identical to the way the earlier quine generator constructed quines. Here it is:

def polyquine(langs):
    ''' Generate a polyquine in the specified languages. '''
    code,payload = populate_templates(langs)
    bnd = boundary(langs[0])
    sep = chr(separator(langs[0]))
    data = [quote(line, sep) for line in payload]
    return code[:bnd] + data + code[bnd:]

Testing

The repo for the polyquine project includes a test harness. It uses python to generate bash scripts that invoke the various language runtimes and the diff utility. I was on a code generation roll, and decided to run with it.

Running the tests will likely require customization of the both the host system and the test script generator. The host needs to have the various languages installed. The test script generator needs to know the paths on the host to the language runtime executables. I decided it was out of scope to explain how to install the various languages, instead leaving it as an exercise for the motivated reader.

I built and tested this code on macOS Catalina. I suspect it could easily be made to run on linux, but have not attempted it. On Windows, who knows? There I'd try to use the WSL. I toyed with the idea of setting up a container or a VM with the various language runtimes installed, but decided that was one yak too many to shave.

Concluding remarks

If you read all four posts in this series, thank you. I applaud your patience and hope you enjoyed them. I tried to make a tricky bit of programming as clear and intelligible as I could. I wanted to explain how the quine relay trick is done, and so demonstrate that it is simpler than would first appear. Whether I succeeded is for others to decide.