Realizing a Quine

March 4, 2019

A quine is a source program, that when compiled and executed, will produce as output an exact copy of its source. Of all languages I know, I am most well acquainted with C++, so I decided to realize a quine in it only out of curiosity, since quines have little practical significance. I first read about quines in the 1983 Turing Award paper, and as soon as I did, I quickly got down to working on it. This made me come up with the following program:


#include <iostream>
#include <fstream>
using namespace std;

int main()
{

    ifstream f;
    f.open("quine.cpp");

    string line;

    while(getline(f,line))
        cout << line << endl;

    f.close();
    return 0;
}

As I later realized, this obviously wasn't anything profound, so after a while, I came across the fact that "stepping outside the program" is considered as cheating. I had to create a program that took no input, and printed itself as an output. The first thing that came to mind was a single cout statement.

Input:


#include <iostream>
#include <string>
int main()
{
    std::cout << "#include \n#include \nint main()\n{\n\tstd::cout << \"";
    return 0;
}

Output:


#include <iostream>
#include <string>
int main()
{
	std::cout << "

But you can only go as far as that. Whatever you write inside the cout within the quotation marks, has to be printed as it is a part of the source itself. So, if you start writing the string #include <ios..., just after the original string ended, by the time you get to the end of the modified string, you will have more characters to add. And when you add those characters, you will reach the end of the string again only to find more characters to add, and so on, it will stretch out till infinity.

Thankfully, C++ helps us out, by allowing us to store strings in memory. What this means is that instead of writing a string directly into the console, we can first store it in a variable. And when that variable's declaration is encountered in code, it can use its own content to print itself, thereby solving the recursive nature of the problem. So I came up with the following code:


#include <iostream>
#include <string>
int main()
{
    std::string s = "#include <iostream>\n#include <string>\nint main()\n{\n\tstd::string s = \"";
    std::string r = "\n\tstd::cout << s << s << \";\n\tstd::string r = \" << r << r;\n\treturn 0;\n}";
    std::cout << s << s << ";\n\tstd::string r = " << r << r;
    return 0;
}

Here string s stores the source code up until its own declaration and string r stores the source code after its own declaration. When the two strings are finally printed using a cout statement, immediately after the printing of the string, the contents of the strings are printed as well.

So this allows us to print the source as well as the statement that prints the source. The only drawback here is that when we are printing for the second time, characters get escaped, even though they shouldn't be. Now, if we remove the use of escape characters, we can have the entire code in one line, but even then, the include requires a newline character, so the escapes will need to be there.

This is where I got stuck and started looking for solutions when I found something called raw string literals (I had heard of raw strings in Python but not in C++). These strings help you put the characters as is - you can actually hit enter in the code, instead of typing in "\n". So if you put in raw strings instead of regular strings, there is no difference between writing in code, and displaying on the console.

So finally, a quine I could understand:


#include <iostream>
#include <string>
int main()
{
std::string s = R"(#include <iostream>
#include <string>
int main()
{
    std::string s = )";
    std::string r = R"(std::cout << s << "R\"(" << s << ")\";\nstd::string r = R\"(" << r << ")\";\n" << r;})";
    std::cout << s << "R\"(" << s << ")\";\nstd::string r = R\"(" << r << ")\";\n" << r;
}

As I delved deeper into this topic I found some theoretical basis around quines as opposed to the ad hoc approach outlined in this post.