Monday, July 2, 2007

Compile-time reflection

This post originally appeared on the digitalmars.D newsgroup.


The subject of compile-time reflection has been an important one to me. I have been musing on it since about the time I started writing Pyd. Here is the current state of my thoughts on the matter.

Functions

When talking about functions, a given symbol may refer to multiple functions:

void foo() {}
void foo(int i) {}
void foo(int i, int j, int k=20) {}

The first thing a compile-time reflection mechanism needs is a way to, given a symbol, derive a tuple of the signatures of the function overloads. There is no immediately obvious syntax for this.

The is() expression has so far been the catch-all location for many of D's reflection capabilities. However, is() operates on types, not arbitrary symbols.

A property is more promising. Re-using the .tupleof property is one idea:

foo.tupleof ⇒ Tuple!(void function(), void function(int), void function(int, int, int))

However, I am not sure how plausible it is to have a property on a symbol like this. Another alternative is to have some keyword act as a function (as typeof and typeid do, for instance). I propose adding "tupleof" as an actual keyword:

tupleof(foo) ⇒ Tuple!(void function(), void function(int), void function(int, int, int))

I will be using this syntax throughout the rest of this post. For the sake of consistency, tupleof(Foo) should do what Foo.tupleof does now.

To umabiguously refer to a specific overload of a function, two pieces of information are required: The function's symbol, and the signature of the overload. When doing compile-time reflection, one is typically working with one specific overload at a time. While a function pointer does refer to one specific overload, it is important to note that function pointers are not compile-time entities! Therefore, the following idiom is common:

template UseFunction(alias func, func_t) {}

That is, any given template that does something with a function requires both the function's symbol and the signature of the particular overload to operate on to be useful.

It should be clear, then, that automatically deriving the overloads of a given function is very important. Another piece of information that is useful is whether a given function has default arguments, and how many. The tupleof() syntax can be re-used for this:

tupleof(foo, void function(int, int, int)) ⇒ Tuple!(void function(int, int))

Here, we pass tupleof() the symbol of a function, and the signature of a particular overload of that function. The result is a tuple of the various signatures it is valid to call the overload with, ignoring the actual signature of the function. The most useful piece of information here is the number of elements in the tuple, which will be equal to the number of default arguments supported by the overload.

One might be tempted to place these additional function signatures in the original tuple derived by tupleof(foo). However, this is not desirable. Consider: We can say any of the following:

void function() fn1 = &foo;
void function(int) fn2 = &foo;
void function(int, int, int) fn3 = &foo;

But we cannot say this:

void function(int, int) fn4 = &foo; // ERROR!

A given function-symbol therefore has two sets of function signatures associated with it: The actual signatures of the functions, and the additional signatures it may be called with due to default arguments. These two sets are not equal in status, and should not be treated as such.

Member functions

Here is where things get really complicated.

class A {
    void bar() {}
    void bar(int i) {}
    void bar(int i, int j, int k=20) {}

    void baz(real r) {}

    static void foobar() {}
    final void foobaz() {}
}

class B : A {
    void foo() {}
    override void baz(real r) {}
}

D does not really have pointers to member functions. It is possible to fake them with some delegate trickery. In particular, there is no way to directly call an alias of a member function. This is important, as I will get to later.

The first mechanism needed is a way to get all of the member functions of a class. I suggest the addition of a .methodsof class property, which will derive a tuple of aliases of the class's member functions.

A.methodsof ⇒ Tuple!(A.bar, A.baz, A.foobar, A.foobaz)
B.methodsof ⇒ Tuple!(A.bar, A.foobar, A.foobaz, B.foo, B.baz)

The order of the members in this tuple is not important. Inherited member functions are included, as well. Note that these are tuples of symbol aliases! Since these are function symbols, all of the mechanisms suggested earlier for regular function symbols should still work!

tupleof(A.bar) ⇒ Tuple!(void function(), void function(int), void function(int, int, int))

And so forth.

There are three kinds of member functions: virtual, static, and final. The next important mechanism that is needed is a way to distinguish these from each other. An important rule of function overloading works in our favor, here: A given function symbol can only refer to functions which are all virtual, all static, or all final. Therefore, this should be considered a property of the symbol, as opposed to one of the function itself.

The actual syntax for this mechanism needs to be determined. D has 'static' and 'final' keywords, but no 'virtual' keyword. Additionally, the 'static' keyword has been overloaded with many meanings, and I hesitate suggesting we add another. Nonetheless, I do.

static(A.bar == static) == false
static(A.bar == final) == false
static(A.bar == virtual) == true

The syntax is derived from that of the is() expression. The grammar would look something like this:

StaticExpression:
    static ( Symbol == SymbolSpecialization )

SymbolSpecialization:
    static
    final
    virtual

Here, 'virtual' is a context-sensitive keyword, not unlike the 'exit' in 'scope(exit)'. If the Symbol is not a member function, it is an error.

A hole presents itself in this scheme. We can get all of the function symbols of a class's member functions. From these, we can get the signatures of their overloads. From these, can get get pointers to the member functions, do some delegate trickery, and actually call them. This is all well and good.

But there is a problem when a method has default arguments. As explained earlier, we can't do this:

// Error! None of the overloads match!
void function(int, int) member_func = &A.bar;

Even though we can say:

A a = new A;
a.bar(1, 2);

The simplest solution is to introduce some way to call an alias of a method directly. There are a few options. My favorite is to take a cue from Python, and allow the following:

alias A.bar fn;
A a = new A;
fn(a, 1, 2);

That is, allow the user to explicitly call the method with the instance as the first parameter. This should be allowed generally, as in:

A.bar(a);
A.baz(a, 5.5);

Given these mechanisms, combined with the existing mechanisms to derive the return type and parameter type tuple from a function type, D's compile-time reflection capabilities would be vastly more powerful.

No comments: