CRTP in C♯: static polymorphism and friendship

The hero image is a screenshot of Wikipedia article.

Update on 4 October. I added a micro-benchmark.

Consider the following C++ code snippet that defines a code generator with the curiously recurring template pattern:

#include<cstdio>

template <typename TWorkItem>
struct WorkItem
{
  TWorkItem &Process()
  {
    // __assume(this != nullptr);
    static_cast<TWorkItem *>(this)->Preprocess();
    std::puts("Processing work item...");
    static_cast<TWorkItem *>(this)->Postprocess();
    return *static_cast<TWorkItem *>(this);
  }
private:
  friend TWorkItem;
  ~WorkItem() = default;
};

Why struct? Because I think it is more important to default to public inheritance than private members. What is that funny __assume? It’s a hint helping Visual C++ compiler to eliminate the nullptr check for static_cast (yes it surprisingly needs such help for this as of the writing of this blog entry).

The above is essentially template method pattern without the overhead of virtual dispatching. It can be consumed as follows:

struct LoggedItem : WorkItem<LoggedItem>
{
private:
  friend WorkItem<LoggedItem>;
  void Preprocess() { std::puts("VERBOSE: Process begins."); }
  void Postprocess() { std::puts("VERBOSE: Process finishes."); }
};

struct UnprocessableItem : WorkItem<UnprocessableItem>
{
private:
  friend WorkItem<UnprocessableItem>;
  void Preprocess() { throw "This work item is unprocessable."; }
  void Postprocess() { }
};

int main(void)
{
  LoggedItem().Process();
  try
  {
    UnprocessableItem().Process();
  }
  catch (char const *message)
  {
    std::puts(message);
  }
}

How can we emulate that in C♯?

Defining the goal

To emulate that in C♯, we need to know what it means by ‘that’. The goal I am aiming for is the following:

No overhead of virtual dispatching.
Hiding the implementation details from outside, or at least prevent the outside potentially from breaking invariants of the classes. (For example, something similar to friendship might be needed.)

An obvious non-solution to the first problem is to use (protected) virtual methods, i.e., reverting back to template method pattern.

An inexperienced person (me a few weeks ago) might think the following code helps:

// This code does NOT help!

public abstract class WorkItem<TWorkItem>
  // 2, 3
  where TWorkItem : WorkItem<TWorkItem>, WorkItem<TWorkItem>.IOverrides
{
  public interface IOverrides // 3
  {
    void Preprocess();
    void Postprocess();
  }

  protected WorkItem() { } // 1

  public TWorkItem Process()
  {
    ((TWorkItem)this).Preprocess(); // 2
    Console.WriteLine("Processing work item...");
    ((TWorkItem)this).Postprocess(); // 2
    return (TWorkItem)this;
  }
}

This is very problematic. First, the code at 1 does not prevent WorkItem<MyWorkItem> from being inherited by some completely unrelated NonWorkItem class.

For the code at 2, there are many problems. The calls to Preprocess and Postprocess are interface dispatching, which are worse than virtual dispatching. Note that generics doesn’t help here because one piece of code is shared by all instantiation with reference types. Moreover, the derived class loses control over whether Preprocess and Postprocess are sealed (unless the derived class itself is sealed), see this blog entry in Chinese or the MSDN documentation:

A class might include an interface multiple times through base classes that it inherits or through interfaces that other interfaces inherit. However, the class can provide an implementation of an interface only one time and only if the class declares the interface as part of the definition of the class (class ClassName : InterfaceName).

Worse yet, the code at 3, by the rules

base class cannot be less accessible that derived class,
types in generic constraints cannot be less accessible than the generic type itself, and
the accessibility of a instantiated generic type is the least among its type parameters and the generic type itself, …

…the interface IOverrides is necessarily visible to everyone with access to MyWorkItem, and anyone can cast MyWorkItem to IOverrides and access those ‘step methods’ (implementation details) in arbitrary manner that can break the invariants maintained by MyWorkItem.

In summary, this is a more terrible solution than giving up the performance.

Solution

Lo’ and behold. The show starts!

public abstract class WorkItem<TWorkItem, TWorkItemOverrides>
  where TWorkItem : WorkItem<TWorkItem, TWorkItemOverrides>
  // 1
  where TWorkItemOverrides : struct, WorkItem<TWorkItem, TWorkItemOverrides>.IOverrides
{
  public interface IOverrides
  {
    // 2
    TWorkItem That { get; }
    void SetThat(TWorkItem that);
    void Preprocess();
    void Postprocess();
  }

  protected TWorkItemOverrides MyOverrides;
  // 3
  protected WorkItem() { MyOverrides.SetThat((TWorkItem)this); }

  public TWorkItem Process()
  {
    MyOverrides.Preprocess();
    Console.WriteLine("Processing work item...");
    MyOverrides.Postprocess();
    return MyOverrides.That;
  }
}

What’s happening here? The generic parameters involve 2 types, one the derived class, the other what I call a companion overriding provider structure. The trick is to move all ‘virtual methods’ into this struct. We’re exploiting the fact that the compiler today will generate a separate copy of code for each value type substituted into a generic type, and every hole of optimisation is open. This explains the generic constraints at 1.

You might wonder what is the use of 2. For one thing, the base class could just cast itself every time it wants to invoke a ‘virtual method’ and pass it into the interface method, e.g., default(TWorkItemOverrides).Preprocess((TWorkItem)this) instead of MyOverrides.Preprocess(). And it saves a pointer from the total size of the class!

I believe casting is more expensive than field look up plus null check, so it might not be the best idea to cast for every call of those ‘virtual methods’. Yet the biggest problem is still about breaking encapsulation — remember that anyone can call Preprocess on the struct due to the accessibility (visibility) issue.

How can we avoid the outside from calling arbitrary methods on my struct? The answer is to make such calls ineffective rather than impossible. The structure must hold a reference to TWorkItem, and the interface methods should delegate its work to the held TWorkItem. So the safety of encapsulation boils down to protecting the integrity of the TWorkItem reference held by TWorkItemOverrides.

The constructors (or lack thereof) of TWorkItemOverrides must not set arbitrary reference to be held. That’s the easy part. However, it seems quite contradictory that the generic type WorkItem can use SetThat to set the reference while no one outside TWorkItem should be able to set the reference. What if someone does this:

MyWorkItem item = new MyWorkItem();
// Suppose this is the companion structure.
MyWorkItem.Overrides overrides = default(MyWorkItem.Overrides);
overrides.SetThat(item);
// BOOM!
overrides.Postprocess();

Well, we’ll soon see how to prevent this and why the SetThat method is not made the set-accessor of That property.

Lastly, let me mention that the cast in 3 prevents any ‘usefully wrong’ derivation from WorkItem<MyWorkItem, ...>, because any attempt to create such an instance will not succeed. Folks familiar with C++ but not with C♯ need to know that unlike C++, in C♯ (and Java) the object identity does not change during construction. This means that the type of the object in the constructor of WorkItem is already TWorkItem, and that the cast will succeed (barring incorrect inheritance).

Correct consumption of the solution

Here is an example of how to correctly derive from WorkItem<...>:

public class LoggedItem : WorkItem<LoggedItem, LoggedItem.Overrides>
{
  public struct Overrides : IOverrides
  {
    public LoggedItem That { get; private set; }
    // 1
    public void SetThat(LoggedItem that) { that.MyOverrides.That = that; }
    // 2
    public void Preprocess() { That.Preprocess(); }
    public void Postprocess() { That.Postprocess(); }
  }

  public LoggedItem() { }
  private void Preprocess() { Console.WriteLine("VERBOSE: Process begins."); }
  private void Postprocess() { Console.WriteLine("VERBOSE: Process finishes."); }
}

Note how 1 and 2 are implemented.

SetThat method is more or less a static method, in that it does not set the That property of the instance on which it is called, but that inside the instance referred to by that. This implementation ensures that no outsider can obtain a useful Overrides instance with a non-null That.

The reason why SetThat is not the set-accessor of That is simply that it does not match the semantics of a property setter.

The code at 2 always forwards the call to the instance, even if the methods happen to be instance-independent. The automatic null check prevents the call into the actual code when a mischievous outsider tries to break our encapsulation. Furthermore, if you want the methods to be virtual, just make the non-forwarder method in the class virtual.

Exercise. Implement UnprocessableItem.

Conclusion

We did not fully solve the problem (but at least mostly). This ‘dual parameter’ approach is useful in implementing static polymorphism. It eliminates the overhead of virtual dispatching and implements a limited form of friendship. With a canonical way to implement the companion structure, it gives the base generic type access to an interface of the derived type while no outsider gets such access, and the two types do not need to reside in the same assembly or friend assemblies.

Benchmark

A commenter suggested that I do a benchmark, so I made a non-trivial one. In the code, we use CRTP to implement a for-loop — the derived class implements 4 methods corresponding to for (init; cond; iter) body; and the CRTP base will compose them into the for-loop. Then we implement the same Newton’s method for computing the square root three times:

Use companion struct and do not use virtual method.
Use companion struct and use virtual method (the struct forwards the call into a virtual method first declared in the derived class; this demonstrates the flexibility that the derived class can elect to use or not use a virtual method).
Do not use companion struct and use virtual method (template method pattern).

To benchmark the code, we compute the square root of 9237598425.139847934 for 1000000 times with 1000 preheating executions. We also permute the order of the three implementations and consider the sum of time of each approach in all 6 possible order.

The benchmark is done with my Surface Book 2, in Release mode for both x86 and x64, and with .NET Core 3.1 and .NET Framework 4.7.2. The results are as follows, which shows that avoiding virtual dispatching has an advantage:

milliseconds	struct, non-virtual	struct, virtual	virtual
x86 Core 3.1	1065	1405	1431
x64 Core 3.1	1015	1371	1418
x86 Fx 4.7.2	1160	1501	1490
x64 Fx 4.7.2	1051	1364	1418

The code is posted as a GitHub gist and available under the MIT license.