Equivalent to class loaders in .NET.

Does anyone know if it is possible to define the equivalent of a "java custom class loader" in .NET?

To give a small background:

I am developing a new programming language that focuses on the CLR, called Freedom. One of the features of the language is its ability to define "type constructors", which are methods that are executed by the compiler at compile time and generate types as output. This is a kind of generalization of generics (it has normal generics) and allow such a code (in the syntax "Freedom"):

var t as tuple<i as int, j as int, k as int>; ti = 2; tj = 4; tk = 5; 

Where the "tuple" is defined as follows:

 public type tuple(params variables as VariableDeclaration[]) as TypeDeclaration { //... } 

In this particular example, the tuple type constructor provides something similar to anonymous types in VB and C #.

However, unlike anonymous types, "tuples" have names and can be used inside public method signatures.

This means that I need a way for a type that ultimately ends up being emitted by the compiler for sharing in multiple assemblies. For example, I want

tuple<x as int> defined in assembly A to ultimately be the same type as tuple<x as int> defined in assembly B.

The problem with this, of course, is that assembly A and assembly B are assembled at different times, which means that they will both issue their own incompatible versions of the tuple type.

I was looking to use some kind of type erase for this, so I would have a shared library with a bunch of types like this (this is the "Freedom" syntax):

 class tuple<T> { public Field1 as T; } class tuple<T, R> { public Field2 as T; public Field2 as R; } 

and then simply redirect access from fields i, j and k tuple to Field1 , Field2 and Field3 .

However, this is not a very effective option. This would mean that at compile time, tuple<x as int> and tuple<y as int> will ultimately be different types, while at run time they will be treated as the same type. This can cause a lot of problems for things like identifying equality and type. It is too impenetrable for abstraction for my tastes.

Other possible options would be to use “state bag objects”. However, using a state bag can defeat the whole purpose of supporting "type constructors" in a language. The idea is to enable "custom language extensions" to generate new types at compile time, with which the compiler can execute a static type with.

In Java, this can be done using custom class loaders. Basically, code that uses tuple types can be issued without actually defining the type on disk. Then, a custom “class loader” can be defined that will dynamically generate a tuple type at runtime. This will allow you to check the static type inside the compiler and unify the types of tuples across compilation boundaries.

Unfortunately, however, the CLR does not support loading a custom class. All loading in the CLR is done at the assembly level. It would be possible to define a separate assembly for each "built type", but this would very quickly lead to performance problems (having many assemblies with one type in them would use too many resources).

So I want to know:

Is it possible to simulate something like Java Class Loaders in .NET, where I can fix the link to a non-existent type and then dynamically generate a link to this type at runtime before the code that it needs to use is run

Note:

* I really already know the answer to the question that I provide as an answer below. However, it took me about 3 days of research and quite a bit of hacking IL to come up with a solution. I thought it would be nice to document it here if someone else ran into the same problem. *

+41
compiler-construction programming-languages clr language-features
Oct 09 '08 at 3:31
source share
2 answers

The answer is yes, but the solution is a bit complicated.

The System.Reflection.Emit defines types that allow you to dynamically create assemblies. They also allow the gradual creation of generated assemblies. In other words, you can add types to a dynamic assembly, execute the generated code, and then add new types to the assembly.

The System.AppDomain class also defines an AssemblyResolve that fires whenever the structure does not load the assembly. By adding a handler for this event, you can define a single assembly into which all "built" types are placed. Code generated by a compiler that uses the constructed type will reference the type in the runtime assembly. Because the runtime assembly does not actually exist on the disk, the AssemblyResolve event will be fired the first time the compiled code attempts to access the configured type. Then the event descriptor generates a dynamic assembly and returns it to the CLR.

Unfortunately, there are several difficult points to make this work. The first problem is to ensure that the event handler is always installed before the compiled code runs. With a console application, this is easy. The code for attaching an event handler can simply be added to the Main method before another code is run. However, there is no primary method for class libraries. A DLL can be loaded as part of an application written in another language, so it’s actually impossible to assume that there is always a basic method for hooking up event handler code.

The second problem is to ensure that reference types are all inserted into the dynamic assembly before using any code that references them. The System.AppDomain class also defines a TypeResolve event that is TypeResolve when the CLR cannot resolve the type in a dynamic assembly. This gives the event handler the ability to determine the type inside the dynamic assembly before the code that uses it runs. However, this event will not work in this case. The CLR will not fire an event for assemblies that statically reference other assemblies, even if the referenced assembly is dynamically defined. This means that we need a way to run the code before any other code in the compiled assembly runs, and ask it to dynamically insert the required types into the runtime assembly, if they are not already defined. Otherwise, when the CLR tries to load these types, it will be seen that the dynamic assembly does not contain the types that they need and will throw a type load exception.

Fortunately, the CLR offers a solution to both problems: Module initializers. A module initializer is the equivalent of a "static class constructor", except that it initializes the entire module, not just one class. Logically, the CLR will:

  • Run the module constructor before any types are available inside the module.
  • Ensure that only those types accessed directly by the module constructor are loaded at run time
  • Do not allow code outside the module to access any of its members until the constructor completes.

It does this for all assemblies, including class libraries and executables, and for EXE starts the module constructor before executing the Main method.

See the blog post for more information on designers.

In any case, a complete solution to my problem requires several parts:

  • The following class definition, defined inside a "language runtime," referenced by all assemblies created by the compiler (this is C # code).

     using System; using System.Collections.Generic; using System.Reflection; using System.Reflection.Emit; namespace SharedLib { public class Loader { private Loader(ModuleBuilder dynamicModule) { m_dynamicModule = dynamicModule; m_definedTypes = new HashSet<string>(); } private static readonly Loader m_instance; private readonly ModuleBuilder m_dynamicModule; private readonly HashSet<string> m_definedTypes; static Loader() { var name = new AssemblyName("$Runtime"); var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(name, AssemblyBuilderAccess.Run); var module = assemblyBuilder.DefineDynamicModule("$Runtime"); m_instance = new Loader(module); AppDomain.CurrentDomain.AssemblyResolve += new ResolveEventHandler(CurrentDomain_AssemblyResolve); } static Assembly CurrentDomain_AssemblyResolve(object sender, ResolveEventArgs args) { if (args.Name == Instance.m_dynamicModule.Assembly.FullName) { return Instance.m_dynamicModule.Assembly; } else { return null; } } public static Loader Instance { get { return m_instance; } } public bool IsDefined(string name) { return m_definedTypes.Contains(name); } public TypeBuilder DefineType(string name) { //in a real system we would not expose the type builder. //instead a AST for the type would be passed in, and we would just create it. var type = m_dynamicModule.DefineType(name, TypeAttributes.Public); m_definedTypes.Add(name); return type; } } } 

    The class defines a singleton that contains a link to the dynamic assembly in which the constructed types will be created. It also contains a “hash set” that stores a set of types that have already been dynamically generated, and finally defines a member that can be used to determine the type. This example returns an instance of System.Reflection.Emit.TypeBuilder, which can then be used to determine the generated class. In a real system, a method is likely to take an AST representation of a class, and just do it by itself.

  • Compiled assemblies that emit the following two links (shown in ILASM syntax):

     .assembly extern $Runtime { .ver 0:0:0:0 } .assembly extern SharedLib { .ver 1:0:0:0 } 

    Here, “SharedLib” is a predefined runtime library that includes the “Loader” class defined above, and $ Runtime is a dynamic runtime assembly into which ready-made types will be inserted.

  • A "module constructor" inside each assembly compiled in a language.

    As far as I know, there are no .NET languages ​​that allow you to define module constructors in the source. The C ++ / CLI compiler is the only compiler I know of that generates them. In IL they look like this: they are defined directly in the module, and not in definitions of any type:

     .method privatescope specialname rtspecialname static void .cctor() cil managed { //generate any constructed types dynamically here... } 

    For me it is not a problem that I have to write a custom IL to get this to work. I am writing a compiler, so code generation is not a problem.

    In the case of an assembly that used the types tuple<i as int, j as int> and tuple<x as double, y as double, z as double> , the module constructor should have generated types such as the following (here in C # syntax) :

     class Tuple_i_j<T, R> { public T i; public R j; } class Tuple_x_y_z<T, R, S> { public T x; public R y; public S z; } 

    Tuple classes are generated as generic types to solve accessibility problems. This will allow you to use the code in the compiled assembly tuple<x as Foo> , where Foo is some non-public type.

    The body of the module constructor that did this (there is only one type showing here and written in C # syntax) will look like this:

     var loader = SharedLib.Loader.Instance; lock (loader) { if (! loader.IsDefined("$Tuple_i_j")) { //create the type. var Tuple_i_j = loader.DefineType("$Tuple_i_j"); //define the generic parameters <T,R> var genericParams = Tuple_i_j.DefineGenericParameters("T", "R"); var T = genericParams[0]; var R = genericParams[1]; //define the field i var fieldX = Tuple_i_j.DefineField("i", T, FieldAttributes.Public); //define the field j var fieldY = Tuple_i_j.DefineField("j", R, FieldAttributes.Public); //create the default constructor. var constructor= Tuple_i_j.DefineDefaultConstructor(MethodAttributes.Public); //"close" the type so that it can be used by executing code. Tuple_i_j.CreateType(); } } 

So, anyway, it was a mechanism that I could come up with to include the crude equivalent of custom class loaders in the CLR.

Does anyone know an easier way to do this?

+50
Oct 09 '08 at 3:39
source share

I think this is the type that DLR should provide in C # 4.0. It’s still hard to find information, but perhaps we will learn more at PDC08. I look forward to your C # 3 solution, though ... I assume it uses anonymous types.

-5
Oct 09 '08 at 3:38
source share



All Articles