Saturday 20 February 2016

Runtime Code Generation, Classes

I've always found fascinating the idea of generating code at runtime. I think the thing can not get more powerful and easier than in JavaScript eval (the evaled code is executed just as if that had been written in place, with access to the local variables and so on!) For conventional languages this has always been a bit more complicated. Things in C# have changed a bit over the years (I already talked about it a long while ago) and I have partially forgotten some old things and not learnt the new ones, so I thought I would do a fast review of what we have available.

In principle the ideal way to generate code at runtime would be something like JavaScript eval: pass a string of code and get it executed. Notice that given Javascript's nature, if that string is a function declaration or a "class declaration" (pre ES6 style), executing that declaration we'll make it available for later invokation. So you can just run some immediate instructions, or create "code blocks" to be later used (invoked).
In C# there's a clear difference between defining classes and methods, and executing code. So, I will write different posts about creating new classes, creating new "reusable code blocks" (delegates pointing to code created at runtime) and executing "one shot" statements.

Creating classes at runtime
You have 3 options that I know of, but in the end the 3 resort to the same: getting the compiler to generate an assembly from the provided source, loading that Assemby into memory, and obtaining from that Assembly a Type object. Finally you'll create instances of that Type by using Activator.CreateInstance.

  • The first and oldest option is using the CodeDOM. This has 2 big disadvantages, first, it launches the compiler (csc.exe) under the covers (as I recently mentioned in this post, I can not understand why the preRoslyn compilers were contained in a binary and could not be used as a library). Because of this, the assembly can't be generated in memory, and it will be written to C:\temp\randomName.dll. Both things are a bit ugly in terms of performance. This reminds me that the first time I read about runtime code generation in C# (sometime in the summer of 2002 if my memory serves me well), they were directly invoking the CSC compiler (Process.Start("csc.exe"...)). The CodeDOM is just layer on top of that.
    private static Type CreateTypeViaCodeDom(string code, string typeName, IEnumerable assemblyNames)
      {
                CodeDomProvider cpd = new CSharpCodeProvider();
                var cp = new CompilerParameters();
                foreach(string assemblyName in assemblyNames)
                {
                 cp.ReferencedAssemblies.Add(assemblyName);
                }
                
                cp.GenerateExecutable = false;
                CompilerResults cr = cpd.CompileAssemblyFromSource(cp, code);
       //the assembly gets written to disk (c:\Temp\randomName.dll)
                
                Type tp = cr.CompiledAssembly.GetType(typeName);
                return tp;
            }
    
  • The modern way to do it is using Roslyn (Microsoft.CodeAnalysis...). As you know Roslyn is based on the Compiler As a Service idea. The compiler resides in some libraries (so no under the covers invokation of csc.exe), and the Assembly can be generated in memory.
      private static Type CreateTypeViaRoslyn(string code, string typeName, IEnumerable references)
      {
       var assemblyName = Path.GetRandomFileName();
             var syntaxTree = CSharpSyntaxTree.ParseText(code);
     
             List metadataReferences = references.Select(it => (MetadataReference)(MetadataReference.CreateFromFile(it.Location))).ToList();
             var compilation = CSharpCompilation.Create(assemblyName, new[] {syntaxTree}, 
        metadataReferences,
                 new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary));
     
     
             using (MemoryStream stream = new MemoryStream())
             {
                 var result = compilation.Emit(stream);
     
                 if (!result.Success)
                 {
                     var failures = result.Diagnostics.Where(diagnostic =>
                         diagnostic.IsWarningAsError ||
                         diagnostic.Severity == DiagnosticSeverity.Error);
     
     
                     var message = ""; // failures.Select(x => $"{x.Id}: {x.GetMessage()}").Join("\n");
                     throw new InvalidOperationException("Compilation failures!\n\n" + message + "\n\nCode:\n\n" + code);
                 }
     
                 stream.Seek(0, SeekOrigin.Begin);
                 Assembly asm = Assembly.Load(stream.ToArray());
                 return asm.GetType(typeName);
             }
            }
    
  • The third option is using Mono's Compiler as a Service. It has been around for many years before Roslyn, enabling the existence of Mono's REPL. Mono's CaaS does not depend on other Mono specific libraries, so you can run it with Microsoft's Framework, you just need to reference the Mono.CSharp.dll, that can be obtained via nuget. Most samples of use of this CaaS deal with running "one shot" small pieces of code, not with creating new classes for later reuse, and the way I've found to use it for this second purpose is a bit bizarre, but it works, so here it goes:
         private static Type CreateTypeViaMono(string code, string typeName, IEnumerable references)
      {
        var evaluator = new Evaluator(new CompilerContext(
                        new CompilerSettings(),
                        new ConsoleReportPrinter()
                ));
    
                // Make it reference our own assembly so it can use IFoo
                foreach(Assembly assembly in references)
                {
                 evaluator.ReferenceAssembly(assembly);
                }
    
                // Feed it some code
                evaluator.Compile(code);
                Assembly asm = ((Type)evaluator.Evaluate("typeof(" + typeName + ");")).Assembly;
                return asm.GetType(typeName);
      }
    

The signatures for method 2 and 3 are the same, for method 1 we have to pass the names of the assemblies to be referenced, rather than the assemblies themselves. If we have an IFormatter interface and we want to create a new Formatter class implementing the interface. We would do something like this:

Type upperCaseFormatterType = CreateTypeViaXXX(@"
             namespace Formatters
             {
              public class UpperCaseFormatter: Formatters.IFormatter
              {
                  public string Format(string s) { return s.ToUpper(); }
              }
    }",
                "Formatters.UpperCaseFormatter", 
                new List()
                {
                 typeof(Formatters.IFormatter).Assembly
                }
    //pass directly "Formatters.dll" for the DOM case
            );
            
            IFormatter formatter = (IFormatter)(Activator.CreateInstance(upperCaseFormatterType));
            Console.WriteLine(formatter.Format("abc"));

Part of this post is based on information and code that I got from this Post, this post, and this question.

No comments:

Post a Comment