Intercepting method calls using IL

Interception – The word interception is heard more in Canadian or American Football when a forward pass is caught by a player of an opposing team (taken from Wikipedia).  It is an event that causes change or diversion in the usual course of actions and the object (person in the case of a football game) causing the diversion is called the interceptor.  In the terms of development, interception is process that causes a change in the steps of method/function/flow execution.

In this article, we will learn one of the ways of injecting a code (using IL instead of C# so that it is language independent) that intercepts execution flow.  We will cover one of the most complicated example that takes an input parameter and returns an object.  If we are able to successfully inject an interceptor in this method, we should be able to do it in any method.  This article will not discuss C# code to emit this IL to an existing code.  It will only discuss IL OpCodes that you need to emit to inject a code.  However at the end to give you a complete recipe, you will find some references to articles that speak about using C# to emit the IL.

A method with an input parameter & returning an object

 

  1. publicBaseShape GetShapeObject(Type shapeType)
  2. {
  3.     var baseShape = (BaseShape)Activator.CreateInstance(shapeType);
  4.     baseShape.GenerateNewId();
  5.     return baseShape;
  6. }

When you reflect this code using either Reflector, or JustDecompile or ILSpy the code gets translated to

  1. publicBaseShape GetShapeObject(Type shapeType)
  2. {
  3.     BaseShape shape = (BaseShape) Activator.CreateInstance(shapeType);
  4.     shape.GenerateNewId();
  5.     BaseShape shape2 = shape;     
  6.     return shape2;
  7. }

When converting the above code into IL, we get


  1. .method public hidebysig instance class [lib]BaseShape
  2.     GetShapeObject(class [mscorlib]System.Type shapeType) cil managed
  3. {
  4.     .maxstack 1
  5.     .locals init (
  6.         [0] class [lib]BaseShape shape,
  7.         [1] class [lib]BaseShape shape2)
  8.     L_0000: nop
  9.     L_0001: ldarg.1
  10.     L_0002: call object [mscorlib]System.Activator::CreateInstance(class [mscorlib]System.Type)
  11.     L_0007: castclass [lib]BaseShape
  12.     L_000c: stloc.0
  13.     L_000d: ldloc.0
  14.     L_000e: callvirt instance void [lib]BaseShape::GenerateNewId()
  15.     L_0013: nop
  16.     L_0014: ldloc.0
  17.     L_0015: stloc.1
  18.     L_0016: br.s L_0018
  19.     L_0018: ldloc.1
  20.     L_0019: ret
  21. }

 

Understanding the IL & OpCodes

If you are new to IL and would like to really go deep into IL, I would recommend you to go through

1. MSDN documentation – OpCodes and their explanation

2. Introduction to MSIL – A series of articles on MSIL by Kenny Kerr

However wherever applicable links to the OpCodes used in this example are provided in this article.

Local variables of a method are defined with keyword .locals init (…) as in Line 5-7. Line 9 loads the first local variable (ldarg.1) into memory with the output of the method Activator.CreateInstance (Line 10).  A cast is applied (Line 11) and stored on the stack as in Line 12.   It then loads the variable shape from the stack (using ldloc.0, line 13) and calls a method (callvirt keyword and not call keyword) GenerateNewId on the loaded object.  This is followed by retrieving the value of shape from stack and assigning to shape2 (Line 16-17).  Line 19-20 load the value of shape2 from stack and returns it to the calling method.

Forming IL of an external method call

 

The first step is to identify the additional local variables required to call the external method.

If the external method you are calling is a static method

  1. External Static method matches the signature MyClassName.MyStaticMethod() – Such calls would typically use a single line of callvirt method as in Line 10 of the sample above
  2. External Static method matches the signature MyClassName.MyStaticMethod(param1, param2, param3) – Such calls are more complex than the first one.  The steps would be
    • This requires you to create an entry into local init for param1, param2 and param3
    • Assigning a value to param1, param2 and param3
    • Then calling actual static method with values of param1, param2 and param3

If the external method is a method of a non-static class i.e.  it looks something like

  1. MyClassName classObject = newMyClassName();
  2. classObject.MyMethod();
  3. classObject.MyMethodWithParameters(param1, param2, param3);

The steps for constructing IL would be

  1. Defining an object reference of MyClassName and parameters of the method called in the local init section
    1. .locals init (
    2.     [0] class [Namespace]Namespace.ParameterType param1,
    3.     [1] class [Namespace]Namespace.MyClassName className,
    4.     …)

     

  2. Create an object of MyClassName and the parameters to the method (preferably in the same order as defined above in Step 1) and storing it in the stack
    1. L_0000: newobj instance void [Namespace]Namespace.ParameterType::.ctor()
    2. L_0005: stloc.s param1
    3. L_0007: newobj instance void [Namespace]Namespace.MyClassName::.ctor()
    4. L_000c: stloc.s className

     

  3. Next is to load the param1, …, paramN from the stack and call the method of the object className
    1. L_004e: ldloc.s className
    2. L_0050: ldloc.s param1
    3. L_0052: callvirt instance void [Namespace]Namespace.MyClassName::MyMethodWithParameters(class [Namespace]Namespace.ParameterType)
    4. L_0057: nop

 

The placeholder to inject this IL

 

Even if you have formed the IL correctly, you still have equal chances of messing up with the executable if you are not sure of where this IL should be injected in the IL of your executable.  And how do you know if you inserted it correctly or not? Well, that’s not difficult to find!  As you execute the EXE or reference the DLL, .NET framework would prompt you that it is a corrupt executable and it can not load it. So that is easy to spot on!  What needs an extra bit of care is to find an appropriate “placeholder” to inject it.  So here is a list of things to be checked when you are injecting any code

  1. The target method or class should not be less visible (in the terms of scope: public, private, internal) than the code to be injected
  2. The injected code should definitely define all the required local variables at the start of the method.  This ensures that they are loaded into the stack on the method call and are provided to you as and when required.
    This can be done seeking the first instruction (0th index) of the method and adding your local variables at the front (0th position onwards) which causes existing variables to be pushed back.  Since all the ldloc.s OpCodes reference these variables using names (check step-3 for non-static calls), you need not worry about their references.
  3. Decide whether you need to use a call or a callvirt method to call this to-be-injected method. Add an additional Nop OpCode after a method call instruction.  If you are wondering why, there is an interesting discussion on StackOverflow on purpose of Nop OpCode that you can read.
  4. For each parameter to the method, ensure that you are first loading the correct parameter from the stack of local inits and using appropriate OpCode (NewObj, NewArr) to create an object.  If the object is a reference type choose one of the following:
    1. /* Sample Instruction set: for simple PARAMETER */
    2. L_0036: ldloc.s obj
    3. L_0038: ldc.i4 0
    4. L_003d: ldarg obj
    5. L_0041: box Int32    <————– anything can be here
    6. L_0046: stelem.ref
    1. /* Sample Instruction set: for simple ARRAY */
    2. L_0036: ldloc.s objArray
    3. L_0038: ldc.i4 0
    4. L_003d: ldarg objArray
    5. L_0041: box string[]    <————– array type can be here
    6. L_0046: stelem.ref

     

  5. If you are willing to call your method just before the exit of target method, then you need to detect the exit instructions (one or more exit paths may exist) of the method.  Find out all the instructions with Ret OpCode.  The find the instruction before Ret OpCode – it usually would be Nop (for just return; cases) or ldloc.0 (for return xyz; where xyz is loaded a value from stack).  Your method calls should be before this (Nop or ldloc.0) statement.

 

Emitting the IL using C# and Reflection

 

There are several articles on the Internet that provide guidelines on emitting IL using C# and Reflection so it would not be iterated here.  The references below can be read and tried but if you find any link broken or not working please feel free to update by commenting on the article

Hope this article has given you insight on explicitly generating IL.

The above code has been taken from Open Source – CInject

Punit Ganshani

Punit Ganshani, based out of Singapore, is Microsoft C# MVP and specializes in Microsoft technology stack and performance engineering. He is an open-source contributor at Codeplex, CodeProject, DZone MVB, has several apps on Windows Store, author of book, a gadget fan and an evangelist.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInReddit

%d bloggers like this: