Search This Blog

Showing posts with label framework. Show all posts
Showing posts with label framework. Show all posts

Wednesday, May 23, 2012

Passing Events to a Virtual Machine

The source code for this article may be found here.

Virtual machines and Software Frameworks are an initial part of our digital life. There are complex VM and simple Software Frameworks. These two articles (Simple Virtual Machine and Simple Runtime Framework by Example) show how easy it may be to implement one yourself. I did my best to describe the way VM code may interact with native code and the Operating System, however, the backwards interaction is still left unexplained. This article is going to fix this omission.

As usual - note for nerds:
The source code given in this article is for example purposes only. I know that this framework is far from being perfect, therefore, this article is not a howto or tutorial - just an explanation of principle. Error checks are omitted on purpose. You want to implement a real framework - do it yourself, including error checks.
By saying VM's code I do not refer to the implementation of the virtual machine, but to the pseudo code that runs inside it.


Architecture Overview
Needless to mention, that the ability to pass events/signals to a code executed by the virtual machine implies a more complex VM architecture. While all previous examples were based on a single function responsible for the execution, adding events means not only adding another function, but we will have to introduce threads to our implementation.

At least two threads are needed:
Fig.1
VM Architecture with Event Listener

  1. Actual VM - this thread is responsible for the execution of the VM's executable code and events queue dispatch (processor);
  2. Event Listener - this thread is responsible for collection of relevant events from the Operating Systems and adding them to the VM's event queue (listener).
You may see that the Core() function, in the attached source code, creates additional thread.







Event ListenerThis thread collects events from the Operating System (mouse move, key up/down, etc) and adds a new entry to the list of EVENT structures.

typedef struct _EVENT
{
   struct _EVENT* next_event; // Pointer to the next event in the queue
   int            code;       // Code of the event
   unsigned int   data;       // Either unsigned int data or the address of the buffer
                              // containing information to be passed to the handler
}EVENT;

The code for the listener is quite simple:

while(WAIT_TIMEOUT == WaitForSingleObject(processor_thread, 1))
{
   // Check for events from the OS
   if(event_present)
   {
      EnterCriticalSection(&cs);
      event = (EVENT*)malloc(sizeof(EVENT));
      event->code = whatever_code_is_needed;
      event->data = whatever_data_is_relevant;
      add_event(event_list, event);
      event->next_event = NULL;
      LeaveCriticalSection(&cs);
   }
}

The code is self explanatory enough.  First of all it checks for available events (this part is omitted and replaced by a comment). If there is a new event to pass to the VM, it adds it to the queue. While in this example, event collection is implemented as a loop, in real life, you may do it in a form of callbacks and use the loop above just to wait for the processor thread to exit.


Processor

Obviously, the "processor" thread is going to be a bit more complicated, then in the previous article (
Simple Runtime Framework by Example), as in addition to running the run_opcode(CPU**) function, it has to check for pending events and pass the control flow to the associated handler in the VM code.

typedef struct _EVENT_HANDLER
{
   struct _EVENT_HANDLER* next_handler; // Pointer to the next handler
   int                    event_code;   // Code of the event
   unsigned int           handler_base; // Address of the handler in the VM's code
}EVENT_HANDLER;

DWORD WINAPI RunningThread(void* param)
{
   CPU*            cpu = (CPU*)param;
   EVENT*          event;
   EVENT_HANDLER*  handler;

   do{
      EnterCriticalSection(&cs);
      if(NULL != events)
      {
         event = events;
         events = events->next_event;

         // Save current context by pushing VM registers to VM's stack
         
         cpu->regs[REG_A] = (unsigned int)event->code;
         cpu->regs[REG_B] = event->data;

         handler = handlers;
         while(NULL != handler && event->code != handler->event_code)
               handler = handler->next_handler;
         
         cpu->regs[REG_IP] = handler->handler_base;

         free(event);
      }
      LeaveCriticalSection(&cs);

   }while(0 != run_opcode(&cpu));
   return cpu->regs[REG_A];
}

We are almost done. Our framework already knows how to pass events to a correct handler in the VM's code. Two more things are yet uncovered - registering a handler and returning from a handler.


Returning from Handler

Due to the fact that Event Handler is not a regular routine, we cannot return from it using the regular
RET instruction, instead, let's introduce another instruction - IRET. As event actually interrupts the execution flow of the program, IRET - interrupt return is exactly what we need. The source code that handles this instruction is so simple, that there is no need to give it here in the text of the article. All it does is simply restoring the context of the VM's code by popping the registers previously pushed on stack.


Registering an Event Handler

The last thing left is to "teach" the program written in pseudo assembly to register a handler for a given event type. In order to do this, we need to add one simple system call -
SYS_ADD_LISTENER.  This system call accepts two parameters:
  1. Code of the event to handle;
  2. Address of the handler function.
loadi  A, 0             ;Code of the event
loadi  B, handler       ;Address of the handler subroutine
_int   sys_add_listener ;Register the handler


Example Code

The example code attached to this article is the implementation of all of the above. It does the following:
  1. Registers event handler;
  2. Enters an infinite loop printing out '.' every several milliseconds;
  3. The first thread waits a bit and generates an event;
  4. Event handler terminates the infinite loop and returns;
  5. The program prints out a message and exits.


I hope this post was helpful or, at least, interesting.

See you at the next.






Saturday, May 19, 2012

Simple Runtime Framework by Example

Source code for this article may be found here.

These days we are simply surrounded by different software frameworks. Just to name a few: Java, .Net and, actually, many more. Have you ever wondered how those work or have you ever wanted or needed to implement one? In this article, I will cover a simple or even trivial runtime framework.

As usual - note for nerds:
The source code given in this article is for example purposes only. I know that this framework is far from being perfect, therefore, this article is not a howto or tutorial - just an explanation of principle. Error checks are omitted on purpose. You want to implement a real framework - do it yourself, including error checks.

Now, to let's get to business.

Software Framework
Wikipedia gives the following identification for the term "Software Framework" - "A software framework is a universal, reusable software platform used to develop applications, products and solutions. Software Frameworks include support programs, compilers, code libraries, an application programming interface (API) and tool sets that bring together all the different components to enable development of a project or solution". As you can see, software framework is quite a complex thing. However, let's simplify it and see how it basically work.

Figure 1.
Software Framework
The diagram on the left may give you a good understanding of what Software Framework is and what role it performs. Simply saying, it is a shim between the user application and the Operating System. There are at least two types of Software Frameworks:

  1. Application Programming Interface (API) - if we take a look at Windows API, we may see that it is a framework as well. However, it may be bypassed or, at least, a programmer may choose to decrease the interaction with it by, for example, using functions from ntdll.dll instead of those provided by kernel32.dll or even "talk" to Windows kernel directly (highly not recommended, but may be unavoidable some times) through interrupts.
  2. .Net like framework - total isolation of user code from the operating system. Such frameworks are mostly virtual machines totally isolating user application from the operating system and hardware. However, such framework has to provide the application with all the services available in the Operating System. This is type of framework we are going to build in this article.




Virtual Machine
The basics of building a simple virtual machine is covered in this article, so I will only give a brief explanation here. Our VM in this example will consist of the following components:
  1. Virtual CPU
    A structure that represents a CPU - basically, has 6 registers and a pointer to the stack:

    typedef struct
    {
       unsigned int  regs[6];
       unsigned int* stack;
    }CPU;

    The 6 registers are general purpose
    A, B, C and D, where A is also used to store system call return value and C is used as a counter for LOOP instruction, STACK POINTER (SP) and INSTRUCTION POINTER (IP).
  2. Instruction Interpreter
    A function or a set of functions which responsible for interpretation of the pseudo assembly (or call it intermediate assembly language) designed for this virtual machine (in this case 14 instructions).
  3. System Call Handler
    This component provides the means for the user application to interact with the Operating System (in this case 2 system calls:
    sys_write and sys_exit).

Core Function
The name of the function speaks for itself. This is the first function of the framework implementation which gains control. In this particular case, it does not have too many things to do - initialization of the virtual CPU and execution of the command interpreter, until the user application exits (signals the framework to terminate the execution).

Implementation
It is a common practice to implement a framework as a DLL (dynamic link library), for example, mscoree.dll - the core of the .Net framework. I do not see any reason to reinvent the wheel, therefore, this framework will be implemented as a DLL as well.

All is fine, you may say, but how should we pass the compiled pseudo assembly code to the framework? Well, I bet, most of you know how to do that. In case you don't - no worries, just keep reading.

In case of .Net framework (at least as far as I know), the loader identifies a file as a .Net executable, reads in the meta header, and initializes the mscoree.dll appropriately. We will not go through all those complications and will use a regular PE file:


Figure 2.
Customized PE file.

  • PE Header - regular PE Header, no modification needed;
  • Code Section - simply invokes the core function of the framework:

    push pseudo_code_base_address
    call [core]
  • Import Section - regular import section that only imports one function from the framework.dll - framework.core(unsigned int);
  • Data Section - this section contains the actual compiled pseudo assembly code and whatever headers you may come up with, that may instruct the core() function to correctly initialize the application.






Example Executable Source Code
The following is the source code of the example executable. It may be compiled with FASM (Flat Assembler).

include 'win32a.asm' ;we need the 'import' macro
include 'asm.asm'    ;pseudo assembly commands and constants

format PE console
entry start

section '.text' readable executable
start:
   push _base
   call [core_func]

section '.idata' data import writeable
library  framework, 'framework.dll'

import framework,\
   core_func, 'Core'

section '.data' readable writeable
_base:
   loadi A, _base
   loadi B, 0x31
   _add A, B
   loadr B, A
   loadi A, _data.string
   loadi C, _data.string_len
   _call _func
   loadi A, 1
   loadi B, _data.string
   loadi C, _data.str_len
   _int sys_write
   loadi A, 1
   loadi B, _data.msg
   loadi C, _data.msg_len
   _int sys_write
   _int sys_exit


_func:
   ; A = string address
   ; B = key
   ; C = counter
.decode:
   loadr D, A
   xorr D, B
   storr A, D
   loadi D, 4
   _add A, D
   _loop .decode
   _ret



_data:
.string db 'Hello, developer!', 10, 13
.str_len = $-.string
db 0
.string_len = ($-.string)/4
.msg db 'The program will now exit.', 10, 13
.msg_len = $-.msg

;Encrypt one string
load k dword from _base + 0x31
repeat 5
load a dword from _data.string + (% - 1) * 4
a = a xor k
store dword a at _data.string + (% - 1) * 4
end repeat



The code above produces a tiny executable which invokes framework's core() function. Pseudo assembly code simply prints two messages (the first one is decoded prior to being printed). Full sources are attached to this article (see the very first line).

The good thing is that you do not have to start the interpreter and load this executable (or specify it as a command line parameter) - you may simply run this executable, Windows loader will bind it with the framework.dll automatically. The bad thing is that you would, most probably, have to write your own compiler, because writing assembly is fun, dealing with pseudo assembly is fun as well, BUT, only when done for fun. It is not as pleasant when dealing with production code.


Possible uses
Unless you are trying to create a framework that would overcome existing software frameworks, you may use such approach to increase the protection of your applications by, for example, virtualizing cryptography algorithms or any other part of your program which is not essential by means of execution speed, but represents a sensitive intellectual property.

Hope you find this article helpful.

See you at the next!