Dynamic Extensions

Our focus this week is on the creation of software that can be modified while it is running. The idea is not necessarily that we want to change existing functionality, but rather that we want to add functionality. To this end, we will look at how to extend a program by loading code directly into it. We'll start with specific mechanisms for C and Java. Then we'll look at a more generic mechanism that can be achieved using the operating system. Finally, we'll look at techniques that bridge multiple languages.

Let's start with an example. Suppose that you have a web server, and you want to allow that web server to execute custom code in order to handle requests. One way to do this is to have the web server run a special program in response to any requests for a specific resource. This is how the web worked in its earliest days: the cgi-bin folder stored perl scripts, and whenever a request was made for a resource in the cgi-bin folder, the web server would start the program corresponding to that resource, pass it the details of the request, await an output, and then pass that output back to the client.

To be clear, that technique works, and we'll do it in a later step of the tutorial. But first, let's try to do things in a very efficient way: by loading code directly into a running program.

The example that we will use (abuse?) throughout this tutorial is a program that manipulates and prints text. It doesn't sound very special, but it will be enough to demonstrate the mechanics of extensibility.

Since we're working in C/C++, let's start by going over some basics. First of all, it's important that you understand the following syntax:

void* (*funct)(int, char);

This line is defining a variable that is a function pointer. It says that funct is a pointer to a function that takes two parameters, the first of which is an int and the second of which is a char. The function returns a void*.

If we had a bunch of functions in our program that matched the above signature, we could make this pointer refer to any of them. If you're thinking "oh, wow, I could avoid a lot of switch statements and if/else blocks by using function pointers", you're right. Here's a trivial example:

You can compile that code with the command g++ simple.cc -o simple. You should invoke it a few times, and you should be sure you understand every line of the code. Here are a few runs I did:

Once you are comfortable with the idea of function pointers, we can move on to the next step: dynamically loading code into a running program.

The main question we now wish to tackle is how to load code into a running program. You should be familiar with the idea that we can compile code into a .o file, using a command like this: g++ -c file.cc -o file.o. We typically call a .o file an "object file", and it contains machine code that needs to be statically linked into an executable file.

In Unix, there are also "shared object" files (.so files). These are often called "DLLs" onn Windows systems (Dynamically Linked Libraries). Shared objects are different from regular object files in that their code can be dynamically linked at runtime. The machine code has a slightly different form, so that it can be loaded anywhere into the process image, instead of into a fixed place. It's actually quite easy to compile a shared object: g++ file.cc -o file.o -fPIC -shared.

Once we have a shared object, we can dynamically load it into a running program using two steps. First, we use dlopen to load the .so into the address space of the current process. Then we use dlsym to find a function's pointer, using the function name.

One thing to be careful about, though, is that C++ does not always name things the way you would expect. Consider the function int foo(int, char). If we compile it with a C compiler like gcc, and then run nm on the resulting .o file, we'll see that there is a function called "foo". If we repeat this process, but compile with a C++ compiler, the function name will be "mangled" to _Z3fooic.

The details of why C++ does this, and what it means, are quite cool. C++ can distinguish between different functions with the same name, based on the types of their paramters. The "i" and "c" in the name refer to the parameter types, and the "_Z3foo" is indicating that there is a function whose name is 3 characters long, named "foo". If you don't care about all of that, it suffices to know that you can create a function with a C-style name by wrapping the declaration in an extern "C" block:

extern "C" { int foo(int, char); }

This is very important for us: we want to have functions that are easy to find, but we don't want to give up on all the benefits of using C++.

Now that we've worked through that, let's make a very simple function that takes a string as input and prints it. I named this file simple_print.cc:

We'll ultimately compile this as a shared object. But for now, let's make the code that will do the dynamic loading. Let's start with the menu code. Here's menu.cc:

Here's menu.h:

This is not a very good interface, but it will suffice for our demo. The user can choose "4" to enter some text, then "1" to load a function from a .so, and then "3" to run that function on the text that was input. I'm going to take a shortcut and assume only one function will be loaded from any particular .so. That's not always the case.

We're going to attach a name to each function we load, and we'll store <name, function> pairs in a map data structure. The whole program (except for the menu) looks like this. We'll call it demo.cc:

If you're wondering why I put the menu in a separate .cc file, it's so that the Makefile for this example can be more complex. Really. I've found that students often don't know how to write a good Makefile. You should strive to understand every line of demo.cc, and also every line of this Makefile. In particular, note that it creates its own dependency information, so that you never need to re-compile things unless there is a chance that the new outcome will differ from the previous outcome.

Once you've put everything together, you should be able to create .so files and load them. To be sure that you've got it all working, make a new .so that uses std::toupper() to uppercase the input. You should be able to start the demo running, then compile only your new function, and still be able to load it into your running program.

Extending Java works in mostly the same way. The main difference is that we don't use function pointers. Instead, we use interfaces and classes.

In the following Loadable.java file, we define a very simple interface: there is just one function, called "act()", which takes a string as a parameter.

It should be easy to see how this is going to be just like a function pointer. We can have a collection of Strings and Loadables, and that will let us dynamically load code and use it.

Here's a really simple example of a class that implements the Loadable interface:

Here's a really simple driver for the demo we'll run in this section of the tutorial:

(Again, I'm splitting code into multiple files, so that I can demonstrate some build tools later on).

Now it's time to implement the code that will load classes on the fly. There are two key steps. First, we use Class.forName() to load a .class file by name. Then we use Class.newInstance() to instantiate the class.

Usually people don't use make to build Java programs. One popular alternative is ant. Here's an ant buildfile. You should save it as "build.xml".

To build, you can type "ant compile". To clean, you can type "ant clean". Note that the "run" target doesn't work so well. You're better off typing "java Demo".

To finish this step of the tutorial, you should do three more things:

Implement some sort of interesting additional Loadable.
Enhance your build.xml file so that there is a target for building Loadable classes that is different from the "compile" target. Doing this correctly should enable you to start your program, compile a new class, and still be able to load it.
If you understand how the HTML code of this tutorial works, you'll discover that there is a missing code listing. It's my code listing for a Loadable that translates to Pig Latin. Find it, and incorporate it into your program!

Early in this tutorial, I mentioned that the "old" way of extending a web server was to have the server create a new process every time there was a request for a resource that corresponded to code. We can do this no Node.js. Keeping up with our Pig Latin theme, here's a remarkably simple perl script (call it "pig.pl") that does a reasonable job of converting text to Pig Latin:

The code clearly has its flaws (specifically, relating to capitalization and punctuation), but it uses a regular expression to succinctly capture the essence of Pig Latin.

Within Node.js, we can invoke a script like this by using the exec component of the "child_process" package:

This is not a proper web server, just a command-line script that you can run via node demo.js. When you do, you'll see that the sample text is being sent to the perl script, which is producing a response. Your first task in this step of the tutorial is to make a web server with a form that submits to a route called "pig" (using GET is fine), where the "pig" route turns its input string into Pig Latin by execing above perl script.

Of course, it's really inefficient to start a new perl interpreter every time we want some Pig Latin. And we have some decent Pig Latin code written in Java (if you finished the previous step). Wouldn't it be nice to have a long-running process that we could send text to, and the process would send us translated text back? For this role, Node.js has the spawn mechanism.

In the following example, we're not using spawn to its full power, because the program we spawn terminates. However, it helps to demonstrate that spawn is a little different than exec. We don't exec and then run a callback with the outputs of the function; instead we provide a callback that runs any time the spawned function produces text.

To go all the way with this example, we'll want to write to the child's stdin, so that the child can read data and then produce new output on its stdout. There is a little catch: when Node.js spawns a child process, it pauses the child's stdin, so we need to manually resume it.

In the following example, we use the standard Unix utility tee, which sends its input both to a file and to stdout. If we run it without a file name, it just sends its input back to stdout, and then keeps running, waiting for more inputs to process.

Since tee doesn't terminiate of its own accord, we will need to manually kill it. We can do that by using the (dangerous) kill function. This is not the right way to manage processes, but it works for our example.

As before, this isn't a web server, just a script. But it shows that the child is running on its own. To finish this step of the tutorial, use spawn instead of exec in your Node.js server's pig route, so that a single long-running process does the translations.

Once you have completed the previous step, you can, in theory, invoke any arbitrary code. The catch is that you will be writing to the stdin of a child process, and if that child process isn't ready to handle binary data, you could have a problem. For example, later in the semester we're going to make image filters that alter a photo after the user uploads it. Sure, you could save the photo to the file system, then write the file name and desired function to the stdin of a process for image manipulation, but that gets tedious. Our best bet is to write C or Java code that we can load directly into our node server.

We're going to do C/C++ first. Our strategy will be to use the node-ffi package. I'm not going to tell you how... it's time for you to figure it out on your own. Visit node-ffi on github to get started.

In the same spirit as the C/C++ step, I'm going to leave this step pretty unspecified. Use the node-java package and see how far you can get.

There's only one next-step for this week. Show that you can get a good sense for the workflow of an extension by setting up a server that works as follows:

The user can upload a text file.
After the file is uploaded, an extension is called.
The extension is given the name of the file.
The extension converts the file to Pig Latin and saves it.
The extension notifies the server of the new file name.
The server notifies the user of the file name.
The user can enter the file name to serve the translated file.

CSE 398

Dynamic Extensions