Somewhere in time, these tools will hopefully grow to a complete, free evaluation system for IOI-like competitions. Comments are welcome.
We have implemented a correction driver that should compile on any system implementing POSIX.2 (e.g., Linux). It tests a program on several inputs; you can limit the program resource usage. In particular, you can limit the actual amount of CPU time used by the program (you can also set memory limits, but because of some known limitations they will not work under Linux).
You can write a comparator that the driver will use to check whether the output of the program is correct; the comparator can be a simple shell script, or a complex program.
Testing game-playing programs can be cumbersome, as there is no way of connection circularly the standard input/ouput of two processes using the shell. We have written a referee that feeds its standard input to two players, and then let them play.
At some point the driver will include the possibility of evaluating games, by providing an opponent as comparator.