The Penguin Model for Secure Distributed Internet Scripting

Humberto Ortiz Zuazaga
humberto -at- hpcf.upr.edu
December 18, 1996

Abstract

The success of the World Wide Web (WWW) has created a large interest in distributed scripting languages, which could be used (for example) to execute code stored on a WWW server in the environment of the browser. Additionally, the concept of ``distributed agents'' roaming the Internet on behalf of users has popped up time and again. Penguin, a set of user-contributed additions to perl purports to allow secure distributed execution of perl programs. In the present paper, I present the Penguin security model, demonstrate Penguin's use, and contrast Penguin with other distributed scripting languages such as safe-tcl, Python, Java, Legion, and Inferno.

Introduction

The explosive growth of the Internet, particularly the World Wide Web; coupled with advances in wireless technology and hand held computing devices have resulted in a renewed interest in distributed scripting languages. Developers envision small programs (agents or applets) traveling from computer to computer, collecting data, utilizing specialized resources, and reporting back to the user. Sun Microsystem's Java language [cite Java] is probably the most visible of the projects currently underway, but industry giants like Microsoft and Bell Labs are rumored to have high priority projects like Blackbird and Inferno in the works. At the same time, grass-roots projects to extend Tcl [cite Tcl], Python [cite Python], and Perl [cite Perl] are well underway on the Internet.

The Perl user community has developed a set of modules for perl that enable it to execute digitally signed source code from trusted sources in restricted environments. Access to each of perl's operators and built in functions can be independently controlled for each entity executing code on your machine.

In this paper I present an overview of the Penguin [cite Penguin] system for secure distributed applications, demonstrate its use and usefulness, and compare it's security features and models with those of other Internet scripting languages.

An Overview of Penguin

Penguin is a module for perl 5.002 and above that, together with the additional modules IO and Safe, and the public key encryption program pgp [cite PGP], allow computers to exchange digitally signed perl source code and execute the code in a restricted perl environment. Penguin is currently in the early beta test cycle, and is distributed with minimal documentation. The distribution contains the Penguin modules, pointers to patched versions of the IO and Safe modules, and two sets of example applications. The first pair is a set of client-server programs: the client packages perl source as a Penguin ``frame'' and transmits it to the server; the server verifies the digital signature and executes the code in an appropriately restricted environment, and returns the result of the execution to the client. This pair of programs could be used as a basis for secure distributed computing servers.

The second pair of programs are more suitable for creating a Penguin enabled WWW browser or mail user agent. The makeapplet program creates a Penguin frame and writes it out to the file system. The runapplet program then reads the frame and executes it in a safe compartment as before. Presumably, the frames could be transmitted to another machine over the Internet (via mail or http or ...) and the web browser or mail user agent could have the runapplet code embedded into it.

In both sample applications, the basic operation of Penguin remains the same. Penguin frames are created by the client, transmitted to the server over the Internet and executed on remote machines. A secure Penguin frame contains the perl code digitally signed using pgp's implementation of the RSA public key algorithm (the code is actually signed with pgp and copied into the frame). The digital signature on the frame is an unforgeable proof of the identity of the sender, and also a guarantee that the code has not been tampered with (via a cryptographically secure message digest). The code can thus be considered to be strongly authenticated.

On the server, the signer's pgp public key is used to look up a set of allowed operations in a ``rights'' database. This database contains a set of (signer, operations) pairs. The database can also contain a default set of operations to be allowed to any signer not explicitly named in the database. The operations allowed are used to initialize a ``Safe compartment'', a restricted perl environment defined by the Safe module. Safe restricts access to only the operations allowed in the rights database by trapping the compilation of all other perl operations. This is a very flexible authorization mechanism, that allows the server to specify how much he trusts each signer with his resources.

Perl is an interpreted language, but perl code is compiled by the interpreter into a syntax tree, and then optimizations are performed on the tree before actual execution. Safe hooks into the compilation phase, and causes compilation of all forbidden operations to fail. Code which uses the forbidden operations will not even begin to execute, as the entire perl file is compiled before execution begins.

Perl contains an ``eval'' statement, which could potentially be a loophole in the Safe security implementation. eval can be used to evaluate arbitrary user data, including data transmitted by covert channels. However, the code to be eval'ed will also undergo the compile-execute cycle at the time of the call to eval (i.e., the original code's run-time), and Safe will trap any forbidden operations in this compilation phase as well. *

An Example

I modified the makeapplet.pl and runapplet.pl from the Penguin distribution to digitally sign the code using pgp instead of sending it with no signature. These programs now implement the same security mechanism as the distributed client-server programs. Here is the modified makeapplet:

<makeapplet source code>=
use Penguin;
use Penguin::Frame::Code;
use Penguin::Frame::Data;
use Penguin::Wrapper::PGP;
use Penguin::Wrapper::Transparent;

my $filename = shift;
my $password = shift;
open(CODEFILE, "<$filename");
{local $/ = undef; $codetosend = <CODEFILE>}
close(CODEFILE);

print("assembling...\n");
$frame = new Penguin::Frame::Code Wrapper => 'Penguin::Wrapper::PGP';

assemble $frame Password => $password, 
                Text     => $codetosend,
                Title    => "untitled program",
                Name     => "Humberto Ortiz Zuazaga";

open(FRAMEFILE, ">$filename.pen");
print FRAMEFILE $frame->contents();
close FRAMEFILE;
print("...done\n");
This code is written to a file (or else not used).

Here is the modified code for runapplet:

<runapplet source code>=
use Penguin;
use Penguin::Rights;
use Penguin::Frame::Code;
use Penguin::Frame::Data;
use Penguin::Wrapper::PGP;
use Penguin::Wrapper::Transparent;
use Penguin::Compartment;

my $filename = shift;
my $password = shift; # bad idea, of course

open(PENGUINFILE, "<$filename");
{local $/ = undef; $penguinframe = <PENGUINFILE>}
close(PENGUINFILE);

print("penguinframe is $penguinframe\n");
$frame = new Penguin::Frame::Code Text => $penguinframe;

($title, $signer, $wrapmethod, $code) = $frame->disassemble(
                                             Password => $password);

$rightsdb = new Penguin::Rights;

get $rightsdb;

$userrights = getrights $rightsdb User => $signer;

print<<"ENDOFFORM";
Title: $title
Signer: $signer
Rights: $userrights
Wrap Method: $wrapmethod
Code
--------------
$code
--------------
ENDOFFORM

$compartment = new Penguin::Compartment;
$compartment->initialize( Operations => $userrights );

$result = $compartment->execute( Code => $code );

if ($@) { # illegal code tried to execute
    $result = $@;
}

print "-------result was--------\n$result\n";
This code is written to a file (or else not used).

These modified programs were then used to test the Penguin modules by running several toy applications signed using several different pgp keys. The rights database used during these tests follows, the users are ordered from higher privilege to lowest. Names preceded by a colon are aliases for common groups of operators, others are the actual perl operations of the same name (i.e., require).

<sample rights database>=
[Humberto Ortiz Zuazaga <humberto@momo.uthscsa.edu>]
:default :base_math :filesys_open require 
[Penguin Test Key]
:default
[default]
:base_core
This code is written to a file (or else not used).

Makeapplet takes a perl source code file, and signs it using pgp with the supplied password. I control the actual private key used to sign the code using the PGPPATH environment variable. In this first example, I use the key for ``Penguin Test Key'' to sign a perl script doing some aritmetic operations, these operations would be allowed by the :base_core set of permissions, so Penguin could easily be used to implement the client-server arithmetic programs from CS5523 assignment 1 in a strongly authenticated and secure fashion.

The output of runapplet shows the frame as received, the signer and associated set of rights, the code, and the results of the execution.

<Execution log for client-server arithmetic example>=
$ makeapplet foobar.perl mypenguinpassword
assembling...
...done
$ runapplet foobar.perl.pen
penguinframe is Penguin 2.99 P3.0
checksum
untitled program
Penguin Test Key
PGP
%%%delimiter%%%
-----BEGIN PGP MESSAGE-----
Version: 2.6.2

owHrZAhlZmUw7PU5sydSVNXoXrEnI6NLE+P/iMRHRRnBy/tnb/OJdI96cPpVRKZN
482L7Oen9WlUXd46JyY9Vjyq/MIWsUv9kkYCIgd5tkhc/KQ/d2HWFpM7XNcX2K/R
TWItLknJzGMAApW0/HwFWwVTay6VpMQiIMvMmosLLKilABKx5gIA
=CbP4
-----END PGP MESSAGE-----
%%%delimiter%%%

Title: untitled program
Signer: Penguin Test Key
Rights: :default
Wrap Method: PGP
Code
--------------
$foo = 5;
$bar = 6;

$foo * $bar;

--------------
-------result was--------
30
$ 
This code is written to a file (or else not used).

This penguin frame is signed using the Penguin Test Key, but the name in the frame is ``Humberto Ortiz Zuazaga'' (i.e., the frame is forged). The Penguin Test user does not have :filesys_open permissions, but I do. If the code tries to open a file on the server, it should not be allowed.

<Execution of forbidden operations>=
$ makeapplet naughty.perl mypenguinpassword
assembling...
...done
$ runapplet naughty.perl.pen
penguinframe is Penguin 2.99 P3.0
checksum
untitled program
Humberto Ortiz Zuazaga
PGP
%%%delimiter%%%
-----BEGIN PGP MESSAGE-----
Version: 2.6.2

owHrZAhlZmUw7PUz2BMpqmp0r9iTkbGpl4lh1rTD99ff5Y67uKR7G2NI1KWzYsuF
pa5aydxd3Ch8+Zb1nznMtroZSwrntU7/myW4Yo6jkHCatAFHy1Pro1+P6XB6M6wx
S2ItLknJzGMAgvyC1DwNN08fVx0FJf3UkmT9gsTi4vIUJU1rLoeUxJJEBVsFG5C0
nTUXAA==
=If6/
-----END PGP MESSAGE-----
%%%delimiter%%%

Title: untitled program
Signer: Penguin Test Key
Rights: :default
Wrap Method: PGP
Code
--------------
open(FILE, "/etc/passwd");
@data = <FILE>;

--------------
-------result was--------
open trapped by operation mask at (eval 2) line 1.

$
This code is written to a file (or else not used).

I however, am allowed to open any file on the server that the owner of the runapplet process has access to (for read or write).

<Obtaining the server's load average>=
penguinframe is Penguin 2.99 P3.0
checksum
untitled program
Humberto Ortiz Zuazaga
PGP
%%%delimiter%%%
-----BEGIN PGP MESSAGE-----
Version: 2.6.2

owHrZJjKzMpg2BtgtL5j0zQf0ztXGRm9a5j/6RZ0ru1fvWzLKv6uDfYlnE9etHqt
5nvSbqreUMtk4eDzZanLqV0XJ5dYN1n/5BZfUyl88L/fK63Vojo+ZpFcU9v3xTPM
WpDtdGlq03sbQ4Vp3rzm3CuYhf79qTh29X1c5ab1+fKvjrouuS531zR9huaWb69c
9Q+yLb224K7X928qqzq9Op8UrGlNYi0uScnMYwCC/ILUPAUNH39HFx0FJf2Covxk
/Zz8xJTEsnQlTWsuFRBbwVbBBqTAzporOSO/QAEsaM2loZKfl6qjoJKWWQah0kpS
U/M0gaqLC3IySxT0oxViSmLyYrX1dWBauIpSS0qL8hRAOq0B
=XB+C
-----END PGP MESSAGE-----
%%%delimiter%%%

Title: untitled program
Signer: Humberto Ortiz Zuazaga <humberto@momo.uthscsa.edu>
Rights: :default :base_math :filesys_open require  
Wrap Method: PGP
Code
--------------
open (LOAD, "/proc/loadavg");
$load = <LOAD>;
chop $load;
($one, $five, $fifteen) = split /[ \t\n]+/, $load;

return $one;
--------------
-------result was--------
0.10
This code is written to a file (or else not used).

If you want to allow any process to obtain the load averages (perhaps to implement a secure distributed scheduler), you can create a perl function that computes the load average, and export the function to the default Safe compartment. Since this code is compiled outside of the compartment it has no restrictions on the operations it can execute. In a similar way, one could export special ``trusted'' versions of unsafe operations, such as reading or writing files. These trusted versions could prompt the user for confirmation when called on a web browser, or do further authentication. If these library functions were implemented correctly, allowing access to them would not in any way compromise the server. In particular, allowing access to a trusted function that wrote files does not imply allowing the client to write arbitrary files if implemented correctly. Perl's long history of use in setuid root programs has produced an extensive library of routines for handling ``tainted'' data which could be used to vet arguments and data passed to the ``trusted'' routines.

Other Distributed Languages

Even in the beta-test form it currently is in, Penguin is the most secure and convenient of the distributed scripting language security models. It's combination of strong authentication and flexible authorization schemes are currently unparalleled.

In Java, applets are verified at compile time, then transmitted as compiled byte code to the remote computer. The byte-code is executed in a restricted environment, and subjected to several verification steps before executing and at run-time. Currently, Java does not support digital signatures nor encryption, although the developers have stated both will be supported in the future (the current version of Penguin does not support pgp encryption, although it was present in previous alpha versions, and the developer has said it will be supported). In addition, you cannot specify which Java opcodes an untrusted program can execute, it is an all-or-nothing affair. Recently, researchers at Princeton have uncovered a number of security holes in the current implementations of Java [cite SIP]. The researchers suppose that other languages would also be found to contain similar bugs if subjected to the same levels of scrutiny.

The current work on Penguin grew out of the authors experiments with safe-tcl, an extension to the tcl language to support remote execution of MIME messages. The ideas in safe-tcl were recently (as of tcl 7.5) folded back into the primary tcl distribution. Tcl now allows you to define multiple tcl interpreters, executing in parallel with the main interpreter. Any of these interpreters may be restricted, which means that only a default set (similar to penguin's :default operations) are visible in the interpreter's name space. Additional tcl commands may be made visible to the safe-tcl interpreter, including ``trusted'' commands, as in Penguin. As of this writing, there is no built in way to add authentication to safe-tcl, although in principle a port of Penguin to tcl or creation of another equivalent interface to pgp should not be a problem.

The interpreted language Python is also becoming a popular choice for system administration scripts and WWW cgi-bin programs. There is currently a WWW browser available called Grail which is written in Python and can load and execute Python applets from WWW pages. Grail also uses this notion of a safe Python interpreter, in Python this is called the Restricted Execution Mode (REM). In REM, the Python interpreter enforces a set of standard restrictions (again by editing the interpreter's name space), but an unrestricted Python interpreter running the browser itself can make additional functionality visible to applets running inside REM. Python has no built in support for authentication, although it does have a rudimentary authorization system where applets loaded from the same page can share information with each other based on the DNS domain they originated from. The user-contributed Python code library contains a Python class encapsulating the user interface to pgp. This library could easily be adopted to provide authentication as in Penguin.

Perhaps the most interesting entries into the distributed scripting fray are not scripting languages, but instead full blown distributed operating systems. This makes technical sense, since indeed all of the scripting languages designer's goals seem to lead to an Internet where code travels from machine to machine and executes uniformly on all of them. The final solution then should be considered more as a Internet wide operating system providing a seamless set of resources.

Legion [cite Legion] is an extension of the Mentat [cite Mentat] distributed system. It is designed to be a world wide distributed operating system where many users can share computing resources. The operating system is designed around distributed objects, and the security system will also revolve around objects. Legion objects export a public interface, and only the exported interface is accessible to other objects. In legion there will be no system-wide notion of security, instead, each object can select the security policy and mechanism it will implement, and this policy could even vary on a method-by-method basis. It is hoped that several common security policies will be shipped as run-time classes that user-written code can inherit from. Thus it should be possible to inherit from a public-key strongly authenticated class. Is is not clear what the current state of the Legion security implementation is, although the white paper is very encouraging.

The other contender in this category, Inferno, is even more mysterious. Bell Labs (now a part of Lucent) began work on this system apparently in response to Sun's Java. A white paper on Inferno is now available on the Internet [cite Inferno]. Apparently, the plan 9 team from Lucent is now working on an stand alone operating system for use in PC's, palmtops, TV's and runnable as a user process under UNIX and NT. In all cases, Inferno apps will see a consistent environment, based on and extending UNIX and Plan 9's ``everything is a file'' metaphor, coupled with a uniform communication protocol to request both local and remote services. RPC is accomplished by writing requests to the file associated with the remote service. Security will be achieved by digitally signing the files associated with the service (it is unclear if requests will also be signed). Communications channels can be digested and encrypted (using public key cryptography) as well.

Summary

Secure distributed scripting languages are still an unrealized goal. Bugs and design flaws exist in all current implementations. Independent security audits have not been performed on any language other than Java. If distributed Internet applications are to become a reality clearly stated security policies and trusted implementations are necessary. Although Sun has been willing to provide source licenses to researchers, it is likely that publically available sources, like those for perl, tcl, python, and legion will garner more trust over time.

Penguin, by being based on the well trusted pgp program, and other publically available code makes for a very good prototype system. Its authorization flexibility and strong authentication make it an excellent choice for the diverse needs of a general purpose Internet scripting language. Perl's domination of the setuid root and cgi-bin programming market provide excellent verification of the security of well designed perl programs. Safe and Penguin should be subjected to tests like those inflicted upon Java, to make sure the implementations correctly enforce the policies they purport to carry out.

In the future, distributed operating systems such as Legion and Inferno, especially if they can be run as user level Unix processes may well become the platform of choice, particularly if they provide easy access to authentication and encryption routines.

References

[1] AT&T, Inferno. Information available from the Internet at: <URL:http://plan9.bell-labs.com/inferno/infernosum.html>

[2] Felix Gallo, Penguin! Information available from the Internet at: <URL:http://www.eden.com/~fsg/penguin.html>

[3] John Ousterhout, Tcl. Information available from the Internet at: <URL:http://www.sunlabs.com:80/research/tcl/>

[4] Princeton University, Secure Internet Programming. Information available from the Internet at: <URL:http://www.cs.princeton.edu/sip/>

[5] Sun Microsystems, Java. Information available from the Internet at: <URL:http://java.sun.com/>

[6] Larry Wall, Perl. Information available from the Internet at: <URL:http://www.perl.com/>

[7] University of Virginia, Mentat. Information available from the Internet at: <URL:http://www.cs.virginia.edu/ mentat/>

[8] University of Virginia, Legion. Information available from the Internet at: <URL:http://www.cs.virginia.edu/ legion/>

*[9] Guido van Rossum, Python. Information available from the Internet at: <URL:http://www.python.org/>

[10] Phil Zimmerman, PGP. Information available from the Internet at: <URL:http://www.ifi.uio.no/pgp/>


Troglodita approved!

Humberto Ortiz Zuazaga
humberto@hpcf.upr.edu

Most recent change: 2007/9/3 at 22:05
Generated with GTML