Researcher: Karl Crary
Research Area: Survivable Distributed Systems
Proposal: Trust-Free Garbage Collection for Mobile Code
In today's networked world, mobile code is all around us. Increasingly, our information infrastructure relies on code that is installed or updated in real time on devices for which downtime cannot be tolerated. This arrangement poses considerable security challenges, and those challenges are certain to grow as network-attached devices and other shared configurable systems become more ubiquitous. In most cases, these challenges are addressed today using what we may call extrinsic verification. Typically, the producer of a mobile software component will sign it, thereby making it possible for any device to verify that the component indeed was produced by the person or agency claimed. That person or agency is trusted to have implemented to component correctly and benignly, and therefore it may be safety installed.
This model of certification is “extrinsic" in the sense that it says nothing about the code itself, only about its producer. Extrinsic certification works satisfactorily today, but it is fundamentally unscalable. We cannot expect that all the mobile code we use will be produced by a small number of trusted authorities. Rapid growth in the software industry, in the open source movement, and in the prevalence of end-user programming is leading to an explosion in the number of software producers. Moreover, even trusted software producers have been known to make mistakes.
To achieve scalable and robust security in the presence of mobile code, we must move to intrinsic verification, wherein security is assured by mechanically inspecting the code itself. Intrinsic verification is scalable and robust because there is no need to trust anyone but the developers of the verification tools. Intrinsic verification is implemented in a variety of systems, including Java, Proof-Carrying Code, and Typed Assembly Language. Each of these systems is based on the observation that verifying safety unaided is too hard, and therefore the code producer should some information to assist the verifier. This information, which we will call a certificate, is tantamount to a machine-readable proof of safety. A software component accompanied by such a certificate is referred to as certified code.
All the certified code systems developed to date share one significant weakness: none of them allow certified software to manage its own memory. In most cases these systems mandate the use of particular trusted code to allocate memory and collect garbage. (Other systems impose even more severe restrictions, such as disallowing dynamic allocation entirely.) This is unfortunate, as not all applications are suitable for garbage collection, and even those that are may want to use a collector tailored for their own application, or simply a collector more modern than was available when the certified code system was deployed. Thus, the future of scalable mobile code is likely to require the ability to certify memory management code for safety. However, the technical obstacles are such that no one has yet succeeded in certifying the safety of any remotely practical garbage collector.
The proposed research will develop technology for practical trust-free memory management. Since garbage collectors are the most technically challenging memory managers to address, the proposed research will focus on type-safe garbage collection. In particular, it will develop a low-level type system for memory management code, implement in it at least one practical garbage collector suitable for use with high-level garbage-collected languages such as ML and Java, and deploy the collector in a grid computing setting.
Most prior work on type-safe garbage collection (but not all) has been based on high-level primitives tailored for garbage collection and on type structure designed to support those primitives. We agree that type theory is the right framework in which to consider the problem, but we suggest that it is profitable to begin at a very low level.
Type systems based on high-level GC primitives face two opposing pressures: They must be simple, because complicated type systems are difficult to use. On the other hand, their nature makes them complex, because they must internalize the complex invariants on which the primitives depend, and secondly, because they must provide invariants for all phases in a single type system. The need for simplicity limits the degree to which the type system can grow to provide more powerful GC primitives. Existing work has yet to produce a practical typed collector.
Instead, we propose to begin at a low level with a simple but powerful proof system in which it is possible to encode the invariants of a practical collector. Thus, GC invariants become part of the program (as they should be), rather than part of the type system. Moreover, there is no need to deal with invariants for distinct phases at the same time; each phase may state its own invariants.
The proposed effort will proceed in three stages: