Open Source and Outsourced Software Acceptance Evaluation and Assurance
According to data collected by CERT, 90% of reported security exploits involve engineering flaws in released software. Adi Shamir (the “S” in RSA), for example, stated that it is almost always easier—much easier—to attack bad software at the ends of an encrypted channel than to attack the channel itself. The CERT vulnerability data suggest, additionally, that the “badness” of the software more often resides at the lower “mechanical” levels of design and code rather than at the level of architecture or requirements. Indeed, the current Common Criteria (ISO 15408) provides a source of confidence concerning those higher levels, while not assisting significantly at the lower levels.
When source code is generally available, as in the case of open source systems there is a perception of heightened vulnerability because it appears that adversaries are able to use access to the source code in order to identify vulnerabilities and develop exploits. Open source or not, bad software is a major source of security challenge. There are also significant concerns regarding acceptance evaluation for outsourced software development, whether offshore or not. Vouches on the basis of process measures and complexity metrics and defect counts are insufficient to assure an absence of vulnerabilities and malicious code. Indeed, some malicious code takes the appearance of an unintentionally introduced vulnerability, which creates plausible deniability for the perpetrator. This heightens the challenge for effective direct software evaluation.
Direct software evaluation includes three aspects: (1) Means to develop a precise understanding of the code, (2) Means to provide direct assurance of critical security and dependability attributes, (3) Process involvement or visibility by the acceptance evaluators in the development of the code sufficient to enable the first two aspects. (This is a frequent complaint about the Common Criteria process, ISO 15408. DoD program managers, for example, call this “visible development.”)
Successful open source systems are widely adopted and extensively serve in mission critical roles for government and industry. The security concerns associated with them are substantive. For example, the open source Apache web server is the most widely adopted web server in the world (67% of all web sites). It is widely used in both government agencies and for e-commerce. There are many significant Java projects, including e-commerce and enterprise frameworks such as the Apache Jakarta project (e.g., Tomcat and Struts), the widely adopted J2EE implementation JBoss (for e-commerce applications), and infrastructure tools such as ANT, JUnit, NetBeans, Eclipse.
Among some open source proponents, there is a perception (“many eyes”) that a larger community of developers and expert users will contribute to making the code safe. This perception may be true—in part—but “many eyes” and other code inspection approaches must complement other kinds of assurance approaches better suited to properties not locally manifest in code, which includes many security-related properties. Security and dependability properties tend to have a non-local and sometimes nondeterministic character and they therefore can defy traditional testing and inspections/walk-throughs.
Proposed effort: We propose to address the challenge of bad software in the following ways:
1. Techniques and tools for models and assurance for exceptional conditions. We are developing a tool to ensure that exceptional conditions are handled consistently in security-critical systems. Correctly managing exceptional conditions is one of the most significant challenges of building secure software. Unexpected exceptions can directly lead to denial of service vulnerabilities if they can be triggered by input from an untrusted party. Furthermore, unexpected exceptions can interfere with program execution in more general ways, leading to broader vulnerabilities when exceptions disable security checks or leak secure information. We are building a tool that allows engineers to express and enforce high-level exception handling policies. These policies describe, at both component granularity, what exceptions might be raised by that component. Policy enforcement ensures that components do not raise unexpected exceptions, and that any exceptions raised are properly handled by clients.
Our tool goes significantly beyond the exception specifications in languages such as Java, both in focusing on component level policies rather than individual functions, in supporting both checked and unchecked exception types, and in providing advanced tools for querying and refactoring the exception structure of a program. In this project, we propose extending the tool to focus on exception-related security vulnerabilities, and applying the tool to find potential vulnerabilities in open-source software.
2. Application of software assurance tools to security evaluation. Many security-critical systems are built from reusable components (especially libraries and frameworks), and in any such system, the security of the system as a whole depends on correct usage of the component parts. For example, the Java security model requires that SecurityManager functions be invoked before any use of protected resources. Other applications may require that an authentication function be invoked before security-critical operations are performed. Unfortunately, these requirements are often poorly documented and are not enforced by tools, leading to failures when the components are used incorrectly. We are building a tool that both aids engineers in documenting these requirements, and can automatically ensure that a component is used correctly. This proposal will support the application of that tool to eliminate component-usage-related vulnerabilities in security-relevant open source code.
Our approach is to augment existing adopted tools to add modeling and analysis capability, but delivered in a manner that seems incremental or even trivial with respect to the experience of the working developer on deadline. We are guided in our design by a set of three principles related to practicability. Specifically: (1) Incrementality and early gratification. Any increment of effort we ask programmers to undertake should yield a generally immediate reward in the form of bug finding, assurance creation, guidance in evolution, or model expression. (2) Familiar expression. Properties should be expressed tersely and using terminology already familiar to programmers. (3) Analysis cut points and composability. Components can be assured separately, and the assurances linked into chains of evidence.
3. Rigorous automated data collection. Open source server-side tools already embody huge capability to automatically capture data and create rich links and metadata in the server-side code database. For example, every line of code in Mozilla (formerly Netscape) is flagged with the version in which it was last updated and the identity of the developer who committed the change. Developers transparently post changes to the server without needing to participate in this bookkeeping explicitly. With small enhancements, these capabilities, already in place, could be augmented to provide more effective traceability of code to developer actions, bugs, and most importantly (from our standpoint) models. (Tim Halloran, an ABD grad student working with us, has already done extensive preliminary feasibility studies.) Potential enhancement range from better tracking of the identities of developers to more precise linking of change management, bug/issue management, and version control. Augmentations such as these would enable more automatic tracking of the dispersed models, assertions, and code segments involved in reasoning about dependability and security properties.
4. Empirical and analytical evaluation of flaws. By analyzing the unusually rich data available in many projects, correlations and causal links can be identified between security flaws (or assurance challenges) and particular engineering practices, code patterns, APIs, and other engineering elements. For example, an ongoing study involving the PIs is leading to identification of the particular semantic character of “hard bugs”—those that attract the extended attention of the most senior engineers and are most challenging to diagnose and/or resolve. Our conjecture is that uninformed resolution of hard bugs is a common cause of unintended vulnerabilities and model violations. We will focus on the special issues of higher level languages such as the family of Java, C#, and Ada95. Proponents of these languages can claim that buffer overflows, for example, are prevented by memory management policies. But there can be other subtle problems, for example, with plugin frameworks (e.g., for Eclipse), with explicit security policy management (e.g., implementations of Java’s security manager abstraction and its checking policy for stack frames).
5. Targeted security dispatch. The rich data collected, including models and chain-of-evidence style assurance cases for security and dependability, can additionally provide an interesting way to support a new capability called TargetedSecurity Dispatch (TSD). The idea of TSD is to support early detection and management of dependability- and security-related flaws or issues. The database identifies code committers with resolution of individual lines of code. It identifies individuals involved with bug/issue diagnosis and resolution. It identifies individuals creating models and building individual links in chains of assurance evidence (i.e., composable proof fragments). This suggests that when a flaw is detected in a build (e.g., through testing or analysis) or when a flaw is reported in a bug/issue report, then automated tools can immediately contact exactly the individuals (and their managers) who are most likely to be involved in resolving the issue. Additionally, the tools can direct the attention of those individuals directly to the relevant code segments, bug/issue reports, and associated prior history and change management records. Underlying TSD are technologies described in items #4 above and #7 below.
6. Software understanding for security evaluation. Software structure deteriorates over time, increasing the difficulty of evaluating the software and also increasing the likelihood of introducing defects as the software is maintained and enhanced. Many development teams address this problem through aggressive code inspections and by refactoring modules and components that are frequently changed. Refactoring, however, introduces new risks, since it is itself an error-prone task.
The essential problem of code inspection is that code often contains various kinds of complex dependencies that are difficult to fully understand and take into account. We propose to consider the value of aggressive visualization tools for inspection and refactoring. Refactoring is a kind of “test” of the power of the tools, because of its error-prone nature. Previous research on software visualization has generally been technology-driven, i.e., novel visualizations are designed, then, with solution in hand, the research looks for problems for which the visualization provides assistance. It has also generally focused on individual, isolated visualizations rather than a collection of linked visualizations that together form an environment for solving particular tasks, particularly attribute-focused inspection. Finally, visualizations have mostly been based only on surface structure, failing to take advantage of other sources of data such as change history and deeper more semantic analysis results (which we can extract from the Fluid tool) that contain rich, if latent, information about code dependencies. These code dependencies can include, for example, pathways of information flow, both desired and not desired.
Our approach to the problem is to exploit new sources of information about dependencies and create a collection of linked-view creation tools. How do developers and evaluators search for hidden dependencies? What sorts of relations do they exploit to find related code? What tools could they use and how could they use them? What problems do they have when refactoring, and what sorts of dependencies do they overlook? Exploratory studies will guide this work. These tools and the diverse information sources they rely on, will provide a technology basis for the ideas of Targeted Security Dispatch.
We will exploit the code change history as a source of information about dependencies. When files or functions have frequently in the past been changed together, this suggests the existence of unknown dependencies. We will provide a visualization that supports the exploration of this information with linked-view creation tools. So, for example, after using change history visualization to identify possible dependencies that represent points of evaluation interest, one might want to select relevant files or functions and generate views of the selection in order to explore possible dependencies in more detail. The coder may want to create persistent views to which he can refer throughout the task. We will study how developers use the evolving tool set in order to identify particularly useful configurations, and views that seem to be particularly helpful in uncovering particular kinds of dependencies.