Studying Software Vulnerabilities

Software vulnerabilities allow an attacker to reduce a system's Confidentiality, Availability, and Integrity by exposing information, executing malicious code, and undermine system functionalities that contribute to the overall system purpose and need. With new vulnerabilities discovered everyday in a variety of applications and user environments, a systematic study of their characteristics is a subject of immediate need for the following reasons:

  1. The high rate in which information about past and new vulnerabilities are accumulated makes it difficult to absorb and comprehend.

  2. Rather than learning from past mistakes, similar types of vulnerabilities are observed repeatedly.

  3. As the scale and complexity of current software grows, better mental models will be required for developers to sense the possibility for the occurrence of vulnerabilities.

While the software development community has put a significant effort to capture the artifacts related to a discovered vulnerability in organized repositories, much of this information is not amenable to meaningful analysis and requires a deep and manual inspection. In the software assurance community a body of knowledge that provides an enumeration of common weaknesses has been developed, but it is not readily usable for the study of vulnerabilities in specific projects and user environments.

This research combines the information sources from these two communities in a way that facilitates the study of vulnerabilities recorded in large software repositories. We introduce the notion of a semantic template to integrate the scattered information relevant to understand and discover vulnerabilities. We evaluate the use of semantic templates by applying it to analyze vulnerabilities, both reported and hidden, as recorded in the software repositories from the Apache Web Server project. We refer to software repositories in a general sense that includes source code, version control data, bug reports, developer mailing lists and project development websites. We derive semantic templates from community standards such as the Common Weaknesses Enumeration (CWE) and Common Vulnerabilities and Exposures (CVE). We rely on standards in order to facilitate the adoption, sharing and interoperability of semantic templates.

CWE FormalizationNew!

Alloy Models of CWEs

Using Grammatech Code Surfer to explore
test case structures

Injection Semantic Template, CWE 1.6

A.    Preparation and collection phase:

     A.1 Selection of content:

Since the CWE is continuously evolving, the template is based on version 1.6. The CWE uses “views” to integrate multiple categorization of weaknesses that share several CWE categories. We use the two most prominent views of the CWE: The Development view (CWE-699) of CWE categories, which is suited for suited for practitioners in the SDLC, and the Research view (CWE-1000), which is suited for research purposes as it has a deep and abstract hierarchical structure.

A.2 Extraction of relevant weaknesses:

The next step is to identify the CWE category that identifies the weakness of interest at the most abstract level. For the “Injection” weakness, CWE – 74 “Failure to Sanitize Data into a Different Plane ('Injection')” is such a category. We call it the root category. Starting with the root category we adopt four strategies to gather weakness related to it in the CWE research and development views.

a.  Navigate hierarchical relationships of the root category (“Parent” and “Child Of”).

b.  Navigate non-taxonomical relationships such as “Can Precede”, “Can Follow”, Peer-of” in the CWE hyperlinked document [1].

c.  Keyword search on the CWE document [1] for weaknesses that have the injection weakness described in their primary or extended description. Keyword search is followed by exploration of parent, sibling and child categories of the discovered CWE category, for relevance to the root category.

d.  Visualization of the root category and its related weaknesses identified by automatically parsing the CWE specification available in an eXtensible Markup Language (XML) [1].

Injection related CWEs

Dynamic visualization of injection related CWEs


While applying each strategy, use of heuristics and some degree of judgment is required on part of the subject matter expert to include a CWE category into the pool of relevant weaknesses.

Injection Semantic Template

Apache HTTP Server injection CVE Annotated View

Annotated Chrome Broswer CVE-2010-2301


Buffer Overflow Semantic Template, CWE v1.6

Buffer overflow related CWEs

Dynamic visualization of buffer overflow related CWEs

Buffer overflow Semantic Template

Apache HTTP Server buffer overflow CVE annotated view

Annotated Chrome Broswer CVE-2010-1773

Buffer Overflow Prezi

Buffer Overflow Semantic Template, CWE v2.0
(Under construction)

Buffer overflow semantic template, CWE v2.0

Information Leak Semantic Template

Information Leak Semantic Template

Information Leak related CWEs visualization 1

Information Leak related CWEs visualization 2

Information Leak related CWEs visualization 3

Recent Experiments with X3D and WebGL

Class level CWE visualization

Full CWE visualization

Full CWE visualization with text

CWE analytics (comming soon!)

Want hands-on experience?

Here are some example vulnerabilities, why don’t you fill-up the semantic templates to study them?

CVE Samples for study using the Buffer Overflow semantic tempalte



Presentation at the SwA Working Group Sessions - Summer 2011. June 28-30, 2011 at MITRE-1, 7525 Colshire Drive, McLean, VA 22102-7539.
Wednesday June 29 - Track B (1H300)
Session 1B: Helping Programmers Understand and Study Software Security Weaknesses: Semantic Templates – Robin Gandhi
To cope with growing software complexity, programmers need better mental models to sense the possibility of a vulnerability. There is no shortage of weakness enumerations and categorization, but they are not in a form that facilitates human understanding and recall. For example, the CWE contains over 50 highly interrelated weakness definitions just to comprehend the possibility of buffer overflows. In this session, the development of Semantic Templates will be introduced for the study of software vulnerabilities. Work on Semantic Templates has been ongoing at the University of Nebraska Omaha since 2010. Using Semantic Templates experiments indicate a definite improvement in the programmer's ability to understand the CWEs related to the underlying software fault, weakness characteristics, resources, and locations affected and the consequences of a given CVE. Input from attendees will be used to guide the adoption of semantic templates in education and research while soliciting avenues for further growth.

Presentation at the 24th FISSEA’s 24th Annual Conference: "Bridging to the Future – Emerging Trends in Cybersecurity" March 15 - 17, 2011, National Institute of Standards and Technology Gaithersburg, Maryland. Dr. Gandhi participated with Joe Jarzombek, Director for Software Assurance, National Cyber Security Division, Office of the Assistant Secretary for Cybersecurity and Communications. pptx

Siy, H., Wu, Y., Gandhi, R.A., Empirical Results on the Study of Software Vulnerabilities (NIER Track). In proceedings of the 33rd International Conference on Software Engineering (ICSE 2011), Waikiki, Honolulu, Hawaii, May 21-28, 2011

Gandhi, R.A., Siy, H., Wu, Y., Studying Security Vulnerabilities, CrossTalk, The Journal of Defense Software Engineering, Sept/Oct issue 2010.

Presentation at NebraskaCERT Cyber Security Forum, Standards in reporting software flaws: SCAP, CVE and CWE - Part 2, July 2010 Slides

Yan, W., Gandhi, R.A, and Siy, H., “Using Semantic Templates to Study Vulnerabilities Recorded in Large Software Repositories” Proc. of The 6th International Workshop on Software Engineering for Secure Systems (SESS'10) at the 32nd International Conference on Software Engineering (ICSE 2010), South Africa, Cape Town. 2010

Project Members

Gandhi, Robin           []
Siy, Harvey              []
Wu, Yan


Website Maintained By: Robin Gandhi, Last updated on 14th July, 2011