Open Source Quality: Application Architecture Data from Over 2,500 Projects Made Available

Coverity – a company specialized in software integrity products and testing services - today announced the publication of application architecture data from over 2,500 popular open source software projects on http://www.scan.coverity.com.

All data is provided as an extension of Coverity’s work with the U.S. Department of Homeland Security (DHS) with a Creative Commons license, and individual developers and open source vendors can freely benefit from it.

Today the ‘state of the art’ in viewing code hierarchies is pretty poor, and as a matter of fact dependencies and structure are not easily visible. Coverity Architecture Analyzer provides visual and actionable data, reporting hierarchies and dependencies.

David Maxwell, Open Source Strategist at Coverity, gave me some background and information about the Architecture Library, a project that is part of the ‘Open Source Hardening Project‘, and also about Coverity’s scan.

DHS pursues many avenues to improve software security, including reactive ones like firewalls, intrusion detection, and virus detection – but in our case, they were looking for a proactive approach. How do you make software more stable, secure, and reliable, so that you don’t have to deal with exploits after they happen.

The Coverity portion was to use our static analysis software on open source projects, and give the results to the developers who work on those projects, so they could fix the bugs we found. Our part of the contract was $100,000/y for 3 years (just ending now).

So far, over 8,500 bugs have been fixed in open source software that were identified by our analysis. Some of the most recent examples were noted in the release of Wine 1.1.5 last week.

Only a fraction of total fundings – $100,000 out of $1.24 million - allowed Coverity to enlist on the Scan Ladder more than 250 projects, find bugs, make them available to the developers, track changes made to the software and eventually confirm that by doing ongoing analysis of the code as it is updated.

How projects in the Scan Ladder are selected?

The initial selection was made manually, by choosing the most popular and widely used projects – Linux, Apache, Samba, OpenSSL…  after the project was up and running with the first 35 or so of those, we had many requests from projects that wanted to be included in the Scan.

Selection was done that way because bug fixes in widely used projects benefit more users. Of course, the analysis is only valuable if the bugs get fixed, so working with projects that express an interest makes good sense.

Given the results, I wonder if open source vendors will apply..