User Guide for CCCC

Table of Contents

Introduction

Report Contents

Counting Methods

Command line syntax

Configuration

Disclaimers

Getting CCCC

Introduction

CCCC is a tool for the analysis of source code in various languages (primarily C++), which generates a report in HTML format on various measurements of the code processed. Although the tool was originally implemented to process C++ and ANSI C, the present version is also able to process Java source files, and support has been present in earlier versions for Ada95. The name CCCC stands for 'C and C++ Code Counter'.

Measurements of source code of this kind are generally referred to as 'software metrics', or more precisely 'software product metrics' (as the term 'software metrics` also covers measurements of the software process, which are called 'software process metrics'). There is a reasonable consensus among modern opinion leaders in the software engineering field that measurement of some kind is probably a Good Thing, although there is less consensus on what is worth measuring and what the measurements mean.

CCCC has been developed as freeware, and is released in source code form. Users are encouraged to compile the program themselves, and to modify the source to reflect their preferences and interests.

The simplest way of using CCCC is just to run it with the names of a selection of files on the command line like this:

cccc my_types.h big.h small.h *.cc

Alternatively, the for a complex hierarchy, the user could enter a command like this:

find . | cccc - (on Unix family platforms)

or

dir /b/s | cccc - (on DOS/Windows family platforms)

CCCC will process each of the files specified on the command line (using standard wildcard processing were appropriate), or, if the '-' option is specified in the standard input stream. For each file, named, CCCC will examine the extension of the filename, and if the extension is recognized as indicating a supported language, the appropriate parser will run on the file. As each file is parsed, recognition of certain constructs will cause records to be written into an internal database. When all files have been processed, a report on the contents of the internal database will be generated in HTML format. By default the main summary HTML report is generated to the file cccc.htm in a subdirectory called .cccc of the the current working directory, with detailed reports on each module (i.e. C++ or Java class) identified by the analysis run.

In addition to the summary and detailed HTML reports, the run will cause generation of corresponding summary and detailed reports in XML format, and a further file called cccc.db to be created. cccc.db will contain a dump of the internal database of the program in a format delimited with the character '@' (chosen because it is one of the few characters which cannot legally appear in C/C++ non-comment source code).

The report contains a number of tables identifying the modules in the files submitted and covering:

Some of the data presented in the report may be displayed in an emphasized form (either with a bold or italic font, or with a red or yellow background). These are items which have been identified as lying outside ranges which have been laid down as desirable for the particular items. A bold font or red background indicates a value which exceeds a threshold defined as being dangerous for that measure, while italic fonts and yellow backgrounds indicate values below the danger threshold but still above a second lower threshold which has been laid down to indicate cause for concern. The two thresholds are configurable by the user of the tool: see the section below on configuring metric treatment for more details.

Report Contents

The report generated by CCCC normally consists of six tables plus a table of contents at the beginning and some informational material about CCCC itself at the end.

Tables generated

Table name

Description

Project Summary

This table presents summary values of various measures over the body of source code submitted.

Procedural Summary

This table presents values of procedural measures summed for each module identified in the code submitted.

Procedural Details

This table presents values of the same procedural measures covered in the procedural summary report, but this time broken down within each module into the contributions of each member function of the module.

Structural Summary

This table presents counts of fan-in and fan-out relationships to each module identified, and a derived metric called the Henry/Kafura/Shepperd measure, which is calculated as the square of the product of the fan-in and fan-out counts.

Structural Details

This table presents lists of the modules contributing to the relationship counts reported in the structural summary.

Rejected Extents

This table presents a list of code regions which the analyser was unable to parse.

Metrics displayed

Tag

Metric Name

Description

LOC

Lines of Code

This metric counts the lines of non-blank, non-comment source code in a function (LOCf), module (LOCm), or project (LOCp). LOC was one of the earliest metrics to come into use (principally because it is straightforward to measure).

It has an obvious relation to the size or complexity of a piece of code, and

can be calibrated for use in prediction of maintenance effort, although concern has been expressed that use of this metric as a measure of programmer productivity may tend to encourage verbose programming practises and discourage desirable simplification.

MVG

McCabe's Cyclomatic Complexity

A measure of a body of code based on analysis of the cyclomatic complexity of the directed acyclic graph which represents the flow of control within each function. First proposed as a measure of the minimum number of test cases to ensure all parts of each function are exercised, it is now widely accepted as a measure for the detection of code which is likely to be error-prone and/or difficult to maintain.

COM

Comment Lines

A crude measure comparable to LOC of the extent of commenting within a region of code. Not very meaningful in isolation, but sometimes used in ratio with LOC or MVG to ensure that comments are distributed proportionately to the bulk or complexity of a region of code.

L_C,M_C

LOC/COM, MVG/COM

See above

FO,FOc,FOv
FI,FIc,FIc

Fan-out, Fan-in

For a given module A, the fan-out is the number of other modules which the module A uses, while the fan-in is the number of other modules which use A.
See the section below on counting methods for a discussion of the distinction between the variants on each of these measures. these figures.

HKS, HKSv, HKSc

Henry-Kafura/Shepperd measure

This metric is derived by squaring the product of the fan-in and fan-out of each module. The original Henry-Kafura measure, which has been described as a measure of 'information flow complexity' includes a term for the length of the module under consideration, but CCCC uses the measure as modified by Shepperd, which omits this term on the basis that it debases the measure by combining two attributes which can and should be separately measured.
Corresponding to the variants on the fan-in and fan-out measures described above, similar variants are calculated on this metric.

NOM

Number of modules

Number of modules identified in the project. See discussion below about what constitutes a module.

WMC

Weighted methods per class

This measure, proposed by Chidamber and Kemerer, is a count of the number of functions defined in a module multiplied by a weighting factor. The only weighting algorithm suggested in the original formulation is a uniform weighting of one unit per function.

REJ

Rejected lines

This is a measure of the number of non-blank non-comment lines of code which was not successfully analysed by the parser. This is more of a validity check on the report generated than a metric of the code submitted: if the amount of code rejected was more than a small fraction (say 10%) of the total code processed, the meaningfulness of the numbers generated by the run must be in doubt.

Counting methods

CCCC implements simple algorithms to calculate each of the measures presented. The algorithms are intended to present a useful approximation to the underlying quantities, rather than meticulously exact counting: in general agreement with manual counts based on the same definitions should agree with CCCC to within 2-3%. If larger discrepancies are discovered, or if this level of agreement is not considered adequate, users are welcome to modify the source code to implement closer agreement, or to change the counting behaviour to reflect a desired basis of calculation. The basic definitions of each count are as follows:

Command-line syntax

The command line flags supported by CCCC are defined in the file ccccmain.cc. A brief usage message can be generated on standard output by entering the command 'cccc --help'. The same message will be generated on standard error if an invalid command line is entered. As of version 3.pre57 the text generated by this command is as follows:

Usage: 
cccc [options] file1.c ...  
Process files listed on command line.
If the filenames include '-', read a list of files from standard input.
This program is work in progress and is not well documented.
Please be prepared to refer to the source code for the 
meaning of some options.
Options:
--help                   * generate this help message
--outdir=<dname>         * directory for generated files
                           (default=.cccc)
--html_outfile=<fname>   * name of primary HTML report generated 
                           (default=<outdir>/cccc.html)
--xml_outfile=<fname>    * name of primary XML report generated 
                           (default=<outdir>/cccc.xml)
--db_infile=<fname>      * preload internal database from named file
                           (default=no initial content)
--db_outfile=<fname>     * save internal database to named file
                           (default=<outdir>/cccc.db)
--opt_infile=<fname>     * load options from named file
                           (default=use compiled-in option values, 
                           refer to cccc_opt.cc for option information)
--opt_outfile=<fname>    * save options to named file
                           (default=<outdir>/cccc.opt)
--lang=<string>          * use language specified for files specified 
                           after this option 
                           languages supported are c,c++,java
                           (default=use language/extension mapping 
                           controlled by options)
--report_mask=<hex>      * control report content 
                           (refer to ccccmain.cc for mask values) 
--debug_mask=<hex>       * control debug output content 
                           (refer to ccccmain.cc for mask values)

Configuration

Much of the behaviour of CCCC can be controlled by a stream of configuration data. The file cccc_opt.cc contains the default value for this stream, which can be dumped using the --opt_outfile qualifier. The resulting file can then be edited to customize behaviour and loaded into a later run of cccc using the --opt_infile qualifier. As of version 3.pre57, the contents of the file dumped by the command cccc --opt_outfile=cccc.opt are as follows:

CCCC_FileExt@.C@c++.ansi@
CCCC_FileExt@.CC@c++.ansi@
CCCC_FileExt@.CPP@c++.ansi@
CCCC_FileExt@.CXX@c++.ansi@
CCCC_FileExt@.H@c++.ansi@
CCCC_FileExt@.H++@c++.ansi@
CCCC_FileExt@.HH@c++.ansi@
CCCC_FileExt@.HPP@c++.ansi@
CCCC_FileExt@.HXX@c++.ansi@
CCCC_FileExt@.J@java@
CCCC_FileExt@.JAV@java@
CCCC_FileExt@.JAVA@java@
CCCC_FileExt@.c@c.ansi@
CCCC_FileExt@.c++@c++.ansi@
CCCC_FileExt@.cc@c++.ansi@
CCCC_FileExt@.cpp@c++.ansi@
CCCC_FileExt@.cxx@c++.ansi@
CCCC_FileExt@.h@c++.ansi@
CCCC_FileExt@.h++@c++.ansi@
CCCC_FileExt@.hh@c++.ansi@
CCCC_FileExt@.hpp@c++.ansi@
CCCC_FileExt@.hxx@c++.ansi@
CCCC_FileExt@.j@java@
CCCC_FileExt@.jav@java@
CCCC_FileExt@.java@java@
CCCC_MetTmnt@8.3@999999.000000@999999.000000@0@8@3@General format for fixed precision 3 d.p.@
CCCC_MetTmnt@CBO@12.000000@30.000000@0@6@0@Coupling between objects@
CCCC_MetTmnt@COM@999999.000000@999999.000000@0@6@0@Comment lines@
CCCC_MetTmnt@COMper@999999.000000@999999.000000@0@6@3@Comment lines (averaged)@
CCCC_MetTmnt@DIT@3.000000@6.000000@0@6@0@Depth of Inheritance Tree@
CCCC_MetTmnt@FI@12.000000@20.000000@0@6@0@Fan in (overall)@
CCCC_MetTmnt@FIc@6.000000@12.000000@0@6@0@Fan in (concrete uses only)@
CCCC_MetTmnt@FIv@6.000000@12.000000@0@6@0@Fan in (visible uses only)@
CCCC_MetTmnt@FO@12.000000@20.000000@0@6@0@Fan out (overall)@
CCCC_MetTmnt@FOc@6.000000@12.000000@0@6@0@Fan out (concrete uses only)@
CCCC_MetTmnt@FOv@6.000000@12.000000@0@6@0@Fan out (visible uses only)@
CCCC_MetTmnt@IF4@100.000000@1000.000000@0@6@0@Henry-Kafura/Shepperd measure (overall)@
CCCC_MetTmnt@IF4c@30.000000@100.000000@0@6@0@Henry-Kafura/Shepperd measure (concrete)@
CCCC_MetTmnt@IF4v@30.000000@100.000000@0@6@0@Henry-Kafura/Shepperd measure (visible)@
CCCC_MetTmnt@LOCf@30.000000@100.000000@0@6@0@Lines of code/function@
CCCC_MetTmnt@LOCm@500.000000@2000.000000@0@6@0@Lines of code/single module@
CCCC_MetTmnt@LOCp@999999.000000@999999.000000@0@6@0@Lines of code/project@
CCCC_MetTmnt@LOCper@500.000000@2000.000000@0@6@3@Lines of code/average module@
CCCC_MetTmnt@L_C@7.000000@30.000000@20@6@3@LOC/COM Lines of code/comment line@
CCCC_MetTmnt@MVGf@10.000000@30.000000@0@6@0@Cyclomatic complexity/function@
CCCC_MetTmnt@MVGm@200.000000@1000.000000@0@6@0@Cyclomatic complexity/single module@
CCCC_MetTmnt@MVGp@999999.000000@999999.000000@0@6@0@Cyclomatic complexity/project@
CCCC_MetTmnt@MVGper@200.000000@1000.000000@0@6@3@Cyclomatic complexity/average module@
CCCC_MetTmnt@M_C@5.000000@10.000000@5@6@3@MVG/COM McCabe/comment line@
CCCC_MetTmnt@NOC@4.000000@15.000000@0@6@0@Number of children@
CCCC_MetTmnt@WMC1@30.000000@100.000000@0@6@0@Weighting function=1 unit per method@
CCCC_MetTmnt@WMCv@10.000000@30.000000@0@6@0@Weighting function=1 unit per visible method@
CCCC_Dialect@c++.mfc@BEGIN_MESSAGE_MAP@start_skipping@
CCCC_Dialect@c++.mfc@END_MESSAGE_MAP@stop_skipping@
CCCC_Dialect@c++.stl@__STL_BEGIN_NAMESPACE@ignore@
CCCC_Dialect@c++.stl@__STL_END_NAMESPACE@ignore@

Configuration of the extension/language mapping

Records in the configuration stream of type CCCC_FileExt control the mapping of file extensions to languages.

Treatment of metric values

Records in the configuration stream of type CCCC_MetTmnt control the treatment of values for each of the metrics defined by CCCC.

Ignoring compiler-specific keywords

Records in the configuration stream of type CCCC_Dialect control the handling of dialect-specific pseudo keywords by the CCCC parsers.

Disclaimers

CCCC was produced as an artifact of an academic research project. The primary motivation was to provide a platform for the exploration of issues related to metrics. The program is not now, and will never become, a commercial standard supported product. While CCCC attempts to recover from parse failures, there are some language constructs which cause crashes, and others which result in code going unanalyzed. If CCCC does not report syntax errors and terminates normally, it is likely that all files have been analyzed, otherwise it is strongly recommended that the user does some kind of independent check on the quantity of code ignored, rather than relying on CCCC's own report on 'Rejected Extents'.

There is one further important disclaimer. As noted above, the primary motivation for the development of CCCC was to aid an academic project to investigate the use of metrics. Over the five years or so the project was running various research activities were conducted, culminating in a practical experiment into the value of metric analysis data in a simulated software engineering task. The aim of this final experiment was to attempt to demonstrate a benefit from the use of such data, its conclusion was that, at best, the presence of a benefit was "not proven". The research project, including the design and outcomes of the final experiment is described at http://www.fchs.ecu.edu.au/~tlittlef, which includes a link to download a PDF of the final PhD thesis arising from the project.

The CCCC project is now dormant. There are no plans for new releases, either to add new features to the program or to fix existing defects. The current version of the program is released under the GNU Public License, giving users the right to work on the source code to address any specific issues they have. The project is hosted on sourceforge.net, the primary developer, Tim Littlefair, can be contacted by email and will be happy to provide advice and encouragement. Contact details appear on the sourceforge website.

Getting CCCC

The best place to look for information about CCCC is the CCCC home page at http://cccc.sourceforge.net.

CCCC downloads are accessible via the standard SourceForge project hierarchy starting at http://sourceforge.net/projects/cccc. SourceForge also hosts mailing lists where new versions are announced and a bug tracker database for the project.

The CCCC distribution includes a version of the Purdue Compiler Construction Toolset (PCCTS) originally created by Terence Parr and coworkers at Purdue University, later maintained by Tom Moog. Many thanks to Terence, his colleagues and Tom for developing this excellent tool, and for releasing it under terms which make it possible for it to be included in the CCCC distribution.

The Win32 installer package for CCCC is created using version 2.0.18 of the "My Inno Setup Extensions" package by Jordan Russell, based on "Inno Setup" by Martjin Laan. This package can be downloaded from http://www.wintax.nl/isx

The program will also require a C++ compiler to build. Past versions have been buildable with various versions of the GNU C++ compiler and/or Microsoft Visual C++, although the code is intended to be portable to a range of modern C++ compilers (with a bit of work in some cases, as the original code base dates back to times before the ANSI standardisation of the C++ language). The reference build tools for the current version are GCC version 3.3 and the freely distributed Microsoft Visual C++ Toolkit 2003. See http://msdn.microsoft.com/visualc/vctoolkit2003/ for details.