At the class level we look not just at metrics which measure aspects of the class we also look at metrics which give us information on the interaction between classes. Metrics which measure these class interactions tell us far more about our design than about our code. Some of the metrics tell us how good our ‘division of labour’ is between our methods while others tell us how much a change to a particular class will affect code in other class. The ideal is that changes to one class should have minimal effects on other classes, and that the number of other classes affected should be minimal. Where classes do have a high level of dependancy on one another they should be in the same package – but we’ll talk about that when we come to look at the package level metrics.
At a basic level we are interested in metrics accumulated from the method-related metrics e.g. the numbers of Methods and Statements in the class. We may also be interested the Total , Average (per method) and Maximum Cyclomatic Complexity and the Total Halstead Effort. We would also be interested in the Maintainability Index at this level. With these metrics we are simply using them as a ‘targeting’ device to direct us towards classes that probably contain methods that contribute to these metrics. Generally the changes that we need to make to improve these metrics will be at the individual method level.
Examples of the core ‘intra-class’ metrics that we might collect are –
LCOM – Lack of cohesion of methods – This metric measures the correlation between the methods and the local instance variables of a class. High cohesion indicates good class subdivision. Lack of cohesion or low cohesion increases complexity. Classes with low cohesion could probably be subdivided into two or more subclasses with increased cohesion. It is calculated as the ratio of methods in a class that do not access a specific data field, averaged over all data fields in the class. The LCOM version that we use in JHawk is LCOM* as defined by Henderson-Sellars. This algorithm should produce results in the range 0 to 1 with zero representing perfect cohesion (each method accesses all attributes), however we have noticed that some values exceed 1. This can occur in three scenarios –
- Where there are no instance attributes defined in the class (ie it inherits them all from superclasses). We view this as legitimate since the inherited class may be modifying or adding behaviour – in this case we set LCOM to 0.
- Where there are no references to instance attributes but there are instance attributes defined. We view this as bad – we have created attributes that are not referred to within the class – this can only mean that they are either not used or are used in a subclass – if the latter then they should be defined in the subclass. We therefore set LCOM to the number of instance variables. The only other case where attributes are defined but not used are classes which define static constants for use by other classes. We don’t make an exception for this but examination of the class by eye (or the class name – most people put ‘Constants’ in the name of such classes) should confirm if this is the case. Some implementations of LCOM don’t count static instance variables – we took the decision to leave them in (mainly because naming conventions cover the legitimate case above)
- Where the number of references to attributes is less than the number of attributes. In this case some of the attributes are not being referenced. This is really a less extreme case of 2 so we let the algorithm take its course and let LCOM ride up above 1. This situation could arise where subclasses of the class in question may share a number of attributes. This class is the logical place to put these attributes but it is possible that only the subclasses will ever reference them. This issue will not arise if accessors are used for the variables and are placed in the same class.
UWCS – Unweighted Class Size – This is calculated from the number of methods plus the number of attributes of a class. Smaller class sizes usually indicate a better designed system reflecting better distributed responsibilities. In other words you didn’t just stuff all the functionality into one big class. It’s difficult to set hard and fast rules about this but you should look carefully at classes where UWCS is above 100.
From a design perspective the ‘inter-class’ metrics are more interesting, The first group of these metrics are those which are related to Coupling i.e. How much one class is dependent on others and vice-versa. Excessive coupling increases sensitivity to changes in other parts of the design and makes a module harder to understand and therefore to change since it is interrelated with other modules. Designing systems to reduce coupling between modules improves modularity and promotes encapsulation. Coupling measurements give us a sense of how difficult it will be to make changes to our system. If coupling is high we run a greater risk of ‘breaking’ something in one area of our code while we are trying to ‘improve’ something in another area –
One measure of Coupling is RFC (Response For Class). This measures the complexity of the class in terms of method calls. It is calculated by adding the number of methods in the class (not including inherited methods) plus the number of distinct method calls made by the methods in the class (each method call is counted only once even if it is called from different methods). JHawk follows this definition. There is only one other common alternative approach to RFC and that is a version which includes the full call graph of any method called from the originating method i.e. method calls are totalled recursively until the end of the call stack is reached. This introduces difficulties about deciding which methods should be included (e.g. should those in the JDK or similar libraries be included?). JHawk follows the standard definition.
Message Passing Coupling (MPC) – This metric measures the numbers of messages passing among objects of the class. A larger number indicates increased coupling between this class and other classes in the system. This makes the classes more dependant on each other which increases the overall complexity of the system and makes the class more difficult to change.
There are five other coupling metrics about which there is considerable confusion in the available academic literature. These metrics all relate to the coupling at the class level. Coupling is a representation of the references between one class and another. A class can reference another by accessing the class itself, an instance of the class or a variable related to the class. A coupling can exist because a class references another or because a class is referenced by another. If a class is referenced a number of times it is still only counted once. There are five common metrics associated with coupling whose definitions are apparently in dispute –
- CBO – Coupling Between objects
- Fan Out
- Fan In
- Efferent Coupling (Ce)
- Afferent Coupling (Ca)
The original CBO definition by (Chidamber & Kemerer) was a total of the number of classes that a class referenced plus the number of classes that referenced the class. If a class appeared in both the referenced and the referred classes it was only counted once. There have been a number of interpretations of CBO in recent years (including some generated by the original authors) – the most commonly used of these is that CBO is the number of classes that a class references (i.e. FanOut as defined below).
Fan Out is defined as the number of other classes referenced by a class. Most differences in interpretation hang on the definition of ‘references’ – this is often looser than the tight defintion outlined in our initial paragraph above. Fan In is the number of other classes that reference a class.
Efferent Coupling is viewed as equivalent to Fan out and Afferent coupling to Fan In. Definitions of Afferent and Efferent Coupling tend to be stricter for those for Fan In and Fan Out.
JHawk provides Fan In and Fan Out measures and takes the view that CBO is equivalent to Fan Out. We have also included CBO as per the ‘classic’ Chidamber & Kemerer definition. We follow the definition of ‘references’ as outlined in the first paragraph of this section. The primary reason for taking the view that FanOut is equivalent to CBO is that when we list metrics we are viewing them from the class perspective and we will take action based on the metrics that we see. If we measure coupling based on the total of the classes referenced by and the classes that reference a class then it is possible that a high (‘classic’) CBO figure could be a result of a large number of classes referencing it. If you are the programmer responsible for that class but not the referencing classes then there is nothing that you can do to modify the class and reduce the CBO figure. Using fan-out shows those classes that make a large number of class references and allows you to tackle the issue by changing the number of classes that the class references. There is also a view expressed that high fan-outs represent class coupling to other classes/objects and thus are undesirable while high fan-ins represent good object designs and high level of reuse – taking this view would suggest that metrics that combine these in an additive relationship are meaningless as the two measures effectively negate each other.
There are a number of other useful metrics at the class level –
- Reuse ratio – calculated as – (number of superclasses above this class in the class hierarchy )/(total number of classes in the class hierarchy).
- Specialization Ratio – Calculated as (number of subclasses below this class in the class hierarchy)/ (number of superclasses above this class in the class hierarchy).
- No. of External Methods Called, No. of Methods Called in class hierarchy and No. of local methods called – these give some idea of the level and complexity of interaction between the class and other classes (both in the classes hierarchy and external to it). These figures are used in calculating metrics such as RFC, LCOM, MPC, Fan In and Fan Out.
- No. of instance variables, No. of modifiers, No. of interfaces implemented and No. of packages imported – these give additional information about the class’s level of semantic complexity. As with methods, large values for these can suggest that a class is doing too much.