Measuring Software
This page describes the kinds of projects I (Ewan Tempero) am interested in
supervising. It is not exclusive - I will consider projects in most areas of
software engineering - but the kinds of projects I describe here will
tend to get
priority. See my list of
current and past students to get an idea of projects I have supervised
in the past.
My main area of research is in measuring software.
This includes figuring what is interesting or useful about code to measure,
figuring out how to make the measurements of code, figuring out what these
measurements mean in terms of how software is developed and quality of code,
and figuring out what visualisations of these measurements can help improve
software development
There are a number of topics that must be addressed that I discuss in
more detail below, and include some ideas for projects that could be done in
those topics. A research project may address just one, or several of these
topics. I include references to research publications and student research
that has resulted from study in these topics.
Metric development
The are many many features of code that can be measured, and comparatively
little work has been done in this area. Even the simplest of features seems
to require many different metrics due to the interaction between different
features. For example, to measure "inheritance", we must take into
consideration such things as, is the class inheriting from another class,
developed for the same application, or is it inheriting from one of the
standard library classes, or from some third-party library. We also need to
consider whether there is a difference between (in Java or C#) inheritance
by interfaces and inheritance by classes, or whether a class inheriting from
another class is to be treated differently to a class implementing an
interface. These questions were discussed in a publication [TNM08].
Example projects on this topic are:
-
Develop new metrics that potentially provides useful information
about some feature of code (see for example
[TCN10],
[Tem09],
[YT07],
[MT07a],
[MT07b],
[CT07],
[Choi2006],
Yang,
Melton,
Riaz).
-
For any existing metric, develop a variation appropriate to a
different languages. For example, C++ and CLOS have multiple
inheritance, and so different
inheritance metrics may be needed those languages.
(see for example [Leonhardt2006],
KimUmeda2007).
-
There are many metrics that have been proposed but have never
been properly thought out, such as "coupling" and "cohesion",
metrics for inheritance, even "size". Develop well-defined metrics for
these ideas
(see for example
TNM08,
Yang,
Melton).
Measuring Code
Metrics by themselves are not very useful. They need to be used to take
measurements of actual code. This is an important step as it is only when
measurements are taken that we really get to understand what the metrics
might tell us (to say nothing of whether or not the metrics are
well-defined).
This involves building instruments that provide
measurements for the metrics, which is usually a non-trivial process. As
well as figuring out how to produce accurate measurements, when the
measurements are to be taken may affect its design. For example,
measurements may be taken on past releases, nightly builds, when checked
into version control, or as the code is being written.
New metrics, whose characteristics we don't yet fully understand, will
typically be applied to many (100-500) different applications. The
instruments will need to be designed to make it easy to try variations on
the metrics as our understanding improves. Measurements taken on nightly
builds or version control check-in require instruments that are very stable
and robust and that can be integrated into different companies' source code
repositories. Measurements taken when code is being written need to
integrate into the integrated development environment and need to be fast
enough as to not interfering with programming activities.
Example projects on this topic are:
-
Develop instruments for measuring existing or new metrics and
carry out empirical studies with them
(see for example
[TCN10],
[Tem09],
BT07,
MT07a,
Tem08,
TNM08,
MPT+08,
BFN+06,
Yang,
Melton,
Leonhardt2006,
Choi2006).
-
Develop a framework for integrating measuring instruments into different
kinds of source code repositories.
-
Develop new instruments or integrate existing instruments into an IDE such
as an plug-in for Eclipse. (see for example MT07c,
Zhang2007).
-
Develop the infrastructure for making measurements in an IDE that will allow
new metrics to be used without the need for new plug-ins or add-ons.
Visualising Measurements
Taking measurements usually produces a lot of data. The challenge is then to
present this data in a way that allows someone to interpret it sensibly,
and to answer the questions being asked. Visualisation is a successful
technique for presenting data in other disciplines, but it is still early
days for data that comes from code. Exactly what kinds of visualisation will
be successful depends on the metrics being used, the questions being asked,
and the goals of taking the measurements in the first place. Presenting
measurements from the same metric across many applications for understanding
the characteristics of a metric will likely have to be done differently than
presenting data from multiple metrics for the same application during the
nightly build as part of a organisations quality control processes.
Visualisations that might be feasible for off-line presentations may not
work for being integrated into an IDE because they are too slow to create.
Example projects on this topic are:
-
Develop novel visualisations for different metrics (see for
example
ANMT08b,
Zhang2007,
Kim2007).
-
Develop different ways to present the visualisations (see for
example ANMT08a).
Interpreting Measurements
Once that data can be presented, then it is time to figure out what it all
means. What does it tell us about the current state of the code; Is
everything going to plan; and What decisions need to be made.
Example projects on this topic are:
-
Determine how characteristics of some measurements related
to quality attributes of the code, such as modifiability,
understandability, or testability (see for example
[Yang2009],
MT07d).
Supporting Software Metrics Research
Much of the research I have been doing has been developing new metrics and
taking measurements with them. This requires something to actually
measure. We can learn more about an individual metric if we can relate its
measurements with those from another metric, but this requires that we
always measure the same thing. An important product that has been supporting
my research is the development of a standard software corpus. This is a
collection of open-source software that has been organised to allow
(relatively) large-scale empirical studies. It is also being distributed to
other research groups. More information is available here.
Example projects on this topic are:
-
Determine what quality control measures are necessary to ensure that
the corpus meets the requirements of good empirical software
engineering research (see for example Han2008).
Relevant Publications
This lists publications that come directly from this research. Many
of the publications are derived from research by students. The
theses and reports are listed below.
Research Publications
- [TCN10]
-
Ewan Tempero, Steve Counsell and James Noble 'An Empirical Study of
Overriding in Open Source Java' Thirty-Third Australasian Computer Science
Conference (ACSC2010), January 2010
- [Tem09]
-
Ewan Tempero 'How Fields are Used in Java: An Empirical Study' Australian
Software Engineering Conference (ASWEC)
April 2009
- [MPT+08]
-
Radu Muschevici, Alex Potanin, Ewan Tempero and James Noble 'Multiple
Dispatch in Practice' ACM SIGPLAN International Conference on
Object-Oriented Programming, Systems, Languages, and Applications,
October 2008.
- [TNM08]
-
Ewan Tempero, James Noble and Hayden Melton 'How do Java Programs Use
Inheritance? An Empirical Study of Inheritance in Java Software' 22nd
European Conference on Object-Oriented Programming (ECOOP), Springer
Berlin / Heidelberg Paphos, Cyprus. July
2008. pp. 667-691. [TR]
[Publisher]
- [MAT08]
-
Homan Ma, Robert Amor and Ewan Tempero 'Indexing the Java API Using
Source Code' Australian Software Engineering Conference (ASWEC),
Perth, Australia. March 2008. pp. 451-460. [TR]
[DOI]
- [YMT08]
-
Hong Yul Yang, Hayden Melton and Ewan Tempero 'An Empirical Study into
Use of Dependency Injection in Java' 19th Australian Software
Engineering Conference (ASWEC), Software Engineering Research Report,
University of Auckland Perth, Australia. March
2008. pp. 239-247. [TR]
[DOI]
- [ANMT08b]
-
Craig Anslow, James Noble, Stuart Marshall and Ewan Tempero 'Visualizing
the Word Structure of Java Class Names' OOPSLA 2008 Poster,
October
- [ANMT08a]
-
Craig Anslow, James Noble, Stuart Marshall and Ewan Tempero 'Towards
End-User Web Software Visualization' Graduate Consortium at the IEEE
Symposium on Visual Languages and Human Centric Computing (VLHCC),
Herrsching am Ammersee, Germany. September
2008. [PDF]
- [Tem08]
-
Ewan Tempero 'An Empirical Study of Unused Design Decisions in Open-source
Java Software' UoA-SE-2008-1, Software Engineering Research Report,
University of Auckland June 2008. [TR]
- [MT07a]
-
Hayden Melton and Ewan Tempero 'An Empirical Study of Cycles among Classes
in Java' Empirical Software Engineering, 12:4 Springer Netherlands
August 2007. pp. 389-415. [TR]
[DOI]
- [BT07]
-
Richard Barker and Ewan Tempero 'A Large-Scale Empirical Comparison of
Object-Oriented Cohesion Metrics' Fourteenth Asia-Pacific Software
Engineering Conference, Nagoya, Japan. December
2007. pp. 414-421. [TR]
[DOI]
-
- [MT07d]
-
Hayden Melton and Ewan Tempero 'Static Members and Cycles in Java Software'
1st International Symposium on Empirical Software Engineering and
Measurement (ESEM), September 2007. pp. 136-145. [PDF]
[DOI]
- [YT07]
-
Hong Yul Yang and Ewan Tempero 'Measuring the Strength of Indirect
Coupling' Australian Software Engineering Conference, IEEE Computer
Society Melbourne, Australia. April
2007. pp. 319-328. [TR]
[DOI]
- [MT07b]
-
Hayden Melton and Ewan Tempero 'The CRSS Metric for Package Design Quality'
Australasian Computer Science Conference, Published as CRPIT 62.
Australian Computer Science Communications Ballarat,
Australia. January 2007. pp. 201-210. [TR]
[Publisher]
- [MT07c]
-
Hayden Melton and Ewan Tempero 'JooJ: Real-time Support for Avoiding Cyclic
Dependencies' Australasian Computer Science Conference, Published
as CRPIT 62. Australian Computer Science Communications January
2007. pp. 87-95. [TR]
[Publisher]
- [CT07]
-
Kelvin H T Choi and Ewan Tempero 'Dynamic Measurement of Polymorphism'
Australasian Computer Science Conference, Published as CRPIT 62.
Australian Computer Science Communications Ballarat,
Australia. January 2007. pp. 211-220. [Publisher]
- [MAT06]
-
Homan Ma, Robert Amor and Ewan Tempero 'Usage Patterns of the Java Standard
API' Thirteenth Asia Pacific Software Engineering Conference
(APSEC06), IEEE Computer Society Bangalore,
India. December 2006. pp. 342-349. [DOI]
- [BFN+06]
-
Gareth Baxter, Marcus Frean, James Noble, Mark Rickerby, Hayden Smith, Matt
Visser, Hayden Melton and Ewan Tempero 'Understanding the Shape of Java
Software' ACM SIGPLAN International Conference on Object-Oriented
Programming, Systems, Languages, and Applications, Portland, OR,
U.S.A. October 2006. pp. 397-412. [DOI]
- [MT06]
-
Hayden Melton and Ewan Tempero 'Identifying Refactoring Opportunities by
Identifying Dependency Cycles' Twenty-Ninth Australasian Computer
Science Conference, Published as CRPIT 48. Hobart, Tasmania,
Australia. January 2006. pp. 35-42. [Publisher]
- [YTB05]
-
Hong Yul Yang, Ewan Tempero and Rebecca Berrigan 'Detecting Indirect
Coupling' The Australian Software Engineering Conference, IEEE
Computer Society Brisbane, Australia. March
2005. pp. 212-221. [DOI]
- [Yang2009]
-
Hong Yul Yang, PhD 2009.
Measuring Indirect Coupling
- [Melton]
-
Haydon Melton, PhD Measuring the
Effect of Refactoring and Design Patterns on Software Quality (in
progress)
- [Riaz]
-
Mehwish Riaz PhD Understanding the impact of database design on the
design quality of software systems (in progress).
- [Han2008]
-
Ted Pei-Hsuan Han Summer Research Scholarship
2007-2008.
Improving a Software Corpus
- [KimUmeda2007]
-
Misun Kim and Taiga Umeda, BE(SE) Part IV project 2007.
Mozilla Source Code Analysis.
- [Barker2007]
-
Richard Barker, ME(SE) 2007.
An Empirical Study of Cohesion Metrics
- [Ma2007]
-
Homan Ma, ME(SE) 2007.
Using Variable Identifiers to Index the Java 1.4.2 API
- [Kim2007]
-
Misun Kim Summer Research Scholarship 2006-2007.
X3D Visualisation of Software Metrics Data
- [Zhang2007]
-
Huinan Zhang, Summer Research Scholarship 2006-2007.
An Eclipse interface for JooJ
- [Leonhardt2006]
-
Enrico Leonhardt, BSc Postgraduate Project 2006.
An empirical study of power-laws and cycles in C# applications
- [Choi2006]
-
Hio Tong (Kelvin) Choi, ME(SE) 2006.
Dynamic Reuse Metrics