Data Structures and Algorithms
|
Experimental Design
|
Designing Experiments
Designing a good experiment is not a trivial task:
it needs some thought and care if the results of your
work are going to lead to valid conclusions.
While there are many variations of good strategies,
the following basic steps will generally lead to a good
procedure:
- Form a hypothesis which your experiment(s) will
test,
- Design some experiments to test this hypothesis,
- Run the experiments and collect the data,
- Analyze your data:
- Transform your raw experimental data into a form which
permits readily verifying or disproving the hypothesis,
- Estimate the experimental error in the original data and
transformed data,
- Compare the transformed data with expectations produced
by the hypothesis.
- If the experimental data and expected results are in accord
(within the experiment's error limits),
then you have a basis for claiming the original hypothesis
to be true.
- If there is a discrepancy, you have a number of strategies:
- Analyse the experiments for sources of error,
- Repeat the experiments to confirm the results (not usually
very profitable for computer based experiments!)
- Re-design the experiments or
- Form a new hypothesis.
Of course, if you are doing original research (that is,
you are testing some new hypothesis for the first time),
then you now need to form alternative hypotheses and see
whether the experimental data would match them also
and repeat the experiments (or formulate new ones) to
ensure that you can reproduce the original results.
This cycle is potentially never-ending!
However, for the experiments you will need to do in this course,
the results are well-known and one 'pass' through this process
should suffice!
Rules
There are a few very important rules:
- You must report all experimental results,
unless they are clearly wrong (eg your program had
an error).
- You may not reject any measurement unless
you can provide an iron-clad argument to
justify its rejection.
An example of an acceptable reason for rejecting or ignoring
a result is:
The computer's timer has a minimum resolution of 10ms,
so all runs with times less than 1s were ignored
because they could have had errors >1%.
Of course, a really careful experimenter would not discount
these results either, but confirm that they also matched
the hypothesis - making allowance for the known error.
However, in the interests of efficiency, this level of
precision will not be expected here!
Reporting results
It is usual to transform your results to some form which
makes it easy to verify that they conform to the hypothesis.
The most common strategy is linearisation - plotting results
on a graph with axes designed to produce a straight line.
Another good strategy is normalisation - reduction of all
results to a constant value.
This normalisation strategy is a good one to employ here:
assume that you are verifying the time complexity of
quicksort. This is usually a O(nlogn) algorithm.
-
Your hypothesis is that quicksort takes n logn time,
or more formally:
T(n) <= c n log n
where T(n) is the time to sort n items
and c is a constant.
-
So time a series of sorts of increasingly larger collections
of items.
-
Divide the times by nlogn.
If you have a set of constants, c +/- some suitably small variation,
then the running time may be expressed as:
T(n) = c n log n
verifying your hypothesis.
Note that you will need to ensure the values of n chosen
for the experiment span a suitably wide range -
so that cn logn, the expected running time also spans
a reasonable range.
For the sort experiment,
this will be satisfied if the largest value of n is
10 times the smallest one - giving times differing by a
factor of ~30 ( 10log210 ).
However, if you were timing a search algorithm with O(logn)
time complexity, then values of n ranging over one
order of magnitude will only produce times varying by a factor
of just over 3 (log2n).
This will generally be insufficient to provide a convincing verification
(over such a small range, other functions of n will also
fit quite well!), so you would need to try to have values of
n ranging over, say, three orders of magnitude, to give
times ranging over one order of magnitude (log21000 ~ 10).
Lecture slides
You will find some additional points and examples in the
lecture slides.
© , 1998