Vision Guided Control
COMPSCI 773 S1 T
Introduction
This course introduces computational methods and techniques
used in
vision-based real-time control. Many topics are only overviewed, but a
number
of interesting theoretical and practical problems are analysed in
detail.
You should not expect exciting things which may be found in sci-fi
books
or movies like "Terminator" as you will soon find out that even a
seemingly
simple automatic real-world action may present a real challenge.
Design of modern
control
systems involves different mathematical tools, especially, optimization
techniques,
matrix analysis, and analytic 2D / 3D geometry. Some tools will be
explained
in brief in the lectures. Still, you are expected to learn these
methods
in details and use them to complete assignments.
Assessment
Assessment is based on 60% course work (30% group work, 30% individual
work) and 40% open-book final examination. Course work includes
one-to-one
oral test and assignments that exploit the hardware (digital cameras
and PCs) available in the CITR Robotics Lab at Tamaki (room 731.234).
For each assignment, each group will have to write a report which
should be organised as follows:
- Each member of the group works on a distinct part of the
assignment
and writes an individual report
- Each group provides a group report presenting the group
solution
and achievements for each assignment. Basically the group report should
consist of an introduction of the problem and different solutions
proposed.
- Both the individual and group reports should show students'
abilities to:
- analyse a problem
- propose feasible solutions based on materials taught
during
the
lectures or learnt while reading research papers
- use statistical tools to assess their experimental
results
Course work
A particular feature of the course work is the emphasis on
complete
system design. Therefore, instead of picking a small part of the
material
covered in lectures as assignment tasks, the project in this paper has
the aim of developing a complete system to perform a specified task.
The
individual assignments present intermediate steps toward achieving this
goal. At the end of the paper, there will be a competition to evaluate
your project.
The equipment in the CITR Active Vision Lab consists of a
number of PCs
running
Linux/Zindows and a few digital video cameras to be used to perform
hand-gesture recognition and user's face authentication. Also there are
two pan tilt digital
cameras forming a stereo system for sensing and describing a 3D working
space where hand gestures should be recognised and traced.
Nowadays, Human - Computer Interaction is a hot research
topic. It
consists
mainly on extracting information (from audio-visual speech, visual
expression,
hand signs, body expression) to interact efficiently with a robot or a
machine
via a computer. Potential applications range from automatic speech
recognition
(ASR), videoconference, virtual reality, communication for disabled
people,
user verification and recognition (audiovisual biometrics features), to
remote
control of robots, vehicles, and devices.
This year course projects will encompass topics such as stereo
vision
and 3D positioning, feature extraction and classification
with a focus on real-time processes for efficient
interaction. Basically, you will have to:
- Design an interface to acquire images and videos from
2 USB cameras.
- Find a 3D trajectory of a hand motion and specify
particular
objects using stereo vision and hand
gesture recognition.
- Your task will be to track your hand movement along an
arbitrary path but with mandatory check points in a 3D field. You will
have to use specific
gesture to name the check point and create a meaningful 3D description
of the work field.
- You will have to identify a set of predifened hand
postures using advanced
statistical analysis techniques such as PCA, LDA.
- Autentify faces in the labs
- Extract faces from images and identify them using
advanced
statistical analysis techniques such as PCA, LDA.
The work is subdivided into three assignments covering the
following
parts of the project:
- Calibration of stereo cameras for computing 3D
positions
of a desired item in the cameras field-of-view by intersecting optical
rays and visualising 3D movements of the item (you will use in this
assignment
the existing Tsai calibration software but also do several programming
tasks in networking and client-server camera access). You will build a
GUI which
acquire images and videostreams from USB cameras and will
display the 3D positions and trajectories any object part of the
field.
- Hand gesture recognition for hand localisation,
hand
mask extraction, and hand
signs recognition at real-time.
- Face authentication for
faces localisation, face mask extraction, and face recognition at
real-time.
- Whole system testing. You will integrate your
previous
work
to:
- effectively recognise hand signs and use these latter
to
indicate objects
- display a sequence of check point objects visited by
a
moving hand
- effectively extract faces from video-images
- recognize faces
The schedule of these assignments is as follows:
|
Theme |
Due date |
Assignment
1 |
Camera calibration, line detection, and 3D GUI |
April 1 (preamble) and April 15 (main part), 2005 |
Assignment
2 |
2D/3D Hand posture and face recognition
|
TBA |
Assignment 3 |
Whole System Testing |
TBA |
Basic Topics of the Course
- Camera calibration and projective geometry
(for handouts: go to "Lectures")
- 2D and 3D vision geometry
- Single camera calibration
- Stereo calibration: epipolar geometry, triangulation
- Low-level Image processing
(for handouts: go to "Lectures")
- Colour detection and discrimination
- Binary image segmentation
- Image matching
- Video sequence processing
- Object tracking and image filtering
- Real-time image processing
- Gesture and Face recognition
- Feature extraction: hand and face
- Feature classification (PCA, LDA)
|
|
|