The University of Auckland - COMPSCI 773 S1 T

Computer Science

Vision Guided Control (Early applied vision)

COMPSCI 773 S1 T

Introduction

This course introduces computational methods and techniques used in popular vision-based research areas such as 2/3D face recognition, 3D scene reconstruction, etc... Many topics are only overviewed, but a number of interesting theoretical and practical problems are analysed in detail. While most seemingly simple automatic real-world actions present a real challenge you will be able to acquire knowledge currently used in the latest technological advances available. This course is a must for students eager to pursue post-graduate studies and/or a career in applied computer vision.

Design of modern control systems involves different mathematical tools, especially, optimization techniques, matrix analysis, and analytic 2D/3D geometry. Some tools will be explained in brief in the lectures. Still, you are expected to learn these methods in details and use them to complete assignments.
Programming will be undertaken in C, C++, C#. You are expected to be pro-efficient in at least one of the above mentioned programming language or show a strong willingness to learn. Java-only students with a strong motivation should be able to progress enough in any of these languages within the first few weeks of the course.

Assessment

Assessment is based on 60% course work (30% group work, 30% individual work) and 40% open-book final examination. Course work includes one-to-one oral test and assignments that exploit the hardware (digital cameras and PCs) available in our research labs at Tamaki (room 731.234). For each assignment, each group will have to write a report which should be organised as follows:

Each member of the group works on a distinct part of the assignment and writes an individual report
Each group provides a group report presenting the group solution and achievements for each assignment. Basically the group report should consist of an introduction of the problem and different solutions proposed.
Both the individual and group reports should show students' abilities to:

analyse a problem
propose feasible solutions based on materials taught during the lectures or learnt while reading research papers
use statistical tools to assess their experimental results

Course work

A particular feature of the course work is the emphasis on complete system design. Therefore, instead of picking a small part of the material covered in lectures as assignment tasks, the project in this paper has the aim of developing a complete system to perform a specified task. The individual assignments present intermediate steps toward achieving this goal. At the end of the paper, there will be a competition to evaluate your project.

The equipment in the CITR Active Vision Lab consists of a number of PCs running Windows and a few web-cameras to be used to perform HCI applications (face recognition, dynamic 3D face animation, face expression recognition). We may also use our 3D scanner and stereo-vision systems for 3D face acquisition.

Nowadays, Human - Computer Interaction is a hot research topic. It consists mainly on extracting information (from audio-visual speech, visual expression, hand signs, body expression) to interact efficiently with a robot or a machine via a computer. Potential applications range from automatic speech recognition (ASR), videoconference, virtual reality, communication for disabled people, user verification and recognition (audiovisual biometrics features), to remote control of robots, vehicles, and devices.

This year course projects will encompass topics such as stereo vision, 3D positioning, feature extraction and classification with a focus on real-time processes for efficient interaction. Basically, you will have to:

Design an interface to acquire synchronized images and videos from 2 USB cameras.
Find a limited set of face features using stereo vision and potentially markers on the face.

Your task will be to track your face (and a subset of face markers) movement.

Authenticate faces in the labs

Extract faces from images and identify them using advanced statistical analysis techniques such as PCA, LDA.
Fuse stereo-vision face data (depth map) and readily available face texture for 2+3D face recognition

The work is subdivided into three assignments covering the following parts of the project:

Calibration of stereo cameras for computing 3D positions of a desired item in the cameras field-of-view by intersecting optical rays and visualising 3D movements of the item (you will use in this assignment the existing Tsai calibration software (or any suitable calibration method you might have researched). You will also do several programming tasks in camera control, vdeo-stream synchronization, video display, etc...).
You will use statistical analysis techniques to recognise face. You will track and display face features using set markers.

Face authentication for faces localisation, face mask extraction, and face recognition at real-time.
3D face authentication (fusing depth map and face texture images)

Whole system testing. You will integrate your previous work to:

Effectively track faces and face features from synchronised video-images.
Effectively recognise faces in 2D and 3D.

The schedule of these assignments is as follows:

	Theme	Due date
Assignment 1	Camera calibration, USB camera image/video acquisition, and 3D GUI	March 21 (preamble) and April 11 (full assignment), 2008
Assignment 2	3D face features tracking, 3D face animation, 2D face and face expression recognition	16 May 2008
Assignment 3	Whole System Testing (live demo)	Last day of lectures of the semester

Basic Topics of the Course

Notice that there are no lectures during the Graduation Week, on Friday May 9, 2008

Camera calibration and projective geometry (for handouts: go to "Lectures")

2D and 3D vision geometry
Single camera calibration
Stereo calibration: epipolar geometry, triangulation

Low-level Image processing (for handouts: go to "Lectures")

Colour detection and discrimination
Binary image segmentation
Image matching

Video sequence processing

Object tracking and image filtering
Real-time image processing

Face expression and face recognition

Feature extraction: face
Feature classification (PCA, LDA)

Stereo-vision

Epipolar geometry, rectification
Stereo matching

Elements of real-time control

Kalman filtering

773 Groups

Groups	1	2	3	4
Students Name/ UPI	TBA	TBA	TBA	TBA

Details

Related Programmes

Apply now!

Handbook

Postgraduate study options

Computer Science Blog