CS498: Computer Vision

CS498

Project

Introduction

The purpose of this assignment is to give you the opportunity to explore a computer-vision topic of your own choosing.

Initial discussion with professor (Due in Week 7 Lab)

During the Week 6 or Week 7 lab period, talk with your instructor about ideas you are considering for your project.

Project Deliverables

There are four check-points — Friday of Week 8, Lab of Week 8, Lab of Week 9, and Lab of Week 10.

Informal discussion with professor (Due Friday of Week 7)

Pleae talk with me about ideas you are considering. If you are considering an ambitious project, I may ask you to do a quick experiment to see if it is likely to work

Preliminary Proposal (Due at start of Week 8 Lab)

Submit in hard copy a brief proposal describing what you hope to accomplish. The title information should include your name, the date, the course number & title, and the assignment name. The text of the proposal should answer four questions: (1) What will your program do when it is complete? (2) What part of the project will you implement? (3) What part of the project is already implemented in available code? (4) What will you learn about computer vision while implementing the project? You may answer all these questions in one or two paragraphs, but be sure to answer them in this order, and to answer all the questions. (10% of final project grade)

Based on your preliminary proposal, I will make two decisions. The first question: What is the quality of the proposal? (Does it meet the requirements above)? This determines your grade for the proposal. The second question: Do I think you can realistically accomplish it in a three week lab. If the answer is "No", I will request a revised version to be emailed me electronically by Week 8, Tuesday, 11pm, and will provide suggestions on how to reduce the lab. Do I think there is enough computer-vision related coding in your proejct? If not, I will suggest some ways you can make your project include more computer vision.

Preliminary Implementation (Due at start of lab, Week 9)

Submit a preliminary version of your program that works at doing the most central part of your project.

Since you will be writing the report during the second week, this should be an 80%-95% complete version of the project.

Submit your code through esubmit. esubmit page TBA

Preliminary Demo (Due in Week 9 Lab)

Demonstrate in lab the working code that you submitted at the start of lab. Tell me about any changes made since submission.

Final Submission (Due in Week 10 Lab)

Submit a report describing your final project. This report should include the same title information as the preliminary report. This report should follow the Word template provided, which includes the following sections, with at least a paragraph of text in each section: (1) What does your program do? (2) What did you learn about computer vision while implementing the project?

In addition, the report should include high-level documentation of your project. For Matlab projects, include a call graph generated by m2html, with a high-level discussion of the important functions of your program. For Object-Oriented languages, include a UML class diagram with a high-level discussion of the important functions of your program. Please note that I am not looking for a description of what happens when your program runs, play-by-play or a description of your classes, class-by-class. Rather, I am looking for the "big picture." What really matters in your program? How is it accomplished? What are the most important methods/arrays/classes in your program? How do they work?

Your project should use multiple matlab (Or Java, C, ...) functions (and possibly classes) to organize your code. These should be documented as demonstrated in the labs throughout the quarter.

Additional notes on m2html: You may find the command m2html('mfiles','.','htmldir','doc','graph','on'); useful for generating this graph. You will need graphviz installed for this to work. Also, dot.exe needs to be on the path. (This is in graphviz's bin folder.) Instructions for adding to the path are here. (The "Path editor 2" in these instructions is obsolete since Windows now offers a nice GUI for editing the path as of Windows 10!)

You only need to describe one key aspect of your approach in detail, and how this connects with the rest of your program. For example, if you implemented the SIFT feature-point descriptor, you might choose to only describe in detail how you achieve rotation invariance by sampling a rotated image patch at the interest point. Or, you might choose to describe in detail how you compute a weighted edge histogram in each bin, arranged in a grid around the interest point. You would not need to describe both.

Before this detailed description, give a high-level overview of the algorithm. For example, for SIFT-feature-point description, you might say "Our implementation of SIFT features describe an image-patch around a point using a histogram of gradients. In order for this histogram to match between images, we must ..."

In addition to the report, submit working source-code for your project. As with the preliminary submission, this should be a zip file that, when unzipped, produces a directory with a program immediately inside it called main.m or Main.java, and a test-file called README.txt if required. Please submit all the individual files needed to run your program through esubmit along with a PDF of your final project

Final Demonstration (Due in Week 10 Lab)

Each student will get up to 10 minutes to demonstrate their project to others in the class (as well as the instructor and possibly visitors). No powerpoint slides are required, but please verbally describe (1) What your program does and (2) what you learned about computer vision in the process.

Potential Project Ideas

Do something toward constructing a 3-D model of a flexible-tube protein with the Center for Biomolecular Modeling
Implement the K-D tree and test its performance properties with a large database of SIFT features
Detect objects with the Kinect sensor and distinguish between a cube and a sphere
Track objects with the Kinect sensor, re-displaying object centroids and tracking trails in a separate view
Find correspondences between two manually-cropped Kinect sensor images by detecting interest-points (or using SIFT+3D transform instead of planar projection that we will use in lab)
Train a real-time face detector in OpenCV (or possibly Matlab)
Recognize faces with eigenfaces (PCA/LDA)
Find pictures of the Eifel tower in a personal photo collection using SIFT feature matching to a variety of eifel tower pictures. (Perhaps a different object than the Eifel tower would work better.)
Detect edge segments using the Hough transform (from scratch)
Study the geometry of multiple-camera calibration. Calibrate several cameras intrinsically using the Matlab camera calibration toolbox. Calibrate multiple cameras given manually-marked feature points. Report on the reprojection error achieved for points not used during calibration.
Align two photographs taken with different camera settings, then combine them to produce an improved image. e.g. flash/no-flash photography. Szeliski Sec 10.2.2, pp. 434-435/.
View morphing. Find the 3D correspondence between two camera images, then simulate moving the camera from one image to another. Sec 7.2.3, pp. 315 gives a sketch of how you might do this. Instead of finding a homography between images, you could find an epipolar transform instead.
Implement Dr. Chad Aeschliman's fast multi-scale edge detector
(Don't know how much is involved...) Create a Google Cardboard app.Link
Do something for the Mars semi-autonomous competition that Dr. Bill Farrow is leading.
Perform background subtraction using a simple per-pixel Gaussian model for the color of each pixel. (This is the first step toward an interesting project with the Center for Biomolecular modeling)

Ambitious (If you are interested in these projects, please talk to me about Undergraduate Research or Independent Study)

Write a stereo-correspondence algorithm from scratch for a well-fixed camera pair
Polish and re-submit Dr. Chad Aeschliman's fast multi-scale edge detector for publication
Implement the HoG detector. (This is a very challenging project.)
Automatically calibrate two cameras based on detected SIFT features
Create an automatic exploration of images in a room based only on the images and their SIFT-feature correspondences. This is challenging because SIFT features are often very noisy for a room environment (as apposed to a still-life toy collection)
Attempt to build a 3-D model by tracking objects with the Kinect sensor
Implement a QR code reader (the part that reads the bits off of the screen)
Re-create something like the features shown on the cover of the book using SIFT image correspondences or edge-detection correspondences.

Final Project