CS498
Project

Project

Introduction

The purpose of this assignment is to give you the opportunity to explore a computer-vision topic of your own choosing.

Project Deliverables

There are three check-points -- Lab of Week 8, Lab of Week 9, and Friday of Week 10. In some weeks, an electronic version is due the evening before the class/lab period.

Preliminary Proposal (Due at start of Week 8 Lab ("Lab 7")

Submit in hard copy a brief proposal describing what you hope to accomplish. The title information should include your name, the date, the course muber & title, and the assignment name. The text of the proposal should answer four questions: (1) What will your program do when it is complete? (2) What part of the project will you implement? (3) What part of the project is already implemented in available code? (4) What will you learn about computer vision while implementing the project? You may answer all these questions in one or two paragraphs, but be sure to answer them in this order, and to answer all the questions. (10% of final project grade)

Based on your preliminary proposal, I will make two decisions. The first question: What is the quality of the proposal? (Does it meet the requirements above)? This determines your grade for the proposal. The second question: Do I think you can realistically accomplish it in a three week lab. If the answer is "No", I will request a revised version to be emailed me electronically by Week 8, Tuesday, 11pm, and will provide suggestions on how to reduce the lab.

Preliminary Implementation (Due electronically, Monday of Week 9, 11pm)

Submit a preliminary version of your program that works at doing at least part of your goal for the project. This should be a zip file that when unzipped produces a folder containing immediately inside it your main program called main.m or Main.java, etc. that when run, runs your program. If your program requires configuration to run, please include a text-file called README.txt describing the configuration required. (10% of final project grade)

Preliminary Demo (Due in Week 9 Lab ("Lab 8")

Demonstrate in lab the working code that you submitted the night before. Tell me about any changes made since submission. (10% of final project grade)

Final Submission (Due Thursday of Week 10, 11pm)

Submit on the final upload page a PDF report describing your final project. This report should include the same title information as the preliminary report. This report should follow the Word template provided, which includes the following sections, with at least a paragraph of text in each section: (1) What does your program do? (2) What did you learn about computer vision while implementing the project?

In addition, the report should include high-level documentation of your project. For Matlab projects, include a call graph generated by m2html, with a high-level discussion of the important functions of your program. For Object-Oriented languages, include a UML class diagram with a high-level discussion of the important functions of your program. Please note that I am not looking for a description of what happens when your program runs, play-by-play or a description of your classes, class-by-class. Rather, I am looking for the "big picture." What really matters in your program? How is it accomplished? What are the most important methods/arrays/classes in your program? How do they work?

Additional notes on m2html: You may find the command m2html('mfiles','.','htmldir','doc','graph','on'); useful for generating this graph. You will need graphviz installed for this to work. Also, dot.exe needs to be on the path. (This is in graphviz's bin folder.) Instructions for adding to the path are here.

You only need to describe one key aspect of your approach in detail, and how this connects with the rest of your program. For example, if you implemented the SIFT feature-point descriptor, you might choose to only describe in detail how you achieve rotation invariance by sampling a rotated image patch at the interest point. Or, you might choose to describe in detail how you compute a weighted edge histogram in each bin, arranged in a grid around the interest point. You would not need to describe both.

Before this detailed description, you might give a high-level overview of the algorithm, e.g. "To describe a feature, we first determine the principle orientation of the point from a large histogram. Then we sample an image patch that is rotated around the image. Our final feature is a set of histograms computed in bins in a grid around the point."

In addition to the report, submit working source-code for your project. As with the preliminary submission, this should be a zip file that, when unzipped, produces a directory with a program immediately inside it called main.m or Main.java, and a test-file called README.txt if required.

Final Demonstration (Due in final class Week 10)

Each team will get up to 10 minutes to demonstrate their project to others in the class (as well as the instructor and possibly visitors). No powerpoint slides are required, but please verbally describe (1) What your program does and (2) what you learned about computer vision in the process.

Potential Project Ideas

  • Implement the K-D tree and test its performance properties with a large database of SIFT features
  • Detect objects with the Kinect sensor and distinguish between a cube and a sphere
  • Track objects with the Kinect sensor, re-displaying object centroids and tracking trails in a separate view
  • Find correspondences between two manually-cropped Kinect sensor images by detecting interest-points (or using SIFT+3D transform instead of planar projection that we will use in lab)
  • Train a real-time face detector in OpenCV (or possibly Matlab)
  • Recognize faces with eigenfaces (PCA/LDA)
  • Find pictures of the Eifel tower in a personal photo collection using SIFT feature matching to a variety of eifel tower pictures. (Perhaps a different object than the Eifel tower would work better.)
  • Detect edge segments using the Hough transform (from scratch)
  • Study the geometry of multiple-camera calibration. Calibrate several cameras intrinsically using the Matlab camera calibration toolbox. Calibrate multiple cameras given manually-marked feature points. Report on the reprojection error achieved for points not used during calibration.
  • Align two photographs taken with different camera settings, then combine them to produce an improved image. e.g. flash/no-flash photography. Szelinski Sec 10.2.2, pp. 434-435/.
  • View morphing. Find the 3D correspondence between two camera images, then simulate moving the camera from one image to another. Sec 7.2.3, pp. 315 gives a sketch of how you might do this. Instead of finding a homography between images, you could find an epipolar transform instead.
  • Implement Dr. Chad Aeschliman's fast multi-scale edge detector
  • (Don't know how much is involved...) Create a Google Cardboard app.Link
  • Do something for the Mars semi-autonomous competition that Dr. Bill Farrow is leading.
  • Do something toward constructing a 3-D model of a flexible-tube protein with the Center for Biomolecular Modeling
  • Perform background subtraction using a simple per-pixel Gaussian model for the color of each pixel. (This is the first step toward an interesting project with the Center for Biomolecular modeling)

Submission Form for Dr. Yoder