Henry S. Baird Fall 2009 Course
CRNs: 43875 (326); 43876 (426)
Note to CSE PhD Graduate Students: this course fulfills two 'Core Areas': (1) Computer Applications, and (2) Theory.
Note to undergraduates: you are very welcome in this course; you'll do the same programming exercises as the grad students, but shorter HWs and exams.
An introduction to the state of the art of pattern recognition and document image processing, and the machine-learning theory, algorithms, and systems architectures that underlie them.
Theoretical topics will include Bayesian decision theory, statistically trainable vector-space classifiers, parametric classifiers (for, e.g., likelihoods with Gaussian densities), non-parametric classifiers (e.g. nearest neighbors), Perceptrons, generalized linear discriminants, kernel-based methods, decision trees, support-vector machines, neural nets, ensembles, and randomized classifiers. Also, we study general methodological issues, including best practices for statistical training and testing, the curse of dimensionality, and feature selection.
The last 1/3 of the course focuses on engineering challenges illustrated in applications chosen from the document image understanding R&D literature. These reflect state-of-the-art approaches to segmentation, contextual analysis (including syntax and semantics), autonomous adaptation, style-conscious recognition, and anytime algorithms.
Weekly written homeworks or short programming exercises. A midterm exam. Students will select a set of related research papers (or a dissertation) from the recent literature and present a short talk in class summarizing and critiquing them. There is a choice between (1) a final exam or (2) a software project on a cutting-edge research problem from digital libraries or Web security (e.g. CAPTCHAs: vision-based Turing tests to tell computers and humans apart).
On completing this course, students will be sufficiently familiar with the theory, notation, and vocabulary of pattern recognition and machine learning to be able to pursue matters of interest in the current technical literature. They will also have a grasp of key engineering issues arising in applications.
Further, this course serves as an introduction to the state of the art of Document Image Analysis which is an essential technology in digital libraries, web-based search of scholarly materials, intelligence analysis, office automation, and web-based security. These topics are being actively researched in Lehigh's Pattern Recognition Research laboratory.
Textbook: Pattern Classification (2nd Ed.), R. O. Duda, P. E. Hart, & D. G. Stork, John Wiley & Sons, October 2000. 680 pages. ISBN 0-471-05669-3. If there are no copies available in the Univ. bookstore, email me immediately.
Lectures: Tuesdays & Thursdays, 10:45 AM - 12:00 noon, Packard Lab 258. First meeting: Tuesday, August 26. (Last meeting: Thursday, Dec. 4.)
Instructor: Prof. Henry Baird, firstname.lastname@example.org. Office: Packard Lab 514C. Office Hours: Thursdays 12-1 PM, or by appointment.BlackBoard site: we will use BlackBoard site CSE-426-326-00-FL09 to distribute lecture slides and homework assignments and data sets. Please browse bb.lehigh.edu and try to login to this site: if you cannot, send me email.
Prerequisites:CSE 340: Algorithms -- or comparable background in basic algorithms and data structures.
Math 205: Linear Algebra etc -- or similar familiarity with linear algebra & matrices.
Math 231 or Math 309 or CSC 450: Applied Probability & Statistics -- or some background in discrete probability and applied statistics & data analysis.
CSE 109: Programming in C++ -- or enough experience with C++, Java, C, or MatLab to complete a small software project without faculty supervision.
Accommodations for Students
If you have a disability for which you are or may be requesting
accommodations, please contact both your instructor and the Office of
Academic Support Services, University Center 212 (610-758-4152) as
early as possible in the semester. You must have
the Academic Support Services office before accommodations can be
If you experience flu–like symptoms, please stay in your room and contact the Health Center. Do not come to class! Email me as soon as you know that you are ill and will need to miss one or more class meetings. [Please obtain an excuse from the Office of Academic Support in the Dean of Students Office (Dean Lantz’ office, UC 210) when you come back to class if your flu–related absence caused you to miss more than a week of class meetings, if you have more than one period of absence due to the flu during the term, if your absence was within three days of Pacing Break or Thanksgiving break, or if you miss an examination.]
If you have any questions, ask the instructor: email@example.com.