

Henry S. Baird Fall 2005 Course Pattern Recognition & Document Analysis Note to CSE Grad Students: this course fulfills two 'Core Areas': (1) Computer Applications, and (2) Theory. Note to undergraduates: you are welcome in this course; you'll do the same HWs, programming exercises, and exams as the grad students, but you'll report on fewer research papers. An introduction to the state of the art of pattern recognition and document image processing, and the machine learning theory, algorithms, and systems architectures that underlie them. Theoretical topics will include Bayesian decision theory, nonparametric methods, linear discriminant functions, neural nets, and algorithmindependent machine learning. Engineering challenges  including trainable classification, segmentation, contextual analysis, autonomous adaptation, and 'anytime' algorithms  will be illustrated by highperformance computer vision systems selected mainly from document image analysis applications. Weekly written homeworks or short programming exercises. A midterm exam. Students will select a set of related research papers (or a dissertation) from the recent literature and present a short talk in class summarizing and critiquing them. There is a choice between (1) a final exam or (2) a software project on a cuttingedge research problem from digital libraries or Web security (e.g. CAPTCHAs: visionbased Turing tests to tell computers and humans apart). Course objective On completing this course, students will be sufficiently familiar with the theory, notation, and vocabulary of pattern recognition and machine learning to be able to pursue matters of interest in the current technical literature. They will also have a grasp of key engineering issues arising in application. Further, this course serves as an introduction to the state of the art of Document Image Analysis which is an essential technology in digital libraries, webbased search of scholarly materials, intelligence analysis, office automation, and webbased security. These topics are being actively researched in Lehigh's Pattern Recognition Research laboratory. Textbook: Pattern Classification (2^{nd} Ed.), R. O. Duda, P. E. Hart, & D. G. Stork, John Wiley & Sons, October 2000. 680 pages. ISBN 0471056693. Prerequisites:CSE 340: Algorithms  or comparable background in basic algorithms and data structures.Math 205: Linear Algebra etc  or similar familiarity with linear algebra & matrices. Math 231 or Math 309 or CSC 450: Applied Probability & Statistics  or some background in discrete probability and applied statistics & data analysis. CSE 109: Programming in C++  or enough experience with C++, Java, C, or MatLab to complete a small software project without faculty supervision. If you have any questions about prerequisites, ask the instructor: hsb2@lehigh.edu.


