# CSE 347 Data Mining (3)

### Instructor:

Ting Wang (Fall 2017)

### Current Catalog Description

Overview of modern data mining techniques: data cleaning; attribute and subset selection; model construction, evaluation and application. Fundamental mathematics and algorithms for decision trees, covering algorithms, association mining, statistical modeling, linear models, neural networks, instance-based learning and clustering covered. Practical design, implementation, application and evaluation of data mining techniques in class projects. Credit will not be given for both CSE 347 and CSE 447. Prerequisites: CSE 17 and (CSE 160 or CSE 326) and (MATH 231 or ECO 045 or ISE 121).

### Textbook

Pang-Ning Tan, "Introduction to Data Mining", 2nd Ed., Addison-Wesley (2013), ISBN 978-0133128901

Leskovec, Rajaraman & Ullman, "Mining of Massive Datasets", 2nd Ed., Cambridge University Press (2014), ISBN 978-1107077232

### References

None

### COURSE OUTCOMES

### Student will have:

- Understanding the principles of data mining.
- Be aware of the challenges that arise in data mining.
- Know a range of techniques for data mining and where they can be applied.
- Become aware of ethical issues that are present in data mining applications.

### RELATIONSHIP BETWEEN COURSE OUTCOMES AND STUDENT ENABLED CHARACTERISTICS

### CSE 347 substantially supports the following student enabled characteristics

**A. **An ability to apply knowledge of computing and mathematics appropriate to the discipline

**B. **An ability to analyze a problem and identify and define the computing requirements appropriate to its solution

**I. **An ability to use current techniques, skills, and tools necessary for computing practices

**J. **An ability to apply mathematical foundations, algorithmic prinicples, and computer science theory in the modeling and design of computer-based systems in a way that demonstrates comprehension of the tradeoffs involved in design choices

**K. **An ability to apply design and evelopment principles in the construction of software systems of varying complexity

### Major Topics Covered in the Course

- machine learning for data mining
- input: concepts, instances, and attributes
- output: knowledge representation
- statistical modeling
- constructing decision trees
- mining association rules
- linear models
- instance-based learning
- clustering
- predicting performance
- data mining ethics

### Assessment Plan for the Course

The students are given eight homework assignments that relate to the assigned readings and material presented in lectures. There is one midterm exam about halfway through the course on the topics covered up to that point. There is a final course project to identify and tackle a "real-life" data mining problem. Students are required to submit a write-up and make a short oral presentation describing their work on their final projects.

### How Data in the Course are Used to Assess Program Outcomes:(unless adequately covered already in the assessment discussion under Criterion 4)

Each semester I include the above data from the assessment plan for the course in my self-assessment of the course. This report is reviewed, in turn, by the Curriculum Committee.