CSE 432/532 Machine Learning
Catalog description:
This course introduces the process, methods, and computing tools fundamental to machine learning. Students will work on large real-world datasets to write code to accomplish tasks such as predicting outcomes, discovering associations, and identifying similar groups. Students will complete a term project showcasing the different steps of the machine learning process, from data cleaning to the extraction of accurate models and the visualization of results.
Prerequisite:
CSE 274
Required topics (approximate weeks allocated):
- Introduction to the course, including logistics and syllabus (0.5)
- Setting-up technologies used in this course, including Anaconda and Jupyter notebooks (0.5)
- Principles of Python programming in a professional environment (2)
- Control flows in Python (loops, functions, conditionals)
- Handling of data through files or in-memory structures (lists, associative arrays)
- Use of the Pandas library for mapping and filtering
- Essentials of data cleaning and transformation (2.5)
- Detecting outliers
- Filing in missing values
- Feature engineering
- Dimensionality reduction
- Data balancing
- Overview of key machine learning tasks, e.g. classification, clustering (0.5)
- Standard techniques for classification (3)
- Decision trees
- Support vector machines
- Ensemble learning (e.g. random forests)
- Overview of possible course projects (0.5)
- Unsupervised learning (3)
- Artificial neural networks on TensorFlow
- Clustering
- Visualizing machine learning models (1.5)
- Principles of scientific visualization applied to machine learning
- Programming visualizations within a machine learning workflow
Learning Outcomes:
- Describe how to create accurate and generalizable models from large and messy datasets.
- Implement code to clean data and derive a model using an appropriate machine learning algorithm.
- Present solutions to stakeholders using visualizations and professional machine learning workflows.
- Write machine learning applications using techniques that are learned independently using online resources. (graduate students only)