Serial Number
38061
Course Number
IMPS5010
Course Identifier
H41 U0120
- Class 01
- 3 Credits
Elective
Master Program in Statistics of National Taiwan University
Master Program in Statistics of National Taiwan University
Elective- CHEN, YAN-BIN
- View Courses Offered by Instructor
COMMON GENERAL EDUCATION CENTER Master Program in Statistics of National Taiwan University
yanbin@ntu.edu.tw
- Room 212, Chee-Chun Leung Cosmology Hall (次震宇宙館 212室)
02-33664688
Website
https://sites.google.com/view/yan-bin/home
- Tue 7, 8, 9
新401
Type 2
15 Student Quota
NTU 15
No Specialization Program
- English
- NTU COOL
- Core Capabilities and Curriculum Planning
- NotesThe course is conducted in English。
NTU Enrollment Status
Enrolled0/15Other Depts0/0Remaining0Registered0- Course Description== Fall 2024 == This course offers practical training in data science, focusing on high-dimensional data computing and dimension reduction algorithms. The characteristics of this course are the hands-on experience with high-performance computers and the observation of real data from a statistical perspective. Practical exercises will be conducted on high performance GPU servers on the cloud, possibly utilizing resources such as the NVIDIA V100 from our NTU or Google Colab. In addition to the hands-on exercises, statistical theories related to dimension reduction algorithms, data visualization, and data interpretation will be introduced. The Python programming skills will be taught during the first month as part of a combined and quick recap course. The course is taught in English, but bilingual Q&A sessions are acceptable. Teaching methods in each week: 50 mins: Lecture. 90 mins: Students engage in hands-on exercises and paper presentations. 10 mins: Conclusion of hands-on exercises and fundamental knowledge. *** Notice *** Kindly notice that there is no need to send me an email for course enrollment. If you would like to take the course but were unable to successfully enroll, please come to class in the first week. We may deliver the authorization codes. The unsuccessful enrollment status will be announced after the preliminary course selection on August 29th.
- Course ObjectiveThe students will learn the inherent characteristics of high-dimensional data and dimension reduction techniques. Additionally, they will gain hands-on experience in operating and accessing high-dimensional data on high-performance GPU servers. Students will be expected to complete projects that involve preprocessing, computing, and operating high-dimensional data on the high-performance GPU servers.
- Course Requirement1. The students should have programming skills (very basic level) in Python before taking. 2. The students should take along with their laptops in the class session.
- Expected weekly study hours before and/or after class3 hours
- Office Hour
*This office hour requires an appointment - Designated ReadingMonth 1: Book1, Chapter 3,5,9 Month 2: Book2, Chapter 1,2 Month 3: Book2, Chapter 5,6 Month 4: Paper study
- ReferencesBook 1: Python for Data Analysis, 3E --- Data Wrangling with Pandas, NumPy, and Jupyter, 2022 By Wes McKinney Book 2: Nonlinear Dimensionality Reduction Techniques -- A Data Structure Preservation Approach, 2021 By Sylvain Lespinats, Benoit Colange, Denys Dutykh
- Grading
10% In class
Exercise in class session
40% Midterm
Paper presentation
50% Final
Final project
- Adjustment methods for students
Adjustment Method Description A3 提供學生彈性出席課程方式
Provide students with flexible ways of attending courses
B6 學生與授課老師協議改以其他形式呈現
Mutual agreement to present in other ways between students and instructors
C2 書面(口頭)報告取代考試
Written (oral) reports replace exams
- Make-up Class Information
- Course Schedule
9/03Week 1 9/03 Introduction 9/10Week 2 9/10 [Part1: A Quick Recap of Python] Python Environment Setup 9/17Week 3 9/17 Public holiday 9/24Week 4 9/24 Data Structures and Functions Pandas 10/01Week 5 10/01 Plot and Visualization 10/08Week 6 10/08 [Part2: Dimensionality Reduction Techniques] Similarity Measure and Distance Function 10/15Week 7 10/15 Nearest Neighbors in Scikit-learn 10/22Week 8 10/22 Machine Learning for Artificial Intelligence 10/29Week 9 10/29 Supervised Learning 11/05Week 10 11/05 Unsupervised Dimensionality Reduction: PCA, t-SNE 11/12Week 11 11/12 Deep Learning: CNN 11/19Week 12 11/19 Natural Language Processing: NLTK 11/26Week 13 11/26 Research Issue: Feature Representation Learning 12/03Week 14 12/03 Final Project Presentation I 12/10Week 15 12/10 Final Project Presentation II 12/17Week 16 12/17 Real Case Study and Discussion