Serial Number
66456
Course Number
Data5010
Course Identifier
946 U0100
No Class
- 3 Credits
Elective
Data Science Degree Program
Data Science Degree Program
Elective- Ming-Chung Chang
- Wed 6, 7, 8
新504
Type 1
30 Student Quota
NTU 30
No Specialization Program
- Chinese
- NTU COOL
- Core Capabilities and Curriculum Planning
- Notes
- Limits on Course Adding / Dropping
Restriction: MA students and beyond
NTU Enrollment Status
Enrolled0/30Other Depts0/0Remaining0Registered0- Course Description第一週 Introduction to Data Science and Matrix Algebra 第二週 Data Collection: Survey sampling 第三週 Data Collection: Factorial design and Space-filling design 第四週 Data Analysis: Supervised Learning I – Linear model and Generalized linear model 第五週 Data Analysis: Supervised Learning II -- Nonparametric regression 第六週 Data Analysis: Supervised Learning III -- Gaussian process regression 第七週 Data Analysis: Supervised Learning IV -- Discriminate analysis 第八週 Data Analysis: Supervised Learning V -- Support vector machine 第九週 期中考週 第十週 Data Analysis: Supervised Learning VI -- Bagging, Random forests, Boosting 第十一週 Data Analysis: Supervised Learning VII -- Deep neural networks 第十二週 Data Analysis: Unsupervised Learning I – Principal component analysis, Factor analysis, Canonical correlation analysis 第十三週 Data Analysis: Unsupervised Learning II -- Clustering methods 第十四週 Big Data Issue I (p>>n): Feature screening 第十五週 Big Data Issue II (n>>p): Subdata selection 第十六週 期末考週 第十七週 彈性教學 第十八週 彈性教學
- Course ObjectiveThe aim of this course is to introduce a variety of Statistical and Machine Learning data analysis methods. Three core techniques for data science: Data collection, Supervised learning, and Unsupervised learning, are introduced in detail. Some recent developments in Big/High-dimensional data are involved. The software I will be using for the course is R (website: https://www.r-project.org/).
- Course RequirementBasic statistical concepts/theories and programming techniques are required, where the course Data5004 Statistical Foundations of Data Science (I) is helpful for understanding the materials in this course. This course will be graded by Homework assignments (20%), Project presentations (40%), and Paper presentations (40%). Students can use any software, not limited to R, for programming in their project presentations. 建議先至教師個人網頁查看去年的課程內容以決定是否適合:https://sites.google.com/view/mcchang/teaching?authuser=0
- Expected weekly study hours before and/or after class
- Office Hour
*This office hour requires an appointment - Designated Reading
- References教科書: 1. James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021), An Introduction to Statistical Learning: with Applications in R, 2nd Edition, Springer 參考書目: 1. Johnson, R.A. and Wichern, D.W. (2007), Applied Multivariate Statistical Analysis,6th edition, Prentice Hall. 2. James, G., Witten, D., Hastie, T., and Tibshirani, R. (2009), The Elements of Statistical Learning, 2nd Edition, Springer 3. Fan, J., Li, R., Zhang, C.-H., and Zou, H. (2020), Statistical Foundations of Data Science, CRC Press 4. Selective Papers
- Grading
- NTU has not set an upper limit on the percentage of A+ grades.
- NTU uses a letter grade system for assessment. The grade percentage ranges and the single-subject grade conversion table in the NATIONAL TAIWAN UNIVERSITY Regulations Governing Academic Grading are for reference only. Instructors may adjust the percentage ranges according to the grade definitions. For more information, see the Assessment for Learning Section。
- Adjustment methods for students
- Make-up Class Information
- Course Schedule