深度學習於音樂分析及生成

流水號
32271
課號
CommE5070
課程識別碼
942 U0840
無分班
3 學分
選修
電機工程學研究所 / 電信工程學研究所
- 電機工程學研究所
- 電信工程學研究所
楊奕軒
- 搜尋教師開設的課程
- 電機資訊學院電機工程學系
四 6, 7, 8
請洽系所辦
2 類
向教師取得授權碼後加選
修課總人數 60 人
本校 60 人
無領域專長
中文授課
NTU COOL
核心能力與課程規劃關聯圖

備註
上課地點:學新118
本校選課狀況僅含臺大選課人數，選課期間每五分鐘更新一次
已選上
0/60
外系已選上
0/0
剩餘名額
0
已登記
0
課程概述
“Music Information Research” (MIR) is an interdisciplinary research field that concerns with the analysis, retrieval, processing, and generation of musical content or information. Researchers involved in MIR may have a background in signal processing, machine learning, information retrieval, human-computer interaction, musicology, psychoacoustics, psychology, or some combination of these. In this course, we are mainly interested in the application of machine learning, in particular deep learning, to address music related problems. Specifically, the course is divided to two parts: analysis and generation. The first part is about the analysis of musical audio signals, covering topics such as feature extraction and representation learning for musical audio, music audio classification, melody extraction, automatic music transcription, and musical source separation. The second part is about the generation of musical material, including symbolic-domain MIDI or tablatures, and audio-domain music signals such as singing voices and instrumental music. This would involve deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAE), Transformers, and diffusion models. Here is a tentative schedule of the course: W1. Introduction to the course W2. Fundamentals & Music representation W3. Analysis I (timbre): Automatic music classification and representation learning (HW1: to be announced) W4. Generation I: Source separation W5. Generation II: GAN & Vocoders W6. Generation III: Synthesis of notes and loops (HW2: to be announced) W7. Analysis II (pitch): Music transcription, Melody extraction, and Chord Recognition W8. Generation IV: Symbolic MIDI generation W9. Generation V: Symbolic MIDI generation: Advanced Topics (HW3: to be announced) W10. Generation VI: Singing voice generation W11. Generation VII: Text-to-music generation W12. Proposal of ideas of final projects W13. Generation VIII: Differentiable DSP models and automatic mixing W14. Miscellaneous Topics W15. Break W16. Oral presentation of final projects
課程目標
1. Understanding of different aspects of music: timbre, rhythm, pitch, harmony, and structure, and the use of domain knowledge for corresponding music signal analysis tasks. 2. Understanding of and hands-on experiences with deep learning techniques to music audio signal analysis 3. Understanding of and hands-on experiences with deep generative models for both musical audio and text-like music data such as MIDI 4. A taste of the fun of research
課程要求
I would assume that students taking this course to * have good background in machine learning and mathematics (e.g., have taken courses such as Machine Learning, Deep Learning, Signals and Systems, Digital Signal Processing, Linear Algebra, Probability and Statistics) * have good coding experience in python and a deep learning framework such as PyTorch * have great interest in music
預期每週課前或/與課後學習時數
Office Hour
指定閱讀
參考書目
Jakub M. Tomcza, Deep Generative Modeling. 978-3-030-93158-2. Springer, 2022.
評量方式
60%
Homeworks
3-4 coding assignments related to building ML/DL models
40%
Final project
For teams of 2 or 3, oral presentation + written report
針對學生困難提供學生調整方式
調整方式說明
A3
提供學生彈性出席課程方式
Provide students with flexible ways of attending courses
補課資訊
課程進度