NTU Course

Natural Language Processing

Offered in 112-2
  • Serial Number

    13328

  • Course Number

    CSIE5042

  • Course Identifier

    922 U0670

  • No Class

  • 3 Credits
  • Elective

    DEPARTMENT OF COMPUTER SCIENCE & INFOR / PROGRAM FOR KNOWLEDGE MANAGEMENT / Intelligent Medicine Program / GRADUATE INSTITUTE OF NETWORKING AND MULTIMEDIA / GRADUATE INSTITUTE OF COMPUTER SCIENCE & INFORMATION ENGINEERING

      Elective
    • DEPARTMENT OF COMPUTER SCIENCE & INFOR

    • PROGRAM FOR KNOWLEDGE MANAGEMENT

    • Intelligent Medicine Program

    • GRADUATE INSTITUTE OF NETWORKING AND MULTIMEDIA

    • GRADUATE INSTITUTE OF COMPUTER SCIENCE & INFORMATION ENGINEERING

  • HSIN-HSI CHEN
    • View Courses Offered by Instructor
    • COLLEGE OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE DEPARTMENT OF COMPUTER SCIENCE & INFOR

    • hhchen@ntu.edu.tw

    • 電資學院資訊館311室
    • 02-33664888-311

  • Thu 2, 3, 4
  • 資105

  • Type 2

  • 55 Student Quota

    NTU 55

  • Specialization Program

    Natural Language Processing

  • Chinese
  • NTU COOL
  • Core Capabilities and Curriculum Planning
  • Notes

    PROGRAM FOR KNOWLEDGE MANAGEMENT 知識管理學程系統領域選修課程。
    Intelligent Medicine Program 智慧醫療學分學程所屬電資學院數據領域課程

  • Limits on Course Adding / Dropping
    • Restriction: juniors and beyond

  • NTU Enrollment Status

    Enrolled
    0/55
    Other Depts
    0/0
    Remaining
    0
    Registered
    0
  • Course Description
    人類語言是人和人互動,傳遞資訊很重要的媒介,人類的知識也是透過語言文字記錄下來。電腦科學的研究,長久以來就把電腦是否具被人類語言處理能力,視為電腦是否具有人的智慧的重要指標之一。自然語言處理(Natural Language Processing,簡稱NLP)、或稱計算語言學(Computational Linguistics,簡稱CL)、或人類語言技術(Human Language Technology,簡稱HLT),探討人類語言的分析與生成,終極目標是電腦與使用者直接以人的語言互動。 自然語言處理的發展歷史可以初步分成以下七個階段: 第一個階段 (1950-1965):主要是以機器翻譯研究為主,嘗試使用統計式和符號式的方法。 第二個階段(1965-1975):嘗試使用理論方式,在語法處理上引進Transformation Grammar,在處理上使用Finite State Automata。 第三個階段(1975-1985):帶入Situation Semantics, DRT, Frames, Semantic Nets, Conceptual Dependency等概念,使用Augmented Transition Network。 第四個階段(1985- 1995):採用理論導向和實驗導向兩個方法論,前者包括HPSG, GPSG 和其他PSGs的語法,使用Unification處理。後者包括資訊擷取,建構Penn Treebank和WordNet。 第五個階段(1995-2005):統計式方法是這段時期的主流,機器學習被運用到各項任務。 第六個階段(2005-2015):深度學習是這段時間的主流,自然語言處理研究幾乎從頭開始,不同的深度學習架構被提出來。 第七個階段(2015-現在):Transformer被引進,預訓練模型如Bert、大語言模型如GPT-3、ChatGPT等被提出,預訓練-精煉和預訓練-提示-預測為主流模式。設計的概念由第五個階段的特徵工程、第六個階段的架構工程、轉向提示工程。 2022年11月30日OpenAI公布ChatGPT,吸引全球瘋狂使用,自然語言處理領域邁向新的世紀。這門課分成兩部分,第一部分將講授基本演算法,由傳統N-Gram以計數統計思維的語言模型、到以神經網路為基礎的語言模型,進階到大語言模型。第二部分是基本自然語言處理任務,由語法、語義、到語用。修習本課程的學生將可以學到第五個階段到第七個階段的知識,應用到不同場域。
  • Course Objective
    Part 1. Fundamental Algorithms 1. Words, Collocations and Multiword Expressions 2. N-Gram Language Models 3. Vector Semantics and Embeddings 4. Neural Networks and Neural Language Models 5. Sequence Labeling for Parts of Speech and Named Entities 6. Deep Learning Architecture for Sequence Processing 7. RNNs and LSTMs 8. Transformers and Pre-trained Language Models 9. Fine-Tuning and Masked Language Models 10. Prompting and Instruct Tuning Part 2. Annotating Linguistic Structure (optional) 11. Constituency Grammars and Parsing 12. Dependency Parsing 13. Logical Representation of Sentence Meanings 14. Relation and Event Extraction 15. Time and Temporal Reasoning 16. Word Senses and WordNet 17. Semantic Role Labelling and Argument Structure 18. Coreference Resolution 19. Discourse Coherence 20. NLP Applications
  • Course Requirement
    期中考,期末考,學期計畫,期末報告
  • Expected weekly study hours before and/or after class
  • Office Hour
  • Designated Reading
  • References
    Jacob Devlin, Ming-Wei Chang, Kenton Lee Kristina, and Toutanova BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, arXiv:1810.04805v2 [cs.CL] 24 May 2019. Kenneth Ward Church, Xiaopeng Yuan, Sheng Guo, Zewu Wu, Yehua Yang and Zeyu Chen, Emerging trends: Deep Nets for Poets, Natural Language Engineering (2021), 27, pp. 631–645. Kenneth Ward Church, Zeyu Chen and Yanjun Ma, Emerging trends: A Gentle Introduction to Fine-tuning, Natural Language Engineering (2021), 27, pp. 763–778. Pengfei Liu, et al., Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, arXiv:2107.13586v1 [cs.CL] 28 Jul 2021. Jason Wei, et al., Emergent Abilities of Large Language Models, arXiv:2206.07682v2 [cs.CL] 26 Oct 2022. Jason Wei, et al., Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, arXiv:2201.11903v5 [cs.CL] 10 Oct 2022. Ziwei Ji, et al., Survey of Hallucination in Natural Language Generation, arXiv:2202.03629v5 [cs.CL] 7 Nov 2022. Yejin Bang, et al., A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity, arXiv:2302.04023v1 [cs.CL] 8 Feb 2023.
  • Grading
    1. NTU has not set an upper limit on the percentage of A+ grades.
    2. NTU uses a letter grade system for assessment. The grade percentage ranges and the single-subject grade conversion table in the NATIONAL TAIWAN UNIVERSITY Regulations Governing Academic Grading are for reference only. Instructors may adjust the percentage ranges according to the grade definitions. For more information, see the Assessment for Learning Section
  • Adjustment methods for students
  • Make-up Class Information
  • Course Schedule
  • To protect everyone's rights, please respect intellectual property rights and refrain from illegal photocopying.