數位人文技術與應用專題

流水號
60337
課號
LING5003
課程識別碼
142 U1040
無分班
3 學分
選修
語言學研究所 / 無授課對象
- 語言學研究所
- 無授課對象
蔡宗翰
- 搜尋教師開設的課程
四 6, 7, 8
綜202
2 類
向教師取得授權碼後加選
修課總人數 20 人
本校 20 人
無領域專長
中文授課
NTU COOL
核心能力與課程規劃關聯圖

備註
初選不開放。欲修此門課的同學填寫問卷https://reurl.cc/1v610W，待錄取後，發放授權碼選課
本校選課狀況僅含臺大選課人數，選課期間每五分鐘更新一次
已選上
0/20
外系已選上
0/0
剩餘名額
0
已登記
0
課程概述
The Digital Humanities Technologies and Applications Project course at National Taiwan University's Graduate Institute of Linguistics merges digital technology with the humanities. Focusing on natural language processing (NLP) within textual data, it covers topics like ethics in automation, algorithmic biases, and digital conservation. Through lectures, seminars, and practical workshops, students gain skills in text mining, sentiment analysis, and applying NLP to linguistic data. The course emphasizes hands-on projects, encouraging interdisciplinary research and the practical application of digital humanities methods. Students will leave with a strong foundation in digital humanities, ready to contribute to academic and professional discussions on the integration of technology and humanistic studies.
課程目標
The course has these core objectives and expected learning outcomes: 1. Digital Humanities Insight: Understand the integration and impact of digital technologies in humanistic studies, focusing on ethical and technical challenges. 2. NLP Techniques: Acquire practical skills in natural language processing (NLP) for textual analysis, including text mining and sentiment analysis. 3. Critical Thinking: Develop critical perspectives on digital technology's role in cultural preservation and historical research. 4. Interdisciplinary Collaboration: Participate in interdisciplinary projects, applying digital humanities approaches to linguistic data. 5. Communication Skills: Enhance abilities in presenting research findings effectively to academic and professional audiences. Upon completion, students will be prepared to contribute meaningfully to digital humanities discussions and projects.
課程要求
1.每周課堂實作：針對每一周所授之自然語言處理方法進行主題式實際操作 2.全體共同完成期末專題大歷史模型 : 明代鹽商網絡與社會流動性研究 ◆學習目標：運用自然語言處理技術進行一項歷史專題研究 ◆實作重點：透過任務設計及文本分析技術，微調一個基於大語言模型 (LLM-based) 之古文預訓練模型以分析明代鹽商群體的社會網絡和社會社會關係
預期每週課前或/與課後學習時數
Office Hour
指定閱讀
待補
參考書目
待補
評量方式
本校尚無訂定 A+ 比例上限。
本校採用等第制評定成績，學生成績評量辦法中的百分制分數區間與單科成績對照表僅供參考，授課教師可依等第定義調整分數區間。詳見學習評量專區。

針對學生困難提供學生調整方式

調整方式	說明
B6	學生與授課老師協議改以其他形式呈現 Mutual agreement to present in other ways between students and instructors
C2	書面(口頭)報告取代考試 Written (oral) reports replace exams

調整方式

說明

學生與授課老師協議改以其他形式呈現

Mutual agreement to present in other ways between students and instructors

書面(口頭)報告取代考試

Written (oral) reports replace exams

補課資訊

課程進度

9/05第 1 週	9/05	※ 課程介紹、期末專案介紹 : 建置一個歷史模型、建構一個 LLM 的步驟 -課程綜述
9/12第 2 週	9/12	※ 資料蒐集與預處理 - 制定策略，選擇適當的文本來源，確保資料的相關性和多樣性 - 對收集的原始文本進行清理、標準化和格式轉換，為後續分析做準備
9/19第 3 週	9/19	※ 提示工程的基礎原理與技巧 - Prompt 的結構和組成部分 - 常見 Prompt 模式和技巧 - 進階提示工程技巧
9/26第 4 週	9/26	※ 基於 ChatGPT-4 的提示工程實作 - 使用 Prompt 讓 ChatGPT 完成不同任務 - 思考 ChatGPT-4 的弱點、在進行人文分析的某些任務上可以如何優化
10/03第 5 週	10/03	※ 任務設計 : 設計自己的 ChatGPT -設計、定義歷史模型需要完成的具體任務 (e.g. 共 10 個任務)
10/10第 6 週	10/10	國慶日
10/17第 7 週	10/17	※ 模型評估的步驟 - 了解如何評估模型 - 歷史模型的分組 : 資料組、評估組
10/24第 8 週	10/24	※ Prompt Engineering 實作 (1) -為 week 5 所設計的任務 (e.g. 任務 1 ~ 5)，設計 prompt
10/31第 9 週	10/31	※ Prompt Engineering 實作 (2) -為 week 5 所設計的任務 (e.g. 任務 6 ~ 10)，設計 prompt
11/07第 10 週	11/07	※ 製作標準答案集、檢查 prompt 並提交 -為所設計的 prompt 製作配對的標準答案集
11/14第 11 週	11/14	※ 製作測試集 -劃分資料集
11/21第 12 週	11/21	※ 1st 訓練結果 & 調整討論 -模型性能分析、錯誤分析、如何調整
11/28第 13 週	11/28	※ 資料擴充 -在資料不足的情況，如何進資料擴充優化模型訓練
12/05第 14 週	12/05	※ 2nd 訓練結果 & 調整討論 -模型性能分析、錯誤分析、如何調整
12/12第 15 週	12/12	※ 最終訓練結果 & 調整討論 -模型性能分析、錯誤分析、如何調整
12/19第 16 週	12/19	※ 期末專題發表