CMPT 839 (Spring 2025): Advanced Natural Language Processing and Understanding

Overview

This course is a graduate-level research course covering advanced topics in NLP, introducing the state-of-the-art methods for computational understanding, analysis, and generation of natural language text. In recent years, advances in deep learning models for NLP has transformed the ability of computers to converse with human using language, giving us multi-lingual, multi-models that are capable of answering questions, composing message, translating and summarizing documents. The development of large language models (LLMs) are built on top of neural models such as transformers, that allows for the scaling up of models and training with large amounts of data. In this course, we will focus on current state-of-the-art methods in NLP including how to do parameter efficient fine-tuning, techniques for scaling models to long sequences, etc. We will also go beyond transformers to learn alternative architectures such as state-space models.

Students are expected to have prior experience with deep learning concepts and framework (Pytorch, Tensorflow, etc), and should also be familiar with basic natural language processing.

Each week, students will read papers in a particular area of natural language processing, and discuss the contributions, limitations and interconnections between the papers. Students will also work on a research project during the course, culminating in a final presentation and written report. The course aims to provide practical experience in comprehending, analyzing and synthesizing research in natural language processing and understanding.

Note: This course is NOT an introductory course to natural language processing. If you are interested in taking a introductory course about natural language processing, please take CMPT 413/713.

Background

There are no formal prerequisites for this class. However, you are expected to be familiar with the following:

Natural language Processing
Deep learning

Topics

Transformer models in NLP
Model design choices
Pre-training and fine-tuning LMs
Advanced inference techniques
LLM agents
Multimodal models

Quick info

Instructor: Angel Chang
TA: Yiming Zhang
Lectures: Mondays 12:30pm to 2:20pm and Wednesdays 12:30pm to 1:20pm (SWH10061).
Office hours:
- Angel: Wed 1:30pm to 2:30pm
- Yiming: Mon 3:00pm - 4:00pm

Syllabus

Below is a tentative outline for the course.

R: Readings, BG: (Optional) Background material / reading for deeper understanding. Provided for reference.

Date	Topic	Notes
Jan 6	Introduction, NLP review & logistics	BG: Stanford CS324 intro to LLMs
Jan 8	Transformers and LLMs	BG: Attention is all you need
Jan 13	Training LLMs (pre-training, post-training)	BG: Instruction tuning survey
Jan 15	Paper discussion 0	D: LLaMa D: InstructGPT
Jan 20	Fine-tuning LLMs (PEFT)
Jan 22	Ensembling and mixture-of-experts	BG: Review of mixture of experts
Jan 27	Using LLMs (Prompting and retrieval) Potential project topics	BG: Prompting survey BG: ACL 2023 Text generation tutorial BG: ACL 2023 RAG tutorial
Jan 29	Paper discussion 1a	D: Towards a Unified View of Parameter-Efficient Transfer Learning
Feb 3	Efficient LLMs: Pruning, quantization, distillation	BG: Inference optimization (Lillian Weng)
Feb 5	Paper discussion 1b	D: RAG for knowledge intensive NLP
Feb 10	Handling long context / State space models Paper discussion 2a	BG: Efficient Transformers BG: UniMem LolCats
Feb 12	Paper discussion 2b	D: Mamba
Feb 17	No class - Reading break
Feb 19	No class - Reading break
Feb 24	Reasoning Project proposals
Feb 26	Project proposals	Due: Project proposal
Mar 3	Multimodal models
Mar 5	Paper discussion 3a	D: DeepSeek-R1
Mar 10	LLM agents Paper discussion 3b	D: ViperGPT
Mar 12	Paper discussion 4a	D: ChemCrow
Mar 17	LLM attacks and social impact Project milestone presentations
Mar 19	Project milestone presentations	Due: Project milestone
Mar 24	Guest lecturer: Hassan Shavarani - LLMs for information extraction
Mar 26	Guest lecturer: Linyi Li - Data centric experiences on LLMs	R: Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
Mar 31	Student tutorial: Michael Xu - Short intro to RL for LLMs
Apr 2	Paper discussion 4b	D: Sparse Autoencoders Find Highly Interpretable Features in Language Models
Apr 7	Project presentations
Apr 9	Project presentations + Conclusion
Apr 10		Due: Project final report

Grading

30% paper reading and reports
30% paper presentations, discussions and participation
40% course project
- 10% proposal: 5% presentation + 5% written proposal
- 10% milestone: 5% presentation + 5% report
- 10% final presentation
- 10% final report

General policies

Academic integrity

SFU’s Academic Integrity web site is filled with information on what is meant by academic dishonesty, where you can find resources to help with your studies and the consequences of cheating. Check out the site for more information and videos that help explain the issues in plain English.

Each student is responsible for his or her conduct as it affects the University community. Academic dishonesty, in whatever form, is ultimately destructive of the values of the University. Furthermore, it is unfair and discouraging to the majority of students who pursue their studies honestly. Scholarly integrity is required of all members of the University. Please refer to this web site.