CMPT 983 (Spring 2021): Special topics in Artificial Intelligence - Grounded Natural Language Understanding

Overview

This course is a graduate-level, seminar-oriented research course covering topics at the intersection of language, vision, graphics, and robotics. The class focuses on the grounding of language to various representations and modalities. Students are expected to have prior experience with deep learning concepts and framework (Pytorch, Tensorflow, etc), and should also have familiarity with at one of the following areas: natural language processing, vision, graphics or robotic.

Each week, students will read papers in a particular area of language grounding, and discuss the contributions, limitations and interconnections between the papers. Students will also work on a research project during the course, culminating in a final presentation and written report. The course aims to provide practical experience in comprehending, analyzing and synthesizing research in grounded natural language understanding.

Note: This course is NOT an introductory course to natural language processing. If you are interested in learning about natural language processing, CMPT 413/825 is offered in the fall.

Background

There are no formal prerequisites for this class. However, you are expected to be familiar with the following:

Natural language Processing
Deep learning

For some topics that we will cover in the class, it is also helpful to be familiar with:

3D representations
Reinforcement Learning

Topics

Grounded language acquisition and understanding
Grounding of language to machine interpretable programs (semantic parsing)
Visual grounding of language and tasks (captioning, VQA models, referring expressions)
Interpretation of language commands for embodied navigation and interaction
Interactive language learning through language games and dialogue
Grounded knowledge representations for mapping language to the 3D world
Generative models for content creation from text

Quick info

Instructor: Angel Chang
TA: Sonia Raychaudhuri
Lectures: Mondays 4:30pm to 6:20pm and Thursdays 4:30pm to 5:20pm, online on Canvas: https://canvas.sfu.ca/courses/56701, sessions will be “live” (synchronous)
Office hours:
- Angel: Thursdays 3:00pm to 4:00pm
- Sonia: Tuesdays 3:00pm to 4:00pm

Syllabus

Below is a tentative outline for the course.

R: Readings, BG: (Optional) Background material / reading for deeper understanding. Provided for reference.

Date	Topic	Notes
Jan 11	Introduction to grounding & logistics [slides]	BG: The Symbol Grounding Problem BG: Six lessons from babies V: How language shapes the way we think
Jan 14	How to read papers & project overview [slides]	BG: How to read a paper
Jan 18	Review of basic deep learning models [slides]	BG: Deep learning BG: Contextual word representations
Jan 21	Multimodal embeddings [slides]	BG: Multimodal Machine Learning
Jan 25	Paper discussion 1	R: DeViSE R: Deep Multimodal Embedding: Manipulating Novel Objects with Point-clouds, Language and Trajectories
Jan 28	Attention for multimodal grounding [slides]	BG: Attention? Attention!
Feb 1	Paper discussion 2	R: Show, Attend and Tell: Neural Image Caption R: MAttNet: Modular Attention Network for Referring Expression Comprehension
Feb 4	Pre-training with transformers [slides]	BG: Attention Is All You Need BG: The Illustrated Transformer
Feb 8	Paper discussion 3	R: Vilbert R: CLIP
Feb 11	Compositional grounding and structured representations [slides]	BG: Linguistic generalization and compositionality in modern artificial neural networks BG: Relational inductive biases, deep learning, and graph networks
Feb 15	No class - Reading break
Feb 18	No class - Reading break
Feb 22	Paper discussion 4	Project proposal due R: Grounded Compositional Semantics For Finding And Describing Images With Sentences R: Learning to Represent Image and Text with Denotation Graph
Feb 25	Semantic parsing for grounding [slides]	BG: Semantic parsers BG: Language to Logical Form with Neural Attention
Mar 1	Paper discussion 5	R: Learning to compose neural networks for question answering R: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Mar 4	Speaker-listener models [slides]	BG: Rational Speech Acts
Mar 8	Paper discussion 6	R: ShapeGlot: Learning Language for Shape Differentiation R: Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog
Mar 11	Instruction following (intro to RL) [slides]	BG: Deep RL BG: A (Long) Peek into Reinforcement Learning
Mar 15	Paper discussion 7	R: Mapping Instructions and Visual Observations to Actions with Reinforcement Learning R: Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Mar 18	Instruction following (visual language navigation) [slides]	BG: Visual language Navigation
Mar 22	Paper discussion 8	Project milestone due R: Sub-Instruction Aware Vision-and-Language Navigation R: RMM: A Recursive Mental Model for Dialogue Navigation
Mar 25	Instruction following (rearrangement) [slides]	BG: Rearrangement
Mar 29	Paper discussion 9	R: Language-Conditioned Imitation Learning for Robot Manipulation Tasks R: Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following
Apr 1	Interactive language learning [slides]	BG: Power to the people
Apr 5	No class - Easter Holiday
Apr 8	Text conditioned content generation 1 [slides]	BG: Generative models
Apr 12	Text conditioned content generation 2 [slides]	BG: Text to image survey BG: 3D generative models
Apr 15	Project presentations and conclusion [slides]	Project writeup due

Grading

35% paper reading and reports
10% paper presentations
15% paper discussions and participation
40% course project
- 10% proposal: 5% presentation + 5% written proposal
- 10% milestone: 5% presentation + 5% report
- 10% final presentation
- 10% final report

General policies

Academic integrity

SFU’s Academic Integrity web site is filled with information on what is meant by academic dishonesty, where you can find resources to help with your studies and the consequences of cheating. Check out the site for more information and videos that help explain the issues in plain English.

Each student is responsible for his or her conduct as it affects the University community. Academic dishonesty, in whatever form, is ultimately destructive of the values of the University. Furthermore, it is unfair and discouraging to the majority of students who pursue their studies honestly. Scholarly integrity is required of all members of the University. Please refer to this web site.