CSCI 699: Trustworthy Large Foundation Models

Spring 2025, Monday 3:30 - 6:50 pm PST. GFS 213
Instructor: Jieyu Zhao

Teaching Assistants: Taiwei Shi


Introduction

Although there have been impressive advancements in large foundation models (e.g., LLMs, VLMs), several studies reported that NLP models contain social biases. Even worse, the models run the risk of further amplifying the stereotypes and causing harms to people. As LFMs continue to advance and be integrated into various domains such as healthcare, finance, marketing, and social media, it raises important ethical concerns that need to be addressed. In this course, students will critically examine the ethical implications of NLP, including issues related to bias/fairness, privacy, safety, and social impact. Through discussions, case studies, and guest lectures, students will explore the ethical challenges associated with AI models and develop a deep understanding of the ethical considerations that arise when designing, implementing, and deploying large foundation models.

Students will get a broad understanding about possible issues in current large foundation models and how current research has tried to alleviate those issues. This class will equip students with the ability to read and write critical reviews about research papers. At the same time, they will learn how to conduct research related to AI fairness, interpretability and robustness.

News:

Course Staff

Jieyu Zhao

Jieyu Zhao

Office Hour: directly after the class

Taiwei

Taiwei Shi

TBD

Logistics

Prerequisites

Schedule

All assignments are due by 11:59pm on the indicated date.

Sign up for the paper presentation & project team members here.

Check the spreadsheet for our papers assinged for each week.

Week Date Topic Related Readings Assignments
1 01/13 Introduction, Logistics and Biases in NLP Introduction, NLP Bias
2 01/20 MLK Holiday. No class
3 01/27 LLMs and VLMs Check the signup sheet.
Guest Lecture (Sidi Lu) Beyond autoregressive language models: insertion-based models, diffusion text model and parallel decoding Find your project team members.
4 02/03 VLMs & Biases in LFMs
Guest Lecture (Zhecan Wang) TBD
5 02/10 Safety in LLMs
Guest Lecture (Weijia Shi) Beyond Monolithic Language Models
6 02/17 President's Day. No class Project Proposal Due
7 02/24 Harms in downstream tasks
Guest Lecture (Yue Huang) Trust LLMs
8 03/03 Human AI Alignment
Guest Lecture (Xiaoyuan Yi) AI Alignment
9 03/10 Human AI Alignment (cont'd) + Privacy Midterm Report Due
Guest Lecture (Yiren Feng) TBD
10 03/17 Fall recess. No class
11 03/24 Project Midterm Presentation Workshop - I
Guest Lecture (Ninghao Liu) Mechanistic interpretability for AI Safety
12 03/31 Project Midterm Presentation Workshop - II
Guest Lecture (Zining Zhu) TBD
13 04/07 AI Agent
Guest Lecture (Xuezhe Ma) TBD
14 04/14 LFMs + X
Guest Lecture (Oliver Liu) Auditable decision making under uncertainty
15 04/21 Final Project Presentation
Guest Lecture (Zhenyuan Qin) AI + medicine
16 04/28 Final Project Presentation Final Report Due
Guest Lecture (Xinyi Wang) Understanding Large Language Models from Pretraining Data Distribution

Grading

Grades will be based on attendance (10%), paper presentation (30%), and a course project (60%).

Attendance and Discussion (10% total):

Paper Reading and Discussion (30% total).

Course Project (60% total):

Late days

You have 4 late days you may use on any assignment **excluding the final report**. Each late day allows you to submit the assignment 24 hours later than the original deadline. You may use a maximum of 2 late days per assignment. If you are working in a group for the project, submitting the project proposal or midterm report one day late means that each member of the group spends a late day.

Paper Presentation

Paper presentation will help students to develop the skills to give research talk to others. Each student will present 2 papers to the class. The student will prepare the slides for the paper and lead the discussion. Each week, there will be another student signed up as the feedback provider (reviewer). Reviewer will provide the feedback to the instructor or the TAs. Grading rubrics: correctness of the content (40%), clarity (20%), discussion (20%), slides & presentation skills (20%).

Final Project

The final project can be done individually or in groups of up to 3. This is your chance to freely explore machine learning methods and how they can be applied to a task of our choice. You will also learn about best practices for developing machine learning methods—inspecting your data, establishing baselines, and analyzing your errors.

Each group needs to finish one research project related to the class topics. There should be a “deliverable” result out of the project, meaning that your project should be self-complete and reproducible (scientifically correct. A typical successful project could be: 1) a novel and sound solution to an interesting research problem, 2) correct and meaningful comparisons among baselines and existing approaches, 3) applying existing techniques to a new application. We will not penalize negative results, as long as your proposed approach is thoroughly explored and justified. Overall, the project should showcase the student’s ability to think critically, conduct rigorous research, and apply the concepts learned in the course to address a relevant problem in the field of NLP ethics.

Students should use the standard *ACL paper submission template to finish their writing report regarding the course project.

Resources

The following courses are relevant: