Jieyu Zhao

CSCI 699: Trustworthy Large Foundation Models

Spring 2025, Monday 3:30 - 6:50 pm PST. GFS 213
Instructor: Jieyu Zhao

Teaching Assistants: Taiwei Shi

Introduction

Although there have been impressive advancements in large foundation models (e.g., LLMs, VLMs), several studies reported that NLP models contain social biases. Even worse, the models run the risk of further amplifying the stereotypes and causing harms to people. As LFMs continue to advance and be integrated into various domains such as healthcare, finance, marketing, and social media, it raises important ethical concerns that need to be addressed. In this course, students will critically examine the ethical implications of NLP, including issues related to bias/fairness, privacy, safety, and social impact. Through discussions, case studies, and guest lectures, students will explore the ethical challenges associated with AI models and develop a deep understanding of the ethical considerations that arise when designing, implementing, and deploying large foundation models.

Students will get a broad understanding about possible issues in current large foundation models and how current research has tried to alleviate those issues. This class will equip students with the ability to read and write critical reviews about research papers. At the same time, they will learn how to conduct research related to AI fairness, interpretability and robustness.

News:

First day of class is Jan 13, 2025!
We will not have class on Jan 20, 2025.

Course Staff

Jieyu Zhao

Office Hour: directly after the class

Taiwei Shi

TBD

Logistics

Office hours: Check the office hours above. If you have special requests, contact the TA individually. (please put “[CSCI 699]” in the subject line).
Assignments: Assignments should be submitted through Google Drive. Use your USC account.
Discussions: We will be using Slack for general course-related questions and announcements.
In-person policy: The class will be conducted in person. There won't be Zoom recordings. Let the instructor know in advance if you have an emergency and cannot join the class.

Prerequisites

Familiarity with natural language processing and/or machine learning. Ideal pre/co-requisites are CSCI 544 (Applied Natural Language Processing) or CSCI 567 (Machine Learning).
We will not teach any NLP/ML/CV concepts/techniques in this class!
Programming skills. We will mainly use python with PyTorch, but you can use any other libraries for your final project.

Schedule

All assignments are due by 11:59pm on the indicated date.

Check the spreadsheet for our papers assinged for each week.

Week	Date	Topic	Related Readings	Assignments
1	01/13	Introduction, Logistics and Biases in NLP	Introduction, NLP Bias
2	01/20	MLK Holiday. No class
3	01/27	LLMs and VLMs	Check the signup sheet.
		Guest Lecture (Sidi Lu)	Beyond autoregressive language models: insertion-based models, diffusion text model and parallel decoding	Find your project team members.
4	02/03	VLMs & Biases in LFMs
		Guest Lecture (Zhecan Wang)	TBD
5	02/10	Safety in LLMs
		Guest Lecture (Weijia Shi)	Beyond Monolithic Language Models
6	02/17	President's Day. No class		Project Proposal Due
7	02/24	Harms in downstream tasks
		Guest Lecture (Yue Huang)	Socially Impactful and Trustworthy Generative Foundation Models
8	03/03	Human AI Alignment
		Guest Lecture (Xiaoyuan Yi)	AI Alignment
9	03/10	Human AI Alignment (cont'd) + Privacy		Midterm Report Due
		Guest Lecture (Yiren Feng)	TBD
10	03/17	Fall recess. No class
11	03/24	Project Midterm Presentation Workshop - I
		Guest Lecture (Ninghao Liu)	Mechanistic interpretability for AI Safety
12	03/31	Project Midterm Presentation Workshop - II
		Guest Lecture (Zining Zhu)	TBD
13	04/07	AI Agent
		Guest Lecture (Xuezhe Ma)	TBD
14	04/14	LFMs + X
		Guest Lecture (Oliver Liu)	Auditable decision making under uncertainty
15	04/21	Final Project Presentation
		Guest Lecture (Zhenyuan Qin)	AI + medicine
16	04/28	Final Project Presentation		Final Report Due
		Guest Lecture (Xinyi Wang)	Understanding Large Language Models from Pretraining Data Distribution

Grading

Grades will be based on attendance (10%), paper presentation (30%), and a course project (60%).

Attendance and Discussion (10% total):

In-Person Attendance

Paper Reading and Discussion (30% total).

Present paper, lead class discussion
Signup as reviewers, peer review others’ presentation

Course Project (60% total):

Project Proposal (10%)
Midterm Report (10%)
Final Presentation (20%)
Final Report (20%)

Late days

You have 4 late days you may use on any assignment **excluding the final report**. Each late day allows you to submit the assignment 24 hours later than the original deadline. You may use a maximum of 2 late days per assignment. If you are working in a group for the project, submitting the project proposal or midterm report one day late means that each member of the group spends a late day.

Paper Presentation

Paper presentation will help students to develop the skills to give research talk to others. Each student will present 2 papers to the class. The student will prepare the slides for the paper and lead the discussion. Each week, there will be another student signed up as the feedback provider (reviewer). Reviewer will provide the feedback to the instructor or the TAs. Grading rubrics: correctness of the content (40%), clarity (20%), discussion (20%), slides & presentation skills (20%).

Final Project

The final project can be done individually or in groups of up to 3. This is your chance to freely explore machine learning methods and how they can be applied to a task of our choice. You will also learn about best practices for developing machine learning methods—inspecting your data, establishing baselines, and analyzing your errors.

Each group needs to finish one research project related to the class topics. There should be a “deliverable” result out of the project, meaning that your project should be self-complete and reproducible (scientifically correct. A typical successful project could be: 1) a novel and sound solution to an interesting research problem, 2) correct and meaningful comparisons among baselines and existing approaches, 3) applying existing techniques to a new application. We will not penalize negative results, as long as your proposed approach is thoroughly explored and justified. Overall, the project should showcase the student’s ability to think critically, conduct rigorous research, and apply the concepts learned in the course to address a relevant problem in the field of NLP ethics.

Students should use the standard *ACL paper submission template to finish their writing report regarding the course project.

Project proposal (10%). Students are expected to finish a 2-page long project proposal by Week 4. The proposal should articulate the research question, justify the significance of the research, and provide evidence of the student’s knowledge and understanding of the research literature. A timeline for the project will be highly recommended to be included in the proposal.
Midterm progress report (10%). By Week 10, the students should finish a ~4-page progress report. The report should provide a clear statement about the research goal (could be different from the original one), a concise overview of the work completed so far, including any challenges encountered and solutions implemented and a report of some initial results.
Final presentation (20%). During the last two week, the students will make a 30-minute presentation about their project. It should include the research goal, the motivation, related work, their methodology and results. There will be a 5-minute QA session for other students to ask questions.
Final project report (20%). Students will write a final project report to describe the details about their research. The report should follow the NLP conference paper format, including the abstract, introduction, related work, result demonstration and discussion section. If the result is negative, it won’t be penalized but the students should highlight their analysis about what could be the possible reasons. The report should be in total 8 pages (excluding the reference).

Resources

The following courses are relevant:

UW: Linguistics 575: Ethics in NLP
Berkeley: CS 294: Fairness in Machine Learning
CMU: Computational Ethics for NLP