E121 (Diplom & Bachelor) – Studiengang Elektrotechnik und Informationstechnik (PO 2016)
Overview
This seminar discusses advanced topics at the interface of LLMs and KAAI. It is a block seminar and will take place on two consecutive days in the summer term 2025. There will also be two meetings at the beginning of the semester, for which participation is mandatory.
- Type: Seminar (0/2/0)
- Teacher: Simon Razniewski (lecturer+advisor topics 3-7, 9-16), Yujia Hu (advisor topics 1+2+8)
- Modules: CMS-SEM, CMS-LM-ADV, CMS-LM-AI, INF-PM-FOR, INF-VERT2, INF-AQUA
Registration
- The number of participants is limited, with preference given to Master students
- To express interest, send an email to the lecturer (simon.razniewski@tu-dresden.de), including a short motivation statement and your transcript
- Places will be allocated based on background match (courses taken) and motivation
Topics
Group 1: Knowledge and representation
How LLMs acquire, store, and represent knowledge (factual, cultural, physical, ontological).
- 1. Facts and hallucinations in LLMs
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Javier Ferrando, Oscar Obeso, Senthooran Rajamanoharan, Neel Nanda
Locating and editing factual associations in gpt
K Meng, D Bau, A Andonian, Y Belinkov - 2. Physical world knowledge
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
Wei Chow, Jiageng Mao, Boyi Li, Daniel Seita, Vitor Guizilini, Yue Wang - 3. Knowledge representation in LLMs
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Lukas Berglund et al
Supposedly Equivalent Facts That Aren't? Entity Frequency in Pre-training Induces Asymmetry in LLMs
Y He et al - 4. Entities and LLMs
Entgpt: Linking generative large language models with knowledge bases
Y Ding, A Poudel, Q Zeng, T Weninger, B Veeramani, S Bhattacharya
Instructed language models with retrievers are powerful entity linkers
Z Xiao, M Gong, J Wu, X Zhang, L Shou, J Pei, D Jiang - 5. Cultural knowledge
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking
Yi Fung, Ruining Zhao, Jae Doo, Chenkai Sun, Heng Ji
CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge
Yu Ying Chiu et al. - 6. Taxonomy construction and refinement with LLMs
Refining wikidata taxonomy using large language modelsY Peng, T Bonald, M Alam
Towards ontology construction with language modelsM Funk, S Hosemann, JC Jung, C Lutz
Group 2: Reasoning and Simulation
How LLMs think, reason step-by-step, simulate humans, and form internal models of the world.
- 7. Chain of thought reasoning
Chain-of-thought prompting elicits reasoning in large language models
J Wei et al
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Sachit Menon, Richard Zemel, Carl Vondrick
- 8. World model emergence in LLMs
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
Kenneth Li, Aspen K. Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data
Treutlein et al. - 9. Theory of mind
How FaR Are Large Language Models From Agents with Theory-of-Mind?
Pei Zhou et al.
Testing theory of mind in large language models and humans
JWA Strachan et al. - 10. Cognitive Psychology on LLMs
Using cognitive psychology to understand GPT-3
Marcel Binz, Eric Schulz - 11. LLM versus human language learning
BabyLM challenge
Charpentier et al. - 12. Simulated humans
Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?
J. J. Horton
Generative Agents: Interactive Simulacra of Human Behavior
J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein
Out of One, Many: Using Language Models to Simulate Human Samples
L. P. Argyle, E. C. Busby, N. Fulda, J. R. Gubler, C. Rytting, and D. Wingate
Group 3: Evaluation, Bias, and Contextual Limits
How we evaluate LLMs, understand their uncertainties, detect biases, and identify their limitations.
- 13. Confidence calibration
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian, Eric Mitchell, Allan Zhou, Archit Sharma, Rafael Rafailov, Huaxiu Yao, Chelsea Finn, Christopher D. Manning - 14. LLM biases
Bias runs deep: Implicit reasoning biases in persona-assigned llmsS Gupta, V Shrivastava, A Deshpande, A Kalyan
Investigating subtler biases in llms: Ageism, beauty, institutional, and nationality bias in generative models
M Kamruzzaman, MMI Shovon, GL Kim - 15. Evaluating factual knowledge
How Reliable are LLMs as Knowledge Bases? Re-thinking
Factuality and Consistency
Danna Zheng, Mirella Lapata, Jeff Z. Pan
Head-to-tail: How knowledgeable are large language models (llm)? AKA will llms replace knowledge graphs?
K Sun, YE Xu, H Zha, Y Liu, XL Dong - 16. Historical perspective
As We May Think
Vannevar Bush
(own topic suggestions are welcome as well)
Deliverables
There are 5 deliverables. To pass the course, all have to be submitted on time. Percentages in brackets denote contribution to final grade.
- Outline of report (5%)
- Report 1st version (0%)*
- Reviews on two other reports (15%)
- Reports final version (40%)
- Presentation (40%)
- 1st revision is not graded, but the prime chance to obtain feedback from advisor and peers.
Tentative timeline
Mon 7.4.: Application deadlineWed 9.4. Notification of accepted participants- Wed 16.4., 10am-12: "Introduction to KAAI" lecture, location S14-745
- Fr 25.4., 10am-12: "Seminar survival skills" lecture + topic assignment, location S14-745
- Wed 7.5.: 1st deliverable due
- 12.-20.5.: Meetings with advisors
- Wed 11.6.: 2nd deliverable due + submit bids for reviewing other papers
- Thu 19.6.: 3rd deliverable due
- Thu 3.7.: 4th deliverable due
- Mon, Tu 14./15.7.: Block seminar presentations (S14-621 and S14-745)
Material
- Slides 1st meeting
- Slides 2nd meetings
- Report template
- Easychair for reviewing