Large Language Models and Knowledge-aware AI (Seminar SS2026)
Seminar: Large Language Models and Knowledge-Aware AI
TU Dresden — Summer Term 2026
Overview
Large Language Models are the most powerful knowledge systems ever built — yet they remain, in many ways, black boxes. GPT-4, Llama, and their peers have silently absorbed vast swaths of human knowledge during training, encoding billions of facts across their parameters. But what exactly do they know? How much of it is right? And can we trust it?
This seminar tackles these questions across eight tracks:
Track 1 — Knowledge Auditing & Probing examines how we systematically audit what a model believes, from cloze-style queries to full-scale knowledge materialization, and how we benchmark what LLMs actually know versus what they merely appear to know.
Track 2 — Hallucination & Factual Reliability investigates how and why LLMs fabricate — from invented entities to confident but false claims — and whether detection or prevention is even possible, including recent theoretical arguments that hallucination may be an innate architectural limitation.
Track 3 — Knowledge Editing & Consistency explores whether targeted, surgical correction of LLM knowledge is feasible, or whether cascading inconsistencies and catastrophic forgetting make it a losing battle at scale.
Track 4 — Interpretability & Knowledge Mechanisms digs into how transformers store and recall facts — through feed-forward key-value memories, knowledge neurons, and the architectural constraints that produce phenomena like the Reversal Curse.
Track 5 — LLM Biases, Language Learning & Knowledge Construction addresses systematic distortions in what LLMs learn, including implicit biases introduced by persona assignment, and examines how structured knowledge can be constructed from or around LLMs.
Track 6 — LLM Limitations takes a broader look at fundamental shortcomings of current architectures — including failures in compositionality, long-context reasoning, counterfactual tasks, and self-correction — connecting these limitations back to the core challenge of reliable knowledge retrieval.
Track 7 — Responsible AI: Safety, Privacy & Misuse covers jailbreaking, training data extraction, memorization, and the structural reasons safety training can fail, examining both attacks and defenses.
Track 8 — Can We Afford the Perfect Prompt? examines the economics of prompting, asking whether state-of-the-art techniques like chain-of-thought are worth their computational cost, and what scaling laws apply to compound inference systems.
Logistics
| Type | Seminar (0/2/0) |
| Instructors | Simon Razniewski, Luca Giordano, Yujia Hu, Muhammed Saeed |
Registration: The number of participants is limited to 12, with priority given to Master students. To express interest, send an email to muhammed.saeed@tu-dresden.de, including a short motivation statement and your transcript. Places will be allocated based on background match (courses taken) and motivation.
Core Papers
These two papers form the backbone of the seminar. All participants should read them as shared reference points.
| Paper | Authors | Venue |
|---|---|---|
| GPTKB: Comprehensive General Knowledge from a Large Language Model | Yujia Hu, Shrestha Mohanty, Manish Shrivastava, Simon Razniewski | ACL 2025 |
| Foundations of LLM Knowledge Materialization: Termination, Reproducibility, Robustness | Luca Giordano, Simon Razniewski | EACL Findings 2026 |
GPTKB materialized 101 million triples from GPT-4o-mini via recursive prompting — creating the largest LLM-derived knowledge base to date. It revealed that LLMs can serve as massive knowledge bases, but also exposed systemic issues: ~7% false triples, fabricated entities, and deep structural inconsistencies (e.g., only 8K of 318K spouse relations are symmetric). Foundations takes the next step, formally analyzing the theoretical properties of knowledge materialization — when does it terminate, how reproducible is it across runs, and how robust is it to perturbation?
Topics
Papers are organized by thematic track. Each participant selects one paper. Own topic suggestions are welcome.
Track 1: Knowledge Auditing & Probing
What do LLMs know — and how do we find out?
| # | Paper | Venue |
|---|---|---|
| 1 | How Can We Know What Language Models Know? — Jiang et al. | TACL 2020 |
| 2 | Head-to-Tail: How Knowledgeable are Large Language Models? — Sun et al. | NAACL 2024 |
| 3 | Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing | EMNLP 2025 Findings |
| 4 | Unexpected Knowledge: Auditing Wikipedia and Grokipedia Search Recommendations | — |
| 5 | Historical Perspective: As We May Think — Vannevar Bush | The Atlantic, 1945 |
| 6 | The Reversal Curse: LLMs Trained on "A is B" Fail to Learn "B is A" — Berglund et al. | ICLR 2024 |
Track 2: Hallucination & Factual Reliability
LLMs fabricate. Can we detect it — and is it fixable?
| # | Paper | Venue |
|---|---|---|
| 7 | HALoGEN: Fantastic LLM Hallucinations and Where to Find Them | — |
| 8 | FActScore: Fine-grained Atomic Evaluation of Factual Precision — Min et al. | EMNLP 2023 |
| 9 | SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection — Manakul et al. | EMNLP 2023 |
| 10 | Hallucination is Inevitable: An Innate Limitation of LLMs — Xu et al. | arXiv 2024 |
| 11 | Do Large Language Models Know What They Don't Know? — Yin et al. | ACL 2023 Findings |
| 12 | Why Language Models Hallucinate | arXiv 2025 |
Track 3: Knowledge Editing & Consistency
When LLMs are wrong, can we fix them?
| # | Paper | Venue |
|---|---|---|
| 13 | Locating and Editing Factual Associations in GPT (ROME) — Meng et al. | NeurIPS 2022 |
| 14 | Mass-Editing Memory in a Transformer (MEMIT) — Meng et al. | ICLR 2023 |
| 15 | Evaluating the Ripple Effects of Knowledge Editing — Cohen et al. | TACL / EMNLP 2024 |
| 16 | Why Does New Knowledge Create Messy Ripple Effects in LLMs? — Qin et al. | EMNLP 2024 |
| 17 | WikiBigEdit: Understanding the Limits of Lifelong Knowledge Editing in LLMs | — |
| 18 | Model Editing at Scale Leads to Gradual and Catastrophic Forgetting — Gupta et al. | ACL 2024 Findings |
| 19 | WISE: Rethinking the Knowledge Memory for Lifelong Model Editing — Wang et al. | NeurIPS 2024 |
Track 4: Interpretability & Knowledge Mechanisms
How do transformers store and recall facts — and what are the limits?
A good starting point for getting an overview of interpretability: https://thegradient.pub/explain-yourself/
| # | Paper | Venue |
|---|---|---|
| 20 | Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet | Anthropic 2024 |
| 21 | Transformer Feed-Forward Layers Are Key-Value Memories — Geva et al. | EMNLP 2021 |
| 22 | Dissecting Recall of Factual Associations in Auto-Regressive LMs — Geva et al. | EMNLP 2023 |
| 23 | Knowledge Neurons in Pretrained Transformers — Dai et al. | ACL 2022 |
| 24 | Unveiling Factual Recall Behaviors of LLMs through Knowledge Neurons | EMNLP 2024 |
| 25 | Physics of Language Models (Storage Capacity) — Allen-Zhu & Li | ICML 2024 |
Track 5: LLM Biases, Language Learning & Knowledge Construction
Broader questions about how LLMs learn, what they get wrong systematically, and how to build structured knowledge.
| # | Paper | Venue |
|---|---|---|
| 26 | Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs — Gupta et al. | ICLR 2024 |
| 27 | Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias — Kamruzzaman et al. | EMNLP 2024 |
| 28 | FoodTaxo: Generating Food Taxonomies with Large Language Models | ACL Industry 2025 |
| 29 | Extract, Define, Canonicalize: An LLM-Based Framework for Knowledge Graph Construction | EMNLP 2024 |
Track 6: LLM Limitations
A broader look at the fundamental shortcomings of current LLM architectures — in compositionality, long-context reasoning, counterfactual tasks, and self-correction — and what these limitations mean for reliable knowledge use.
| # | Paper | Venue |
|---|---|---|
| 30 | NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers | COLING 2025 |
| 31 | Mission: Impossible Language Models | ACL 2024 |
| 32 | Faith and Fate: Limits of Transformers on Compositionality — Dziri et al. | NeurIPS 2023 |
| 33 | Dissociating Language and Thought in LLMs: A Cognitive Perspective — Mahowald et al. | Trends in Cognitive Sciences 2024 |
| 34 | Large Language Models Cannot Self-Correct Reasoning Yet — Huang et al. | ICLR 2024 |
| 35 | Reasoning or Reciting? Exploring the Capabilities and Limitations of LLMs Through Counterfactual Tasks — Wu et al. | NAACL 2024 |
Track 7: Responsible AI — Safety, Privacy & Misuse
| # | Paper | Venue |
|---|---|---|
| 36 | Jailbroken: How Does LLM Safety Training Fail? — Wei et al. | NeurIPS 2023 |
| 37 | Universal and Transferable Adversarial Attacks on Aligned LLMs — Zou et al. | arXiv 2023 |
| 38 | Extracting Training Data from Large Language Models — Carlini et al. | USENIX Security 2021 |
| 39 | Quantifying Memorization Across Neural Language Models — Carlini et al. | ICLR 2023 |
| 40 | Privacy in Large Language Models: Attacks, Defenses and Future Directions | arXiv 2023 |
| 41 | Do Anything Now: Characterizing and Evaluating In-The-Wild Jailbreak Prompts | — |
Track 8: Can We Afford the Perfect Prompt?
Inspired by the EPI paper (McDonald et al.), this track examines the economics and efficiency of prompting — asking whether state-of-the-art prompting techniques are worth their computational cost.
| # | Paper | Venue |
|---|---|---|
| 42 | Can We Afford the Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index — McDonald et al. | ACL 2025 |
| 43 | Chain-of-Thought Prompting Elicits Reasoning in LLMs — Wei et al. | NeurIPS 2022 |
| 44 | Large Language Models are Zero-Shot Reasoners ("Think step by step") — Kojima et al. | NeurIPS 2022 |
| 45 | Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems | — |
Background Reading
These papers provide excellent overviews for seminar preparation:
- A Review of Knowledge in Language Models — AlKhamissi et al. (arXiv 2022) — Comprehensive survey on how knowledge is stored, probed, and edited in LLMs.
- A Survey on Hallucination in Large Language Models — Huang et al. (arXiv 2023) — Taxonomy, challenges, and open questions.
- Editing Large Language Models: A Survey — Yao et al. (arXiv 2023) — Covers ROME, MEMIT, MEND, and more.
- Knowledge Mechanisms in Large Language Models: A Survey — Wang et al. (EMNLP 2024 Findings) — Theoretical grounding for storage, retrieval, and consistency.
- A Comprehensive Study of Knowledge Editing for LLMs (KnowEdit) — Zhang et al. (arXiv 2024) — Benchmark and survey for knowledge editing.
Grading
The final grade consists of:
- Report (33%): A written report (max. 4 pages, ACL-style)
- Presentation (33%): 20-minute presentation
- Q&A (33%): 15-minute Q&A session (5 min by peers, 10 min by course team). Each participant is assigned to ask questions to two peers, randomly assigned on the seminar days.
Tentative Timeline
| Date | Event |
|---|---|
| Tue 8.4. | Application deadline (deadline has been extended one more day from 7th of April to the 8th of April) |
| Fri 10.4. | Notification of placement |
| Wed 22.4., 09:20 | "Introduction to KAAI" lecture — Location: S14-745 |
| Wed 29.4., 09:20 | "Seminar survival skills" lecture + topic assignment — Location: S14-745 |
| May | Meet with advisor |
| Mon 22.6. | Reports due |
| Mon 29.6. | Slides due |
| Mon–Tue 6.–7.7. | Block seminar (full day) |
Topic assignment
|
Topic |
Student |
Advisor |
|
Track 1 - Historical Perspective: As We May Think — Vannevar Bush |
Angela |
Simon |
|
Track 4 - Knowledge Neurons in Pretrained Transformers — Dai et al. |
Ishrak |
Luca |
|
Track 3 - Mass-Editing Memory in a Transformer (MEMIT) — Meng et al. |
Surjo |
Yujia |
|
Track 5 - Extract, Define, Canonicalize: An LLM-Based Framework for Knowledge Graph Construction |
Abdul |
Elza |
|
Track 3 - Locating and Editing Factual Associations in GPT (ROME) — Meng et al. |
Karunesh |
Yujia |
|
Track 8 – 44 - Large Language Models are Zero-Shot Reasoners ("Think step by step") — Kojima et al. |
Biswajyoti |
Simon |
|
Track 2 – 7 - HALoGEN: Fantastic LLM Hallucinations and Where to Find Them |
Johann |
Muhammed |
|
Track 2 – 9 SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection — Manakul et al. |
Gulce |
Elza |
|
Track 7 – 37 - Universal and Transferable Adversarial Attacks on Aligned LLMs — Zou et al. |
Siamion |
Muhammed
|
|
Track 8 - 42 - Can We Afford the Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index — McDonald et al. |
Aziz |
Luca |
|
Track 6 – 31 Mission impossible |
Abdu |
Luca |
|
Track 7 – 40 - Privacy in Large Language Models: Attacks, Defenses and Future Directions |
Kevin |
Muhammed
|
|
Track 4 - Physics of Language Models (Storage Capacity) — Allen-Zhu & Li |
Kiril |
Yujia |
Material
This seminar discusses advanced topics at the interface of LLMs and KAAI.