Misinformation as a harm: structured approaches for fact-checking prioritization

Multiple AI models help robots execute complex plans more transparently

Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models

A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing

Fast Detection of Phase Transitions with Multi-Task Learning-by-Confusion

AI can now attend a meeting and write code for you – here’s why you should be cautious

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

MIT engineers are on a failure-finding mission

On the notion of Hallucinations from the lens of Bias and Validity in Synthetic CXR Images