Large language models (LLMs) like ChatGPT show reasoning errors across many domains. Identifying vulnerabilities is good for public safety, industry, and the scientists making these models. The human ...
Abstract: We present a multi-way parallel corpus of Math Word Problems (MWPs) in nine languages, including six low-resource languages. To date, this is the largest multilingual MWP dataset available.
Do you stare at a math word problem and feel completely stuck? You're not alone. These problems mix reading comprehension ...
The president doesn’t use numbers and statistics like an adult; he uses numbers and statistics that he thinks sound good and make him feel better.
These low-floor, high-ceiling problems support differentiation, challenging all students by encouraging flexible thinking and allowing for multiple solution paths.
Abstract: Vector addition systems with states (VASS), also known as Petri nets, are a popular model of concurrent systems. Many problems from many areas reduce to the reachability problem for VASS, ...