As a high school exchange student in Brazil many years ago, I fell in love with the country and its people. So when reports emerged in 2014 of babies born with microcephaly (abnormally small heads causing irreversible damage) in one Brazilian region, linked and the attributed to the Zika virus, I paid close attention. But the story didn’t add up. Why would Zika, endemic across South America, cause birth defects in just one area? The question stuck with me and a few weeks ago I turned to the large language model (LLM) Grok to investigate.
I chose Grok for its relatively fewer guardrails compared to other LLMs. As I expected, it initially echoed the official narrative, shaped by public materials and language frequency. But after a couple of hours of asking very specific questions and drilling down on inconsistencies, we uncovered a confluence of events that gave outlines of a potential explanation that did make sense: Rio Olympics preparations and worry about public perceptions of Brazil, a president facing impeachment, an larvicide introduced into water supplies which had not undergone human testing, local Brazilian news reports of untrained workers overdosing tanks, residents’ concerns about water appearance, a damning lack of any of the required water testing, reports of pressure on health officials to avoid contrary investigations, and a dismissed rat study linking the larvicide to microcephaly-like defects. Wow.
I’m not prepared to say that I really know what happened to cause those birth defects, but I think I have a pretty likely hypothesis. (Not having the legal budget of a large investigative newspaper, I’m not prepared to take the story any further, but my view of the world and how it works has been enlightened.)
Using this particular investigation as a starting point inspired me to create prompt guidelines for using LLMs to counter the “Overton window” effect of dominant narratives, to spot misinformation, and to recognize cognitive biases that are exploited in propaganda. More on that to come.
In this short post, however, I want to focus on what I learned about AI’s struggles with extrapolation, which is one of several reasoning tasks LLMs are not built for, alongside causal, abductive, analogical, counterfactual, and critical reasoning.
Historical and investigative research often involves piecing together incomplete or contradictory data to hypothesize motives or connect dots. This requires extrapolation. LLMs can summarize known details and identify patterns, but they falter at reasoning beyond their training and at discerning causality. Their language fluency can mislead users, including and maybe especially students, into mistaking polished answers for insight, potentially reinforcing manipulated narratives instead of uncovering truths. History shows that official stories frequently diverge from likely events, a nuance that LLMs struggle to capture.
Recognizing this limitation actually offers an opportunity. Educators can design questions and exercises that highlight AI’s reasoning weaknesses, thereby fostering human reasoning skills—extrapolation, critical thinking, and synthesis—which are largely at the heart of a good education. By understanding what AI cannot do, we can better appreciate what makes human inquiry unique.
No comments:
Post a Comment
I hate having to moderate comments, but have to do so because of spam... :(