The Persuasion Gap
The Persuasion Gap
Prior research suggested LLMs are no more persuasive than standard political campaign practices. Across two survey experiments with 19,145 participants and seven frontier models, they are. LLMs outperform standard campaign advertisements, with heterogeneity across models.
The performance differences between models are consistent across political issues and positions — the ranking doesn’t change depending on the topic. But the response to prompting strategies diverges radically: information-based prompts enhance persuasiveness for some models and substantially diminish it for others. The same prompt modification that makes one model more persuasive makes another less persuasive.
The through-claim: persuasion capacity is not a single axis on which models can be ranked independently of context. The models have different persuasion profiles — different strengths in different persuasive modes. A model that excels at information-based persuasion may fail at other approaches, and vice versa. This means that benchmarking persuasion risk on a single prompt strategy underestimates the risk for some models and overestimates it for others.
The practical implication cuts against simple safety measures. Rate-limiting or filtering based on aggregate persuasion scores misses the interaction between model and strategy. A model scored as “low risk” overall might be highly effective under the specific prompt strategy that circumvents its weaknesses. Risk assessment requires testing the full matrix of models by strategies, not just measuring each model once. The space of persuasion is higher-dimensional than a single leaderboard suggests.