By Giles Strachan and Ilan Manor
In 1952, General Sir Gerard Templer arrived in Malaya. On a visit to a village suspected of supporting communists, he delivered a tough speech. He began ‘You are all bastards’, which his translator converted to ‘His Excellency says that none of your parents were married.’ He continued ‘I can be a bastard too.’ This was dutifully translated as ‘His Excellency says that his parents were also unmarried.’
The problem Sir Gerard faced was that his speech relied on layers of meaning. With access to past speeches by Sir Gerard and other British leaders, perhaps his audience could have understood.
ChatGPT and language learning models (LLMs) have arrived with similar fanfare to Sir Gerard, and face similar issues of being lost in translation. In diplomacy, much has been made of the ability of LLMs to draft speeches, prepare position papers and otherwise automate the humdrum day-to-day business of countries talking. LLMs, however, face two challenges – their inability to understand diplomacy and their inherent bias.
While diplomacy has become increasingly public, with diplomats live-tweeting images from summits and posting policy papers, the most important part of diplomacy remains discrete. Conversations on the sidelines of coronations, encounters in the corridors at the UN and exchanges at summits still produce results. Public posturing may require that, for example, Russian diplomats deny the presence of its troops in Crimea, but in the negotiating room they are far more forthcoming. After all, troops can only be withdrawn if all parties agree that they are really there. Like an iceberg, much of diplomacy is unseen.
A field with a significant hidden portion is a great challenge to LLMs that are, essentially, sophisticated predictive models that can formulate responses based on how ‘likely’ a given combination of words is to occur. LLMs like ChatGPT perform this analysis after training on the massive corpus of English-language documents available online. Let ChatGPT rip on Project Gutenberg and it will come back able to produce a reasonable simulacrum of Shakespeare. This, however, is where the problems emerge.
First, diplomacy is a secretive art. We know that the statements of MFAs differ deeply from their own internal analyses – after Wikileaks, for example, the State Department launched a “charm offensive” to smooth the feathers ruffled by the frank – even brutal – assessments of US diplomats. An LLM in 2010 would have described America’s deep esteem for the rulers of Saudi Arabia, for example, rather than the blunter assessments found in Wikileaks. The delicate language of diplomacy would also require an LLM to understand that when the Israel is ‘very concerned’ about activity in the Gaza Strip, it is business as usual, but when it is ‘deeply concerned’, the airstrikes are en route. An LLM trained on public data will never be able to offer significant insight into the words of diplomats.
Secondly, LLMs included biases. The corpus of data used for training ChatGPT is composed of online data. The internet, however, is biased towards developed economies – the English-language Wikipedia is by some distance the largest while Swedish (10m speakers) has twice as many articles as Mandarin (1.35bn speakers). This means that ChatGPT is training on data which over-represents the USA and the UK. Regardless of the admirably liberal bent of these two nations, ChatGPT will see far more combinations of words praising these countries and representing their views those opposing them. For transnational languages, this also means a bias towards the linguistic centres of gravity. A Portuguese diplomat will find that public LLMs represent the views of Brazil far better than those of their own nation, let alone a Timorese diplomat. In a field reliant on subtlety and where national identity is so significant, this poses a considerable barrier to using public LLMs.
To overcome these limitations, we propose that MFAs construct private LLMs. The public training of LLMs could be supplemented by access to decades of diplomatic cables and secret analyses to provide insight into negotiation tactics, decision-making hierarchies and crisis response. For example, in trade negotiations a private LLM could trace the relationship between public statements and movements in the negotiating room, revealing what kinds of public language hint at areas for progress or dangerous red lines. Covert data on, say, the Iranian nuclear programme could be matched to analyses of Iranian state media to identify when the government is bluffing and when they are close to breakthroughs. The first MFA to develop their own in-house LLM will no doubt have a significant advantage in their ability to analyse their counterparties and deploy AI to improve all three key functions of diplomacy: representation, negotiation and crisis management
Had Nikita Khrushchev consulted an LLM before his trip to the UN, he would have anticipated a Philippine attack on the Soviet Union, automatically drafted a witty response and left his shoes at home.
Giles Strachan is a former civil servant and analyst.