Can language models develop a creole?
31 Oct 2025
When two populations that speak mutually unintelligible languages come into sustained contact (often through trade, migration, or colonisation), pressures for communication can give rise to a pidgin language. A pidgin is a contact language that develops with a simplified grammar and a lexicon drawn largely from the languages in contact, typically from the socially dominant one. Crucially, a pidgin has no native speakers: it functions as an auxiliary language, used by members of each group as a second language to facilitate mutual understanding.1
Eventually, a pidgin may develop into a fully fledged language with a full vocabulary and grammar, and become the first language of a community. When this happens, a creole language is formed.2 Examples include Haitian Creole (French-based), Jamaican Patois (English-based), Krio (English-based), and Nubi (Arabic-based).3 Creoles can emerge with remarkable speed, as intense contact situations create strong communicative pressures that favour rapid grammatical expansion and stabilisation.4
Creoles emerge when speakers of different languages, faced with the necessity of communication, creatively assemble a new linguistic system. Language models, too, are adaptive systems that generate and internalise patterns from linguistic input.5 One may then ask: can two language models, trained on different languages, develop a creole language when they come into contact?
When two language models trained on different languages are made to interact by completing each other's texts, they simulate the contact and interaction between communities of speakers of mutually unintelligible languages. If we then allow the models to learn from these exchanges over successive generations, we may observe processes analogous to pidginisation and creolisation: lexical borrowing, syntactic simplification, and eventual convergence towards a shared, rule-governed code. The experiment therefore offers a way to explore whether the pressures that drive linguistic unification in human communities can yield similar structural outcomes in artificial agents.
To explore this, I propose the following experiment.
The basic experiment
Take two similar corpora, one in LanguageA and the other in LanguageB. Train two language models of identical architecture and size on each language, ModelA and ModelB. Each model internalises the grammar of its language, but is ignorant of that of the other language. The two models are then brought into contact through a process of inter-generational text exchange.
In each generation, the models engage in a fixed number of episodes of interaction. In half of these, ModelA generates an initial text, which ModelB then completes. In the other half, the roles are reversed: ModelB begins, and ModelA continues. The resulting set of mixed-language texts forms a small corpus of interactions between LanguageA and LanguageB: the linguistic output of that generation.
Once the mixed-language corpus is obtained, we fine-tune each model on it, simulating the process by which speakers of each language are first made aware of the other language and begin to learn from one another's utterances. Once the fine-tuning is completed, we obtain the next generation of models.
We repeat the process over successive generations: each time we produce a mixed-language corpus of texts started by one of the models and completed by the other, and we fine-tune both on the resulting shared corpus. This iterative cycle of exchange and adaptation continues for multiple generations.
Over time, each model starts to learn from the other's language and we may test whether the two models' languages converge and develop a stable, mutually intelligible code, a kind of artificial creole.
Evaluating the new contact language
After a sufficient number of generations, or throughout the evolution of the models, we might evaluate whether their outputs exhibit signs of convergence such as increased mutual intelligibility, lexical borrowing, grammatical regularisation, or the emergence of consistent structural patterns distinct from either original language.
We can do this through two complementary lenses: computational convergence metrics and linguistic–typological diagnostics.
Computational convergence metrics
- Cross-perplexity: Does ModelA assign low perplexity to ModelB's outputs, and vice versa? Does cross-perplexity improve over generations?
- Representation alignment: Do the models map the same meanings to similar embeddings? Cosine similarity between sentence embeddings for paired meanings across agents.
- Lexicon overlap: Do the models converge on a shared vocabulary?
Linguistic–typological diagnostics
- Word-order stabilisation: Do both models settle on a dominant word order, e.g. SVO, SOV, etc?
- Tense/mood/aspect marking: Does a common, fixed ordering of tense/mood/aspect emerge?
- Regularisation: Since creoles typically exhibit fewer irregular forms,6 test whether the emergent code likewise regularises: track drops in irregular/exceptional types and paradigm entropy across generations.
- Comparatives: Creoles typically prefer periphrastic comparative marking with invariant degree particles (e.g. “more X” rather than “X-er”) rather than synthetic adjective inflection;7 track rising use and positional stability of these markers across generations.
- etc.
Further improvements to the experiment
There are a few improvements or tweaks we can make to the basic experiment:
- In the basic experiment, two things pass to the next generation: the mixed texts and the adapted weights. Instead, after each generation we can re-initialise ModelA and ModelB from the original pre-trained versions (which constituted generation 0), and fine-tune them only on the previous generation's generated corpus. This way, the only thing that passes from one generation to the next is the data, not the adapted weights.
- One reason why two populations might develop a common creole language is that they have a functional pressure to succeed in communicating with each other. In our basic experiment, this pressure is implemented via gradient descent for next-token prediction. Instead, we might wrap each generation in a cooperative game so that ModelA and ModelB must communicate with each other to achieve goals, rather than simply completing strings.8
- Rather than having one-to-one interactions, we might have two populations of agents, one trained on LanguageA and the other on LanguageB. Larger, mixed populations might reduce idiosyncratic drifts and better approximate communal norms.
- It might be interesting to play with the tokeniser to see how different forms of segmentation affect linguistic convergence. A shared tokeniser could make it easier for the two models to borrow words and share subunits, encouraging rapid lexical blending, while separate tokenisers might preserve linguistic boundaries for longer, delaying or reducing convergence. We could also experiment with finer-grained tokenisation, such as character-level units, to test whether the models can develop shared structures even when no common word forms exist. Comparing these settings would help reveal how much of any observed "creolisation" depends on surface overlap versus deeper grammatical adaptation.
Possible outcomes
I anticipate a few potential outcomes of this experiment.
The simplest possibility is that there is a collapse, and the models fail to develop a consistent shared code. The models fail to communicate and understand each other, and exchanges degenerate into incoherent sequences. This is akin to two communities trying to communicate with each other, but failing to make themselves understood.
Another possibility is that one of the languages may come to dominate, with the other gradually adapting to it. This would be reflected in the evaluation metrics proposed above: convergence would be asymmetric and cross-perplexity, lexical borrowing, and other metrics would be skewed towards one of the languages.
Alternatively, the models might produce mixed-language sequences, alternating or blending lexical and grammatical material from both sources without regularisation. This resembles code-switching or the formation of a pidgin that remains a flexible contact variety rather than a fully stabilised system.
However, it is also possible that under suitable conditions the interactions between the models could yield a new, stable linguistic system distinct from either parent language but drawing from both. The resulting code might exhibit typical features of creoles: reduced irregular morphology, analytic tense–mood–aspect marking, fixed word order, and lexical borrowing from both sources. This would constitute the strongest parallel to natural-language creolisation.
I am not suggesting that this process faithfully reproduces how creoles emerge in human societies, nor that it captures the full sociolinguistic and cognitive realities of creolisation. Real-world creoles arise through complex sociohistorical processes of migration, power, and identity, which the proposed experiment does not capture. Nevertheless, the experiment remains linguistically interesting because it isolates, in a controlled and observable way, the structural dynamics that accompany language contact: borrowing, simplification, convergence, and the creation of new grammatical regularities. By observing how such processes unfold in artificial agents exposed to comparable pressures, we can gain insight into the general principles that govern how linguistic systems adapt and reorganise under contact, and thus illuminate, in abstract form, the mechanisms that make human language so remarkably self-organising.
Notes
1 J. Holm, An introduction to pidgins and creoles (Cambridge University Press, 2000).
2 Ibid.
3 S.M. Michaelis et al., The Atlas of Pidgin and Creole Language Structures (Oxford University Press, 2013).
4 S.G. Thomason and T. Kaufmann, Language Contact, Creolization, and Genetic Linguistics (University of California Press, 1992).
5 See for example S. Gururangan, “Don't Stop Pretraining: Adapt Language Models to Domains and Tasks” (arXiv:2004.10964, 2020).
6 J. Holm, An introduction to pidgins and creoles (Cambridge University Press, 2000).
7 Ibid.
8 L. Lewis, Convention: A Philosophical Study (John Wiley & Sons, 2002).