How the world’s largest language family spread — and why others go extinct

Three books that take on the history of languages have something for everyone
Proto: How One Ancient Language Went Global Laura Spinney William Collins (2025)
The Indo-Europeans Rediscovered: How a Scientific Revolution is Rewriting their Story J. P. Mallory Thames & Hudson (2025)
Rare Tongues: The Secret Stories of Hidden Languages Lorna Gibb Atlantic (2025)
A key human characteristic is our ability to communicate through complex languages — about 7,000 of which are spoken around the world today. Understanding the origin and development of past and present languages can help researchers to understand human evolution.
Although today’s languages group into about 140 families, only 5 of these families are widely used: Indo-European, Sino-Tibetan, Niger–Congo, Afro-Asiatic and Austronesian. Indo-European languages form the largest family, if those who speak them as a second language are included — with 12 main branches ranging historically from northwestern China to western Europe. “Almost every second person on Earth speaks Indo-European”, notes science writer Laura Spinney in Proto, one of a trio of intriguing books exploring the history of languages, common and rare.
Both Spinney’s lively book for non-specialists, and The Indo-Europeans Rediscovered — an academic study with broad appeal by archaeologist James Mallory — focus on the origins of this vast language family. By contrast, extinct and endangered languages are the preoccupation of Rare Tongues, a quirky study by linguist Lorna Gibb, aimed at all audiences.
The origin of the Indo-European language family has been the “Holy Grail for many intellectuals and many not-so intellectuals” over the past few centuries, writes Spinney. “The arguments have run from ingenious to ingenuous to outright weird,” comments Mallory, with one even proposing a source “outside our galaxy”. His book’s eye-catching appendix lists 176 individuals who, between 1686 and 2024, each proposed birthplaces, or homelands, for Indo-European — “as far north as the polar regions and as far south as Antarctica, from the Atlantic to the Pacific”.
William ‘Oriental’ Jones was one such thinker who was, and still is, widely cited, not least by Spinney and Mallory. A British philologist and pioneering Indologist who worked as a judge in colonial India, Jones drew attention in 1786 to the tantalizing resemblance between the ancient languages Sanskrit, Latin and Greek. For example, the Sanskrit word for ‘mother’ is mata and the Latin is mater; the verb ‘to fly’ is pátami in Sanskrit, pétomai in Greek and petō in Latin. Jones found these similarities so strong that he wrote: “No philologer could examine them all three without believing them to have sprung from some common source, which, perhaps, no longer exists.” Here was the “semi-official discovery of the Indo-European language family”, observes Mallory.
Tracing Indo-European’s origins
Jones speculated that the birthplace of Proto-Indo-European was probably in what is now Iran, with speakers migrating east towards India and west towards Europe. However, he did not coin the term Indo-European. That name was put forward in 1813 by physicist Thomas Young, a polymath now known for his contribution to deciphering ancient Egyptian scripts, including the two on the Rosetta Stone. While reviewing a compendium of the world’s languages, Young postulated that the Indo-European homeland lay in Central Asia — specifically, in Kashmir, in the northwestern part of the Indian subcontinent.
Today, however, few scholars support this ‘out-of-India’ theory. The chief evidence cited by those who still do, as Spinney and Mallory describe, is the existence of the mysterious Indus Valley civilization, for which archaeological sites were found in the 1920s in northwestern India (which is now Pakistan) and later dated to as early as 3300 bc. Its people used an exquisite script, which adorns the frontispiece of Mallory’s book. However, the Indus script remains undeciphered and offers little convincing evidence that its authors spoke either Sanskrit (as some linguists sympathetic to Hindu nationalism think) or another Indo-European language.

The Indus script, from around 2600 bc, remains undeciphered despite a century of effort.Credit: Alamy
Between around 1870 and 1945, Indo-European homeland theories shifted towards various parts of Europe, as is explored at length by Mallory. Scandinavia was favoured by some on the basis of racial arguments — and subsequently advocated for in the form of Aryanism by Nazi Germany’s regime. Lithuanian language peculiarities point to an origin in the region east of the Baltic Sea, and the Linear Ware pottery culture (5500–4500 bc) points to one around the Danube River.
Genetics provides clues
The Pontic–Caspian Steppe theory, proposing a homeland spanning areas north of the Black Sea and the Caspian Sea, became prominent after about 1960. Spinney accepts it, and so does Mallory, if reluctantly, having spent his PhD pursuing it. “After a half-century of study I am pretty much where I started,” Mallory admits. The theory has its weaknesses, but, of all the potential homelands, it seems the “least bad”, he adds.
What convinced Spinney, Mallory and others was genetics. In 2015, Nature published papers by two sets of authors who used distinct methods to analyse ancient-human DNA (W. Haak et al. Nature 522, 207–211 (2015); M. E. Allentoft et al. Nature 522,167–172; 2015). The samples came from excavated graves of people of the Yamnaya culture, who lived in the Pontic–Caspian Steppe between 8,000 and 3,000 years ago.
The papers concluded that hunter-gatherers, farmers and nomads migrated east towards Asia and west towards Europe around 5,000 years ago, and in Europe had substituted 90% or more of the gene pool with their genetic ancestry. “Most European men alive today, and millions of their counterparts in Central and South Asia, carry Y chromosomes that came from the steppe,” writes Spinney. No other mass migrations — including the displacements caused by the fall of the Roman Empire, the Black Death, the 1918 influenza pandemic or the twentieth-century world wars — had similar “genetic, cultural or linguistic legacies”.
Such genetic support explains, in Mallory’s words, “the astonishing fact that people in Iceland, Ireland, England, Spain, Norway, Germany, Lithuania, Italy, Greece, Ukraine, Iran and India converse in languages that, if we rolled them back over about five thousand years, would merge into a common language”. The findings also suggest that language played a more influential part in the evolution of human societies than did nationalism, empires and wars. But we can never know for certain — because the Yamnaya’s language has disappeared forever.
Why some languages vanish
The reasons why some languages become extinct, and others thrive, are the focus of Rare Tongues. Gibb has lived in six European countries and draws examples from all over the world, across many periods of history.
Enjoying our latest content?
Login or create an account to continue
- Access the most recent journalism from Nature's award-winning team
- Explore the latest features & opinion covering groundbreaking research
or
Sign in or create an accountNature 641, 31-33 (2025)
doi: https://doi.org/10.1038/d41586-025-01296-5
This story originally appeared on: Nature - Author:Andrew Robinson