Endangered languages are often clustered in small geographic areas. These areas are Language Hotspots, places with:
High genetic diversity — Most accounts of language diversity only look at the raw number of languages in an area. Our calculation of genetic diversity also considers how many genetic units are represented. A genetic unit is a grouping like the Romance languages (French, Spanish, Italian, Portuguese, etc., all descended from Latin).
By looking at genetic diversity, we find areas in crucial need of scientific study. When an entire genetic unit dies we lose much more information than if we lose one branch of a family but not its close relatives. For example, if all speakers of Portuguese died, we would lose a lot of information, but we could learn a lot about the language by studying Spanish. If all the Romance languages vanished, though, we would not be able to learn much about them, even by studying more distantly related languages like German.
High levels of endangerment — Language endangerment cannot be measured precisely. The number of speakers does not necessarily determine how endangered a language is. If those speakers include young children and the language is used in all parts of daily life, then a few speakers can maintain a language. A language with only elderly speakers that has not been passed on to younger generations may be endangered. We use a five-star scale to rank endangerment. The levels in this ranking are:
No stars | Extinct | No speakers |
1 star | Moribund | Youngest speakers over age 60 |
2 stars | Highly endangered | Youngest speakers over age 40 |
3 stars | Endangered | No children speak the language |
4 stars | Threatened | Small community undergoing shift. Not currently endangered but a small change in circumstances could lead to endangerment. |
5 stars | Thriving | Stable or growing community with speakers of all ages who use the language in all or most spheres of life |
Low levels of documentation — We rank how much accessible information exists about a language. Examples of documentation are: writing systems, grammars, dictionaries, texts, and audio and video materials. We only count materials that are accessible, meaning resources that have been published and are in print, and have been translated into a widely-known language. These are ranked on a five-point scale, with a language receiving a point for:
- Texts with translation
- Short scholarly articles
- Descriptive grammar
- Lexicon (word list) or dictionary
- Audio/Video materials with annotation
Our model ranks hotspots as follows:
Rating by Genetic Index
- Taiwan, Northern Philippines (.700)
- Southern South America (.417)
- Northern & Central Australia (.405)
- Central South America (.398)
- Eastern Siberia (.391)
- Oklahoma-Southwest (.372)
- Caucasus (.250)
- Central Siberia (.24)
- Northern South America (.228)
- Northwest Pacific Plateau (.226)
- Southeast Asia (.138)
- Southern Africa (.092)
- Eastern Africa (.080)
- Meso-America (in progress)
- Western Africa (in progress)
- Western Melanesia (in progress)
- Eastern Melanesia (in progress)
- East India, Malaysia (in progress)
Rating by Threat Level
- Northern & Central Australia
- Central South America
- Eastern Siberia
- Northwest Pacific Plateau
- Southern South America
- Central South America
- Oklahoma-Southwest
- Northern South America
- Western Melanesia
- Caucasus
- Taiwan, Northern Philippines
- Southeast Asia
- Eastern Africa
- Meso-America
- Southern Africa
- Western Africa
- Eastern Melanesia
- East India, Malaysia
- threat level key:
- severe
- high
- medium
- low
How many languages are there in the world?
Our calculations rely on knowing the total number of languages in the world, and how many people speak each language. There is no source that reliably lists every language in the world. We use a number of sources to obtain our figures, which are updated as we find newer or more accurate counts of languages. These sources are:
Our own expeditions — This is the only reliable source of data. Undertaking our own research is the only way to ensure that we have an accurate, up-to-date picture of language communities.
National censuses — These often over- or undercount ethnic groups. They may lump small groups into a generic "other" category if they do not make up a certain percentage of the population. Some censuses count ethnic groups rather than language speakers. This UN chart is a collection of census data: World Population by Ethnic Group.
Ethnologue — The Ethnologue is the most extensive list of the world's languages. It is compiled by the Summer Institute of Linguistics, a missionary group dedicated to translating the Christian Bible into as many languages as possible. The Ethnologue relies on tipsters to send in numbers of speakers, rather than performing active collection. Their numbers are often inflated by counting the number of people who identify with an ethnic group, rather than the number of speakers. If there is language shift among a community, the number of people in the ethnic group will always be larger than the number of speakers.
For more information on counting languages, the Linguistic Society of America has a pamphlet on the topic: How Many Languages Are There in the World?
By our calculations, there are approximately 470 genetic units within hotspots, compared with approximately 500-550 genetic units in the entire world. That means that most of the genetic units in the world are represented in hotspots, even though they only cover small geographic regions. Click here to download a list of all genetic units used in our calculations.