We consider a large size population which evolves according to neutral haploid reproduction. The genealogical tree is very complex and genealogical distances are distributed according to a probability density which remains random in the limit of a large population. This density which varies for different populations, and varies for the same population at different times, has a distribution that we find out. The evolution of languages closely resembles the evolution of haploid organisms or mtDNA. This similarity allows for the construction of languages trees. The key point is the definition of a distance between pairs of languages. Here we use a renormalized Levenshtein distance among words with the same meaning and we average on all the words contained in a list. Assuming a constant rate of mutation, these lexical distances are logarithmically proportional, in average, to genealogical distances. The relation between lexical and genealogical distances is then further investigated in order to take into account the intrinsic randomness associated with the lexical evolution. We test our method by constructing the trees of the Indo-European and Austronesian groups.

Family trees: languages and genetics

Petroni, F.;
2009-01-01

Abstract

We consider a large size population which evolves according to neutral haploid reproduction. The genealogical tree is very complex and genealogical distances are distributed according to a probability density which remains random in the limit of a large population. This density which varies for different populations, and varies for the same population at different times, has a distribution that we find out. The evolution of languages closely resembles the evolution of haploid organisms or mtDNA. This similarity allows for the construction of languages trees. The key point is the definition of a distance between pairs of languages. Here we use a renormalized Levenshtein distance among words with the same meaning and we average on all the words contained in a list. Assuming a constant rate of mutation, these lexical distances are logarithmically proportional, in average, to genealogical distances. The relation between lexical and genealogical distances is then further investigated in order to take into account the intrinsic randomness associated with the lexical evolution. We test our method by constructing the trees of the Indo-European and Austronesian groups.
2009
Random processes; fluctuation phenomena; dynamics of social systems; dynamics of evolution; networks and genealogical trees
File in questo prodotto:
File Dimensione Formato  
Petroni_et_al_Markov2009.pdf

Solo gestori archivio

Tipologia: versione editoriale
Dimensione 453.68 kB
Formato Adobe PDF
453.68 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/14719
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact