Linguistic compositions highly volatile in Portuguese

Autores

  • Jesús Enrique García Universidade Estadual de Campinas
  • Ramin Gholizadeh Universidade Estadual de Campinas
  • Verónica Andrea González-López Universidade Estadual de Campinas

DOI:

https://doi.org/10.20396/cel.v59i3.8651002

Palavras-chave:

And phrases. Bayesian information criterion. Partition Markov models. Proximity between N-grams.

Resumo

In this paper we use a distance d between sequences of N-grams to identify N-grams that show a different performance when comparing two sequences of N-grams. With this tool, we inspect written texts of European Portuguese dated between 16th century and 19th century. We identify the most voluble N-grams throughout the period and we also identify N-grams that should be considered when studying the linguistic changes from Classical Portuguese to Modern Portuguese. We find that 2-grams composed by unstressed monosyllables followed by paroxytone words (and viceversa) change markedly, from one text to the next, during the whole period. Stressed monosyllabic words (SMW) reveal discrepancies between written texts of the 16th century when compared with texts from the beginning of the 17th century. 2-grams of (i) SMW followed by paroxytone or oxytone word and (ii) paroxytone dissyllabic word or oxytone word followed by a SMW are some of them.

Downloads

Não há dados estatísticos.

Biografia do Autor

Jesús Enrique García, Universidade Estadual de Campinas

Prof. Dr. do Departamento de Estatisticas da Universidade Estadual de Campinas - IMECC

Ramin Gholizadeh, Universidade Estadual de Campinas

Prof. Dr. do Departamento de Estatisticas da Universidade Estadual de Campinas - IMECC

Verónica Andrea González-López, Universidade Estadual de Campinas

Profa. Dra. do Departamento de Estatisticas da Universidade Estadual de Campinas - IMECC

Referências

S. Frota, C. Galves, M. Vigario, V. A. González-López and B. Abaurre, The phonology of rhythm from Classical to Modern Portuguese, Journal of Historical Linguistics (2012) 2.2 173-207.

C. Galves and P. Faria, Tycho Brahe Parsed Corpus of Historical Portuguese. http://www.tycho.iel.unicamp.br/tycho/corpus/en/index.html (2010).

A. Galves, C. Galves, J. García, N. L. Garcia and F. Leonardi, Context tree selection and linguistic rhythm retrieval from written texts, The Annals of Applied Statistics (2012) 6(1) 186-209.

Jesus E. García and V. A. González-López, Detecting regime changes in Markov models, New Trends in Stochastic Modeling and Data Analysis (2015) (in chapter 2, p. 103).

Jesus E. García and V. A. González-López, Optimal Partition of Markov Models and Automatic Classication of Languages, Stochastic and Data Analysis Methods and Applications in Statistics and Demography (2016) (in chapter 5, p. 207).

Jesus E. García and V. A. González-López, Consistent Estimation of Partition Markov Models, Entropy (2017) 19 160.

Jesus E. García, V. A. González-López and F. H. Kubo de Andrade, Dissimilarity between Markovian Processes Applied to Industrial Processes, AIP Conference Proceedings (2017)1863 220002.

C.D. Manning and H. Schütze, Foundations of statistical natural language processing, Vol. 999. Cambridge: MIT press, (1999).

J. Mehler and M. Nespor, Linguistic rhythm and the acquisition of language, Vol. 3, pp. 213-222. Oxford: Oxford University Press, (2004).

G. Schwarz, Estimating the dimension of a model, The annals of statistics, (1978) 6(2) 461-464.

Jesus E. García: Department of Statistics, University of Campinas, Campinas, SP, CEP 13083-859, Brazil - E-mail address: jg@ime.unicamp.br

R. Gholizadeh: University of Campinas, Campinas, SP, CEP: 13083-859, Brazil - E-mail address: 1ramin.gholizadh@gmail.com

V. A. González-López: Department of Statistics, University of Campinas, Campinas, SP, CEP: 13083-859, Brazil - E-mail address: veronica@ime.unicamp.br

Downloads

Publicado

2017-12-04

Como Citar

GARCÍA, J. E.; GHOLIZADEH, R.; GONZÁLEZ-LÓPEZ, V. A. Linguistic compositions highly volatile in Portuguese. Cadernos de Estudos Linguísticos, Campinas, SP, v. 59, n. 3, p. 617–630, 2017. DOI: 10.20396/cel.v59i3.8651002. Disponível em: https://periodicos.sbu.unicamp.br/ojs/index.php/cel/article/view/8651002. Acesso em: 3 jul. 2022.