Genetic and machine learning algorithms in optimization in classification problems
Keywords:
Elitist Genetic Algorithm, K-Nearest Neighbors (KNN), Random ForestAbstract
There are various types of optimization algorithms as well as classification algorithms. Among such algorithms, the Elitist Genetic Algorithm represents optimization algorithms, while KNN, Decision Tree, and Random Forest represent classification algorithms. The objective of this work is to demonstrate, through an application, how it is possible to use these two classes of algorithms together not only to optimize the number of classification accuracies but also to reduce the dimension of the problem. The scenario used is the classification of Brazilian credit cooperatives using the text from their bylaws. The word bank used consisted of 8,293 words, which was reduced to 1,037 words throughout the process. The classification accuracy was higher than 81\% using KNN and 82\% using Random Forest with 1,936 words.
References
ACOSTA-GONZ ́aLEZ, E.; FERN ́aNDEZ-RODR ́ıGUEZ, F. Model selection via genetic
algorithms illustrated with cross-country growth data. Empirical Economics, n. 33, p. 313–337,
BOUCHARD, M.J.; ROUSSELI`eRE, D.; GUERNIC, M. Le. Conceptual Framework for the
purpose of Measurement of Cooperatives and its Operationalization. [S.l.], 2017. Dispon ́ıvel em:
BOUCHARD, M. J. et al. Statistics on cooperatives: concepts, classification, work and
economic contribution measurement. ILO, CIRIEC, COPAC: Geneva, Switzerland, 2020.
CAMPOS, V.S.M.; PEREIRA, A.G.C.; CRUZ, J.A. Rojas. Modeling the genetic algorithm by
a non-homogeneous markov chain: Weak and strong ergodicity. Theory of Probability and its
Applications, v. 57, p. 185–192, 2012.
EUM, H.; CARINI, C.; BOUCHARD, M. J. Classification of cooperatives. a proposed
typology. Statistics on cooperatives: Concepts, classification, work and economic contribution
measurement, p. 13–22, 2020.
HOLLAND, J.H. Adaptation in natural and artificial systems. [S.l.]: Ann Arbor: The
University of Michigan Press, 1975.
JAMES, G. et al. An introduction to statistical learning with R applications. [S.l.]: Springer,
KASAMBARA, A. Machine Learning Essentials : Practical Guide in R. [S.l.]: Published by
STHDA, 2017, 2017.
KI-MOON, B. Organiza ̧c ̃ao das Na ̧c ̃oes Unidas (ONU), Secret ́ario Geral (2007-2017: Ban
Ki-Moon). Mensagem do Secret ́ario Geral por ocasi ̃ao da celebra ̧c ̃ao do Dia Mundial da
Alimenta ̧c ̃ao. 16 out 2012. 2012. Online. Acessado em 19/02/2024, ⟨https://shorturl.at/qruI7⟩.
LACERDA, E. G.; CARVALHO, A. C.; LUDERMIR, T. B. Model selection via genetic
algorithms for rbf networks. Journal of Intelligent & Fuzzy Systems, IOS Press, v. 13, p.
–122, 2002.
LANTZ, B. Machine Learning with R : Expert techniques for predictive modeling. [S.l.]: Packt,
Birmingham, 2019.
LOPES, L. P. Predi ̧c ̃ao do pre ̧co do caf ́e naturais brasileiro por meio de modelos de statistical
machine learning. Sigmae, Alfenas, v. 7, p. 1–16, 2018.
OCB. Ramos do Cooperativismo - conhe ̧ca nossa nova organiza ̧c ̃ao. Bras ́ılia,. [S.l.], 2019.
Acessado em 19/02/2024, ⟨https://shorturl.at/cDZ19⟩.
PATERLINI, S.; MINERVA, T. Regression model selection using genetic algorithms. In:
Proceedings of the 11th WSEAS international conference on nural networks and 11th WSEAS
international conference on evolutionary computing and 11th WSEAS international conference
on Fuzzy systems. World Scientific and Engineering Academy and Society (WSEAS). [S.l.:
s.n.], 2010. p. 19–27.
PEREIRA, A. G. C.; ANDRADE, B.B. On the genetic algorithm with adaptive mutation rate
and selected statistical applications. Computational Statistics (Zeitschrift), v. 30, p. 131–150,
PEREIRA, A. G. C. et al. Convergence analysis of an elitist non-homogeneous genetic
algorithm with crossover/mutation probabilities adjusted by a fuzzy controller. Chilean
Journal of Statistics, v. 9, p. 19–32, 2018.
. On the convergence rate of the elitist genetic algorithm based on mutation probability.
Communications in Statistics - Theory and Methods, v. 49, p. 769–780, 2019.
RUDOLPH, G. Convergence analysis of canonical genetic algorithms. IEEE Transactions on
Neural Networks, v. 5, p. 96–101, 1994.
Downloads
Published
How to Cite
Issue
Section
License
Proposta de Política para Periódicos de Acesso Livre
Autores que publicam nesta revista concordam com os seguintes termos:
- Autores mantém os direitos autorais e concedem à revista o direito de primeira publicação, com o trabalho simultaneamente licenciado sob a Licença Creative Commons Attribution que permite o compartilhamento do trabalho com reconhecimento da autoria e publicação inicial nesta revista.
- Autores têm autorização para assumir contratos adicionais separadamente, para distribuição não-exclusiva da versão do trabalho publicada nesta revista (ex.: publicar em repositório institucional ou como capítulo de livro), com reconhecimento de autoria e publicação inicial nesta revista.
- Autores têm permissão e são estimulados a publicar e distribuir seu trabalho online (ex.: em repositórios institucionais ou na sua página pessoal) a qualquer ponto antes ou durante o processo editorial, já que isso pode gerar alterações produtivas, bem como aumentar o impacto e a citação do trabalho publicado (Veja O Efeito do Acesso Livre).