Predicting the price of Brazilian Natural coffee using statistical machine learning models

Authors

  • Lucas Pereira Lopes USP - Universidade de São Paulo

Keywords:

Modelling, Coffee Price, Commodities, Brazil, Machine Learning

Abstract

The knowledge of price behavior becomes extremely useful in decision-making by coffee producers. However, the conclusions about finding determinants of prices in the agricultural area are ambiguous in the literature, because problems in the methodologies adopted, errors related to selected variables and ignored statistical hypotheses are some of the reasons for divergent results. Given the importance of coffee in the Brazilian economy, the main of this work is to study models known as Statistical Machine Learning for the Brazilian coffee price prediction. As a result, in descending order, the models that obtained the best predictive powers were Support Vector Machine (SVM) with Linear Kernel, followed by LASSO, SVM with Gaussian Kernel, Boosting, Regression Tree, K-NN, and Random Forest models. In addition, most of the models obtained a high correlation between their results and corroborated the choice of the variables
that most affect the price. We further believe that this approach helps to produce more robust conclusions about the determinants of coffee price variability and thus is a potential tool in risk management and control by the administrators.

Author Biography

Lucas Pereira Lopes, USP - Universidade de São Paulo

Instituto de Ciências Matemáticas e de Computação - ICMC USP

References

ALMEIDA, F. M. Previsão do Preço do Café no Brasil. Dissertação de mestrado em Ciências Contábeis da FUCAPE, 2010.

ALMEIDA, F. M.; SILVA, O. M.; BRAGA, M. J. O comércio internacional do café brasileiro: a influência dos custos de transporte. Revista de Economia e Sociologia Rural, v. 49, n. 2, 2011.

ALVES, L. G.; RIBEIRO, H. V.; RODRIGUES, F. A. Crime prediction through urban metrics and statistical learning. arXiv preprint arXiv:1712.03834, 2017.

BENEDETTI, J. K. On the nonparametric estimation of regression functions. Journal of the Royal Statistical Society. Series B (Methodological), p. 248-253, 1977.

BOX, G. E. P.; JENKINS, G. M. Time series analysis, control, and forecasting. San Francisco, CA: Holden Day, v. 3226, n. 3228, p. 10, 1976.

BREIMAN, L. et al. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, v. 16, n. 3, p. 199-231, 2001.

BÜHLMANN, P. Bagging, boosting and ensemble methods. In: Handbook of Computational Statistics. Springer Berlin Heidelberg, 2012.

BÜHLMANN, P.; HOTHORN, T. Boosting algorithms: Regularization, prediction and model fitting. Statistical Science, p. 477-505, 2007.

COCAPEC. Disponível em: . Acesso em: 17 jan. 2018.

DIAS, L. O.; SILVA, M. S. Determinantes da demanda internacional por café brasileiro. Revista de Política Agrícola, v. 24, n. 1, p. 86-98, 2015.

DONG, B.; CAO, C.; LEE, S. E. Applying support vector machines to predict building energy consumption in tropical region. Energy and Buildings, v. 37, n. 5, p. 545-553, 2005.

ELITH, J.; LEATHWICK, J. R.; HASTIE, T. A working guide to boosted regression trees. Journal of Animal Ecology, v. 77, n. 4, p. 802-813, 2008.

FRAGA, C. C. Resenha histórica do café no Brasil. Agricultura em São Paulo, v. 10, n. 1, p. 1-21, 1963.

FRIEDMAN, J.; HASTIE, T.; TIBSHIRANI, R. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, v. 33, n. 1, p. 1, 2010.

FURTADO, C. Raízes do Subdesenvolvimento. Civilização Brasileira, Rio de Janeiro, 2003.

GUTIERREZ, C. E. C.; ALMEIDA, F. M. Modelagem e Previsão do Preço do Café Brasileiro. Revista de Economia, v. 39, n. 2, 2013.

HONG, T. T. K. Effects of Exchange Rate and World Prices on Export Price of Vietnamese Coffee. International Journal of Economics and Financial Issues, v. 6, n. 4, 2016.

HULL, J. et al. OTC derivatives and central clearing: can all transactions be cleared?. Financial Stability Review, v. 14, p. 71-78, 2010.

IZBICKI, R. Machine Learning sob a ótica estatística, 2017. Disponível em: < rizbicki.wordpress.com/teaching/>. Acesso em: 20 dez. 2017.

KAVAKLIOGLU, K. Modeling and prediction of Turkey’s electricity consumption using Support Vector Regression. Applied Energy, v. 88, n. 1, p. 368-375, 2011.

LARK, R. M. Some tools for parsimonious modelling and interpretation of within-field variation of soil and crop systems. Soil & Tillage Research, v.58, n.3-4, p.99-111, 2001.

LEAMER, E. E. et al. The Heckscher-Ohlin model in theory and practice, 1995.

LIAW, A. et al. Classification and regression by randomForest. R news, v. 2, n. 3, p. 18-22, 2002.

MARGARIDO, M. A.; BARROS, G. S. C. Transmissão de preços agrícolas internacionais para preços agrícolas domésticos no Brasil. Instituto de Economia Agrícola, São Paulo, 2000.

MIRANDA, A. P.; CORONEL, D. A.; VIEIRA, K. M. Previsão do mercado futuro do café arábica utilizando redes neurais e métodos econométricos. Estudos do CEPE, p. 66-98, 2013.

MOURA, M. D. C. et al. Failure and reliability prediction by support vector machines regression of time series data. Reliability Engineering & System Safety, v. 96, n. 11, p. 1527-1534, 2011.

PINDYCK, R. S.; RUBINFELD, D. L. Econometric models and economic forecasts. Boston, 1998.

PONTIL, M. Learning with reproducing kernel Hilbert spaces: a guide tour. Bulletin of the Italian Artificial Intelligence Association, AI* IA Notizie, 2003.

R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2016.

RAJASEKARAN, S.; GAYATHRI, S.; LEE, T. L. Support vector regression methodology for storm surge predictions. Ocean Engineering, v. 35, n. 16, p. 1578-1587, 2008.

RIBEIRO, K. C. S.; SOUZA, A. F.; ROGERS, P. Preços do café no brasil: variáveis preditivas no mercado à vista e futuro. Revista de Gestão USP, São Paulo, 2006.

SAFAVIAN, S. R.; LANDGREBE, D. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics, v. 21, n. 3, p. 660-674, 1991.

SCHÖLKOPF, B. The kernel trick for distances. In: Advances in neural information processing systems. 2001. p. 301-307.

SMOLA, A.; VAPNIK, V. Support vector regression machines. Advances in neural information processing systems, 9:155–161, 1997.

STONE, M. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series B (Methodological), 1974.

TIBSHIRANI, R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), p. 267-288, 1996.

USDA - UNITED STATES DEPARTMENT OF AGRICULTURE. Foreign Agricultural Service (FAS). Grain: world markets and trade. United States: USDA/FAS, 2017. Disponível em: <https://apps.fas.usda.gov/psdonline/circulars/grain.pdf>. Acesso em: 28 ago. 2018.

Published

14-09-2018

How to Cite

Lopes, L. P. (2018). Predicting the price of Brazilian Natural coffee using statistical machine learning models. Sigmae, 7(1), 1–16. Retrieved from https://publicacoes.unifal-mg.edu.br/revistas/index.php/sigmae/article/view/699

Issue

Section

Data Science & Machine Learning