Outliers labeling: Faleschini method for univariate quantitative data

Authors

  • Luis Fernando Maia Lima UNIR
  • João Marcelo Brazão Protázio Universidade Federal do Pará

Keywords:

Outliers, Outliers Labeling, Faleschini

Abstract

The objective of this work is to present Faleschini's method for labeling outliers for univariate quantitative
data. Faleschini's method uses the mean, standard deviation, moment coefficients of asymmetry and kurtosis, resulting in
a fourth degree equation, where the smallest root and the largest root can be adopted as outlier labelers. Faleschini's
method is compared with Tukey's method and the recent method of Adil and Zaman. For continuous theoretical
distributions that do not present parameters or mean, standard deviation, asymmetry or kurtosis, “pseudo” parameters based on percentiles and quartiles are proposed, and these “pseudo” parameters are also used for discrete distributions.
and for sample data, as well as a proposal for calculating these percentiles and quartiles. The article ends by pointing
out that Faleschini's method has a conceptual advantage, as in addition to taking the parameters of location, dispersion, asymmetry and kurtosis, it does not distinguish between distributions with a light or heavy tail. “pseudo” parameters as extension to other fields, such as multivariate analysis, time series and others.

References

ADIL, Iftikhar Hussain; IRSHAD, Ateeq ur Rehman. A modified approach for detection of outliers. Pakistan

Journal of Statistics and Operation Research, Lahore, v. 11, n. 1, p. 91-102, Apr. 2015.

ADIL, Iftikhar Hussain; ZAMAN, Asad. Outliers detection in skewed distributions: split sample skewness

based boxplot. Economic Computation and Economic Cybernetics Studies and Research, Bucharest, v.

, n. 3, p. 279-296, 2020.

ANDRADE, Larissa Ribeiro de; CIRILLO, Marcelo Angelo; BEIJO, Luiz Alberto. Proposal of a bootstrap

procedure using measures of influence in non-linear regression models with outliers; doi:

4025/actascitechnol.v36i1.17564. Acta Scientiarum. Technology, v. 36, n. 1, p. 93-99, 7 jan. 2014.

BABURA, Babangida Ibrahim; ADAM, Mohd Bakri; FRITIANTO, Anwar; SAMAD, Abdul Rahim Abdul.

Modified boxplot for extreme data. AIP Conference Proceedings, New York, v. 1842, issue 1, May 2017.

BARBOSA, Josino José; PEREIRA, Tiago Martins; OLIVEIRA, Fernando Luiz Pereira de. Uma proposta

para identificação de outliers multivariados. Ciência e Natura, [S. l.], v. 40, p. e40, 2018. DOI:

5902/2179460X29535. Disponível em: https://periodicos.ufsm.br/cienciaenatura/article/view/29535

Acesso em: 21 ago. 2023.

BARBOSA, Josino José; DUARTE, Anderson Ribeiro; MARTINS, Helgem Souza Ribeiro. A performance

evaluation in multivariate outliers identification methods. Ciência e Natura, [S. l.], v. 42, p. e16, 2020. DOI:

5902/2179460X41662. Disponível em: https://periodicos.ufsm.br/cienciaenatura/article/view/41662

Acesso em: 21 ago. 2023.

BARNETT, Vic; LEWIS, Toby. Outliers in statistical data. 3 ed. Chichester: John Wiley & Sons, 1994.

BRYS, Guy; HUBERT, Mia; STRUYF, Anja. A robust measure of skewness. Journal of computacional and

graphical statistics, v. 13, n. 4, p. 996-1017, December 2004.

BRUFFAERTS, Christopher; VERARDI, Vincenzo; VERMANDELE, Catherine. A generalized boxplot for

skewed and heavy-tailed distributions. Statistics and Probability Letters, Amsterdam, v. 95, p. 110-117,

Dec. 2014.

CARLING, Kenneth. Resistance outlier rules and the non-Gaussian case. Computational Statistics & Data

Analysis. V. 33, n. 3, p. 249-258, 2000.

CHAMBERS, J. M. et al. Graphical methods for data analysis. USA: Wadsworth, 1983.

FALESCHINI, Luigi. Su alcune proprietà dei momenti impegati nello studio dela variabilità, assimetria e

curtosi. Statistica, anno VIII, n. 4, p. 503-513, Ottobre-Dicembre 1948.

FIORI, Anna Maria; ZENGA, Michele. The meaning of kurtosis, the influence function and an early intuition

by L. Faleschini. Statistica, anno LXV, n. 2, 2005, p. 135-144.

HUBERT, M.; VANDERVIEREN, E. An adjusted boxplot for skewed distributions. Computational

Statistics & Data Analysis, Amsterdam, v. 52, n. 12, p. 5186-5201, Aug. 2008.

JONES, M. C.; ROSCO, J. F.; PEWSEY, Arthur. Skewness-Invariant Measures of Kurtosis. The American

Statistician, v. 65, n. 2 , p. 89-95, May 2011.

KIMBER, A. C. Exploratory data analysis for possibly censored data from skewed distributions. Applied

Statistics. V. 39, n. 1, p. 21-30, 1990.

LIMA, Luís Fernando Maia; MAROLDI, Alexandre Masson; SILVA, Dávilla Vieira Odízio da; HAYASHI,

Carlos Roberto Massao; HAYASHI, Maria Cristina Piumbato Innocentini. Métricas científicas em estudos

bibliométricos: detecção de outliers para dados univariados. Em Questão, Porto Alegre, v. 23, Edição Especial

EBBC, p. 254-273, jan. 2017.

LIMA, Luís Fernando Maia; MAROLDI, Alexandre Masson; SILVA, Dávilla Vieira Odízio da; HAYASHI,

Carlos Roberto Massao; HAYASHI, Maria Cristina Piumbato Innocentini. A influência de outliers nos estudos

métricos da informação: uma análise de dados univariados. Em Questão, Porto Alegre, v. 24, Edição Especial

EBBC, p. 216-235, 2018.

PEREIRA, Tiago Martins; CIRILLO, Marcelo Ângelo; OLIVEIRA, Fernando Luiz Pereira de. Chisquaremax

rotation criterion in factor analysis: a Monte Carlo assessment of the effect of outliers. Acta Scientiarum.

Technology, v. 36, n. 4, p. 643-649, 12 set. 2014.

RODRIGUES, Paulo Jorge Canas; ALMEIDA, Rafael; MUSTAFA, Kézia. The usefulness of robust

multivariate methods: A case study with the menu items of a fast food restaurant chain. Ciência e Natura, [S.

l.], v. 42, p. e17, 2020. DOI: 10.5902/2179460X39892. Disponível em:

https://periodicos.ufsm.br/cienciaenatura/article/view/e18%27 Acesso em: 21 ago. 2023.

ROSADO, Fernando. Outliers em dados estatísticos. Lisboa: Sociedade Portuguesa de Estatística, 2006.

SILVA, Kelly C. Ramos da; OLIVEIRA, Helder L. Costa de; CARVALHO, André C.P.L.F. de. Performance

evaluation of outlier rules for labelling outliers in multidimensional dataset. International Journal of

Business and Data Mining, v. 19, n. 2, p. 135-152, 21 July 2021.

SILVA, Kelly Cristina Ramos da. Regras robustas para rotular outliers em dados de caudas leves e caudas pesadas. Tese de Doutorado. 2019. Disponível em: https://teses.usp.br/teses/disponiveis/55/55134/tde-

-145141/pt-br.php Acesso em 12 ago. 2023.

TAMBAY, Jean-Louis. An integrated approach for the treatment of outliers in sub-annual economic surveys.

American Statistical Association Proceedings of the Survey Research Methods. Alexandria, VA:

American Statistical Association, p. 229-234, 1988.

TRIOLA, Mario F. Introdução à Estatística. 10. ed. Rio de Janeiro: LTC, 2012.

TUKEY, John Wilder. Exploratory data analysis. Reading: Massachusetts, Addison-Wesley, 1977.

VELLEMAN, P. F.; HOAGLIN, D.C. Applications, basics, and computings of exploratory data analysis.

Boston: Duxbury, 1981.

VELOSO, Manoel Vitor de; CIRILLO, Marcelo Angelo. Principal components in the discrimination of

outliers: A study in simulation sample data corrected by Pearson’s and Yates ́s chi-square distance. Acta

Scientiarum. Technology, v. 38, n. 2, p. 193-200, 1 Apr. 2016.

VISSOTTO JUNIOR, Dornelles; DIAS, Nelson Luís. Método Empírico para Determinação de outliers em

Séries de Fluxos de dados Micrometeorológicos Pós-processados. Ciência e Natura, [S. l.], v. 35, p. 169–171,

DOI: 10.5902/2179460X11585. Disponível em:

https://periodicos.ufsm.br/cienciaenatura/article/view/11585 Acesso em: 21 ago. 2023.

WALKER, M. L; DOVOEDO, Y. H.; CHAKRABORTI, S.; HILTON, C. W. An improved boxplot for

univariate data. The American Statistician, v. 72, n. 4, p. 348-353, November 2018.

Published

21-08-2024

How to Cite

Lima, L. F. M., & João Marcelo Brazão Protázio. (2024). Outliers labeling: Faleschini method for univariate quantitative data. Sigmae, 13(2), 128–139. Retrieved from https://publicacoes.unifal-mg.edu.br/revistas/index.php/sigmae/article/view/2283

Issue

Section

Probability and Statistics