This paper presents a statistical analysis of 20 opens ource object-oriented systems with the purpose of detecting differences in metrics distribution between Java and Python projects. We selected ten Java projects from the Java Qualitas Corpus and ten projects written in Python. For each system, we considered 10 class-level software metrics. We performed a best fit procedure on the empirical distributions through the log-normal distribution and the double Pareto distribution to identify differences between the two languages. Even though the statistical distributions for projects written in Java and Python may appear the same for lower values of the metric, performing the procedure with the double Pareto distribution for the Number of Local Methods metric reveals that major differences can be noticed along the queue of the distributions. On the contrary, the same analysis performed with the Number of Statements metric reveals that only the initial portion of the double Pareto distribution shows differences between the two languages. In addition, the dispersion parameter associated to the log-normal distribution fit for the total Number Of Methods can be used for distinguishing Java projects from Python projects.
A Statistical Comparison of Java and Python Software Metric Properties
Ortu, M;Marchesi, M
2016-01-01
Abstract
This paper presents a statistical analysis of 20 opens ource object-oriented systems with the purpose of detecting differences in metrics distribution between Java and Python projects. We selected ten Java projects from the Java Qualitas Corpus and ten projects written in Python. For each system, we considered 10 class-level software metrics. We performed a best fit procedure on the empirical distributions through the log-normal distribution and the double Pareto distribution to identify differences between the two languages. Even though the statistical distributions for projects written in Java and Python may appear the same for lower values of the metric, performing the procedure with the double Pareto distribution for the Number of Local Methods metric reveals that major differences can be noticed along the queue of the distributions. On the contrary, the same analysis performed with the Number of Statements metric reveals that only the initial portion of the double Pareto distribution shows differences between the two languages. In addition, the dispersion parameter associated to the log-normal distribution fit for the total Number Of Methods can be used for distinguishing Java projects from Python projects.File | Dimensione | Formato | |
---|---|---|---|
a_Statistiscal_comparison.pdf
Solo gestori archivio
Tipologia:
versione post-print (AAM)
Dimensione
1.43 MB
Formato
Adobe PDF
|
1.43 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.