The spread of free/libre/open source software (FLOSS) and the openness of its development model offer researchers a valuable source of information regarding software data. The creation of large portals, which host a vast amount of FLOSS projects make it easy to create large datasets with valuable information regarding the FLOSS development process. In addition initiatives such as FLOSSMole provide researchers with a single point and continuing access to those data. Up to now the majority of datasets from FLOSSMole offered data regarding the development process and not the code itself. From February 2007 FLOSSMole offers data donated
by SourceKibitzer, which contain source code metrics for FLOSS projects written in Java. In this paper we provide
a premilinary analysis on those data using machine learning techniques, such as classification rules and decision trees. Using the first available data from February 2007, we tried to build rules that can be used in order to estimate the future values of metrics offered for March. Here we present some preliminary results that are encouraging and deserve to be further analyzed in future releases of SourceKibitzer datasets.
|