"We employ a unique data set consisting of 71 open source projects hosted at the SourceForge web site. The 71 projects in the sample were chosen (in January 2000)"
"This sample was observed over an 18-month period from January 2002 through the middle of 2003, with data collected at 2-month intervals."
"We are grateful to NERA for providing us with the data."
"Although we only have data on a relatively small sample of the projects hosted SourceForge, the sample is unique because of data on lines of code as well as data on different versions of the program. The latter is a potentially important control variable, since a change in version may necessitate additional lines of code.
Our data set contains information on the size of the open source projects in the form of source lines of code (SLOC). Using SLOC as a performance measure is not always ideal; nevertheless, this performance measure is employed in the profession and the literature.15 For our purposes, SLOC is in fact an ideal measure, because we want to measure the effort that is put into the project, rather than whether a project succeeds."
|