"First, we selected the projects to initially target, using several criteria to get a broad picture of the open source landscape. Second, we collected the actual data, using a framework of parsers and some manual inspection. Third, we standardized and inserted the data into a database for later use."
"but we plan to eventually cross reference our list of projects with existing open source project information (such as FLOSSmole) to take advantage of the work already done by other researchers."
"For each release, we collected the following data: the project it belonged to, the date the release was published, the type of release, the release label (version number) and the source of the data"
discussion of their difficulties
"We conclude that programmatically creating a release history database from existing open source data is not trivial,"
"We have currently collected 1579 distinct releases from 22 different open source projects"