Abstract:
Several types of clones exist in software systems due to the copy-paste activity, developer limitations, language
restrictions, and software development lifecycle. This work studies the issues of cloning in server side
technologies for web applications. We studied 11 different reasonable size (average over 22K LOC) web
development projects coded in C#, Java, Ruby-on-Rails (ROR), and PHP based on the same set of
requirements. We identified and analyzed simple and structural clones present in these systems in order to
compare the different technologies in terms of number of clones, clone size, clone coverage, reasons behind
creation of clones, and the ratio of refactorable and non-refactorable clones. Our study focused only on the
base languages of these server side technologies. Our analyses show that C# has the highest number of clones
and ROR has the lowest. C# also has the highest and ROR has the lowest percentages of refactorable clones.
PHP has the highest clone coverage and ROR has the lowest. Average clone size for all projects ranges from
49.8 to 77.2 tokens. In terms of clone size, there are no significant differences across projects in the same
technology. The project size, project architecture, and developer approach dictate the percentage of clones
present in a software project. The use of frameworks and design patterns helps control generation of clones.