A Technique of Code Clone Detection based on Defined Mechanism for Threshold Calculation


  • R. Mehboob UET, Taxila
  • S. Shabbir UET, Taxila
  • A. Javed UET, Taxila


Over the past few years the revolution in the technology and use of programming languages for product development has made code reusability a common practice. Consequently the problem of code cloning is also increasing leading to redundancy and increased maintenance cost. The real motivation of the proposed research work is to identify code clones from pair of codes that are going to be utilized for a project under consideration. Existing practices such as control flow graphs (CFGs) and abstract syntax tree (AST) promote a high level of abstraction by masking the inner details of the code. Therefore, a strategy is needed to define the mechanism for calculation of the threshold value to identify clones by considering the inner details of the code. This paper presents a defined mechanism for the computation of threshold for code clone detection. Moreover, the inner details of the code are examined by performing the comparative analysis of tokens and conditional clauses. The proposed technique eliminates high level of abstraction caused by the use of CFGs. The proposed method is tested on custom dataset of different sorting algorithms. Experimental results indicate the effectiveness of the proposed method for code clone detection.

Author Biographies

R. Mehboob, UET, Taxila

Software Engineering Department

S. Shabbir, UET, Taxila

Software Engineering Department

A. Javed, UET, Taxila

Software Engineering Department


P. Pradhan, A. K. Dwivedi and S. K. Rath, "Detection of design pattern using graph isomorphism and normalized cross correlation", 8th Int. Conf. Contemp. Comput., pp. 208–213, 2015.

W. Li, D. Li, C. Qiu and J. Hou, "Efficient metric vector-based code clone detection using function-calling tree", Int. J. Hybrid Inf. Technol., vol. 8, no. 11, pp. 139–150, 2015.

S. Gupta and P. C. Gupta, "A novel approach to detect duplicate code blocks to reduce maintenance effort", Int. J. Adv. Comput. Sci. Appl., vol. 7, no. 4, pp. 311–314, 2016.

W. Li, H. Saidi, H. Sanchez and M. Sch, “Detecting similar programs via the Weisfeiler-Leman graph kernel", Proc. of 15th Int. Conf. on Software Reuse, pp. 315–330, 2016.

R. Tekchandani, R. Bhatia and M. Singh, "Semantic code clone detection for Internet of things applications using reaching definition and liveness analysis", J. Supercomput., pp. 1–28, 2016.

R. Koschke, "Large-scale inter-system clone detection using suffix trees and hashing", J. Softw. Evol. Process, vol. 26, no. 8,

pp. 747–769, 2014.

A. Sheneamer and J. Kalita, "Code clone detection using coarse and fine-grained hybrid approaches", IEEE 7th Int. Conf. Intell. Comput. Inf. Syst. ICICIS 2015, pp. 472–480, 2016.

I. Keivanloo, F. Zhang, and Y. Zou, “Threshold-free code clone detection for a large-scale heterogeneous Java repository”, IEEE 22nd Int. Conf. Softw. Anal. Evol. Reengineering, Proc. SANER 2015, pp. 201–210, 2015.

A. Ashish, "Clones clustering using K-means", 10th Int. Conf. Intell. Syst. Control, pp. 1–6, 2016.

Y. Higo and S. Kusumoto, "How often do unintended inconsistencies happen? Deriving modification patterns and detecting overlooked code fragments", IEEE Int. Conf. Softw. Maintenance, pp. 222–231, 2012.

K. Kanagalakshmi and R. Suguna, "Software refactoring technique for code clone detection of static and dynamic website", Int. J. Comp. Applications", vol. 107, no. 12, pp. 1–10, 2014.

S. Singh and S. Kaur, "A systematic literature review: Refactoring for disclosing code smells in object oriented software", Ain Shams Eng. J., 2016.

T. Kamiya, "An Execution-Semantic and Content-and-Context- Based Code-Clone Detection and Analysis", Software Clones (IWSC), IEEE 9th Int. Workshop, pp. 1–7, 2015.

K. E. Rajakumari and T. Jebarajan, "A novel approach to effective detection and analysis of code clones", Third Int. Conf. Innov. Comput. Technol., pp. 287–290, 2013.

I. D. Baxter, A. Yahin, L. Moura, M. Sant’Anna and L. Bier, "Clone detection using abstract syntax trees", Proc. Int. Conf. Softw. Maint. (Cat. No. 98CB36272), pp. 368–377, 1998.

C. Liu, C. Chen, J. Han and P. S. Yu, "GPLAG: detection of software plagiarism by program dependence graph analysis", Proc. 12th ACM SIGKDD Int. Conf. Knowl. Discov. data Min., pp. 872–881, 2006.

R. M. Abdel-Aziz, A. E. Aboutabl and M. S. Mostafa, "Clone detection using DIFF algorithm for aspect mining", Int. J. Adv. Comput. Sci. Appl., vol. 3, no. 8, pp. 137–140, 2012.

W. Qu, Y. Jia and M. Jiang, "Pattern mining of cloned codes in software systems", Inf. Sci. (Ny)., vol. 259, pp. 544–554, 2014.

F. H. Su, J. Bell and G. Kaiser, "Challenges in behavioral code clone detection", IEEE 23rd Int. Conf. Softw. Anal. Evol. Reengineering, SANER 2016, vol. 2, pp. 21–22, 2016.

A. Okutan and O. Taner Yildiz, "A novel kernel to predict software defectiveness", J. Syst. Softw., vol. 119, pp. 109–121, 2016.

Z. Tian, T. Liu, Q. Zheng, M. Fan, E. Zhuang and Z. Yang, "Exploiting thread-related system calls for plagiarism detection of multithreaded programs", J. Syst. Softw., vol. 119, pp. 136–148, 2016.

G. Maskeri, D. Karnam, S. A. Viswanathan and S. Padmanabhuni, "Version history based source code plagiarism detection in proprietary systems", IEEE Int. Conf. Softw. Maintenance, ICSM, pp. 609–612, 2012.

E. Flores, A. Barron-Cedeno, L. Moreno and P. Rosso, "Cross-language source code re-use detection using latent semantic analysis", J. Univers. Comput. Sci., vol. 21, no. 13, pp. 1708–1725, 2015.

S. Stojanović, Z. Radivojević, and M. Cvetanović, "Approach for estimating similarity between procedures in differently compiled binaries", Inf. Softw. Technol., vol. 58, pp. 259–271, 2015.

Y. Yuan, F. Zhang, and X. Su, "CloneAyz : An approach for clone representation and analysis", Inf. Sci. Control Engg. (IEEE), pp. 252-256, 2016.

G. Vale, E. Figueiredo, R. Abilio, and H. Costa, "Bad smells in software product lines: A systematic review", Proc. of 8th Brazilian Symp. Softw. Components, Archit. Reuse, pp. 84–94, 2014.

A. Ouni, M. Kessentini, S. Bechikh and H. Sahraoui, "Prioritizing code-smells correction tasks using chemical reaction optimization," Software Quality Journal , vol. 23, no. 2. 2015.

A. Yamashita and L. Moonen, "Do code smells reflect important maintainability aspects?", IEEE Int. Conf. Softw. Maintenance, ICSM, pp. 306–315, 2012.

F. Hermans, M. Pinzger and A. van Deursen, "Detecting and refactoring code smells in spreadsheet formulas", Empir. Softw. Eng., vol. 20, no. 2, pp. 549–575, 2015.

B. Hauptmann, M. Junker, S. Eder, L. Heinemann, R. Vaas and

P. Braun, "Hunting for Smells in Natural Language Tests", Proc. of Int. Conf. on Software Engg., no. 1, pp. 4–7, 2013.

Z. Li, S. Lu, S. Myagmar and Y. Zhou, "CP-Miner: Finding copy-paste and related bugs in large-scale software code", IEEE Trans. Softw. Eng., vol. 32, no. 3, pp. 176–192, 2006.

H. Kaur and R. Maini, “Identification of recurring patterns of code to detect structural clones", Proc. of 6th Int. Adv. Comput. Conf.,

pp. 398–403, 2016.

M. Abdelkader and M. Mimoun, "Clone detection using time series and dynamic time warping techniques", Third World Conf. Complex Syst., pp. 1–6, 2015.

J. Alnihoud and R. Mansi, "An enhancement of major sorting algorithms", Int. Arab J. Inf. Technol., vol. 7, no. 1, pp. 55–62, 2010.




How to Cite

R. Mehboob, S. Shabbir, and A. Javed, “A Technique of Code Clone Detection based on Defined Mechanism for Threshold Calculation”, The Nucleus, vol. 54, no. 4, pp. 197–204, Jan. 2018.