Skip to content

zhangxunhui/CPGE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RCD: Risky Clone Detection

This is a project for finding factors related to bad clones.

How to use this

First of all, you need to clone this repository into your own server.

As all the dependent parts are based on docker. So install docker on your server first. Here is a guide for ubuntu 20.04. Then install docker-compose by running sudo apt install docker-compose.

Add priviledge for the user.

sudo groupadd docker
sudo gpasswd -a $USER docker
newgrp docker
docker ps # test whether docker command can be used

Install docker-compose following this guide.

Install the dependencies using the following steps:

  1. docker network create cpge to create the network.
  2. cd dependencies
  3. docker-compose up -d to install all the dependent services, including kafka (the dependent zookeeper), mysql, kibana (the dependent elasticsearch).
  • Error may occur when installing elasticsearch and kibana, you need to:
    • sudo chmod 777 -R elasticsearch
    • sudo chmod 777 -R kibana
    • docker-compose up -d

Install environments:

  • git:
    • download git, version >= 2.34
  • python:
    • create python virtual environment based on Anaconda using command conda create -n bad_clone python=3.7.11.
    • activate the environment using command conda activate bad_clone
    • install dependent python packages using command pip install -r requirements.txt
  • Mysql:
    • this project uses Mysql 8.0.30
    • copy the configuration template and rename it using commandcp ./config.template.yml ./config.yml
    • set the section of the config with the hints in the template
  • Java
    • To run the clone detector NIL, jdk 11+ is needed.

run the project

  1. Start collecting data for repositories by running the following commands:
git clone https://gitlink.org.cn/MillerEvan/bad_clone_prediction.git
cp repos.example repos # You need to add your own repositories in the repos file
conda activate bad_clone
python RCDMain.py
  1. Delete the data for specific repositories:
cp delete_repos.example delete_repos # You need to add your own repositories in the delete_repos file
python deleteProject.py

About

Clone Pair Genealogy Extractor

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages