This is a project for finding factors related to bad clones.
First of all, you need to clone this repository into your own server.
As all the dependent parts are based on docker. So install docker on your server first. Here is a guide for ubuntu 20.04.
Then install docker-compose by running sudo apt install docker-compose
.
Add priviledge for the user.
sudo groupadd docker
sudo gpasswd -a $USER docker
newgrp docker
docker ps # test whether docker command can be used
Install docker-compose following this guide.
Install the dependencies using the following steps:
docker network create cpge
to create the network.cd dependencies
docker-compose up -d
to install all the dependent services, including kafka (the dependent zookeeper), mysql, kibana (the dependent elasticsearch).
- Error may occur when installing elasticsearch and kibana, you need to:
sudo chmod 777 -R elasticsearch
sudo chmod 777 -R kibana
docker-compose up -d
- git:
- download git, version >= 2.34
- python:
- create python virtual environment based on Anaconda using command
conda create -n bad_clone python=3.7.11
. - activate the environment using command
conda activate bad_clone
- install dependent python packages using command
pip install -r requirements.txt
- create python virtual environment based on Anaconda using command
- Mysql:
- this project uses Mysql 8.0.30
- copy the configuration template and rename it using command
cp ./config.template.yml ./config.yml
- set the section of the config with the hints in the template
- Java
- To run the clone detector NIL, jdk 11+ is needed.
- Start collecting data for repositories by running the following commands:
git clone https://gitlink.org.cn/MillerEvan/bad_clone_prediction.git
cp repos.example repos # You need to add your own repositories in the repos file
conda activate bad_clone
python RCDMain.py
- Delete the data for specific repositories:
cp delete_repos.example delete_repos # You need to add your own repositories in the delete_repos file
python deleteProject.py