Skip to content

Public code for AI-based DNA primer design using auto-encoder, convolutional neural network and Cox regression model.

Notifications You must be signed in to change notification settings

soochem/ai-dna-primer-design

Repository files navigation

AI System for Primer Design

Autoencoder

  1. Train encder and decoder.
train.py [-h] [-k K_FOLDS]
        [-e NUM_EPOCHS]
        [-b BATCH_SIZE]
        [-l LEARNING_RATE]
        [--embedding_dim EMBEDDING_DIM]
        [--hidden_dim HIDDEN_DIM]
        [--plot_every PLOT_EVERY]
        [--data_path DATA_PATH]
        [--word_dict WORD_DICT]
        [--debug DEBUG
  1. Regress CT values.
regress.py [-h] [-k K_FOLDS]
          [-e NUM_EPOCHS]
          [-b BATCH_SIZE]
          [-l LEARNING_RATE]
          [--embedding_dim EMBEDDING_DIM]
          [--hidden_dim HIDDEN_DIM]
          [--plot_every PLOT_EVERY]
          [--data_path DATA_PATH]
          [--word_dict WORD_DICT]
          [--debug DEBUG]
  1. Inference CT values.
inference.py [-h] [-k K_FOLDS]
            [-b BATCH_SIZE]
            [--embedding_dim EMBEDDING_DIM]
            [--hidden_dim HIDDEN_DIM]
            [--data_path DATA_PATH]
            [--word_dict WORD_DICT]
            [--debug DEBUG]

Multi-input CNN

Two-step classifier-regressor

  1. Generate binary 'label' for data with NaN
  2. Train classifier
    python train_cnn_multi.py \
        --data_path='./data/train_df_with_label.csv' \
        --loss_function='bce_loss' \
        -e 1000 \
        --target_name='label' \
        --patience 10
    
  3. Inference with classifier (you may use '--model_path') -> produce 'train/test_df_wtih_label_no_nan.csv'
    python inference_cnn_classifier.py \
        --data_path='./data/train_df_with_label.csv' \
        --target_name='label'    
    python inference_cnn_classifier.py \
        --data_path='./data/test_df_with_label.csv' \
        --target_name='label'    
    
  4. Train regressor without NaN (predicted) data
    python train_cnn_multi.py --data_path='./data/train_df_with_label_no_nan.csv' \
        -e 1000 \
        --patience 20
    
  5. Inference with regressor on train set for qualitative analysis (you may use '--model_path')
    python inference_cnn_multi.py --data_path='./data/train_df_with_label_no_nan.csv'
    
  6. Inference with regressor on test set for predicting ct value (you may use '--model_path')
    python inference_cnn_multi.py --data_path='./data/test_df_with_label_no_nan.csv'
    

About

Public code for AI-based DNA primer design using auto-encoder, convolutional neural network and Cox regression model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published