ConveRT (Conversational Representations from Transformers) TF model based on
Below is the code to train a new model with the same default settings as described in the paper.
from model import convert
model = convert.get_compiled_model(vocab_path, max_steps)
- vocab_path (str): path to any WordPiece compatible vocab file with "##" suffix indicator, or build one by running tools/ on the train input corpus.
- max_steps (int): number of training steps before stopping learning rate decay.
- inputs: a dictionary mapping
keys to the corresponding input string array/tensors or
of the same format (check
Input specs:
"context": tf.TensorSpec(shape=(None,), dtype=tf.string),
"response": tf.TensorSpec(shape=(None,), dtype=tf.string)
A sample training code can be found in tools/
An ExportSavedModel
callback is provided in model/ to export the model for inference.
The saved model can then be loaded in the following ways for inference:
import tensorflow as tf
import tensorflow_text as tf_text
from model import convert
inputs = {"context": ..., "response": ...}
# Option 1
model = tf.saved_model.load(saved_model)
serve_fn = model.signatures["serve"]
output = serve_fn(context=inputs["context"], response=inputs["response"])
# Option 2
model = tf.keras.models.load_model(
custom_objects={"ConveRT": convert.ConveRT}
output = model(inputs)
- Model supports single context currently.
- Only 1 OOV bucket.