Skip to content

V0.4 Release Note

Latest
Compare
Choose a tag to compare
@zhjwy9343 zhjwy9343 released this 15 Jan 18:27

The GraphStorm V0.4 release contains several major feature enhancements. In this version, we have introduced experimental support for using edge features in GNN message passing computation. Now users can use edge features by setting two new command line (CLI) arguments, --edge-feat-name and --edge-feat-mp-op. GraphStorm APIs were also updated to support using edge features in message passing. In addition, we introduced support for DGL’s GraphBolt in this version. GraphBolt is a new data loading module for DGL that enables faster and more efficient graph sampling. For link prediction on Paper100M, we achieved a 1.4X speedup in training and a 3.6X speedup in inference with GraphBolt enabled in GraphStorm. We also enhanced distributed graph processing (GSProcess) to support hard negative sampling, multitask mask generation, and saving and loading numeric feature transformations. We added RotatE and TransE score functions for link prediction. In this version, we added a new GraphStorm example that predicts complex and dynamic network traffic and an example that demonstrates how to use the super-node method to perform graph-level prediction tasks. We also added a new example that demonstrates how to use SageMaker Pipelines with GraphStorm and how to run GraphBolt-enabled jobs.

Major features

  • Support using edge features in GNN message passing computation. Users only need to set two new CLI arguments to use this new feature with an RGCN encoder. #1057, #1070, #1074, #1084, #1088, #1096, #1098, #1104.
  • GraphBolt integration. Users can use GraphBolt by setting one argument, --use-graphbolt, in graph processing and model training and inference. #1001, #1011, #1024, #1025, # 1029, #1083, #1116.
  • GSProcessing enhancements: supporting hard negative sampling, multitask mask generation, and saving numeric feature transformations. #994, #1050, #1073, #1085, #1091, #1076, #1117.

New Examples

  • Network time series traffic prediction. This example demonstrates how to make time series prediction on a synthetic air transportation traffic by using GraphStorm. #1109.
  • Graph-level prediction. This example demonstrates how to use the super-node method to perform graph-level prediction tasks using GraphStorm CLIs and APIs. #1021, #1026.
  • A new notebook example of using customized models with CLIs. #1049, #1087.
  • A new notebook example of conducting distributed training pipeline on SageMaker. #1108, #1126.

Minor features

Breaking changes

  • API changes: RelGraphConvLayer adds two new arguments, edge_feat_name and edge_feat_mp_op to support using edge feature, and in its forward function, change the input argument inputs into two arguments, n_h and e_h, for node embeddings and edge embeddings, perspectively. RelationalGCNEncoder adds two new arguments, edge_feat_name and edge_feat_mp_op too. Its forward function changes the input argument h into n_h and e_hs too. #1074.
  • Decoders, including EntityClassifier, EntityRegression, DenseBiDecoder, EdgeRegression, MLPEdgeDecoder, and MLPEFeatEdgeDecoder, have a new argument, use_bias, to allow users to set bias in these decoders. #1111, #1125.
  • Modify GSProcessing configuration parser to be equivalent to GConstruct. #1117.

Contributors