[dicp][ascend] Optimization for dynamic shape code logic. #791

pdx1989 · 2024-04-24T07:08:16Z

Optimize dynamic shape handling:

Adjust relationship between SymInt and InputArgs.
Refine variable replacement in in/out shape structure.
Enable complete expression calculation in SymInt replacement.
Some code refinement merging duplicated executing path.
Support lightllm llama 7B dynamic shape version.

…e_optimization

1.Remove redundant Cast operator. 2.Change logic of Expand shape proxy. 3.Merge output stride executing path.

This reverts commit 9025019.

…ape_optimization

dicp/dicp/vendor/AscendGraph/codegen/ascend.py

dicp/dicp/vendor/AscendGraph/conversion.py

…e_attention unit test.

dicp/dicp/vendor/AscendGraph/conversion.py

…rg#791) * Refine code structure of dynamic shape handling. * Adjust symint_to_args relationship code logic. * Remove redundant code. * Enable 70B get_qkv stage dynamic shape. * Fix complex size append. * Change load_and_run in/out shape assignment. * Refine variable replacement in in/out shape structure. * Fix merge bugs. * Change one comment and variable name. * Fix an array assignment change. * Code refinement including: 1.Remove redundant Cast operator. 2.Change logic of Expand shape proxy. 3.Merge output stride executing path. * Get clear idea for expand Cast situation. * Apply some idea from Gpt AI. * Revert "Apply some idea from Gpt AI." This reverts commit 9025019. * Remove dead use, replace const proxy. * Support 7B dynamic shape version. * Pass 1st dynamic graph model. * Pass both two graph model for 7B dynamic shape version. * Fix ci case incre_flash_attention. * Change split execution path for both shape mode. * Add execution path for copy_with_offset. * Merge copy_with_offset shape path mode. * Add const proxy for int split_size. * Move some common functions into util. * Add path for flash_attention to pass head, kvhead and dim in. * Cancel path split for slice start proxy form. * Add sequenceAt & triu test unit case. * Return several code logic back to original design, and fix flash_incre_attention unit test. * Modify the logic of split implementation. * Add split dynamic case. * Remove identity additional logic, wrap convert into promote_dtype. * Pass ge unit test. * Modify logic of lt dtype, and prompt_attention fp16 conversion. * Add promote_dtype priority logic. * Fix promote_dtype bug. * Cast back fa to float32 if dtype not consistent. * Change to return q_dtype tensor. * Improve promote_dtype logic. * Add const proxy logic for promote_dtype. * Fix flash_attention declaration. * Remove Symint & Proxy from 7B static path. * Change const judge method. --------- Co-authored-by: chenchiyu <chenchiyu@pjlab.org.cn>

pdx1989 added 2 commits April 23, 2024 10:19

Refine code structure of dynamic shape handling.

39d38b0

Adjust symint_to_args relationship code logic.

5a97b9b

pdx1989 requested a review from jinminxi104 as a code owner April 24, 2024 07:08

pdx1989 added 10 commits April 24, 2024 07:11

Remove redundant code.

e6f4e43

Enable 70B get_qkv stage dynamic shape.

d2333fc

Fix complex size append.

54167a9

Change load_and_run in/out shape assignment.

3dec435

Refine variable replacement in in/out shape structure.

2666a3f

Merge branch 'daoxin/lightllm_dynamic_shape' into daoxin/dynamic_shap…

de1dc08

…e_optimization

Fix merge bugs.

0389a64

Merge branch 'main' into daoxin/dynamic_shape_optimization

cfee2df

Change one comment and variable name.

5a2fd6a

Fix an array assignment change.

03ba1a4

pdx1989 requested review from zhaochaoxing and tangzhiyi11 April 25, 2024 10:50

pdx1989 and others added 10 commits April 26, 2024 03:33

Code refinement including:

71f6c61

1.Remove redundant Cast operator. 2.Change logic of Expand shape proxy. 3.Merge output stride executing path.

Get clear idea for expand Cast situation.

25c5c56

Apply some idea from Gpt AI.

9025019

Revert "Apply some idea from Gpt AI."

62a6b36

This reverts commit 9025019.

Remove dead use, replace const proxy.

2f6bd52

Merge branch 'daoxin/lightllm_dynamic_support' into daoxin/dynamic_sh…

5629a80

…ape_optimization

Merge branch 'main' into daoxin/dynamic_shape_optimization

9df01a5

Support 7B dynamic shape version.

92521eb

Pass 1st dynamic graph model.

8898541

Pass both two graph model for 7B dynamic shape version.

8971b85

pdx1989 changed the title ~~[WIP][dicp][ascend] Optimization for dynamic shape code logic.~~ [dicp][ascend] Optimization for dynamic shape code logic. May 14, 2024

tangzhiyi11 reviewed May 14, 2024

View reviewed changes

dicp/dicp/vendor/AscendGraph/codegen/ascend.py Show resolved Hide resolved

Fix ci case incre_flash_attention.

08e73f5

jinminxi104 requested changes May 14, 2024

View reviewed changes

jinminxi104 reviewed May 14, 2024

View reviewed changes

dicp/dicp/vendor/AscendGraph/conversion.py Outdated Show resolved Hide resolved

tangzhiyi11 reviewed May 15, 2024

View reviewed changes

dicp/dicp/vendor/AscendGraph/conversion.py Outdated Show resolved Hide resolved

dicp/dicp/vendor/AscendGraph/conversion.py Outdated Show resolved Hide resolved

pdx1989 added 7 commits May 15, 2024 08:55

Add execution path for copy_with_offset.

3e36e2e

Merge copy_with_offset shape path mode.

6c40ad4

Add const proxy for int split_size.

0ce746f

Move some common functions into util.

e5c06ab

Add path for flash_attention to pass head, kvhead and dim in.

9f0472a

Cancel path split for slice start proxy form.

62957f7

Add sequenceAt & triu test unit case.

709c009

tangzhiyi11 reviewed May 17, 2024

View reviewed changes

dicp/dicp/vendor/AscendGraph/conversion.py Outdated Show resolved Hide resolved

dicp/dicp/vendor/AscendGraph/conversion.py Outdated Show resolved Hide resolved

pdx1989 added 4 commits May 17, 2024 06:17

Return several code logic back to original design, and fix flash_incr…

7b013bc

…e_attention unit test.

Modify the logic of split implementation.

2a1b6d9

Add split dynamic case.

10b1603

Remove identity additional logic, wrap convert into promote_dtype.

d5689c0

tangzhiyi11 reviewed May 22, 2024

View reviewed changes

dicp/dicp/vendor/AscendGraph/conversion.py Show resolved Hide resolved

pdx1989 added 6 commits May 23, 2024 07:23

Pass ge unit test.

4e499b9

Modify logic of lt dtype, and prompt_attention fp16 conversion.

3811319

Add promote_dtype priority logic.

a629335

Fix promote_dtype bug.

9341488

Cast back fa to float32 if dtype not consistent.

c84fbdd

Change to return q_dtype tensor.

151f4b2

tangzhiyi11 approved these changes May 23, 2024

View reviewed changes

pdx1989 added 5 commits May 24, 2024 05:22

Improve promote_dtype logic.

bdecf8c

Add const proxy logic for promote_dtype.

69be14c

Fix flash_attention declaration.

052bf1e

Remove Symint & Proxy from 7B static path.

06d0bcc

Change const judge method.

5ed78cc

jinminxi104 approved these changes May 29, 2024

View reviewed changes

jinminxi104 merged commit 2947a6a into DeepLink-org:main May 29, 2024
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dicp][ascend] Optimization for dynamic shape code logic. #791

[dicp][ascend] Optimization for dynamic shape code logic. #791

pdx1989 commented Apr 24, 2024 •

edited

Loading

[dicp][ascend] Optimization for dynamic shape code logic. #791

[dicp][ascend] Optimization for dynamic shape code logic. #791

Conversation

pdx1989 commented Apr 24, 2024 • edited Loading

pdx1989 commented Apr 24, 2024 •

edited

Loading