FP16 inference #78

Don-Chad · 2025-01-19T12:28:42Z

Don-Chad
Jan 19, 2025

Awesome job so far!!

The original kokoro does allow FP 16 loading, and it speeds up about 25% or so! Would be great to implement this. (I tried and the architecture does not allow it currently)

The converting to fp16 is very straight forward, probably can be done instantly (model.half) also.

from hashlib import sha256
from pathlib import Path
import torch

path = Path(file).parent.parent / 'kokoro-v0_19.pth'
assert path.exists(), f'No model pth found at {path}'

net = torch.load(path, map_location='cpu', weights_only=True)['net']
for a in net:
for b in net[a]:
net[a][b] = net[a][b].half()

torch.save(dict(net=net), 'kokoro-v0_19-half.pth')
with open('kokoro-v0_19-half.pth', 'rb') as rb:
h = sha256(rb.read()).hexdigest()

remsky · 2025-01-26T20:20:47Z

remsky
Jan 26, 2025
Maintainer

Using the half model is implemented in the v0.1.2-pre and up branch image, and the option for either should be available there as well

1 reply

Don-Chad Jan 27, 2025
Author

fantastic! Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP16 inference #78

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

FP16 inference #78

Don-Chad Jan 19, 2025

Replies: 1 comment · 1 reply

remsky Jan 26, 2025 Maintainer

Don-Chad Jan 27, 2025 Author

Don-Chad
Jan 19, 2025

Replies: 1 comment 1 reply

remsky
Jan 26, 2025
Maintainer

Don-Chad Jan 27, 2025
Author