Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tn 新增 億~千億 #223

Open
bensonbs opened this issue Jun 4, 2024 · 3 comments
Open

tn 新增 億~千億 #223

bensonbs opened this issue Jun 4, 2024 · 3 comments

Comments

@bensonbs
Copy link

bensonbs commented Jun 4, 2024

tn/chinese/rules/cardinal.py

        # 10001111, 1001111, 101111, 11111, 10111, 10011, 10001, 10000
        ten_thousand = ((thousand | hundred | ten | digit) + insert('萬') +
                        (thousand
                         | (zero + rmpunct + hundred)
                         | (rmzero + rmpunct + zero + tens)
                         | (rmzero + rmpunct + rmzero + zero + digit)
                         | rmzero**4))

        # add hundred_million
        hundred_million = 
@FlynnFlag
Copy link

FlynnFlag commented Jan 16, 2025

_ten_thousand = ((thousand | (zero + hundred) | (rmzero + zero + rmpunct + tens) | (rmzero + rmzero + rmpunct + zero + digit)) + insert('万') +
                (thousand
                 | (zero + rmpunct + hundred)
                 | (rmzero + rmpunct + zero + tens)
                 | (rmzero + rmpunct + rmzero + zero + digit)
                 | rmzero**4))

billion =  ((ten_thousand | thousand | hundred | ten | digit) + insert('亿') +
            _ten_thousand
            |rmzero**8)

写了好久,测试下来看似对了,
rmzero的位置好像有问题,但是不想改了,直接前处理把数字里的逗号给去了
如果有其他不对的话麻烦反馈一下

@Dobby22
Copy link

Dobby22 commented Jan 20, 2025

_ten_thousand = ((thousand | (zero + hundred) | (rmzero + zero + rmpunct + tens) | (rmzero + rmzero + rmpunct + zero + digit)) + insert('万') +
(thousand
| (zero + rmpunct + hundred)
| (rmzero + rmpunct + zero + tens)
| (rmzero + rmpunct + rmzero + zero + digit)
| rmzero**4))

billion = ((ten_thousand | thousand | hundred | ten | digit) + insert('亿') +
_ten_thousand
|rmzero**8)
写了好久,测试下来看似对了, rmzero的位置好像有问题,但是不想改了,直接前处理把数字里的逗号给去了 如果有其他不对的话麻烦反馈一下

非常棒的解决方案,但是在对999,999,999做处理的时候,出来的结果是九百九十九,九十九万九千九百九十九

@FlynnFlag
Copy link

_ten_thousand = ((thousand | (zero + hundred) | (rmzero + zero + rmpunct + tens) | (rmzero + rmzero + rmpunct + zero + digit)) + insert('万') +
(thousand
| (zero + rmpunct + hundred)
| (rmzero + rmpunct + zero + tens)
| (rmzero + rmpunct + rmzero + zero + digit)
| rmzero4))
billion = ((ten_thousand | thousand | hundred | ten | digit) + insert('亿') +
_ten_thousand
|rmzero
8)
写了好久,测试下来看似对了, rmzero的位置好像有问题,但是不想改了,直接前处理把数字里的逗号给去了 如果有其他不对的话麻烦反馈一下

非常棒的解决方案,但是在对999,999,999做处理的时候,出来的结果是九百九十九,九十九万九千九百九十九

因为我写的这个逻辑,亿级别时,挪用原先的逗号位置判断就不对了。我的应用场景其实也用不到逗号,所以为了方便我直接前面预处理,你也可以自行修改rmpunct的组合位置。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants