Memory usage optimization via reuse of `SchemaValidator` and `SchemaSerializer` #1616

sydney-runkle · 2025-01-31T14:55:02Z

The main goal of this PR is to reduce memory usage associated with models by minimizing how much space SchemaValidators and SchemaSerializers consume.

This is done by reusing references to existing validators and serializers when there are nested structures present.

This is allowed for model and (pydantic) dataclass core schemas. Notably, we don't use this reuse pattern for generic dataclasses (see code commentary for more info).

This has been benchmarked against many examples. The examples where this refactor has the most impact are those which have lots large/nested models.

Two examples where improvements were particularly noticeable were for schema builds with:

aiotdlib
k8s_v2.py, a kubernetes model file

Some highlight stats:

Extreme reduction in total number of allocations, we tested up to almost 7x, but this could be even greater depending on model structure
Significant reduction in resident memory size - this is probably the most important metric for users - our experiments showed results between 2-4x
Reduction in total memory allocated (1.5-2x)
Schema build times also have the potential to improve, we saw ~15% build time reduction for aiotdlib, and a small 3-5% improvement for the kubernetes example.

For `aiotdlib`

Metric	Before	After	Change	% Change	Reduction Factor
Resident Memory Size	884MB	212MB	-672MB	-76.0%	4.17×
Total Allocations	5,069,626	746,466	-4,323,160	-85.3%	6.79×
Total Memory Allocated	1.317GB	671MB	-646MB	-49.1%	1.96×

`aiotdlib.py` (consolidated models)

Total memory allocation is ~50% of what it was previously
Total number of allocations has dropped 6.8x
Resident memory has reduced by 4x -- this is probably the most valuable stat here!

on main:

📏 Total allocations:
        5069626

📦 Total memory allocated:
        1.317GB

📊 Histogram of allocation size:
        min: 1.000B
        ----------------------------------------------
        < 4.000B   :  221354 ▇▇▇▇
        < 18.000B  : 1783931 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 76.000B  : 1600031 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 326.000B :  806918 ▇▇▇▇▇▇▇▇▇▇▇▇
        < 1.354KB  :  531049 ▇▇▇▇▇▇▇▇
        < 5.754KB  :   90030 ▇▇
        < 24.456KB :   34687 ▇
        < 103.938KB:    1495 ▇
        < 441.735KB:      54 ▇
        <=1.833MB  :      77 ▇
        ----------------------------------------------
        max: 1.833MB

📂 Allocator type distribution:
         MALLOC: 4102467
         REALLOC: 913128
         CALLOC: 45971
         MMAP: 8060

🥇 Top 5 largest allocating locations (by size):
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 510.371MB
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 278.361MB
        - __init__:/Users/sydney-runkle/.local/share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/typing.py:1035 -> 239.962MB
        - _get_code_from_file:<frozen runpy>:259 -> 66.375MB
        - data:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_namespace_utils.py:91 -> 21.933MB

🥇 Top 5 largest allocating locations (by number of allocations):
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 3347638
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 1559311
        - __init__:/Users/sydney-runkle/.local/share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/typing.py:1035 -> 56044
        - _extract_json_schema_info_from_field_info:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_generate_schema.py:258 -> 21452
        - _get_code_from_file:<frozen runpy>:259 -> 17837

with this branch:

📏 Total allocations:
        746466

📦 Total memory allocated:
        670.775MB

📊 Histogram of allocation size:
        min: 1.000B
        ---------------------------------------------
        < 4.000B   :  53493 ▇▇▇▇▇▇
        < 18.000B  : 181024 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 76.000B  : 264162 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 326.000B :  79409 ▇▇▇▇▇▇▇▇
        < 1.354KB  : 105607 ▇▇▇▇▇▇▇▇▇▇
        < 5.754KB  :  33087 ▇▇▇▇
        < 24.456KB :  28484 ▇▇▇
        < 103.938KB:   1111 ▇
        < 441.735KB:     36 ▇
        <=1.833MB  :     53 ▇
        ---------------------------------------------
        max: 1.833MB

📂 Allocator type distribution:
         MALLOC: 601531
         REALLOC: 91757
         CALLOC: 45038
         MMAP: 8140

🥇 Top 5 largest allocating locations (by size):
        - __init__:/Users/sydney-runkle/.local/share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/typing.py:1035 -> 240.962MB
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 80.636MB
        - _get_code_from_file:<frozen runpy>:259 -> 66.375MB
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 44.481MB
        - data:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_namespace_utils.py:91 -> 21.933MB

🥇 Top 5 largest allocating locations (by number of allocations):
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 381038
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 204972
        - __init__:/Users/sydney-runkle/.local/share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/typing.py:1035 -> 56045
        - _extract_json_schema_info_from_field_info:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_generate_schema.py:258 -> 21452
        - _get_code_from_file:<frozen runpy>:259 -> 17837

Old flamegraph:

New flamegraph:

For `k8s_v2.py`

Metric	Before	After	Change	% Change	Reduction Factor
Resident Memory Size	563MB	290MB	-273MB	-48.5%	1.94×
Total Allocations	1,969,609	586,011	-1,383,598	-70.2%	3.36×
Total Memory Allocated	787MB	519MB	-268MB	-34.1%	1.52×

on main:

📏 Total allocations:
        1969609

📦 Total memory allocated:
        786.890MB

📊 Histogram of allocation size:
        min: 1.000B
        ---------------------------------------------
        < 4.000B   : 340813 ▇▇▇▇▇▇▇▇▇▇▇▇
        < 21.000B  : 747070 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 100.000B : 339933 ▇▇▇▇▇▇▇▇▇▇▇▇
        < 466.000B : 120403 ▇▇▇▇▇
        < 2.114KB  : 363496 ▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 9.824KB  :  56606 ▇▇
        < 45.646KB :    991 ▇
        < 212.084KB:    124 ▇
        < 985.395KB:     50 ▇
        <=4.471MB  :    123 ▇
        ---------------------------------------------
        max: 4.471MB

📂 Allocator type distribution:
         MALLOC: 1826508
         REALLOC: 100096
         CALLOC: 42912
         MMAP: 93

🥇 Top 5 largest allocating locations (by size):
        - _get_code_from_file:<frozen runpy>:259 -> 282.632MB
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 187.160MB
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 133.740MB
        - __init__:/Users/sydney-runkle/.local/share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/typing.py:1035 -> 21.739MB
        - from_field:/Users/sydney-runkle/Work/oss/pydantic/pydantic/fields.py:279 -> 13.876MB

🥇 Top 5 largest allocating locations (by number of allocations):
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 1241381
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 507059
        - _get_code_from_file:<frozen runpy>:259 -> 74830
        - _apply_annotations:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_generate_schema.py:2098 -> 30023
        - from_field:/Users/sydney-runkle/Work/oss/pydantic/pydantic/fields.py:279 -> 18946

On this branch:

📏 Total allocations:
        586011

📦 Total memory allocated:
        518.610MB

📊 Histogram of allocation size:
        min: 1.000B
        ---------------------------------------------
        < 4.000B   :  90275 ▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 21.000B  : 155729 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 100.000B : 101222 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 466.000B :  21643 ▇▇▇▇
        < 2.114KB  : 178561 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 9.824KB  :  37417 ▇▇▇▇▇▇
        < 45.646KB :    911 ▇
        < 212.084KB:     96 ▇
        < 985.395KB:     42 ▇
        <=4.471MB  :    115 ▇
        ---------------------------------------------
        max: 4.471MB

📂 Allocator type distribution:
         MALLOC: 510537
         CALLOC: 41977
         REALLOC: 33412
         MMAP: 85

🥇 Top 5 largest allocating locations (by size):
        - _get_code_from_file:<frozen runpy>:259 -> 282.632MB
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 36.204MB
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 23.107MB
        - __init__:/Users/sydney-runkle/.local/share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/typing.py:1035 -> 21.739MB
        - from_field:/Users/sydney-runkle/Work/oss/pydantic/pydantic/fields.py:279 -> 14.876MB

🥇 Top 5 largest allocating locations (by number of allocations):
        - create_schema_validator:/Users/sydney-runkle/Work/oss/pydantic/pydantic/plugin/_schema_validator.py:51 -> 240515
        - complete_model_class:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_model_construction.py:611 -> 96597
        - _get_code_from_file:<frozen runpy>:259 -> 74830
        - _apply_annotations:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_generate_schema.py:2098 -> 30024
        - _extract_json_schema_info_from_field_info:/Users/sydney-runkle/Work/oss/pydantic/pydantic/_internal/_generate_schema.py:258 -> 30022

Old flamegraph:

New flamegraph:

Thanks 🚀

@Viicos with the help finding some examples that were appropriate for benchmarking, and the idea to skip the core schema modifications for simplicity 👍
@davidhewitt for iterating with me on the appropriate pyo3 tools to use for this :)
@BoxyUwU for your work on #1414 which got us started down this path

codspeed-hq · 2025-01-31T15:01:52Z

CodSpeed Performance Report

Merging #1616 will not alter performance

_{Comparing prebuilt-variant (99136a5) with main (fdccecd)}

Summary

✅ 157 untouched benchmarks

src/validators/prebuilt.rs

fruitoiz · 2025-02-01T12:06:04Z

Impressive! I hope you will not stop here.

davidhewitt

Seems fine enough! I wonder, is there a way this can be tested? Maybe do something evil like modify the __pydantic_validator__ on a type and confirm that validator is picked up? 🙈

src/validators/prebuilt.rs

src/serializers/shared.rs

src/validators/mod.rs

src/validators/prebuilt.rs

sydney-runkle · 2025-02-05T15:40:45Z

Wondering if I should consolidate the shared "extract prebuilt" logic between the validator and serializers...

davidhewitt · 2025-02-05T15:50:42Z

I think that would be smart, suggest file to go at src/common/prebuilt.rs for the shared logic.

…ic-core into prebuilt-variant

davidhewitt

LGTM, feel free to consolidate logic and then merge 👍

using prebuilt validators

75bde54

sydney-runkle mentioned this pull request Jan 31, 2025

Draft: Reusing schema validators and serializers #1614

Closed

linting

5835c59

sydney-runkle added 2 commits January 31, 2025 10:30

linting and inheritance fix

e5245b5

serializer reuse logic as well

980fa20

davidhewitt reviewed Jan 31, 2025

View reviewed changes

src/validators/prebuilt.rs Outdated Show resolved Hide resolved

handling for mappingproxy rather than dict on classes

8bd3ef2

sydney-runkle added 2 commits February 3, 2025 15:41

using more simple py approach

8e754c4

edge case for generic dataclasses

57671e8

sydney-runkle marked this pull request as ready for review February 4, 2025 16:07

sydney-runkle mentioned this pull request Feb 4, 2025

Introduce a schema variant to reuse Validators, Serializers and CoreSchema #1414

Closed

sydney-runkle changed the title ~~Memory usage optimization - use prebuilt validators and serializers~~ Memory usage optimization via reuse of SchemaValidator and SchemaSerializer Feb 4, 2025

sydney-runkle requested a review from davidhewitt February 4, 2025 17:26

davidhewitt reviewed Feb 4, 2025

View reviewed changes

src/validators/prebuilt.rs Outdated Show resolved Hide resolved

src/validators/prebuilt.rs Outdated Show resolved Hide resolved

src/serializers/shared.rs Outdated Show resolved Hide resolved

src/validators/mod.rs Outdated Show resolved Hide resolved

davidhewitt reviewed Feb 5, 2025

View reviewed changes

src/validators/prebuilt.rs Outdated Show resolved Hide resolved

sydney-runkle and others added 3 commits February 5, 2025 10:22

fix name function for validator

a8a7c5e

restructuring recs from david

78f18b3

Merge branch 'main' into prebuilt-variant

8a9b535

sydney-runkle added 2 commits February 5, 2025 10:52

confirming prebuilt usage via a test

bbaef68

Merge branch 'prebuilt-variant' of https://github.com/pydantic/pydant…

3267736

…ic-core into prebuilt-variant

davidhewitt approved these changes Feb 5, 2025

View reviewed changes

refactor common extraction logic

99136a5

sydney-runkle merged commit 164b9ff into main Feb 5, 2025
28 checks passed

sydney-runkle deleted the prebuilt-variant branch February 5, 2025 21:24

Viicos mentioned this pull request Feb 6, 2025

Fix condition before using prebuilt validator/serializer #1625

Merged

4 tasks

fruitoiz mentioned this pull request Feb 6, 2025

pydantic v2 memory usage with hundreds of complicated models pydantic/pydantic#9982

Open

1 task

sydney-runkle mentioned this pull request Feb 7, 2025

Bump pydantic-core to v2.29.0 pydantic/pydantic#11402

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage optimization via reuse of `SchemaValidator` and `SchemaSerializer` #1616

Memory usage optimization via reuse of `SchemaValidator` and `SchemaSerializer` #1616

sydney-runkle commented Jan 31, 2025 •

edited by Viicos

Loading

codspeed-hq bot commented Jan 31, 2025 •

edited

Loading

fruitoiz commented Feb 1, 2025

davidhewitt left a comment

sydney-runkle commented Feb 5, 2025

davidhewitt commented Feb 5, 2025

davidhewitt left a comment

Memory usage optimization via reuse of SchemaValidator and SchemaSerializer #1616

Memory usage optimization via reuse of SchemaValidator and SchemaSerializer #1616

Conversation

sydney-runkle commented Jan 31, 2025 • edited by Viicos Loading

For aiotdlib

For k8s_v2.py

Thanks 🚀

codspeed-hq bot commented Jan 31, 2025 • edited Loading

CodSpeed Performance Report

Merging #1616 will not alter performance

Summary

fruitoiz commented Feb 1, 2025

davidhewitt left a comment

Choose a reason for hiding this comment

sydney-runkle commented Feb 5, 2025

davidhewitt commented Feb 5, 2025

davidhewitt left a comment

Choose a reason for hiding this comment

Memory usage optimization via reuse of `SchemaValidator` and `SchemaSerializer` #1616

Memory usage optimization via reuse of `SchemaValidator` and `SchemaSerializer` #1616

sydney-runkle commented Jan 31, 2025 •

edited by Viicos

Loading

For `aiotdlib`

For `k8s_v2.py`

codspeed-hq bot commented Jan 31, 2025 •

edited

Loading