Model Reuse and Deduplication¶
When generating models from schemas, you may encounter duplicate model definitions. datamodel-code-generator provides options to deduplicate models and share them across multiple files, improving output structure, reducing diff sizes, and enhancing performance.
Quick Overview¶
| Option | Description |
|---|---|
--reuse-model |
Deduplicate identical model/enum definitions |
--reuse-scope |
Control scope of deduplication (root or tree) |
--shared-module-name |
Name for shared module in multi-file output |
--collapse-root-models |
Inline root models instead of creating wrappers |
--use-type-alias |
Create TypeAlias for reusable field types (see Reducing Duplicate Field Types) |
--reuse-model¶
The --reuse-model flag detects identical enum or model definitions and generates a single shared definition instead of duplicates.
Without --reuse-model¶
# Duplicate enums for animal and pet fields
class Animal(Enum):
dog = 'dog'
cat = 'cat'
class Pet(Enum): # Duplicate!
dog = 'dog'
cat = 'cat'
class User(BaseModel):
animal: Optional[Animal] = None
pet: Optional[Pet] = None
With --reuse-model¶
# Single shared enum
class Animal(Enum):
dog = 'dog'
cat = 'cat'
class User(BaseModel):
animal: Optional[Animal] = None
pet: Optional[Animal] = None # Reuses Animal
Benefits¶
- Smaller output - Less generated code
- Cleaner diffs - Changes to shared types only appear once
- Better performance - Faster generation for large schemas
- Type consistency - Same types are truly the same
--reuse-scope¶
Controls the scope for model reuse detection when processing multiple input files.
| Value | Description |
|---|---|
root |
Detect duplicates only within each input file (default) |
tree |
Detect duplicates across all input files |
Single-file input¶
For single-file input, --reuse-scope has no effect. Use --reuse-model alone.
Multi-file input with tree scope¶
When generating from multiple schema files to a directory:
Input files:
Output with --reuse-scope tree:
models/
├── __init__.py
├── user.py # imports from shared
├── order.py # imports from shared
└── shared.py # SharedModel defined once
# models/user.py
from .shared import SharedModel
class User(BaseModel):
data: Optional[SharedModel] = None
# models/shared.py
class SharedModel(BaseModel):
id: Optional[int] = None
name: Optional[str] = None
--shared-module-name¶
Customize the name of the shared module when using --reuse-scope tree.
datamodel-codegen --input schemas/ --output models/ \
--reuse-model --reuse-scope tree --shared-module-name common
Output:
--collapse-root-models¶
Inline root model definitions instead of creating separate wrapper classes.
Without --collapse-root-models¶
With --collapse-root-models¶
When to use¶
- Simpler output when wrapper classes aren't needed
- Reducing the number of generated classes
- When root models are just type aliases
Combining Options¶
Recommended for large multi-file projects¶
datamodel-codegen \
--input schemas/ \
--output models/ \
--reuse-model \
--reuse-scope tree \
--shared-module-name common \
--collapse-root-models
This produces:
- Deduplicated models across all files
- Shared types in a common.py module
- Inlined simple root models
- Minimal, clean output
Recommended for single-file projects¶
datamodel-codegen \
--input schema.json \
--output model.py \
--reuse-model \
--collapse-root-models
Performance Impact¶
For large schemas with many models:
| Scenario | Without reuse | With reuse |
|---|---|---|
| 100 schemas, 50% duplicates | 100 models | ~50 models |
| Generation time | Baseline | Faster (less to generate) |
| Output size | Large | Smaller |
| Git diff on type change | Multiple files | Single location |
Performance tip
For very large schemas, combine --reuse-model with --disable-warnings to speed up generation:
Output Structure Comparison¶
Without deduplication¶
models/
├── user.py # UserStatus enum
├── order.py # OrderStatus enum (duplicate of UserStatus!)
└── product.py # ProductStatus enum (duplicate!)
With --reuse-model --reuse-scope tree¶
models/
├── __init__.py
├── user.py # imports Status from shared
├── order.py # imports Status from shared
├── product.py # imports Status from shared
└── shared.py # Status enum defined once
Reducing Duplicate Field Types¶
When multiple classes share the same field type with identical constraints or metadata, you can reduce duplication by defining the type once in $defs and referencing it with $ref. Combined with --use-type-alias, this creates a single TypeAlias that's reused across all classes.
Problem: Duplicate Annotated Fields¶
Without using $ref, each class gets its own inline field definition:
class ClassA(BaseModel):
place_name: Annotated[str, Field(alias='placeName')] # Duplicate!
class ClassB(BaseModel):
place_name: Annotated[str, Field(alias='placeName')] # Duplicate!
Solution: Use $defs with --use-type-alias¶
Step 1: Define the shared type in $defs
{
"$defs": {
"PlaceName": {
"type": "string",
"title": "PlaceName",
"description": "A place name"
},
"ClassA": {
"type": "object",
"properties": {
"place_name": { "$ref": "#/$defs/PlaceName" }
}
},
"ClassB": {
"type": "object",
"properties": {
"place_name": { "$ref": "#/$defs/PlaceName" }
}
}
}
}
Step 2: Generate with --use-type-alias
Result: Single TypeAlias reused across classes¶
PlaceName = TypeAliasType(
"PlaceName",
Annotated[str, Field(..., description='A place name', title='PlaceName')],
)
class ClassA(BaseModel):
place_name: PlaceName # Reuses the TypeAlias
class ClassB(BaseModel):
place_name: PlaceName # Reuses the TypeAlias
Benefits¶
- Single source of truth - Field type is defined once
- Easier maintenance - Change the type in one place
- Cleaner generated code - No redundant annotations
- Type safety - All fields share the exact same type
When to Use This Pattern¶
This pattern is ideal when:
- Multiple classes share fields with the same constraints (e.g.,
minLength,pattern) - Fields have identical metadata (e.g.,
description,examples) - You want to ensure type consistency across your schema