architxt.simplification.simple_rewrite#
Functions
|
Rewrite a forest into a valid schema, treating each tree as a distinct group. |
- architxt.simplification.simple_rewrite.simple_rewrite(forest, *, commit=BATCH_SIZE)[source]#
Rewrite a forest into a valid schema, treating each tree as a distinct group.
This function processes each tree in the forest, collapsing its entities into a single group node if the tree contains unlabelled nodes. Each unique combination of entity labels is assigned a consistent group ID. Duplicate entities are removed.
- Parameters:
forest (
Iterable
[Tree
]) – A forest to be rewritten in place.commit (
Union
[bool
,int
]) – When working with a TreeBucket, changes can be committed automatically . - If False, no commits are made. Use this for small forests where you want to commit manually later. - If True, commits after processing the entire forest in one transaction. - If an integer, commits after processing every N tree. To avoid memory issues with large forests, we recommend using batch commit on large forests.
- Return type: