architxt.simplification.simple_rewrite

Contents

architxt.simplification.simple_rewrite#

Functions

simple_rewrite(forest, *[, commit])

Rewrite a forest into a valid schema, treating each tree as a distinct group.

architxt.simplification.simple_rewrite.simple_rewrite(forest, *, commit=BATCH_SIZE)[source]#

Rewrite a forest into a valid schema, treating each tree as a distinct group.

This function processes each tree in the forest, collapsing its entities into a single group node if the tree contains unlabelled nodes. Each unique combination of entity labels is assigned a consistent group ID. Duplicate entities are removed.

Parameters:
  • forest (Iterable[Tree]) – A forest to be rewritten in place.

  • commit (Union[bool, int]) – When working with a TreeBucket, changes can be committed automatically . - If False, no commits are made. Use this for small forests where you want to commit manually later. - If True, commits after processing the entire forest in one transaction. - If an integer, commits after processing every N tree. To avoid memory issues with large forests, we recommend using batch commit on large forests.

Return type:

None