architxt.simplification.tree_rewriting#
Functions
|
Apply a sequence of edit operations to a forest, potentially simplifying its structure. |
|
Create a group node from a subtree and inserts it into its parent node. |
|
Find and create groups based on the given set of equivalent subtrees. |
|
Rewrite a forest by applying edit operations iteratively. |
- architxt.simplification.tree_rewriting.apply_operations(edit_ops, forest, *, equiv_subtrees, early_exit=True, executor)[source]#
Apply a sequence of edit operations to a forest, potentially simplifying its structure.
Each operation in edit_ops is applied to the forest in the provided order. If early_exit is enabled, the function stops as soon as an operation successfully simplifies at least one tree. Otherwise, all operations are applied.
- Parameters:
edit_ops (
Sequence
[Union
[Operation
,tuple
[str
,Operation
]]]) – A sequence of operations to apply to the forest. Each operation can either be a callable or a tuple (name, callable) where name is a string identifier for the operation.forest (
Collection
[Tree
]) – The input forest (a collection of trees) on which operations are applied.equiv_subtrees (
set
[tuple
[Tree
, …]]) – The set of equivalent subtrees.early_exit (
bool
) – A boolean flag indicating whether to stop after the first successful operation. If False, all operations are applied.executor (
ProcessPoolExecutor
) – A pool executor to parallelize the processing of the forest.
- Return type:
tuple
[Collection
[Tree
],Optional
[int
]]- Returns:
A tuple composed of: - The updated forest after applying the operations. - The index of the operation that successfully simplified a tree, or None if no operation succeeded.
- architxt.simplification.tree_rewriting.create_group(subtree, group_index)[source]#
Create a group node from a subtree and inserts it into its parent node.
- architxt.simplification.tree_rewriting.find_groups(equiv_subtrees, min_support)[source]#
Find and create groups based on the given set of equivalent subtrees.
- architxt.simplification.tree_rewriting.rewrite(forest, *, tau=0.7, epoch=100, min_support=None, metric=DEFAULT_METRIC, edit_ops=DEFAULT_OPERATIONS, debug=False, max_workers=None)[source]#
Rewrite a forest by applying edit operations iteratively.
- Parameters:
forest (
Collection
[Tree
]) – The forest to perform on.tau (
float
) – Threshold for subtree similarity when clustering.epoch (
int
) – Maximum number of rewriting steps.min_support (
Optional
[int
]) – Minimum support of groups.metric (
Callable
[Collection
[str
],Collection
[str
],float
]) – The metric function used to compute similarity between subtrees.edit_ops (
Sequence
[type
[Operation
]]) – The list of operations to perform on the forest.debug (
bool
) – Whether to enable debug logging.max_workers (
Optional
[int
]) – Number of parallel worker processes to use.
- Return type:
- Returns:
The rewritten forest.
Modules