architxt.tree#

Functions

has_type(t[, types])

Check if the given tree object has the specified type(s).

reduce(tree, pos[, types])

Reduces a subtree within a tree at the specified position pos.

reduce_all(tree[, skip_types])

Recursively attempts to reduce all eligible subtrees in a tree.

Classes

NodeLabel(label_type[, label, data])

NodeType(*values)

Tree(node[, children])

class architxt.tree.NodeLabel(label_type, label='', data=None)[source]#

Bases: str

data#

Type:    Optional[dict[str, Any]]

name#

Type:    str

type#

Type:    NodeType

class architxt.tree.NodeType(*values)[source]#

Bases: str, Enum

COLL = 'COLL'#

Type:    str

ENT = 'ENT'#

Type:    str

GROUP = 'GROUP'#

Type:    str

REL = 'REL'#

Type:    str

class architxt.tree.Tree(node, children=None)[source]#

Bases: ParentedTree

append(child)[source]#

Append object to the end of the list.

Return type:

None

depth()[source]#

Get the depth of the tree.

>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> t.depth()
1
>>> t[0].depth()
2
>>> t[0, 0].depth()
3
Return type:

int

entities()[source]#

Get a tuple of subtrees that are entities.

>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> list(t.entities()) == [t[0, 0], t[0, 1], t[1, 0], t[1, 1]]
True
>>> del t[0]
>>> list(t.entities()) == [t[0, 0], t[0, 1]]
True
>>> list(t[0, 0].entities()) == [t[0, 0]]
True
Return type:

tuple[Tree, …]

entity_label_count()[source]#

Return a Counter object that counts the labels of entity subtrees.

>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> t.entity_label_count()
Counter({'person': 2, 'fruit': 1, 'animal': 1})
Return type:

Counter[NodeLabel]

entity_labels()[source]#

Get the set of entity labels present in the tree.

>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> sorted(t.entity_labels())
['animal', 'fruit', 'person']
>>> sorted(t[0].entity_labels())
['fruit', 'person']
>>> del t[0]
>>> sorted(t.entity_labels())
['animal', 'person']
Return type:

set[str]

extend(children)[source]#

Extend list by appending elements from the iterable.

Return type:

None

group_instances(group_name)[source]#

Get a DataFrame containing all instances of a specified group within the tree.

Each row in the DataFrame represents an instance of the group, and each column represents an entity in that group, with the value being a concatenated string of that entity’s leaves.

Parameters:

group_name (str) – The name of the group to search for.

Return type:

DataFrame

Returns:

A pandas DataFrame containing instances of the specified group.

>>> t = Tree.fromstring('(S (GROUP::A (ENT::person Alice) (ENT::fruit apple)) '
...                     '(GROUP::A (ENT::person Bob) (ENT::fruit banana)) '
...                     '(GROUP::B (ENT::person Charlie) (ENT::animal dog)))')
>>> t.group_instances("A")
  person   fruit
0  Alice   apple
1    Bob  banana
>>> t.group_instances("B")
    person animal
0  Charlie    dog
>>> t.group_instances("C")
Empty DataFrame
Columns: []
Index: []
>>> t[0].group_instances("A")
  person  fruit
0  Alice  apple
groups()[source]#

Get the set of group names present within the tree.

Return type:

set[str]

Returns:

A set of unique group names within the tree.

>>> t = Tree.fromstring('(S (GROUP::A x) (GROUP::B y) (X (GROUP::C z)))')
>>> sorted(t.groups())
['A', 'B', 'C']
>>> sorted(t[0].groups())
['A']
has_duplicate_entity()[source]#

Check if there are duplicate entity labels.

>>> from architxt.tree import Tree
>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> t.has_duplicate_entity()
True
>>> t[0].has_duplicate_entity()
False
Return type:

bool

has_entity_child()[source]#

Check if there is at least one entity as direct children.

>>> from architxt.tree import Tree
>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> t.has_entity_child()
False
>>> t[0].has_entity_child()
True
Return type:

bool

has_unlabelled_nodes()[source]#
Return type:

bool

height()[source]#

Get the height of the tree.

>>> t = Tree.fromstring('(S (X (ENT::person Alice) (ENT::fruit apple)) (Y (ENT::person Bob) (ENT::animal rabbit)))')
>>> t.height()
4
>>> t[0].height()
3
>>> t[0, 0].height()
2
Return type:

int

insert(pos, child)[source]#

Insert object before index.

Return type:

None

merge(tree)[source]#

Merge two trees into one.

The root of both trees becomes one while maintaining the level of each subtree.

Return type:

Tree

pop(pos=-1, *, recursive=True)[source]#

Delete an element from the tree at the specified position pos.

If the parent tree becomes empty after the deletion, recursively deletes the parent node.

Parameters:
  • pos (int) – The position (index) of the element to delete in the tree.

  • recursive (bool) – If an empty tree should be removed from the parent.

Return type:

Union[Tree, str]

Returns:

The element at the position. The function modifies the tree in place.

>>> t = Tree.fromstring("(S (NP Alice) (VP (VB like) (NP (NNS apples))))")
>>> print(t[(1, 1)].pformat(margin=255))
(NP (NNS apples))
>>> subtree = t[1, 1].pop(0)
>>> print(t.pformat(margin=255))
(S (NP Alice) (VP (VB like)))
>>> subtree = t.pop(0)
>>> print(t.pformat(margin=255))
(S (VP (VB like)))
>>> subtree = t[0].pop(0, recursive=False)
>>> print(t.pformat(margin=255))
(S (VP ))
remove(child, *, recursive=True)[source]#

Remove first occurrence of value.

Raises ValueError if the value is not present.

Return type:

None

set_label(label)[source]#

Set the node label of the tree.

>>> t = Tree.fromstring("(S (NP (D the) (N dog)) (VP (V chased) (NP (D the) (N cat))))")
>>> t.set_label("T")
>>> print(t)
(T (NP (D the) (N dog)) (VP (V chased) (NP (D the) (N cat))))
Parameters:

label (any) – the node label (typically a string)

Return type:

None

slots = ('_parent', '_label')#

Type:    tuple

architxt.tree.has_type(t, types=None)[source]#

Check if the given tree object has the specified type(s).

Parameters:
  • t (Any) – The object to check type for (can be a Tree, Production, or NodeLabel).

  • types (Union[set[Union[NodeType, str]], NodeType, str, None]) – The types to check for (can be a set of strings, a string, or None).

Return type:

bool

Returns:

True if the object has the specified type(s), False otherwise.

>>> tree = Tree.fromstring('(S (ENT Alice) (REL Bob))')
>>> has_type(tree, NodeType.ENT)  # Check if the tree is of type 'S'
False
>>> has_type(tree[0], NodeType.ENT)
True
>>> has_type(tree[0], 'ENT')
True
>>> has_type(tree[1], NodeType.ENT)
False
>>> has_type(tree[1], {NodeType.ENT, NodeType.REL})
True
architxt.tree.reduce(tree, pos, types=None)[source]#

Reduces a subtree within a tree at the specified position pos.

The reduction occurs only if the subtree at pos has exactly one child, or if it does not match a specific set of node types. If the subtree can be reduced, its children are lifted into the parent node at pos.

Parameters:
  • tree (Tree) – The tree in which the reduction will take place.

  • pos (int) – The index of the subtree to attempt to reduce.

  • types (Optional[set[Union[str, NodeType]]]) – A set of NodeType or string labels that should be kept, or None to reduce based on length.

Return type:

bool

Returns:

True if the subtree was reduced, False otherwise.

>>> t = Tree.fromstring("(S (NP Alice) (VP (VB like) (NP (NNS apples))))")
>>> reduce(t[1], 1)
True
>>> print(t.pformat(margin=255))
(S (NP Alice) (VP (VB like) (NNS apples)))
>>> reduce(t, 0)
True
>>> print(t.pformat(margin=255))
(S Alice (VP (VB like) (NNS apples)))
architxt.tree.reduce_all(tree, skip_types=None)[source]#

Recursively attempts to reduce all eligible subtrees in a tree.

The reduction process continues until no further reductions are possible. Subtrees can be skipped if their types are listed in skip_types.

Parameters:
  • tree (Tree) – The tree in which reductions will be attempted.

  • skip_types (Optional[set[Union[str, NodeType]]]) – A set of NodeType or string labels that should be kept, or None to reduce based on length.

Return type:

None

Returns:

None. The tree is modified in place.

>>> t = Tree.fromstring("(S (X (Y (Z (NP Alice)))) (VP (VB likes) (NP (NNS apples))))")
>>> reduce_all(t)
>>> print(t.pformat(margin=255))
(S Alice (VP likes apples))