architxt.nlp.brat#
Dataset loader for BRAT (BRAT Rapid Annotation Tool) format.
Functions
|
Convert a list of BratEntity objects into Entity objects, while filtering out certain types of tags. |
|
Convert a Brat example into annotated sentences, filtering and mapping entities and relations as specified. |
|
Convert a list of BratRelation objects into Relation objects while filtering out certain types of relations. |
|
- architxt.nlp.brat.convert_brat_entities(entities, *, allow_list=None, mapping=None)[source]#
Convert a list of BratEntity objects into Entity objects, while filtering out certain types of tags.
- Parameters:
entities (
Iterable
[Entity
]) – An iterable of BratEntity objects to convert.allow_list (
Optional
[set
[str
]]) – A set of entity types to exclude from the output. If None, no filtering is applied.mapping (
Optional
[dict
[str
,str
]]) – A dictionary mapping entity names to new values. If None, no mapping is applied.
- Return type:
- Returns:
A generator yielding Entity objects.
>>> from pybrat.parser import Entity, Relation, Span >>> ents = [ ... Entity(spans=[Span(start=0, end=5)], type="person", mention="E1"), ... Entity(spans=[Span(start=10, end=15)], type="FREQ", mention="E2"), ... Entity(spans=[Span(start=20, end=25)], type="MOMENT", mention="E3") ... ] >>> ents = list(convert_brat_entities(ents, allow_list={"MOMENT"}, mapping={"FREQ": "FREQUENCE"})) >>> len(ents) 2 >>> print(ents[0].name) PERSON >>> print(ents[1].name) FREQUENCE
- architxt.nlp.brat.convert_brat_example(example, *, entities_filter=None, relations_filter=None, entities_mapping=None, relations_mapping=None)[source]#
Convert a Brat example into annotated sentences, filtering and mapping entities and relations as specified.
- Parameters:
example (
Example
) – An Example object containing the .txt and .ann file data.entities_filter (
Optional
[set
[str
]]) – A set of entity types to exclude from the output. If None, no filtering is applied.relations_filter (
Optional
[set
[str
]]) – A set of relation types to exclude from the output. If None, no filtering is applied.entities_mapping (
Optional
[dict
[str
,str
]]) – A dictionary mapping entity names to new values. If None, no mapping is applied.relations_mapping (
Optional
[dict
[str
,str
]]) – A dictionary mapping relation names to new values. If None, no mapping is applied.
- Return type:
- Returns:
A generator yielding AnnotatedSentence objects for each sentence in the text.
- architxt.nlp.brat.convert_brat_relations(relations, *, allow_list=None, mapping=None)[source]#
Convert a list of BratRelation objects into Relation objects while filtering out certain types of relations.
- Parameters:
relations (
Iterable
[Relation
]) – An iterable of BratRelation objects to convert.allow_list (
Optional
[set
[str
]]) – A set of relation types to exclude from the output. If None, no filtering is applied.mapping (
Optional
[dict
[str
,str
]]) – A dictionary mapping relation names to new values. If None, no mapping is applied.
- Return type:
- Returns:
A generator yielding Relation objects.
>>> from pybrat.parser import Entity, Relation, Span >>> rels = [ ... Relation(arg1=Entity(spans=[Span(start=0, end=5)], type='X', mention='E1'), arg2=Entity(spans=[Span(start=10, end=15)], type='Y', mention='E2'), type="part-of"), ... Relation(arg1=Entity(spans=[Span(start=20, end=25)], type='X', mention='E3'), arg2=Entity(spans=[Span(start=30, end=35)], type='Z', mention='E3'), type="TEMPORALITY") ... ] >>> rels = list(convert_brat_relations(rels, allow_list={"TEMPORALITY"})) >>> len(rels) 1 >>> print(rels[0].name) PART-OF