Fundamentals#

ArchiTXT represents a database instances as a hierarchical tree structure called the meta-model. This meta-model acts as an intermediary, database-agnostic representation. It allows flexible transformation from/to various database models such as :

  • Relational databases

  • Document-oriented databases

  • Graph databases (RDF, Property Graph)

  • Tabular formats (CSV, TSV, etc.)

  • Nested data structure (XML, JSON, YAML, etc.)

Instance Representation#

A database instance is a set of Tree organised into a forest. Each node in these trees represents a fundamental database concept. The key components are:

Entity

A fundamental data unit representing a name-value couple such as attributes. It is represented by ENT nodes.

Group

A collection of Entities that share a semantic relationship. It is represented by GROUP nodes.

Relation

A connection between two Groups. It is represented by REL nodes.

Collection

A set of equivalent Groups or Relations. It is represented by COLL nodes.

Below is a visual representation of an instance in ArchiTXT:

        ---
config:
  theme: neutral
---
graph TD;
    root["ROOT"]
    root --> r1[REL1]
    r1 --> g1[GROUP1]
    r1 --> g2[GROUP2]
    root --> g3[GROUP1]
    g1 --> e1[ENT1]
    g1 --> e2[ENT2]
    g2 --> e3[ENT3]
    g3 --> e4[ENT1]
    g3 --> e5[ENT2]
    

Schema Definition#

Given an instance structured as a forest, a Schema is a grammar. This grammar recognises a subset of trees that compose a valid database instance.

The following, is the minimal schema that recognises the above instance:

ROOT   ::= REL1 | GROUP1 | GROUP2
REL1   ::= GROUP1 GROUP2
GROUP1 ::= ENT1 ENT2
GROUP2 ::= ENT3
ENT1   ::= <data>
ENT2   ::= <data>
ENT3   ::= <data>