Abstract
Graph-based semantic representations are popular in natural language processing, where it is often convenient to model linguistic concepts as nodes and relations as edges between them. Several attempts have been made to find a generative device that is sufficiently powerful to describe languages of semantic graphs, while at the same allowing efficient parsing. We contribute to this line of work by introducing graph extension grammar, a variant of the contextual hyperedge replacement grammars proposed by Hoffmann et al. Contextual hyperedge replacement can generate graphs with non-structural reentrancies, a type of node-sharing that is very common in formalisms such as abstract meaning representation, but that context-free types of graph grammars cannot model. To provide our formalism with a way to place reentrancies in a linguistically meaningful way, we endow rules with logical formulas in counting monadic second-order logic. We then present a parsing algorithm and show as our main result that this algorithm runs in polynomial time on graph languages generated by a subclass of our grammars, the so-called local graph extension grammars.
1 Introduction
Formal graph languages are commonly used to represent the semantics of natural and artificial languages. They are exceptionally versatile, lend themselves to human interpretation (in contrast to, for example, vector-based semantic representations), and have a comprehensive mathematical theory. Recent applications are the abstract meaning representations (AMRs) (Langkilde and Knight 1998; Banarescu et al. 2013) that capture the semantics of natural language sentences, and the work of Allamanis, Brockschmidt, and Khademi (2018), who use graphs to represent both the syntactic and semantic structure of source code. In these cases, it is common to represent objects as nodes, and relations as directed edges. Probabilities and other weights are sometimes added to reflect quantitative aspects such as likelihoods and uncertainties.
In this work, we propose the graph extension grammar for modeling languages of graph-based semantic representations. We formalize these grammars in a tree-based fashion in the sense of Drewes (2006). Thus, a grammar consists of a graph algebra and a tree grammar g. The trees generated by g are well-formed expressions over the operations of . Each generated tree thus evaluates into a graph, meaning that the tree language generated by g evaluates to a graph language. If we are careful about how we construct and combine g and , we can make parsing efficient. In other words, given a graph G, we can find a tree in the language of g that evaluates to G under —or decide that no such tree exists, meaning that G is not in the graph language specified by g and —in polynomial time. The main contribution of this work is the design of the algebra.
As a guiding example, we take the aforementioned AMR. (Note that since this article focuses on parsing in terms of membership problem solving, we do not go into the extensive string-to-AMR parsing literature.) AMR is characterized by its graphs being directed, acyclic, and having unbounded node degree. The concept was first introduced by Langkilde and Knight (1998) based on a semantic abstraction language by Kasper (1989). The notion was refined and popularized by Banarescu et al. (2013) and instantiated for a limited domain by Braune, Bauer, and Knight (2014). To ground AMR in formal language theory, Chiang et al. (2018) analyze the AMR corpus of Banarescu et al. (2013). They note that even though the node degree is generally low in practice, this is not always the case, which speaks in favor of models that allow an unbounded node degree. Regarding the treewidth of the graphs in the corpus, they find that it never exceeds 4 and conclude that an algorithm can depend exponentially on this parameter and still be feasible in practice.
In the context of semantic graphs, it is common to talk about reentrancies. Figure 1 illustrates this concept with a pair of AMR graphs, both of which require node sharing, or reentrant edges, to express the correct semantics. We propose to distinguish between two types of reentrancies: In the first graph, the reentrancy is structural, meaning that the boy must be the agent of whatever he is trying, so the “arg0” edge can only point to him. In the second graph, the reentrancy is non-structural in the sense that although in this particular case the girl believes herself, there is nothing to prevent her from believing someone else, or that someone else believes her. In general, we speak of (1) structural reentrancies when they are syntactically governed, for example, by control or coordination, and of (2) non-structural reentrancies when they can in principle refer to any antecedent, in some sense disregarding the structure of the graph in favor of the roles of the concepts. Structural reentrancy can, for example, also take the form of subject control as in “They persuaded him to talk to her”, where the person who talks and the person who was persuaded must be one and the same. In contrast, “[…], but she liked them” is an example of (2) since the antecedent of “them” may be picked from anywhere in “[…]” for the semantic representation to be valid.
A second important characteristic of AMR and other notions of semantic graphs in natural language processing is that certain types of edges, like those in Figure 1, point to arguments of which there should be at most one. For example, the concept “try” in Figure 1 must only have one outgoing “arg0” edge. Other edges, of types not present in Figure 1, can for example represent modifiers, such as believing strongly and with a passion. Outgoing edges of these types are usually not limited in number.
It is known that contextual hyperedge replacement grammars (CHRGs), an extension of the well-known context-free hyperedge replacement grammars (HRGs), can model both structural and non-structural reentrancies (Drewes and Jonsson 2017). To achieve this, contextual hyperedge replacement rules may contain so-called context nodes which, during the application of a rule, are identified with any appropriately labeled node of the graph to which the rule is applied. In particular, this allows for the generation of non-structural reentrancies, using rules that contain edges pointing to context nodes. Despite this added generative power compared to HRGs, previous research on CHRGs has resulted in a pumping lemma (Berglund 2019) for the generated graph languages and a parser generator (Drewes, Hoffmann, and Minas 2017, 2021, 2022) that, for a certain subclass of CHRGs, yields a parser that runs in quadratic (and in the common case, linear) time in the size of its input graph.1 However, similarly to LL- and LR-parsers for ordinary context-free languages, and in contrast to our algorithm, the parser generator may discover parsing conflicts. In this case it is unable to construct a parser. Compared to the string case, the possible reasons for conflicts are much subtler, which makes grammar construction a complex and error-prone manual endeavor.
For this reason, we are now proposing graph extension grammars, a type of graph grammar that makes use of the idea of context nodes in a different way. It enables polynomial parsing based on an entirely syntactic condition on the rules (our main result Theorem 5) while retaining the ability to specify graph languages with reentrancies of both type (1) and (2). One key property of these grammars that allows for polynomial parsing is that, intuitively, nodes are provided with all of their outgoing edges the moment they are created. Using ordinary contextual hyperedge replacement, this would thus result in graph languages of bounded out-degree, the bound being given by the maximal number of outgoing edges of nodes in the right-hand sides of productions. To generate semantic graphs in which a concept can have an arbitrary number of dependents, we use a technique from Drewes et al. (2010) known as cloning. Here, tree-based graph generation is helpful because it allows us to incorporate both contextuality and cloning in a natural way without sacrificing efficient parsability. While tree-based hyperedge replacement grammars are well-known to be equivalent to ordinary ones (see below), the tree-based formulation does make a difference in the contextual case as it avoids the problem of cyclic dependencies that has not yet been fully characterized and can make parsing intractable (Drewes, Hoffmann, and Minas 2019).
The creation of reentrant edges via context nodes requires a control mechanism that ensures that edges are placed in a meaningful way. One must, for example, make certain that co-references refer to entities of the correct type and plurality. For such conditions, we use monadic second-order (mso) logic since it has many well-known ties to the theory of hyperedge replacement. In particular, the theorem commonly known as Courcelle’s theorem (see, e.g., Courcelle and Engelfriet 2012) states that mso formulas can be evaluated in linear time on graphs of bounded treewidth. The theorem is thus applicable to graphs generated by HRGs, because they generate graph languages of bounded treewidth. Unfortunately, the addition of non-structural reentrancies destroys this property and can cause a generated graph to have a treewidth proportional to the size of the graph. Despite this, we show in our main result that, if the mso formulas involved are local in a sense that restricts their ability to make general statements about the structure of the graph, then Courcelle’s theorem can be exploited to solve the membership problem in polynomial time. We also show that the membership problem is NP-complete if no restriction is imposed on the formulas.
We first prove our main result for graph extension grammars that are edge agnostic, meaning that their mso formulas are not allowed to make use of the edge predicate. Since this is a rather severe limitation that does not allow the placement of reentrant edges to be controlled by structural conditions at all, we show afterwards how it can be relaxed. We do so by allowing the logic to make use of predicates which, for a given node, state that this node belongs to a part of the graph having a certain form. The resulting local graph extension grammars are much more general than the edge-agnostic ones, but they still allow for polynomial parsing. We note here that, in fact, our algorithm works correctly even without any such restriction, the only downside being a lack of efficiency. With this in mind, it may be worth noting that neither edge agnosticism nor locality are needed to be able to apply Courcelle’s theorem if we are only interested in parsing graphs of bounded treewidth. For AMR, by checking the AMR bank, Chiang et al. (2018) observed that, although there is no theoretical upper limit on the treewidth of AMR graphs, in practice AMR graphs of treewidth larger than 5 are extremely rare. Hence, if one is willing to place such a bound on the treewidth of acceptable AMR graphs, then Courcelle’s theorem and thus our result can be used even without the locality assumption.
Regardless, it is our belief that locality, from a practical point of view, is not a particularly severe restriction. In fact, even edge agnosticism may be tolerable in many practical cases. This is because, as long as we are interested in the generation of AMR-like structures, the primary use of context nodes is to be able to describe reentrancies caused by language elements such as pronouns. Simplifying the situation only a little, one may say that a pronoun may refer to any suitable antecedent in the sentence, wherever and in whichever role it occurs. This is essentially saying that their placement is edge agnostic even from a linguistic point of view. For example, if the pronoun “he” occurs in a sentence (or its AMR graph) that also has an occurrence of “Bob” then the former may refer to the latter, regardless of which other edges point to Bob. The only thing that (usually) matters is whether the label “Bob” refers to a person that uses the personal pronoun “he”.
1.1 Related Work
Tree-based generation dates back to the seminal article by Mezei and Wright (1967), which generalizes context-free languages to languages over arbitrary domains, by evaluating the trees of a regular tree language with respect to an algebra.2 Operations on graphs were used in this way for the first time by Bauderon and Courcelle (1987). See the textbook by Courcelle and Engelfriet (2012) for the eventual culmination of this line of work. Graph operations similar to those used here (though without the contextual extension) appeared first in Courcelle’s work on the mso logic of graphs (Courcelle 1991, Definition 1.7). Essentially, if the right-hand side of a production contains k nonterminal hyperedges, it is viewed as an operation that takes k hypergraphs as arguments (corresponding to the hypergraphs generated by the k subderivations) and returns the hypergraph obtained by replacing the nonterminals in the right-hand side by those k argument hypergraphs. By context-freeness, evaluating the tree language that corresponds to the set of derivation trees of a grammar of such productions (which is a regular tree language) yields the same set of hypergraphs as is generated by the grammar itself.
This two-step approach is somewhat similar to that of lexical-functional grammar (LFG) by Kaplan and Bresnan (1982): In LFG, a c-structure containing syntactical information about a sentence and an f-structure that provides its semantic information are combined to create a representation for a language user’s syntactic knowledge. The c-structure is comparable to a derivation tree, while the semantic f-structure that applies semantic knowledge on top of that can be compared to the algebraic operations on graphs. Using tree formalisms for capturing linguistic aspects such as co-occurrence is not new: See, for example, the work by Joshi and Levy (1982) for the usage of trees to impose local constraints on sentences. Similarly, Carroll et al. (1999) investigate the practical implications of the extended domain of locality of tree-adjoining grammars.
Several formalisms have been put forth in the literature to describe graph-based semantic representations in general and AMR in particular. Most of these can be seen as variations of HRGs (see Habel and Kreowski 1987; Bauderon and Courcelle 1987; Drewes, Kreowski, and Habel 1997). It was established early that unrestricted HRGs can generate NP-complete graph languages (Aalbersberg, Rozenberg, and Ehrenfeucht 1986; Lange and Welzl 1987), so restrictions are needed to ensure efficient parsing. To this end, Lautemann (1990) proposes a CYK-like membership algorithm and proved that it runs in polynomial time provided that the language satisfies the following condition: For every graph G in the language, the number of connected components obtained by removing s nodes from G is in , where n is the number of nodes of G and the constant s depends on the grammar. Lautemann’s algorithm is refined by Chiang et al. (2013) to make it more suitable for natural language processing (NLP) tasks, but the algorithm is exponential in the node degree of the input graph.
In a parallel line of work, Quernheim and Knight (2012) propose automata on directed acyclic graphs (DAGs) for processing feature structures in machine translation. Chiang et al. (2018) invent an extended model of these DAG automata, focusing on semantic representations such as AMR. For this, the left- and right-hand sides in their DAG automata may take the form of restricted regular expressions. Blum and Drewes (2019) complement this work by studying language-theoretic properties of the DAG automata, establishing, among other things, that equivalence and emptiness are decidable in polynomial time.
Various types of graph algebras for AMR parsing are described in the work by Groschwitz et al. (see, e.g., Groschwitz, Koller, and Teichmann 2015; Groschwitz et al.2017,2018; Lindemann, Groschwitz, and Koller 2019,2020). A central objective is to find linguistically motivated restrictions that can efficiently be trained from data. An algorithm based on such an algebra for translating strings into semantic graphs is presented by Groschwitz et al. (2018): operations of arity zero denote graph fragments, and operations of arity two denote binary combinations of graph fragments into larger graphs. The trees over the operations of this algebra mirror the compositional structure of semantic graphs. The approach differs from ours in that their graph operations are entirely deterministic, and that neither context nodes nor cloning are used. Moreover, as is common in computational linguistics, evaluation is primarily empirical.
Through another set of syntactic restrictions on HRGs, Björklund et al. (2021) and Björklund, Drewes, and Ericson (2019) arrive at order-preserving DAG grammars (OPDGs)—with or without weights—which can be parsed in linear time. Intuitively, the restrictions ensure that each generated graph can be uniquely represented by a tree interpreted by a particular graph algebra. Despite their restrictions, OPDGs can describe central structural properties of AMR, but their limitation lies in the modeling of reentrancies. Of the previously discussed types of reentrancies, type (1) can to a large extent be modeled using OPDGs. Modeling type (2) cannot be done (except in very limited cases) since it requires attaching edges to the non-local context in a stochastic way, which cannot be achieved using hyperedge replacement alone.
The CHRGs by Drewes, Hoffmann, and Minas (2012) extend the ordinary HRGs with so-called contextual rules, which allow for isolated nodes in their left-hand sides. Contextual rules can reference previously generated nodes that are not attached to the replaced nonterminal hyperedge and add structure to them. Even though this formalism is strictly more powerful than HRG, it inherits several of the nice properties of HRG. In particular, there are useful normal forms and the membership problem is in NP (Drewes and Hoffmann 2015) for certain subclasses. However, as indicated above, the conditions defining these subclasses are semantic ones and thus difficult to handle. Here, we set out to use the idea of context nodes, enriched by a benign form of cloning, in a different way to develop a type of graph grammar that is sufficiently descriptive to model structural and non-structural phenomena in semantic graphs such as AMR, while allowing for easily verifiable syntactic conditions that result in polynomial time membership tests.
A very preliminary version of this work was published as Björklund, Drewes, and Jonsson (2021), the main result being a parsing algorithm that runs in time O(n2τ +1) where τ is the maximal type of any extension operation in the grammar (see Sections 2 and 3 for definitions). The graph extension grammars in that version restrict the matching of context nodes to nodes in the argument graph solely via node labels (that is, their node labels have to be identical). The use of mso formulas for that purpose is more general, simplifies the formalism, and allows us to take advantage of Courcelle’s theorem in the main proof of the article while preserving the upper bound on the running time of O(n2τ +1). Furthermore, it enables us to extend polynomial parsing to so-called local graph extension grammars. Their extension operations use a logical vocabulary enriched by so-called local node predicates, thus relaxing the assumption of edge agnosticism.
The remainder of this article is structured as follows. In the next section, we gather the basic definitions regarding graphs and logic on graphs. In Section 3 we introduce graph extension grammars and discuss an example. Section 4 shows that graph extension grammars can generate NP-complete graph languages. Readers who are less interested in NP-completeness results may safely skip that section, or consider it as another illustration of the concept of graph extension grammars. The major technical section is Section 5, in which we present the parsing algorithm and show that it can be implemented to run in polynomial time for edge-agnostic graph extension grammars. Subsequently, in Section 6, we show how the edge agnosticism requirement can be relaxed without sacrificing polynomial parsing, by turning to local graph extension grammars. Finally, Section 7 concludes.
2 Preliminaries
We sometimes use boldface letters x to denote (finite) indexed sequences, whose individual elements are then denoted as xi, that is, x = x1⋯xk where k = |x|. Alternatively, we may sometimes denote xi by x(i).
Trees and Algebras
A ranked alphabet is a pair A = (Σ, rk) such that Σ is a finite set of symbols and rk: Σ →ℕ is a function that assigns to every f ∈ Σ a rank. We usually keep rk implicit, thus notationally identifying A with Σ, and write f(k) to indicate that rk(f) = k.
The set TΣ of all well-formed trees over a ranked alphabet Σ is defined inductively: It is the smallest set of expressions such that, for every f(k) ∈ Σ and all trees t1,…, tk ∈TΣ, we have f[t1,…, tk] ∈TΣ. In particular f[] for k = 0, which we henceforth abbreviate by f, is in TΣ.
Let Σ be a ranked alphabet. A Σ-algebra is a pair where is a set called the domain of , and is a function for every f(k) ∈ Σ. Given a tree t ∈TΣ, the result of evaluating t with respect to is denoted by : for t = f[t1,…, tk] we let .
To generate trees over the operations of an algebra, we use regular tree grammars.
A regular tree grammar is a tuple g = (N,Σ, P, S) consisting of the following components:
N is a ranked alphabet of symbols, all of rank 0, called nonterminals .
Σ is a ranked alphabet of terminals which is disjoint with N.
P is a finite set of productions of the form A → f[A1,…, Ak] where, for k ∈ℕ, f(k) ∈ Σ, and A, A1,…, Ak ∈ N.
S ∈ N is the initial nonterminal .
We also say that g is a regular tree grammar over Σ .
While we are going to work with algebras over graph domains, let us illustrate the general idea by considering an algebra over the rational numbers ℚ. In this setting, the operations could be the standard arithmetic ones, such as +,−,/,×, all binary, together with constant-valued operations in ℚ, all of arity zero. It is hopefully easy to see how the trees and /[10,−[10,4]] evaluate to 3 and , as they are mere syntactic variants of the common arithmetic expressions and 10/(10 −4), respectively.
Let g = (N,Σ, P, S) be a regular tree grammar. Then (LA(g))A∈N is the smallest family of sets of trees such that, for every A ∈ N, a tree f[t1,…, tk] is in LA(g) if there is a production A → f[A1,…, Ak] in P for some k ∈ℕ and A, A1,…, Ak ∈ N, such that for all i ∈ [k]. The regular tree language generated by g is L(g) = LS(g).
Graphs and Counting Monadic Second-Order Logic
We now define the type of graphs considered in this article, and the operations used to construct them. In short, we work with node- and edge-labeled directed graphs, each equipped with a sequence of so-called ports. From a graph operation point of view, the sequence of ports is the “interface” of the graph; its nodes are the only ones that can individually be accessed. The number of ports determines the type of the graph.
A labeling alphabet (or simply alphabet ) is a pair of finite sets and of labels. A graph (over ) is a system G = (V, E, lab, port) where
V is a finite set of nodes ,
is a (necessarily finite) set of edges ,
assigns a node label to every node, and
is a sequence of nodes called ports .
The type of G is type(G) = |port|. The set of all graphs (over an implicitly understood labeling alphabet) of type k is denoted by . For an edge e = (u, z, v) ∈ E, we let src(e) = u (short for source ) and tar(e) = v (short for target ).
In the following, if the components of a graph G are not explicitly named, they will be denoted by VG, EG, labG, and portG, respectively.
The restriction of a graph G to V ⊆ VG and E ⊆ EG is defined if [portG] ⊆ V and , and is in this case given by (V, E, lab|V, portG).
A morphism from a graph G = (V, E, lab, port) to a graph G′ = (V′, E′, lab′, port′) is a structure and label preserving function μ: V → V′. More precisely, we require that lab′(μ(v)) = lab(v) for all v ∈ V, (μ(u), z, μ(v)) ∈ E′ for all (u, z, v) ∈ E, and μ(port) is a prefix of port′ (that is, every port of G is mapped to the respective port of G′, but G′ may contain additional ports). The fact that μ is such a morphism is also denoted by writing μ: G → G′. G is included in G′, denoted by G ⊆ G′, if the identity on V is a morphism from G to G′.
A morphing of a graph G = (V, E, lab, port) into a graph G′ = (V′, E′, lab′, port′) is a surjective morphism μ: G → G′ which is also surjective on edges and ports, that is, E′ = {(μ(u), z, μ(v))∣(u, z, v) ∈ E} and port′ = μ(port). Given such a morphing, we denote G′ by μ(G), and call it a morph of G. A morphing is an isomorphism if it is bijective; if an isomorphism between G and G′ exists, then these graphs are isomorphic.
As described in Courcelle and Engelfriet (2012), graphs are examples of (finite) relational structures. Such a structure consists of a finite set of objects (in the case of graphs these are the nodes of the graph4) and a finite set of relations on these objects. Selecting the set of relations to be considered and their arities determines which type of structures we talk about. To be able to view (our type of) graphs as relational structures we need unary relations laba for every node label , so that laba(v) is true if the node v carries the label a. We also need unary relations porti for all port numbers i to express that a given node is the i-th port. Finally, we need binary relations edga for all in order to express that there is an edge labeled z between two nodes, that is, edgz(u, v) is true if there is an edge labeled z from u to v.
The formal definition thus reads as follows:
Let G be a graph. Then we identify G with the finite relational structure , , (porti)i∈[type(G)]) such that5
VG is the domain (or universe) of the structure,
for every , laba is the unary predicate such that laba(u) holds if and only if labG(u) = a,
for every , edgz is the binary predicate such that edgz(u, v) holds if and only if (u, z, v) ∈ EG, and
for every i ∈ [type(G)], porti is the unary predicate such that porti(u) holds if and only if u = portG(i).
We use predicate logic to express properties of (nodes in) graphs. While, in principle, any logic may be used, we focus on counting monadic second-order (cmso) logic. Thus, formulas can make use of individual and set variables, denoted by (possibly indexed) lowercase letters x, y,… and uppercase letters X, Y, …. In the following, we let and be disjoint countably infinite sets of individual and set variables, respectively. The set CMSO of all cmso formulas expressing graph properties is inductively defined to be the smallest set of formal expressions satisfying the following conditions:
The formulas true and false are in CMSO.
For all , , , and i ∈ℕ, the formulas x = y, laba(x), edgz(x, y), and porti(x) belong to CMSO.
For all and , the formula x ∈ X is in CMSO.
For all and all r, s ∈ℕ with r < s, the formula cardr, s(X) is in CMSO.
For all , Q ∈{∀,∃}, and formulas φ, φ′ ∈CMSO, the formulas (Q ξ.φ), (φ ∧ φ′), and (¬φ) belong to CMSO.
As usual, we can omit parentheses when writing down formulas if there is no danger of confusion.
A cmso formula φ may be denoted by φ(X, x), where and , to express the fact that the free variables occurring in φ are in [X] ∪ [x].6 Given a graph G, an assignment appropriate for a formula φ(X, x) is a mapping asg that assigns a subset of VG to every X ∈ [X] and an element of VG to every x ∈ [x]. Given such an assignment, φ can be evaluated in G in the usual way, where cardr, s(X) (with X ∈X) is satisfied if and only if |asg(X)| = r (mod s). If asg(X) = V and asg(x) = v, we let denote the statement that φ is satisfied (i.e., evaluates to true) in G under asg. As usual, we can make use of other Boolean connectives such as ∨ and → in formulas since they can be expressed in terms of ∧ and ¬. For example, for cmso formulas φ and φ′, the formula is equivalent to ¬(φ ∧¬φ′), in the sense that the two formulas are satisfied by the same set of assignments, and φ ∨ φ′ is equivalent to ¬(¬φ ∧¬φ′).
In Section 5, we will make use of the so-called Backwards Translation theorem for quantifier-free operations. These quantifier-free operations map relational structures to other relational structures, in our case graphs to graphs. The formal definition of these operations takes some getting used to (as so much in the area of formal logic). In the next paragraphs, we thus try to convey the intuition first. However, we would also like to point out that quantifier-free operations and the Backwards Translation theorem are “merely” technical tools we use to formulate the proof of our main result. They are not used in the graph extension grammars themselves. Hence, readers who are just interested in the formalism and the overall parsing strategy may safely skip the remainder of this section.
The idea behind quantifier-free operations, which is a special case of the more general cmso operations, is to use a collection of logical formulas to describe a mapping from one type of relational structure to another. Hence, the input is a relational structure consisting of a set of objects (like the nodes of a graph) and a number of relations on these objects. The output is supposed to be a similar relational structure, though in general it can be a structure involving other relations. Now, we can use logical formulas to define both the domain of the resulting structure and each of its relations in terms of the input structure. For this, we need two things:
a domain formula δ with one free individual variable (the “argument” of the formula) which, when applied to an object in the input structure, determines whether this object is to be an object in the output structure (value true) or not (value false);7
for every relation Ri of the output structure, a formula with βi free variables, where βi is the arity of Ri. If this formula, applied to objects of the input structure, yields true, then is a tuple in the relation Ri of the output structure (provided that all of are in its domain, as dictated by the domain formula).
Naturally, both the domain formula and the formulas determining the output relations can make use of the relations of the input structure. The mappings definable in this way are called quantifier-free operations because we shall forbid the formulas to make use of quantifiers.
We are actually only interested in the case where both input and output structures are graphs. Hence, the purpose of δ is to pick the nodes to be included in the output graph, while the formulas define its node labels, edges, and ports.
The following formal definition provides one more element, which for simplicity was left out in the explanation above: A quantifier-free operation can have additional parameters. These parameters are represented by free set variables X1,…, Xℓ in the formulas and are thus to be instantiated by sets of nodes of the input graph when the operation is “called”. Hence, the operation can yield different output graphs for one and the same input graph depending on those parameters. In later parts of the article, these parameters will be used to be able to select certain nodes that are meant to play a distinguished role in the construction formalized by the operation.
Let R1,…, Rk be a list of the relations in Definition 4, where Ri has the arity βi for all i ∈ [k]. Let and be (pairwise distinct) variables. A quantifier-free operation ξ inX1,…, Xℓ is specified by quantifier-free formulas such that
δ is a formula with the free variables X1,…, Xℓ, x, and
every , for i ∈ [k], is a formula in ,
and we write . Given a graph G and node sets U1,…,Uℓ ⊆ VG, its image H = ξ(G,U1,…,Uℓ) under ξ is defined as follows:
The nodes of H are all v ∈ VG such that G⊧δ(U1,…,Uℓ, v).
Given nodes , holds in the image H if and only if .
A quantifier-free operation as defined above does not necessarily map graphs to graphs, but to slightly more general relational structures. More precisely, the resulting structures do not necessarily have exactly one port of each kind, as we may have G⊧θporti(U1,…,Uℓ, v1) for any number of nodes v1 ∈ VH (including zero). Hence, the output of a quantifier-free operation may strictly speaking not be a graph. We may disregard this formal inconsistency for two reasons. Firstly, the quantifier-free operations constructed later in this article yield true graphs by construction, meaning that the problem does not occur. Secondly, the results we are going to use (the Backwards Translation theorem and Courcelle’s theorem) both apply not only to graphs but to general relational structures.
Let k ∈ℕ, , , and a graph, viewed as a relational structure in the way described in Definition 3. We discuss a quantifier-free operation in the set variables , where . We choose constitute formulas of ξ so that the application of ξ to G with set arguments U1,U2,U3 ⊆ VG has the following effect: The set of nodes is restricted to (U1 ∪U2) ∖U3. All edges that involve a node not in U2 are kept, but switch their label (from “−−−” to “- - -” or vice versa), and the nodes in U2 form a clique of edges labeled “−−−”. The output graph has no ports. To achieve this, we define the formulas as follows:
δ(X1, X2, X3, x) = ((x ∈ X1) ∨ (x ∈ X2)) ∧¬(x ∈ X3), expressing that a node (represented by the variable x) belongs to the output graph if and only if it is in (X1 ∪ X2) ∖ X3,
θlaba(X1, X2, X3, x1) = laba(x1) for every , expressing that nodes in the output graph inherit their label from the input graph,
θedg−−−(X1, X2, X3, x1, x2) = edg−−−(x1, x2) ∨ (x1 ∈ X2 ∧ x2 ∈ X2), expressing that there is an edge labeled “−−−” between two nodes of the output graph if these two nodes are connected by a “- - -” edge in the input graph or both belong to X2,
θedg−−−(X1, X2, X3, x1, x2) = edg−−−(x1, x2) ∧¬(x1 ∈ X2 ∧ x2 ∈ X2), expressing that there is an edge labeled “- - -” between two nodes of the output graph if these two nodes are connected by a “−−−” edge in the input graph and at least one of them does not belong to X2,
θporti(X1, X2, X3, x) = false for every i ∈ [k], expressing that the output graph has no ports.
Figures 2 and 3 show an example pair of input and output graphs, where the parameters corresponding to X1,…, X3 are U1,…,U3, indicated by the green, blue, and orange areas of Figure 2. In these figures, edges labeled “—” and “- - -” are drawn as solid and broken lines, respectively, and we do not indicate edge directions, for simplicity.
We note here that the quantifier-free operations defined in Courcelle and Engelfriet (2012) satisfy ℓ = 0, that is, they do not depend on arguments U1,…,Uℓ. However, the extension to ℓ ≥ 0 will turn out to be technically convenient and is mathematically insignificant because one can alternatively consider relational structures with additional unary predicates U1,…,Uℓ to achieve the same effect in the setting of Courcelle and Engelfriet (2012). A similar remark applies to Theorem 1 below.
Now we can state the Backwards Translation theorem (which is a weak version of a similar theorem for the much more general CMSO-transductions [Courcelle and Engelfriet 2012, Theorem 7.10]). This theorem states that, if we have a quantifier-free operation ξ, any property of ξ(G) that can be expressed by some cmso formula φ can also be expressed as a property of the original graph G by means of a cmso formula φ′ (obtained by, intuitively, “backwards translating” φ along ξ).
For every quantifier-free operation ξ in set variables and every formula φ(Y1,…, Yk, y1,…, yk′) ∈ CMSO, there is a formula φ′(X1,…, Xℓ, Y1,…, Yk, y1,…, yk′) ∈ CMSO such that the following holds:
Let G be a graph, U1,…,Uℓ ⊆UG, and H = ξ(G,U1,…,Uℓ). Then we have G⊧φ′(U1,…,Uℓ, V1,…, Vk, v1,…, vk′) if and only if H⊧φ(V1,…, Vk, v1,…, vk′), for all V1,…, Vk ⊆ VH and v1,…, vk′ ∈ VH.
Let us return to the quantifier-free operation ξ of Example 1, and the pair of input and output graphs in Figures 2 and 3. Consider now a formula φ which, say, states that a pair of nodes v1 and v2 given as parameters to this formula (i.e., v1 and v2 correspond to free variables y1 and y2 in the formula) are such that every node that is not in a given set V1 (corresponding to a free set variable Y1 in the formula) is reachable from at least one of v1 and v2 via an undirected path. Reachability is known to be definable in mso logic (and hence in its extension cmso). The property can, for example, be expressed through the predicate Reachable(x, y), which is true if and only if x = y or there exists a set X of nodes such that:
x, y ∈ X,
every z ∈{x, y} has an edge to exactly one element in X ∖{z},
every z ∈ X ∖{x, y} has edges to exactly two elements in X ∖{z}.
It should be straightforward to verify that each of the properties 1–3 is cmso definable.
If we have a graph such as H in Figure 3, a set V1 of nodes in H, and two further nodes v1 and v2 in that graph, φ(V1, v1, v2) expresses the property mentioned above, that is, all nodes except possibly those in V1 are reachable from at least one of v1 and v2. In Figure 3, this is obviously true.
What the backwards translation theorem tells us is that we can construct another formula φ′ which, if applied to the input graph G to ξ, checks whether φ would be satisfied in its output graph H. In other words, in order to find out whether ξ applied to G will yield a graph in which φ is satisfied, we do not need H but can instead evaluate the new formula φ′ in G.
Recall, however, that H depends not only on G but also on the additional set parameters; in this particular example, H = ξ(G,U1,U2,U3). Clearly, this means that the formula φ′ will have to take U1,U2,U3 into account. Hence, φ′ must be provided with these as auxiliary parameters, in addition to the original parameters of φ. This is why the backwards translation turns φ(Y1, y1, y2) into φ′(X1, X2, X3, Y1, y1, y2) instead of the simpler form φ′(Y1, y1, y2).
3 Graph Extension Grammars
Graph extension grammars generate graphs by repeated application of two types of graph operations. One type of operation takes the disjoint union of a pair of smaller graphs, in doing so concatenating their port sequences. The other type extends an existing graph with additional structure placed “on top” of that graph. The extension operation uses a template graph with designated nodes, so-called docks. The docks are to be attached to the ports of the argument graph, and the ports of the template become the ports of the combined graph. The template also contains a number of context nodes that can be duplicated (or cloned) and identified with arbitrarily chosen nodes in the argument graph. This is provided that the choice satisfies a given formula whose free variables are the targets of the cloned context nodes.
To formally define the simpler one of these graph operations, disjoint union, let k, k′ ∈ℕ. Then is defined as follows: for and , ⊎kk′(G, G′) yields the graph in obtained by making the two graphs disjoint by a suitable renaming of nodes and taking their union. This is defined in the obvious way for the first three components of the argument graphs and concatenates the port sequences of both graphs. Thus, ⊎kk′ is not commutative and only defined up to isomorphism. We usually write G ⊎kk′G′ instead of ⊎kk′(G, G′). To avoid unnecessary technicalities, we shall generally assume that G and G′ are disjoint from the start, and that no renaming of nodes takes place. We extend ⊎kk′ to an operation in the usual way: for and , .
To introduce the second type of graph operation, we first define cloning. The purpose of cloning is to make the expansion operations (to be defined afterwards) more powerful by allowing them to attach edges to an arbitrary number of nodes in the argument graph. Cloning of nodes was originally introduced to formalize the structure of object-oriented programs (Drewes et al. 2010), and later adopted in computational linguistics (Björklund, Drewes, and Ericson 2019).
To define cloning, consider a graph G = (V, E, lab, port) and let C ⊆ V ∖ [port]. Then cloneC(G) is the set of all graphs obtained from G by replacing each of the nodes in C and their incident edges by an arbitrary number of copies. Figure 4 illustrates this construction. Formally, G′ = (V′, E′, lab′, port′) ∈cloneC(G) if there is a family (CLv)v∈V of pairwise disjoint sets CLv of nodes, such that the following hold:
CLv = {v} for all v ∈ V ∖ C,
,
, and
for all v ∈ V and v′ ∈CLv, lab′(v′) = lab(v).
Note that cloning does not rename nodes in V ∖ C. In the following, we shall continue to denote by CLv (v ∈ V) the set of clones of v in a graph belonging to cloneC(G).
and (VΦ, EΦ, labΦ, dockΦ) are graphs, the former being referred to as the underlying graph of Φ, and
φΦ ∈CMSO is a formula in which every free variable is a set variable Xv for some v ∈ VΦ ∖ ([portΦ] ∪ [dockΦ]).
For notational convenience, we define two abbreviations:
CΦ = VΦ ∖ ([portΦ] ∪ [dockΦ]) is the set of context nodes and
NEWΦ = [portΦ] ∖ [dockΦ] is the set of new nodes.
We assume in the following that the set CΦ is implicitly ordered, so that we can view it as a sequence whenever convenient. Thus, if CΦ = {v1,…, vk}, the last component of Φ has the form with one set variable for every context node vi.
Throughout the rest of this article, we shall continue to denote the components of an expansion operation Φ by VΦ, EΦ, labΦ, portΦ, dockΦ, CΦ, and φΦ.
When applied to a graph, Φ adds (copies of the) nodes in NEWΦ to the graph while those in [dockΦ] and CΦ are references to ports and nondeterministically chosen nodes in the input graph, respectively.
Formally, the application of Φ to an argument graph is possible if |dockΦ| = ℓ. It yields a graph of type |portΦ| by fusing the nodes in dockΦ with those in port. Moreover, clones of the context nodes are fused with arbitrary nodes in V, inheriting the labels from G. Thus, the application of the operation to G clones the nodes in CΦ, fuses those in dockΦ with those in port, and fuses all nodes in CLv, for each v ∈ CΦ, injectively with nodes in V, provided that φΦ is satisfied under the assignment given by the mapping of cloned context nodes to nodes in V. The port sequence of the resulting graph is portΦ.
Formally, let |portΦ| = k and |dockΦ| = ℓ. Then Φ is interpreted as the nondeterministic operation defined as follows. For a graph , a graph is in Φ(G) if it can be obtained by the following steps:
Choose a graph and a morphing μ of G′ to a graph (V′, E′, lab′, port′) such that the following hold:
- (a)
μ(NEWΦ) ∩ V = ∅,
- (b)
μ(dockΦ) = port,
- (c)
for all nodes v ∈ CΦ, it holds that μ(CLv) ⊆ V, and
- (d)
(where we view CΦ as a sequence).
- (a)
Define H = (V ∪ V′, E ∪ E′, lab ⊔lab′, port′).
We observe that by the definition of the priority union ⊔, the labels of nodes not belonging to NEWΦ are disregarded. Hence, the labels of nodes that are fused with context nodes or are ports in G are determined by lab.8 This means that the labels of nodes not in NEWΦ can be dropped when specifying Φ, essentially regarding these nodes as unlabeled ones.
We now specialize expansion operations to extension operations by placing conditions on their structure. The intuition is to make sure that graphs are built bottom–up, that is, that Φ always extends the input graph by placing nodes and edges “on top”, with edges being directed downwards, and in such a way that all nodes of the argument graph are reachable from the ports. For this purpose, edges must point from new to “old” nodes, and all nodes in [dockΦ] which are not in [portΦ] (i.e., intuitively, ports in the argument graph that are “forgotten”) must have an incoming edge. Formally,
- (R1)
and
- (R2)
[dockΦ] ∖ [portΦ] ⊆ tar(EΦ).
Since edges introduced by an extension operation, owing to (R1), can only be directed from new nodes (which by definition of NEWΦ must be ports) to nodes in the argument graph, it follows in particular that all graphs constructed from the empty graph with the help of union and extension operations are directed acyclic graphs (DAGs). By a straightforward induction, (R2) ensures that every node in a graph constructed in this way is reachable from a port.
A graph extension algebra is a Σ-algebra where every symbol in Σ is interpreted as an extension operation, a union operation, or the set {ϕ}, where ϕ is the empty graph (∅, ∅, ∅, ε). Note that the operations of the algebra act on sets of graphs rather than on single graphs. This is necessary because of the nondeterministic nature of extension operations. It also takes care of the fact that operations are only defined on graphs of the right type: By convention, the application of an operation to a graph of an inappropriate type yields the empty set of results. This relieves us from having to deal with typed algebras.
For notational simplicity, we shall assume that, in a graph extension grammar as above, for all f ∈ Σ, that is, we use the operations themselves as symbols in Σ.
Before entering on the topic of parsing, let us pause to consider a concrete example in the setting of natural language processing.
In the context of this example, the concepts girl and boy represent entities that can act as agents (which verbs cannot), try and persuade represent verbs that require structural control, and want and believe represent verbs where this is not needed.
First, we take a top–down perspective to understand the tree generation. The initial nonterminal is S. The base case for S is the generation of an extension operation that creates a single node representing a girl or a boy concept. In all other cases, S generates an extension operation that adds a verb and its outgoing edges, and in which the verb is the single port. The nonterminal C has the same function as S with the following two differences: It has no corresponding base case, and the ports of the extension operations mark both the agent of the verb (designated by an arg0 edge) and the verb itself. The nonterminal S′ can only generate extensions that create girl or boy nodes. As we shall see, this makes sure that arg0 edges always point to persons. Finally, there is a pair of nonterminals U and U′ that both create union operations, the former with two resulting ports, and the latter with three.
Now, we take a bottom–up perspective to see how the extension operations of a tree are evaluated. Unless the tree consists of a single node, we will have several extensions generated by S and S′ that create girl and boy nodes as the leaves of the tree. In this case, we can apply one or more union operations to concatenate their port sequences and make them visible to further operations. The construction ensures that an arg0 edge can only have a person, that is, a valid agent, as its target. After the application of a union operation, it is possible to apply any extension that is applicable to U (or U′), meaning that none of the non-port nodes are contextual nodes. The resulting graph can have one or two ports—if the graph has one port, the applied extension operation was generated by S, and if the graph has two ports, the applied operation was generated by C. In other words: C signals that the graph is ready for the addition of a control verb, and S that the graph is a valid semantic graph with one port. When sufficiently many nodes have been generated, it is no longer necessary (but still possible) to use the union operations. Instead, we can add incoming edges to already generated nodes contextually (unless control is explicitly needed, as is the case for control verbs). The generated graphs can therefore contain both structural and non-structural dependencies. See Figure 8 for an example of a tree generated by the grammar in Figure 7, and its evaluation into a semantic graph.
While cloning is explicitly disabled in the extension operations above (by requiring that context nodes are cloned exactly once), in general cloning may play a central role for NLP applications, because it enables concepts to refer to an unbounded number of “arguments”. (Note that in the edge-agnostic case, i.e., if the formulas φΦ of the extension operations are not allowed to make us of the predicates edga, the mapping of cloned nodes to nodes in the actual graph is made solely based on labels and ports. Similarly, the decision about how many clones to create cannot depend on the presence or absence of edges. We may, for instance, say “create at least (or at most) k clones if there is both an a- and a b-labeled node”, but not “create a clone for every node which is the target of a z-labeled edge and map it to that node”.)
Cloning can be used if we, for example, want to express the situation that an entire group of people is persuaded by another group to believe something or someone, thus using several agents as a target of any of the argument edges. For this, we can use operations such as those in Figure 9 (where an extension operation Φ is depicted as a pair consisting of the graph with docks indicated as before, and the formula φΦ). Here, the formula φ2 plays a vital role. The formula φ1 only requires that the believers (targets of arg0-edges) are persons (and that there is at least one). While φ2 incorporates the same requirement for both persuaders and persuaded persons, it additionally requires that the persuaded ones (represented by nodes in Xv) are exactly those who are also reachable via arg0-edges from the port of the argument graph, that is, the believers. It may be instructive to note that the corresponding structural requirement was expressed by “remembering” nodes as ports in the rules of Figure 7. This is no longer possible here (recall that port1(x) picks the first port), because the set of nodes to be remembered is not a priori bounded in size. Hence, we have to express the desired coordination by means of the edgarg0-predicate. With this addition, the example is not any longer edge agnostic in the sense to be defined in Definition 7, and thus the result we are going to prove in Section 5 does not apply to it anymore. However, the property expressed by φ2 is a local one in the sense to be defined in Section 6, and hence our main result (Theorem 5) does indeed cover it.
Readers familiar with HRGs or Courcelle’s hyperedge replacement algebras (Courcelle 1991) may have noticed that productions using union and extension operations (disregarding context nodes and the effect of the component φΦ) correspond to hyperedge replacement productions of two types:
- (a)
A → B ⊎kℓC corresponds to a production with two hyperedges in the right-hand side, labeled B and C, where the first one is attached to k of the nodes to which the nonterminal in the left-hand side (which is labeled A) is attached, and the second is attached to the remaining ℓ nodes of the left-hand side. In particular, there are no further nodes in the right-hand side.
- (b)
A → Φ[B] corresponds to a hyperedge replacement production with a single nonterminal hyperedge labeled B in its right-hand side. Such productions are the ones responsible for actually generating new terminal items (nodes and edges).
Generalized extension operations of arity greater than one can be constructed by defining so-called derived operations through composition of suitable extension and union operations. Given a set of such derived operations, the tree-to-tree mapping that replaces each occurrence of a derived operation by its definition is a tree homomorphism. As regular tree languages are closed under tree homomorphisms, this shows that the restriction to binary unions and unary extensions is no limitation (in contrast to requirements (R1) and (R2), which help reduce the complexity of parsing).
We also remark here that it is not a restriction that all new nodes introduced in an extension operation are ports. This is because we can add a non-port to a graph G by evaluating Φ(Φ′(G)) where Φ′ introduces the desired node as a port, say port i, and Φ “forgets” port i, that is, dockΦ(i)∉[portΦ]. (Note, however, that (R2) then requires dockΦ(i) to have an incoming edge from one of the ports of Φ, making sure that the node introduced in this way is reachable from a port in Φ(Φ′(G)).)
4 NP-Completeness
Before turning to graph extension grammars for which parsing can be implemented to run in polynomial time, we confirm in this section that restrictions are required to accomplish this (unless P = NP), as the problem is NP-complete in general. Readers who are not interested in the proof may either skip this section or read it for the sake of seeing another example of a graph extension grammar.
For all graph extension grammars Γ, it holds that L(Γ) is in NP. Furthermore, there exist graph extension grammars Γ such that L(Γ) is NP-complete.
As an immediate consequence of the parsing algorithm to be presented in Section 5, it holds that L(Γ) ∈NP for all graph extension grammars Γ. This is because nondeterminism can be used to simply “guess” an appropriate matching in line 21 of Algorithm 1. The verification that such a guessed matching is indeed one can easily be implemented to run in polynomial time.
It remains to find a graph extension grammar that generates an NP-complete graph language. We do this by presenting a graph extension grammar Γ whose generated graphs represent satisfiable formulas in propositional logic. The grammar consists of three parts. Taking a bottom–up view, these three parts serve the following purposes:
The first part, corresponding to the rules applied furthest down in the derivation tree, generates trees with node labels in {¬,∨,∧, var}, where every node labeled ¬ has one child, those labeled ∨ and ∧ have two children, and nodes labeled var are leaves. These trees thus represent formulas in the usual way, where every node labeled var stands for an occurrence of an unnamed variable.
The second part introduces a chain of =- nodes on top of the root of the formula. From each of these nodes, any number of edges will point to some of the (nodes representing) variable occurrences. Two occurrences of variables will be considered to be occurrences of the same variable if and only if they are both targets of edges originating from the same =-labeled node.
The third part of the grammar, corresponding to the topmost section of the derivation tree, consists of only one rule that “guesses” a satisfying truth assignment and uses its cmso formula to check that this truth assignment is indeed satisfying.
To simplify the grammar and make the rules more readable, we occasionally use rules whose right-hand sides are arbitrary trees over the given operations and nonterminals, rather than sticking to rules of the form A → f[A1,…, Ak] as in Definition 1. In a straightforward way, these rules can be decomposed into rules of the form A → f[A1,…, Ak] by introducing additional nonterminals.
The first part of the grammar is shown in Figure 10.
The rules of the second part of the grammar are shown in Figure 11. The cmso formula makes sure that edges from the generated nodes all point to variable occurrences.
Finally, the rule that ensures that the generated formula is satisfiable is shown in Figure 12. Its left-hand side S is the initial nonterminal of the grammar. The conjuncts of the formula express that every node representing a subformula has precisely one eq-edge pointing to it (line 1, where we make use of the customary abbreviation ∃! for ‘there exists exactly one’), every node representing a variable or logical operator is assigned either true or false (line 2), different occurrences of the same variable are assigned the same truth value (line 3), the truth assignment is compatible with the definition of the logical operators (lines 4–6), and the truth value assigned to the root node of the formula is true (line 7). It should be clear that the formula represented by a graph G that can be generated from E is satisfiable if and only if Φsat(G)≠∅, which is the case if and only if L(Γ) contains the graph G′ obtained from G by adding a node labeled root with a chain-labeled edge pointing to the port of G and making that new node the (unique) port of G′. (Note that this is the effect of applying Φsat to G if the formula is satisfied for some choice of XT and XF.) Moreover, given a propositional formula it is easy to construct the corresponding graph G′, that is, there is a logspace reduction of the NP-complete satisfiability problem for propositional formulas to L(Γ).
We note that the validity of the preceding proof does not depend on the use of isolated context nodes in Φsat: The proof remains valid if we add (equally labeled) edges from the port to each of these context nodes. Then every node corresponding to a subformula would receive exactly one incoming edge from the port, regardless of the truth assignment hidden in the choice of XT and XF, and adding those edges to the output graph G′ of the reduction would again result in a logspace reduction to L(Γ).
5 Parsing for Edge-Agnostic Graph Extension Grammars
We now provide a blueprint of a parsing algorithm for graph extension grammars, which we afterwards instantiate to a polynomial time parsing algorithm for a special case of graph extension grammars, the so-called edge-agnostic ones. In the next section, we will discuss how this restriction can partially be lifted.
Throughout this section, let be a graph extension grammar, where g = (N,Σ, P, S) and is a graph extension Σ-algebra.
The goal is to decide the membership problem for L(Γ) and, in the positive case, produce a corresponding derivation tree t ∈ L(g) such that the given graph G is in . In fact, by the recursive structure of the proposed algorithm, it will be obvious how to obtain t. Hence, we focus on the membership problem, which is formally defined as follows:
- Input:
A graph G
- Question:
Does it hold that G ∈ L(Γ)?
By definition, extension operations keep the identities of nodes in the input graph unchanged, whereas new nodes added to the graph may be given arbitrary identities (as long as clashes with nodes in the input graph are avoided). For the union operation, it holds that renaming of nodes is necessary only if the node sets of the two argument graphs intersect. As a consequence, when evaluating a tree t, we can without loss of generality assume that operations never change node identities of argument graphs. In other words, if , then for every α ∈ addr(t), there is a (concrete) graph such that
if t/α = Φ[t/α1], then Gα ∈ Φ(Gα1) and
if t/α = t/α1 ⊎kℓt/α2, we have Gα = (V1 ∪ V2, E1 ∪ E2, lab1 ⊔lab2, port1, port2) where Gαi = (Vi, Ei, labi, porti) for i = 1,2,
and Gε = G.
Thus, for every α ∈ addr(t), the graph Gα is a subgraph of G. More precisely, if G = (V, E, lab, port) and Gα = (V′, E′, lab′, port′) then V′ ⊆ V, E′ ⊆ E, and lab′ = lab|V′. Note that, while the graphs Gα are usually not uniquely determined by G and t, the important fact is that they do exist if and only if . We say in the following that the family (Gα)α∈addr(t) is a concrete evaluation of t into G. Hence, the membership problem amounts to deciding, for the input graph G, whether there is a tree t ∈ L(g) that permits a concrete evaluation into G.
The following lemma, which forms the basis of our parsing algorithm, shows that Gα = (V′, E′, lab′, port′) is determined by port′ alone, as it is the subgraph of G induced by the nodes that are reachable from port′. More precisely, consider a sequence of nodes of G and let be the set of nodes reachable in G by directed paths from (any of the nodes in) p.9 Now, we define .
If (Gα)α∈addr(t) is a concrete evaluation of a tree t ∈TΣ into a graph G, then for all α ∈ addr(t).
Let Gα = (Vα, Eα, labα, portα) for every α ∈ addr(t). We first show the following claim:
We have {e ∈ E∣src(e) ∈ Vα}⊆ Eα and .
We prove Claim 1 by induction on the length of α. (Thus, the induction proceeds top-down rather than bottom-up.) For α = ε, the statement holds trivially. Now, assume that it holds for some α ∈ addr(t). Since the empty graph ϕ has neither ports nor edges, two relevant cases remain.
Gα is of the form Gα1 ⊎kℓGα2.
Let {i, j} = [2]. We have to show that the statement holds for Gαi. Thus, assume that src(e) ∈ Vαi and . Then src(e) ∈ Vα and thus, by the induction hypothesis, e ∈ Eα. By the definition of ⊎kℓ this means that e ∈ Eαi. Furthermore, implies and thus, again by the induction hypothesis, v ∈ Vα. Hence, it remains to argue that no node in Gαj can be reached from portαi. This follows readily from the just established fact that there is no edge e′ ∈ V ∖ Vαi for which src(e′) ∈ Eαi (together with the fact that Vαi ∩ Vαj = ∅).
Gα is of the form Φ(Gα1) for an extension operation Φ.
To show that the statement holds for Gα1, let Gα = (Vα1 ∪ V′, Eα1 ∪ E′, labα1 ⊔lab′, port′), where (V′, E′, lab′, port′) is obtained from as in the definition of Φ(Gα1). By the induction hypothesis, src(e) ∈ Vα implies e ∈ Eα. Thus, src(e) ∈ Vα1 implies e ∈ Eα1 unless e ∈ E′. However, by requirement (R1), all e ∈ E′ satisfy src(e) ∈ [port′] ∖ [port], which is equivalent to src(e)∉Vα1 because Vα ∖ Vα1 = [port′] ∖ [port]. From this it follows that e ∈ Eα1.
Now, let . If v∉Vα1, consider the first edge e on a path from a node in [portα1] to v such that tar(e)∉Vα1. Then src(e) ∈ Vα1 but e∉Eα1, contradicting the previously established fact that src(e) ∈ Vα1 implies e∉Eα1.
This finishes the proof of Claim 1. Next, we prove the converse inclusion of the second part of Claim 1:
.
This time, we proceed bottom–up, by induction on the size of t/α. The statement is trivially true for Gα = ϕ. Thus, as before, there are two cases to distinguish.
Gα is of the form Gα1 ⊎kℓGα2.
By the induction hypothesis, Claim 2 holds for Gα1 and Gα2. Consequently, .
Gα is of the form Φ(Gα1) for an extension operation Φ.
Again, let Gα = (Vα1 ∪ V′, Eα1 ∪ E′, labα1 ⊔lab′, port′), where (V′, E′, lab′, port′) is obtained from as in the definition of Φ(Gα1). (In particular, port′ = portα.) By the induction hypothesis, . Moreover, by requirement (R2), for every node v ∈ [portα1] it either holds that v ∈ [portα], or there are u ∈ [portα] and e ∈ E′ with src(e) = u and tar(e) = v. Hence, and thus . Since, furthermore, V′∖ Vα1 ⊆ [portα], this shows that Vα ⊆ [portα], and thus .
Together, Claims 1 and 2 state that Vα contains exactly the nodes reachable from portα (by Claim 2 and the second part of Claim 1), and also all edges originating at those nodes (by the first part of Claim 1). This finishes the proof of the lemma.
We note the following immediate consequence of Lemma 1:
Let (Gα)α∈addr(t) be a concrete evaluation of a tree t ∈TΣ into a graph G. If for some α ∈ addr(t), then Gα = ϕ.
In Lemma 1, we find the beginnings of a recursive parsing algorithm: In the following, consider an input graph G = (V, E, lab, port), a nonterminal A ∈ N, and a sequence . To decide whether (and thus whether G ∈ L(Γ) if A = S and p = portG), we need to consider three cases, the first two of which are straightforward.
If p = ε, then we have if and only if the production A → ϕ is in P.
Otherwise, we need to check each production with the left-hand side A according to Cases 2 and 3, as follows:
If the production is of the form , let p = p1p2 where |pi| = ki for i = 1,2. If , then . If , we recursively need to determine whether for i = 1,2 because in that case .
If the production is of the form A → Φ(B), assume first for simplicity that no cloning takes place (i.e., every node in CΦ is cloned precisely once). The basic intuition is that we need to check all structure-preserving mappings of to that map portΦ to p. Such a mapping determines in particular the image of [dockΦ] in , say p1, and similarly it defines the images v1,…, vm ∈ V of the context nodes u1,…, um of Φ. Hence, we now have to check whether and, recursively, whether .
However, this disregards the fact that nodes in CΦ can be cloned. To deal with this additional difficulty, we define the notion of a matching of Φ to . Then, based on Lemma 1, the parsing strategy in this third case is to check whether there exists a matching m of Φ to such that, recursively, . Thus, for every such sequence m(dockΦ), the algorithm recursively invokes a procedure Parse_rec(B, m(dockΦ)) (which is outlined in the upcoming Algorithm 1) and returns yes if one of the recursive calls does, and no otherwise.
To define the notion of matchings, consider a graph extension operation Φ with |portΦ| = k, as well as a directed acyclic graph H with type(H) = k (i.e., here H takes the rôle of above). We need to determine all possible such that . To see how this can be done, let VNEW = {portH(i)∣i ∈ [k]and portΦ(i)∉[dockΦ]}. In other words, VNEW is the set of nodes of H which are images of nodes in NEWΦ. Recall that, by (R1), the nodes in NEWΦ are the only nodes in Φ that may have edges to other nodes in Φ. Thus, the edge set ENEW = {e ∈ EH∣src(e) ∈ VNEW} contains precisely the edges of H which are images of edges of Φ. Hence, we have to find a suitable mapping of the nodes in VΦ to subsets of . We say that a function m: VΦ → ℘(V1) is a matching of Φ to H if the following conditions are satisfied:
- (M1)
The restriction of H to V1 and ENEW is a morph of a graph in by some morphing μ which satisfies the conditions |μ(CLu)| = 1 for all u ∈dockΦ and μ(CLv) = m(v) for all v ∈ VΦ.
(In particular, |m(v)| = 1 for all v ∈ [portΦ] ∪ [dockΦ]. Therefore, we shall in the following view the restriction of m to [portΦ] ∪ [dockΦ] as a function from nodes to nodes rather than to sets of nodes, see, for example, (M2) and the left-hand side of (M3) below.)
- (M2)
In H, every node in VH ∖ [portH] is reachable from (a node in) m(dockΦ).
- (M3)
, where {u1,…, un} = CΦ.
The pseudocode of the algorithm is shown in Algorithm 1. In the code, we use the statement “returnresult[A, p]← true” to denote memoisation: the variable result[A, p] gets the Boolean value true assigned to it and then that same value is returned by the procedure. Obviously, the condition in line 21 of the algorithm needs to be made more concrete, that is, we have to find (efficient) ways to check whether m exists. However, let us first postpone this question and show that the algorithm is correct.
Algorithm 1 decides whether the input graph G is an element of L(Γ).
By Lemma 1, it suffices to show that Parse_rec(A, p), for all , returns true if for a tree t such that , and false otherwise. By line 10, termination is guaranteed since there are only finitely many pairs (A, p). Hence, it remains to be shown that Parse_rec(A, p) returns true if and only if for a tree t such that .
We first show that Parse_rec(A, p) returns true if there exists a tree t ∈ LA(g) such that . We prove this by induction on the smallest tree t ∈ LA(g) such that . The derivation of t by g can have one of three forms, depending on whether the root of t is ϕ, a union operation, or an extension operation.
If t = ϕ, then P contains the rule A → ϕ, so the algorithm returns true in line 12.
If with for i = 1,2, then where for i = 1,2. By the definition of , if we set p = p1p2 with |pi| = ki for i = 1,2, then and . Since both t1 and t2 are smaller than t, the induction hypothesis yields that Parse_rec(Bi, pi) returns true for i = 1,2, which means that Parse_rec(A, p) returns true in line 17.
We have thus finished the first direction of the proof. It remains to be shown that Parse_rec(A, p) returns true only if there exists a tree t ∈ LA(g) such that . We proceed by induction on the number of recursive calls of Parse_rec.
Since if p = ε, the assertion is true for the return statement in line 12.
If the return statement in line 17 is reached, then it follows from the condition that . We also know from the induction hypothesis that, for i = 1,2, there are trees such that . Hence, is a tree in such that .
Finally, assume that the return statement in line 12 is reached. Let A → Φ[B] be the rule considered and m the matching whose existence is guaranteed by the condition in line 23. Again, the induction hypothesis applies, this time stating that there is a tree t1 ∈ LB(g) such that , where p1 = m(dockΦ). Let . By (M1), m determines a clone of and a morphing μ of G′ to a graph (V′, E′, lab′, port′) such that . Using (M3), this implies that and thus, by the definition of , that because . This shows that and finishes the proof of the theorem.
The remainder of the article will be devoted to the question of how line 21 can be concertized. In this section, this will first result in a polynomial-time algorithm for the special case of edge-agnostic graph extension grammars (to be defined formally in Definition 7). With this as a basis, the next section will generalize the result to local graph extension grammars.
For a fixed grammar, thanks to memoisation Parse_rec will be called at most O(nwd(Γ)) times, where n is the number of nodes of the input graph. This is because there are only a constant number of possible choices for the first parameter and nwd(Γ) for the second. Hence, the total running time is O(nwd(Γ)) times the maximal number of computation steps it takes to execute the body of the procedure (not counting recursive calls). This, in turn, is dominated by the time it takes to enumerate the matchings m of Φ to (line 21). For this, note that there is no need for line 21 to explicitly enumerate all matchings m of Φ to because the loop body only depends on (the nonterminal B and) m(dockΦ). In other words, the loop can be replaced with
Hence, the question that remains is how to implement line 22’, that is, to decide for given whether there exists a matching m of Φ to with m(dockΦ) = d. For this, we will now show that one can construct a cmso formula that checks whether the mapping of dockΦ to d can be extended to a matching of Φ to . (This makes it possible to use Courcelle’s theorem to come up with an efficient algorithm for the edge-agnostic case of graph extension grammars later on.)
To simplify our reasoning, we make use of the Backwards Translation theorem for quantifier-free operations (Courcelle and Engelfriet 2012, Theorem 5.47), by means of the following lemma.
For every extension operation Φ, there is a quantifier-free operation ξ in ℓ = |dockΦ| variables such that the following holds. Let H be a graph, and let m: VΦ → ℘(V1) be a mapping that satisfies (M1) and (M2) (where , as in the definition of matchings). Then .
Let m be as in the lemma and d = m(dockΦ). We first show the following claim:
is the graph H′ obtained from H by removing all nodes in [portH] ∖ [d] from it (together with their incident edges) and defining portH′ = d.
To prove Claim 3, note first that every node in VH ∖ [portH] is reachable from d by (M2). Thus, it remains to be shown that no node v ∈ [portH] ∖ [d] is reachable from d. However, this is clear by (M1) in combination with (R1). Furthermore, by definition .
Specifying the quantifier-free operation ξ is now straightforward. By Claim 3, it can be defined as follows:
- the domain formula δ expresses that a node belongs to the resulting graph if it is in [d] or not a port:
- edges and node labels are copied: for all and
- the nodes in [d] become the ports:
This finishes the proof of the lemma.
As a consequence, we can now prove the main lemma that—with the additional assumption of edge agnosticism—will lead to an efficient implementation of Algorithm 1 using Courcelle’s theorem.
For every extension operation Φ with ℓ = |dockΦ|, there is a cmso formula ψΦ with the free individual variables x1,…, xℓ such that, for every graph and every with |d| = ℓ, we have H⊧ψ(d) if and only if there is a matching m of Φ to H with m(dockΦ) = d.
Defining the conjuncts ψ(M1) and ψ(M2) is rather straightforward:
- ψ(M1) expresses that ports are bijectively mapped to ports, and for all pairs of nodes in the image of m, if one of them is a new node (i.e., an image of a node in [portΦ] ∖ [dockΦ]), then the edges between those nodes are exactly the images of edges between their pre-images. To be precise, let ij ∈ [n], for j ∈ [k], be the index such that . ThenIt should be clear that H⊧ψ(M1)(V1,…, Vn, d) if and only if (M1) holds for the mapping m that maps vi to Vi for every i ∈ [n].
- ψ(M2) expresses that for every node v that is not a port, there is a port from which there is a path to v:This makes use of the fact that the predicate path can be expressed in monadic second-order logic; see Courcelle and Engelfriet (2012, Proposition 5.11).
Finally, assume without loss of generality that CΦ = {v1,…, vc} for some c ∈ [n]. Then ψ(M3) must be constructed in such a way that H⊧ψ(M3)(V1,…, Vc) if and only if . In fact, since ψ(M1) and ψ(M2) already ensure that (M1) and (M2) hold, Lemma 2 applies, providing us with a quantifier-free operation ξ such that where j1,…, jℓ ∈ [n] are the indices such that for all p ∈ [ℓ]. Hence, Theorem 1 applied to ξ and φΦ yields the formula ψ(M3) needed. (Thus, are to be substituted for X1,…, Xℓ in Theorem 1, X1,…, Xk play the role of Y1,…, Yk, and k′ = 0.)
Thus, line 22’ can be implemented using any algorithm which, for a fixed cmso formula ψ, an input graph G, and node sequence , checks whether G⊧ψ(d). Unfortunately, this is an intractable problem in general. Therefore, for efficient parsing, we aim to restrict G to graphs of bounded treewidth, as in this case we can apply Courcelle’s theorem (Courcelle and Engelfriet 2012, Theorem 6.4) which states that the problem can be solved in linear time on graphs of bounded treewidth. Since it is an important feature of graph extension grammars that they can generate graph languages of unbounded treewidth, we now define a special case that retains that ability while allowing us to consider graphs of bounded treewidth in the application of Lemma 3.
A graph extension operation Φ is edge agnostic if φΦ does not contain any predicate of the form edga (). An edge-agnostic graph extension grammar is a graph extension grammar in which all graph extension operations are edge agnostic.
Note that the graph extension grammar discussed in Example 4 is edge agnostic (with the exception of the discussion in its last paragraph). Recall that the purpose of the development of the graph extension grammar formalism is to capture not only structural but also non-structural reentrancies. The example grammar handles both—which despite the limitations of this example may indicate that even edge agnosticism is not as severe a restriction with respect to the linguistic usefulness of the formalism as one might think. The reason, as discussed in the Introduction, is that these non-structural reentrancies are often caused by language elements such as pronouns which, intuitively, are edge agnostic in themselves (as opposed to structural reentrancies like those caused by control).
We can now show that Algorithm 1 can be implemented to run in polynomial time for edge-agnostic graph extension grammars Γ.
Let Γ be an edge-agnostic graph extension grammar, and let r be the maximal type occurring in its operations. Then Algorithm 1 can be made to run in time O(n2r +1).
Let G be the input graph, where n = |VG|. Since there are O(nr) recursive invocations of the algorithm, and the loop in line 21’ is also executed O(nr) times, it suffices to argue that line 22’ can be implemented to run in linear time. We do this by using Lemma 3, but replacing the graph by a graph H of bounded treewidth such that H⊧ψ(d) if and only if .
6 Parsing for Local Graph Extension Grammars
The restriction to the edge-agnostic case used in the preceding section in order to ensure a polynomial running time of the parsing algorithm is rather severe. In this section, we show that this restriction can partially be lifted. The NP-completeness proof in Section 4 may provide a hint: The mso formula in the rule depicted in Figure 12 inspects the entire graph. We shall now see that a certain amount of local structural conditions can be allowed without sacrificing efficient parsing. The idea is that we weaken the restriction imposed by edge agnosticism by adding primitive predicates to our logic that make it possible to express that a node belongs (or does not belong) to a part of the graph having a certain form. This type of predicate is inspired by the notion of (nested) graph constraint in the theory of graph transformation (see, for example Habel, Heckel, and Taentzer 1996; Arendt et al. 2014).
A local condition is a sequence χ = G0Q1G1Q2⋯Gk where G0 ⊆ G1 ⊆⋯ ⊆ Gk are graphs10 for some k ∈ℕ, and Q1,…, Qk ∈{∃,¬∃,∀,¬∀}.
Let G be a graph. We inductively define what it means for an injective morphism μ0: G0 → G to satisfyχ. Every such morphism satisfies χ if k = 0. Assume now that k > 0 and let χ′ = G1Q2⋯Gk. Then μ0 satisfies χ if one of the following cases holds:
Q1 = ∃ and there exists an extension of μ0 to a morphism μ1: G1 → G that satisfies χ′.11
Q1 = ¬∃ and there does not exist any extension of μ0 to a morphism μ1: G1 → G that satisfies χ′.
Q1 = ∀ and all extensions of μ0 to a morphism μ1: G1 → G satisfy χ′.
Q1 = ¬∀ and there exists an extension of μ0 to a morphism μ1: G1 → G that does not satisfy χ′.
A local node predicate is a unary predicate specified by a local condition χ as above, such that G0 consists of a single node u which is not a port. A node v in a graph G with the same label as u satisfies χ, that is, χ(v) is true in G, if the morphism that maps u to v satisfies χ.
An example of a local node predicate (with k = 2) is shown in Figure 13.
Let χ be a (fixed) local node predicate, G a graph, and v ∈ VG. Then it can be checked in polynomial time in the number of nodes of the graph G whether v satisfies χ.
If χ = G0Q1G1Q2⋯Gk, it is straightforward to test whether v satisfies χ by means of a recursive algorithm of fixed recursion depth k mimicking the definition of satisfaction for local conditions. The body of the procedure consists of a loop which enumerates all possible extensions of the morphism that maps Gi−1 to G (determined by the enclosing call) and test whether they, recursively, fulfill GiQi +1⋯Gk. Since the graphs Gi are fixed, each loop runs in polynomial time, and hence the entire algorithm runs in polynomial time.
We note here that, while in general the exponent of the polynomial bounding the running time of the test whether v satisfies χ can be arbitrarily large, typical properties that need to be tested in practice are unlikely to require big and complex graphs Gi or large k and may therefore be expected to be of reasonable complexity.
Let CMSOloc denote the logic obtained from CMSO by removing all predicates of the form edga, where , and adding all local node predicates. A graph extension grammar such that every extension operation Φ appearing in it satisfies φΦ ∈CMSOloc is a local graph extension grammar.
For every local graph extension grammar Γ, Algorithm 1 can be implemented to run in polynomial time.
Since the only difference between an edge-agnostic and a local graph extension grammar is that the extension operations of the latter can make use of the local node predicates, it suffices to show how to handle those. For this, we have to extend Lemma 3 accordingly, which we do by precomputing the required predicates for every relevant node and adding them to the relational structure. Note first that any given extension operation Φ makes use of a finite number of local node predicates. Let χ1,…, χh be those predicates.
By Lemma 4 it takes polynomial time to compute, for every node v of (where H and d are as in Lemma 3) and every i ∈ [h], the truth value χi(v) with respect to .12 Once this has been done, we can follow the construction in the proof of Lemma 3, but only after first having turned H into the relational structure that is obtained by adding these pre-computed unary predicates χ1,…, χh to it. Since the Backwards Translation theorem and Courcelle’s theorem apply just as well to the resulting relational structures (which are essentially graphs with h types of additional node labels), and these unary predicates do not increase the treewidth of the structures, the remainders of the proofs of Lemma 3 and Theorem 4 continue to be valid.
We note there that the polynomial bounding the running time of the algorithm, in contrast to the edge-agnostic case, is an arbitrary one. This is unavoidable (as long as subgraph isomorphism is not in P, that is, P≠NP) because the exponent of the polynomial in Lemma 4 depends on the local node predicates occurring in the grammar. However, as mentioned above, practically relevant local node predicates are likely to be rather benign (at least for applications in NLP). For example, assuming that suitable data structures are used, the local node predicate required to implement the coordination discussed in connection with Figure 9 can be computed for all nodes v of the host graph in accumulated linear time. In those cases, the running time will be essentially the same as in the edge-agnostic case.
7 Conclusion
We have introduced graph extension grammars, a simultaneous restriction and extension of hyperedge replacement graph grammars. Rules construct graphs using operations of two kinds: The first is disjoint union, and the second is a family of unary operations that add new nodes and edges to an existing graph G. The augmentation is done in such a way that all new edges lead from a new node to either
- (1)
a port in G, or
- (2)
any number of (arbitrarily situated) nodes in G, chosen by a cmso formula.
While graph extension grammars are inspired by the notion of contextual hyperedge replacement grammars (Drewes, Hoffmann, and Minas 2012), they differ from them in a rather fundamental way: Intuitively, the context nodes in a rule r of a graph extension grammar refer to nodes in the subgraph generated by the very subderivation rooted in r. In contextual hyperedge replacement grammars, the situation is the opposite: context nodes refer to nodes generated by rule applications outside that subderivation. The latter creates a problem that does not occur in graph extension grammars, namely, that there may be cyclic dependencies which, intuitively, create deadlocks (see Drewes, Hoffmann, and Minas 2022).
By way of example, we have shown that graph extension grammars can model both the structural and non-structural reentrancies that are common in semantic graphs of formalisms such as AMR. We have presented a parsing algorithm for our formalism and proved it to be correct. If the formulas used to direct the placement of non-structural reentrancies make only “local” structural assertions, then the running time is polynomial. In general, the graph languages generated by graph extension grammars can be NP-complete (shown in Theorem 2), and hence there is little hope of being able to find an efficient membership algorithm for unrestricted graph extension grammars.
It is worthwhile repeating that the dynamic programming parsing algorithm presented here makes at most a polynomial number of recursive calls, and that the body of the algorithm runs in polynomial time provided that it can be decided in polynomial time whether a given mapping of the ports of an extension operation Φ to nodes in the input graph can be extended to a matching that satisfies φΦ. To obtain the main result of this article, we exploited the fact that this is the case if φΦ is local. However, any other restriction enabling us to decide the existence of matchings in polynomial time would work just as well. In particular, there may be other ways to make sure that it suffices to test φΦ on some subgraph of bounded treewidth, such as a spanning forest. We currently do not know of a meaningful restriction that would allow us to do this, but there is another restriction that is even simpler: if no extension operation contains a context node, then graph extension grammars are special hyperedge replacement grammars,13 and thus generate only graphs of bounded treewidth. This shows that Theorem 4 holds for such graph extension grammars, a result which is not entirely trivial because hyperedge replacement can, in general, generate NP-complete graph languages (see Section 1.1):
Let Γ be a graph extension grammar such that none of its extension operations contains a context node, and let τ be the maximal type occurring in its operations. Then Algorithm 1 can be implemented to run in time O(nτ).
There are several promising directions for future work. On the theoretical side, it has to be said that the bounds on the running time proved in this article are rough upper bounds for worst-case scenarios. Since the exponents are rather high, it would be useful to have a closer look at the parameters that influence them, making a fixed-parameter analysis. A parameter that readily comes to mind is the number of reentrancies of a graph. Because the extreme case of a (directed acyclic) graph without reentrancies is a forest, we expect such an analysis to offer plenty of room for reducing the running time of our algorithms to much lower levels.
We would furthermore like to apply the new ideas presented here to the formalisms by Groschwitz et al. (2017) and Björklund, Drewes, and Ericson (2016), to see if also these can be made to accommodate contextual rules without sacrificing parsing efficiency. It is also natural to generalize graph extension grammars to string-to-graph or tree-to-graph transducers to facilitate translation from natural-language sentences or dependency trees to AMR graphs.
On the empirical side, we are interested in a number of directions. A first step could be to check whether graph extension grammars can express the AMR languages of some existing AMR corpora, and see how the running times of parsing vary in practice depending on variables such as the number of reentrancies in the input graph or the grammar size. In addition, we would like to develop algorithms for inferring extension and union operations from AMR corpora, and in training neural networks to translate between sentences and AMR graphs using trees over graph extension algebras as an intermediate representation. Such efforts would make the new formalism available to current data-driven approaches in NLP, with the aim of adding structure and interpretability to machine-learning workflows.
Acknowledgments
We are immensely grateful to the reviewers for all the effort they put into their reports, which made a tremendous difference for the final manuscript. We also want to thank Yannick Stade for providing us with helpful comments on a draft of this article, and Valentin Gledel who prompted us to prove that the general membership problem for graph extension grammars is NP-complete. This work has been supported by the Swedish Research Council under grant no. 2020-03852, and by the Wallenberg AI, Autonomous Systems and Software Program through the NEST project STING—Synthesis and analysis with Transducers and Invertible Neural Generators.
Notes
The graph parsing tool Grappa is available at: https://www.unibw.de/inf2/grappa.
Mezei and Wright formulate this in terms of systems of equations, but the essential ideas are the same.
The similarity to the notation [n], n ∈ℕ, is intentional, since both operations generate sets.
We get an alternative definition of graphs as relational structures if we additionally consider edges as objects, rather than as relations as we will do here; see Courcelle and Engelfriet (2012) for a discussion of the difference.
Courcelle and Engelfriet (2012) denote this structure by ⌊G⌋.
Note that this notation does not imply that each of the variables actually occurs in φ.
Thus the output structure cannot be bigger than the input structure in terms of the number of objects in it. One can remedy this by considering domain formulas with several free variables, which then determine whether a tuple of objects of the original formula is an object of the output formula, but we will not need this in the current article.
This idea has its origins in personal discussions between Frank Drewes and Berthold Hoffmann in 2019.
We will only use the notation when the graph G being referred to is clear from the context.
The reader may wish to recall the formal definition of the inclusion relation “⊆” from Section 2.
Thus, the requirement on μ1 is that its restriction to the subgraph G0 of G1 is μ0.
Note that we need to evaluate χi(v) in the graph rather than in H because it is the former that is the potential argument of Φ.
Strictly speaking, this is not entirely true because, for an extension operation Φ, the formula φΦ still restricts the applicability of extension operations to graphs having the property expressed by φΦ, but this is well known not to increase the power of hyperedge replacement grammars, and even if it did, it would obviously not increase the treewidth of graphs.
References
Author notes
Action Editor: Carlos Gómez-Rodríguez