Projects: Client-side Transitive Closure Computation
From Accessible Integrated Formal Reasoning Environments
This article describes the "Client-side Transitive Closure Computation" aartifact project for the Spring 2011 iteration of the Boston University course CS 511. A high-level outline for what your project report should contain can be found here.
NOTE: This article will be updated and extended over time. However, once a requirement is listed in this article, it will never change and will not be removed.
Contents |
[edit] Requirement #1: Hypergraph Data Structure with Transitive Closure Operation
[edit] Basic Definition
Define a JavaScript data structure for representing directed hypergraphs with labelled edges and labelled nodes. No two nodes can have the same label and the order of the nodes in a hyperedge matters (i.e. an edge (x,y) is distinct from (y,x)). Your data structure should allow you to:
- insert a new labelled node (preferably in O(1) time)
- insert a new labelled hyperedge between zero or more nodes (preferably in O(log n) time or better)
- given an edge label and collection of node labels, determine whether a hyperedge with the specified label exists between the nodes with the specified labels.
[edit] Example of a Hypergraph
For purposes of illustration, we represent hyperedges in a hypergraph using the following notation:
"edge label"("node label", ..., "node label")
The following might represent a hypergraph with some hyperedges:
nodes: X Y Z
hyperedges: "true"() "is a set"(X) "is a set"(Y) "is a set"(Z) "is a subset of"(X,Y) "is a subset of"(Y,Z) "is the union of _ and _"(Z,X,Y)
Notice that the number of nodes a hyperedge can have can be zero, one, two, three, etc.
[edit] Hypergraphs and Logical Formulas
Logical predicates represent relationships between terms, and hyperedges represent relationships between nodes. Thus, we can interpret logical formulas as properties of a hypergraph and vice versa. For example, suppose we are given an edge label "<". The formula below can be viewed as a requirement that a hypergraph should satisfy:
for all x, y, z. "<"(x,y) and "<"(y,z) implies "<"(x,z)
[edit] The Closure Operation
You must write an algorithm that takes an instance of your hypergraph data structure and a list of formulas of the following form:
for all x1,...,xN. "P1"(x1,...xN) and ... and "PK"(x1,...,xN)
implies "Q"(x1,...,xN)
The algorithm must then keep adding hyperedges to the graph until all the formulas in the list are satisfied.
[edit] Details
The number of variables after "for all" can be zero or more. Also, the number of variables that appear between parentheses after each hyperedge label "P" can be zero or more. For example, the following are all valid formula inputs:
for all x,y. "<"(x,y) and ">"(x,y) implies "contradiction"()
for all x,y,z. "int"(x) and "int"(y) and "is the sum of _ and _"(z,x,y)
implies "int"(z)
You do not need to write a complete parser for logical formulas. It is sufficient to write a separate function that can handle the specific kind of formula illustrated above; this can be done using calls to the JavaScript split function. You may still want to use some internal representation for formulas.
[edit] The Transit Operation
One way to approach writing this algorithm is to first write a transit algorithm. This algorithm should have a signature similar to the following:
hypergraph transit(hypergraph instance, list of formulas)
In other words, it should take as input an instance of a hypergraph and a list of formulas, and should return a new (possibly larger) hypergraph.
This algorithm should go through the entire list of formulas one by one. For each formula, it should find all collections of hyperedges which satisfy the collection of patterns on the left-hand side of the implies keyword in the formula. For every collection that is satisfied, it should create the new edge specified on the right-hand side of the implies keyword (with the variables x1,...,xN appropriately substituted). It should not add edges that already exist (or closure, defined below, may end up running in an infinite loop). Thus, it may be better to collect a separate list of new edges based on the formulas and then add them in bulk to the graph once the transit algorithm is finished.
A useful data structure for computing the transit operation is a substitution. A substitution is just a mapping from variable names to nodes. For example, the following might be a substitution (where A and B are nodes):
x |--> A y |--> B
Substitution can be combined as long as they are consistent. For example, {x |--> A, y |--> B} and {z |--> C} can be combined into {x |--> A, y |--> B, z |--> C}. However, {x |--> A} and {x |--> B} cannot be combined because they assign different nodes to x. Notice also that we can create an empty substitution that does not assign a node to any variables. Finally, notice that for a given conclusion in a formula of the form P(x,y,z), we can apply a substitution such as {x |--> A, y |--> B, z |--> C} to obtain an edge in the graph, P(A,B,C).
Notice that given a premise of the form P(x,y,z) in a formula, it is possible to ask the following question: what are the possible assignments to x, y, and z so that P(x,y,z) matches an existing edge with label P in the graph? The answer to this question is a substitution. In fact, we can ask for the list of all such substitutions for a given premise (each corresponding to one edge).
The overall structure of the transit algorithm could then be described using the following pseudocode:
algorithm transit(hypergraph G, list of formulas):
for each formula in the list
create an empty collection of substitutions S
add one empty substitution to S
for each each premise in the formula
get a list T of substitutions that make the premise match an edge in G
create a new empty collection of substitutions S'
for each substitution s in S
for each substitution t in T
r := combination of s and t (if this is valid)
add r to S'
S := S'
for each substitution in S
apply the substitution to the conclusion of the formula
if this forms a valid edge, add it to the graph
For example, if the formula
for all x,y. "<"(x,y) and ">"(x,y) implies "contradiction"()
is being considered, the algorithm looks through the hypergraph for two edges labelled "<" and ">" that both link the nodes (x,y). If two such edges are found, an edge labelled "contradiction" is added to the hypergraph data structure (unless it already exists).
Once the transit operation is working, the closure algorithm is easy to define using the following pseudocode:
algorithm closure(hypergraph, list of formulas):
do
hypergraph := transit(hypergraph, list of formulas)
while (some edges were added during the last transit operation)
[edit] Requirement #2: Propositional Logic
It is possible to use a working closure operation algorithm as a sound (but not necessarily complete) validation algorithm for propositional logic. This is done by specifying a fixed list of formulas that correspond to propositional logic, converting an input formula into a hypergraph instance, running the closure operation to obtain a new hypergraph, and then checking whether in this hypergraph that formula is indicated to be true. You may assume the formula will have an implication operator at its root (i.e. be of the form "F implies F'"). This can be written as pseudocode:
algorithm validate(formula "F implies F'"):
L := list of formulas for propositional logic (always the same)
G := hypergraph representing formula F
G' := closure(G, L)
if the node in G' corresponding to the formula F' has an edge "true"
return true
else
return false
For example, assume we have already created a list of formulas corresponding to propositional logic. Suppose we want to check that the formula "(P and (P implies Q)) implies Q" is true. We would first need to build a hypergraph that represents "P and (P implies Q)". It might look something like the below:
"is_true"(a) "is_conj"(P,C,a) "is_implies"(P,Q,c)
Then, assuming the list for propositional calculus has the following facts in it:
for all x,y. "is_true"(x) and "is_conj"(y,z,x) implies "is_true"(z) for all x,y. "is_true"(x) and "is_conj"(y,z,x) implies "is_true"(y)
Any closure computation would add to the graph the edge "is_true"(P), and maybe others.
[edit] List of Formulas Corresponding to Propositional Logic
If nodes in the hypergraph represent formulas in propositional logic, the axioms of propositional logic themselves can be provided as input to the closure algorithm. For example, consider the following axiom:
This axiom can be represented in the following way:
for all x,y. "true"(x) and "implies"(x,y) implies "true"(y)
It may be necessary to encode more complex relationships such as the following in order to represent complex formulas in the graph:
for all a,b,c. "_ is the conjunction of _ and _"(c,a,b) and "true"(a) and "true"(b) implies "true"(c)
Encode the axioms of propositional logic in this way. Once this is done, it should be possible to encode any formula as a hypergraph in the following way. Notice that in the tree representation of a formula, each node corresponds to an atom or subformula. For every atom or subformula, add a corresponding node to the graph. Next, add hyperedges that represent the relationships between all the nodes as determined by the formula. Finally, you can run the closure algorithm and check whether the "true" edge applies to the node corresponding to the overall formula.
You are not required to write an automated process to encode formulas as hypergraphs, but you may find it helpful for testing.
[edit] Test Cases
If you have assembled an appropriate list of formulas successfully, you should be able to confirm as true the following formulas in propositional logic.
More generally, you should be able to confirm as true any formula
that satisfies one of the following restrictions:
-
and
contain only predicates and conjunctions;
-
contains only predicates and conjunctions, and
contains only predicates, conjunctions, and disjunctions.