depgraph — Dependency graphs

The DepGraph class encapsulates a number of useful methods to deal with interdependent items. It represents the workhorse for the scheduling algorithms in the scheduler module.

A dependency graph is a directed acyclic graph which may be represented as follows:

digraph depgraph { "bacon" -> "spam"; "sausage" -> "eggs", "spam"; }

The convention here is that if an edge goes from A to B, then A depends on B.

Building

You can create a DepGraph in of two ways. Either you pass a dictionary representing the dependencies between items to the from_dependency_dictionary class method:

>>> from valjean.cosette.depgraph import DepGraph
>>> from pprint import pprint
>>> deps = {'bacon': ['spam'], 'sausage': ['eggs', 'spam']}
>>> pprint(deps)
{'bacon': ['spam'], 'sausage': ['eggs', 'spam']}
>>> g = DepGraph.from_dependency_dictionary(deps)

or you create an empty graph and add nodes and dependencies by hand:

>>> h = DepGraph().add_dependency('bacon', on='spam') \
...               .add_dependency('sausage', on='eggs') \
...               .add_dependency('sausage', on='spam')
>>> g == h
True

In these examples, sausage depends on eggs and spam, and bacon depends on spam; spam and eggs have no dependencies. Note that the dependency dictionary may be seen as a sparse representation of the graph adjacency matrix.

You can recover the dependency dictionary by passing the graph to dict():

>>> pprint(dict(g))  
{'bacon': {'spam'}, 'eggs': set(), 'sausage': {'eggs', 'spam'}, 'spam': set()}

If you print the dictionary, you will notice that spam and eggs have been added as nodes with no dependencies:

>>> print(g)
[('bacon', ['spam']), ('eggs', []), ('sausage', ['eggs', 'spam']), ('spam', [])]

What if a node has no dependencies and no other node depends on it? Just add it to the dictionary with the empty list as a value:

>>> free = DepGraph.from_dependency_dictionary({'kazantzakis': []})

You can also add it after creating the graph:

>>> also_free = DepGraph().add_node('kazantzakis')
>>> free == also_free
True

Querying

You can inspect the nodes of the graph:

>>> sorted(g.nodes())
['bacon', 'eggs', 'sausage', 'spam']

or ask for the dependencies of a node:

>>> sorted(g.dependencies('sausage'))
['eggs', 'spam']
>>> sorted(g['sausage'])  # equivalent, shorter syntax
['eggs', 'spam']

or ask for the nodes that depend on another node (careful though, this operation has O(N) time complexity, N being the number of nodes in the graph):

>>> sorted(g.dependees('spam'))
['bacon', 'sausage']

You can also iterate over graphs:

>>> for k, vs in sorted(g):
...     for v in sorted(vs):
...         print(f"You can't have {k} without {v}!")
You can't have bacon without spam!
You can't have sausage without eggs!
You can't have sausage without spam!

Finally, you can check if a graph is a subgraph of another one:

>>> sub_g = DepGraph().add_dependency('bacon', on='spam') \
...                   .add_dependency('sausage', on='eggs')
>>> sub_g <= g
True

Merging, sorting and other operations

Given two graphs, possibly sharing some nodes and edges, you can construct the union g12 as follows:

>>> g1 = sub_g
>>> g2 = DepGraph().add_dependency('sausage', on='spam')
>>> g12 = g1 + g2
>>> g12 == g
True
>>> _ = g2.merge(g1)  # in-place merge
>>> g2 == g
True
>>> g2 += g1          # equivalent syntax
>>> g2 == g
True

It is also possible to compute the transitive reduction of the graph. Let g be an acyclic graph. The transitive reduction tr(g) is the minimal (in the number of edges), provably unique subgraph of g over the same vertices with the following property: for each pair of nodes A and B, A is reachable from B within g iff it is reachable within tr(g).

>>> g_redundant = DepGraph() \
...     .add_dependency('eggs', on='bacon') \
...     .add_dependency('bacon', on='spam') \
...     .add_dependency('eggs', on='spam')  # this edge is redundant
>>> g_tr = g_redundant.copy()
>>> print(g_tr.transitive_reduction())
[('bacon', ['spam']), ('eggs', ['bacon']), ('spam', [])]
>>> 'spam' in g_tr.dependencies('eggs')
False

You can also do a topological sort of the graph. The result is a list of the graph nodes, with the property that each node is guaranteed to appear after all the nodes it depends on. Note that in general there are several possible topological sorts for a given graph.

>>> g.topological_sort()  
['eggs', 'spam', 'bacon', 'sausage']

Grafting sub-graph nodes into the graph itself

A nice feature of DepGraph is that you can have graph nodes that are themselves instances of DepGraph! Consider the following graph:

digraph depgraph { compound=true; "Spanish Inquisition" -> "surprise" [lhead=cluster1]; subgraph cluster1 { label="chief weapons" "surprise" -> "fear"; "surprise" -> "efficiency"; "fear" -> "devotion"; "efficiency" -> "devotion"; } "devotion" -> "nice red uniforms" [ltail=cluster1]; }

Here is the code that generates it:

>>> spanish_inq = 'Spanish Inquisition'
>>> surprise, fear, efficiency = 'surprise', 'fear', 'efficiency'
>>> devotion, uniforms = 'devotion', 'nice red uniforms'
>>> chief_weapons = DepGraph.from_dependency_dictionary({
...    surprise: [fear, efficiency],
...    fear: [devotion],
...    efficiency: [devotion]
...    })
>>> spanish = DepGraph() \
...     .add_dependency(spanish_inq, on=chief_weapons) \
...     .add_dependency(chief_weapons, on=uniforms)

Here chief_weapons is a DepGraph itself, but it is also a node of spanish. You can graft chief_weapons inside spanish like so:

>>> spanish.graft(chief_weapons)
DepGraph(...)

This yields the following graph

digraph depgraph { compound=true; "Spanish Inquisition" -> "surprise"; "surprise" -> "fear"; "surprise" -> "efficiency"; "fear" -> "devotion"; "efficiency" -> "devotion"; "devotion" -> "nice red uniforms"; }

as you can verify yourself:

>>> full_graph = DepGraph.from_dependency_dictionary({
...    spanish_inq: [surprise],
...    surprise: [fear, efficiency],
...    fear: [devotion],
...    efficiency: [devotion],
...    devotion: [uniforms]
...    })
>>> full_graph == spanish
True

The nice thing about this feature is that it makes it easier (more modular) to build complex graphs. Just build smaller subgraphs and assemble them together as if they were nodes! The flatten method will recursively graft all graph nodes into the main graph.

(Fun fact: the DepGraph type is a monad, with flatten playing the role of join. If what I just wrote makes no sense to you, don’t worry.)

Caveats

Some things you should be aware of when using DepGraph:

  • The type of the items in the graph is irrelevant, but if you want to use the from_dependency_dictionary constructor they need to be stored in a dictionary, and therefore they must be hashable;

  • You need to use a single-element list if you want to express a single dependency, as in the case of bacon. So this is wrong:

    >>> bad_deps = {0: 1, 7: [0, 42]}  # error, should be 0: [1]
    >>> bad = DepGraph.from_dependency_dictionary(bad_deps)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
        [...]
    TypeError: 'int' object is not iterable
    

    Here 1 is not iterable, and the test crashed. However, if your graph nodes happen to be iterable, the DepGraph constructor will not crash, but you will not get what you expect.

exception valjean.cosette.depgraph.DepGraphError[source]

An exception raised by DepGraph.

class valjean.cosette.depgraph.DepGraph(nodes=None, edges=None)[source]

A dependency graph.

There are two preferred ways to instantiate this class:

Alternatively, you may also use the full form of the constructor; in this case, the graph edges are represented as a dictionary between integers, with the convention that each node is represented by its index in the nodes list. So nodes can be seen as a mapping from integers to nodes. The inverse mapping is called index and may be passed to the constructor if it is available to the caller as a dictionary; if not, it will be constructed internally.

classmethod from_dependency_dictionary(dependencies)[source]

Generate a DepGraph from a dependency dictionary.

__init__(nodes=None, edges=None)[source]

Initialize the object from a list of nodes and an edge dictionary.

Parameters:
  • nodes (iterable) – An iterable over the nodes of the graph, or None for an empty graph.

  • edges (mapping) – A mapping between integers representing the nodes, or None for an empty graph.

__str__()[source]

Return str(self).

__repr__()[source]

Return repr(self).

__add__(other)[source]

Merge two graphs, return the result as a new graph.

__radd__(other)

Merge two graphs, return the result as a new graph.

__len__()[source]

Return the number of vertices in the graph.

__contains__(node)[source]

Returns True if x is one of the nodes.

__le__(other)[source]

g <= h if g is a subgraph of h

isomorphic_to(other)[source]

Returns True if this graph is isomorphic to other.

__eq__(other)

Returns True if this graph is isomorphic to other.

add_node(node)[source]

Add a new node to the graph.

Parameters:

node – The new node.

remove_node(node)[source]

Remove a node from the graph.

Any edges going in and out of the node will be removed, too.

Parameters:

node – The node to be removed.

add_dependency(node, on)[source]

Add a new dependency to the graph.

Parameters:
  • node – The node for which the dependency is specified.

  • on – The node node depends on.

remove_dependency(node, on)[source]

Remove a dependency from the graph.

Parameters:
  • node – The node for which the dependency is specified.

  • on – The node node depends on.

invert()[source]

Invert the graph.

Returns:

A new graph having the same nodes but all edges inverted.

topological_sort()[source]

Perform a topological sort of the graph.

Returns:

The nodes of the graph, as a list, with the invariant that each node appears in the list after all the nodes it depends on.

Raises:

DepGraphError – if the graph is cyclic.

copy()[source]

Return a copy of this graph.

merge(other)[source]

Merge this graph in place with another one.

Parameters:

other (DepGraph) – The graph to merge.

__iadd__(other)

Merge this graph in place with another one.

Parameters:

other (DepGraph) – The graph to merge.

nodes()[source]

Returns the graph nodes.

dependencies(node, recurse=False)[source]

Query the graph about the dependencies of node. If recurse is True, also return indirect dependencies (the transitive closure of the specified node). With recurse=False, this operation is O(1).

Parameters:

recurse (bool) – If true, return indirect dependencies as well.

Returns:

A list containing the dependencies of node.

__getitem__(node, recurse=False)

Query the graph about the dependencies of node. If recurse is True, also return indirect dependencies (the transitive closure of the specified node). With recurse=False, this operation is O(1).

Parameters:

recurse (bool) – If true, return indirect dependencies as well.

Returns:

A list containing the dependencies of node.

dependees(node)[source]

Collect the nodes that depend on the given node. This operation is O(N), where N is the number of nodes in the graph.

Returns:

A set containing the dependees of node.

to_graphviz()[source]

Convert the graph to graphviz format.

Returns:

A string describing the file in graphviz format.

transitive_reduction()[source]

Perform a transitive reduction of the graph in place.

transitive_closure()[source]

Perform a transitive closure of the graph in place.

initial()[source]

Return the initial nodes of the graph.

The initial nodes are the nodes that have no ingoing edge; i.e., no other node depends on them.

Returns:

The list of initial nodes.

terminal()[source]

Return the terminal nodes of the graph.

The terminal nodes are the nodes that have no outgoing edge; i.e., they do not depend on any other node.

Returns:

The list of terminal nodes.

graft(node)[source]

Graft the given node into the graph.

Parameters:

node (DepGraph) – A DepGraph embedded as a graph node.

__hash__ = None
flatten(recurse=True)[source]

Graft all DepGraph nodes into this graph.

Parameters:

recurse (bool) – If true, recursively graft DepGraph nodes until all nodes are flat.

depends(node1, node2, recurse=False)[source]

Return True if the node node1 depends on node2.

Parameters:
  • node1 – The first node.

  • node2 – The second node.

  • recurse (bool) – If true, look at indirect dependencies, too.

Returns:

True if node1 directly (recurse == False) or indirectly (recurse == True) depends on node2.