Traversing Relationships and Nodes

Traversing nodes and relationships are one of the key benefits of using Neo4j. In neo4j.rb there are two different ways of traversing: either use the Neo4j traversals which uses the Ruby Enumerable API, or use the Neo4j cypher query language.

1 Cypher

See the Neo4j docs. Use the Neo4j.query method to run a cypher query.

Example:

result = Neo4j.query("START n=node(0) RETURN n")

The result return value implements the Ruby Enumerable returning hash values:

result.first['n'].neo_id  # => 0 (reference node)

You can also find which columns are available:

result.columns.first #=> 'n'

You can also set parameters:

Neo4j.query("START n=relationship({rel}) RETURN n", 'rel' => @some_relationship.neo_id)

For cypher queries with lucene starting points you need the name of the index file.

class Person
  include Neo4j::NodeMixin
  property :name
  index :name
end

Person.index_names[:exact] => "Person-exact"

You may get much better performance when using cypher compared to using the Neo4j.rb traversals, since it is executed in Java. Notice that cypher support is available from neo4j.rb version 1.3.0 and up.

2 Neo4j.rb Traversal

2.1 Performance

Neo4j is very efficient and can easily traverse 100.000 nodes/relationships of any depth. Ruby is much slower then Java, so you will always get better performance implementing traverser filters in Java. However, Neo4j.rb is also several magnitudes faster than SQL queries (e.g. >1000 ggr times faster for large data set, see here ).

For example, on my laptop I can traverse 500.000 nodes in 0.5 secondes with Java but in JRuby it takes about 4 seconds. However, traversing fewer nodes minimizes the difference between Java and Ruby. For more information check the neo4j-perf github project

2.2 Tweaking Performance

It could be very expensive to create Ruby wrappers around every native Neo4j java node. You can avoid this by:

  1. Not using the Neo4j::NodeMixin
  2. Loading the raw java objects (e.g. use the _rels and _node, _load methods) instead of Ruby instances.
  3. Using Identity Map (see Neo4j::IdentityMap )
  4. Traversing without loading wrapper objects (using the raw method, see below)
  5. Use the pacer gem, see below.
  6. Use the Cypher Query language, see above.

Examples

person.outgoing(:friends).depth(:all).raw.to_a

Or by using the java iterator directly:

folder.incoming(:parent_folder).
       incoming(:folder).
       depth(:all).
       iterator.each do |node|
         puts node
       end

Using Java Neo4j::Relationships

The Neo4j::Node#:_rels and Neo4j::Node#_rel returns Neo4j::Relationship instead of your own Ruby wrapper class. For more info, check the RDoc.

Neo4j::Relationship._end_node and Neo4j::Relationship._start_node return Neo4j::Node objects. Notice that Neo4j::Node and Neo4j::Relationships really are the Java objects.

2.3 Traversals

A neo4j traversal is created using methods like #incoming and #outgoing on the Neo4j::Node. These methods are also available on the Neo4j::NodeMixin and Neo4j::Model. All the traversal methods, such as #outgoing, #filter, and #depth can be combined and chained. By default, the start node is not included in the result. If you want to include the start node: node.outgoing(:foo).include_start_node Check the Java Documentation for more detailed info – http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.html or the RDoc API Neo4j::Node#outgoing

In all the code examples here, I have skipped creating Transactions. If you want to try these examples, wrap the write operations (such as creating nodes and relationships) in a Neo4j::Transaction.run{} block or use the (Rails) Neo4j::Model instead, which will create transactions automatically for you.

2.4 Ruby Enumerable and Traversals

The result of a traversal returns an object which includes the Ruby Enumerable mixin. This means that you can use any of the Enumerable method on traversals.

Example:

# find all nodes with property name == 'foo' from node a
# with outgoing relationship 'friends'
a.outgoing(:friends).find{|node| node[:name} == 'foo'}

# return an array names of all nodes from node a with 
# outgoing relationship 'friends'
a.outgoing(:friends).collect{|node| node[:name}}

As shown in the example above, the traversal returns Neo4j nodes. If you want to get the path objects instead, you can do something like:

a.outgoing(:frineds).paths.to_a

2.5 outgoing, incoming, filters

2.5.1 #outgoing and depth 1

The following code:

a = Neo4j::Node.new
b = Neo4j::Node.new
c = Neo4j::Node.new
a.outgoing(:friends) << b << c

Creates the following nodes and relationships:

To find b and c:

a.outgoing(:friends)
2.5.2 #outgoing and depth N

Let say we have the following node space:

which is created with

a = Neo4j::Node.new :name => 'A'
b = Neo4j::Node.new :name => 'B'
c = Neo4j::Node.new :name => 'C'
d = Neo4j::Node.new :name => 'D'
e = Neo4j::Node.new :name => 'E'
a.outgoing(:friends) << b << c
b.outgoing(:friends) << d << e
c.outgoing(:friends) << b

To find A’s friends friends and his friends

a.outgoing(:friends).depth(2).each {|node| puts node[:name]}

The above example prints: B, C, D, E

2.5.3 #filter

Say we only want to include the friends friends nodes (D and E) from the example above, and not nodes at depth one:

a.outgoing(:friends).depth(2).
  filter{|path| path.length == 2}.
    each {|node| puts node[:name]}

The above example prints: D, E

2.6 Advanced Traversals

The following examples use a more complex graph, which is created by the following code snippet:

a = Neo4j::Node.new :name => 'A'
b = Neo4j::Node.new :name => 'B'
c = Neo4j::Node.new :name => 'C'
d = Neo4j::Node.new :name => 'D'
e = Neo4j::Node.new :name => 'E'
f = Neo4j::Node.new :name => 'F'
g = Neo4j::Node.new :name => 'G'
Neo4j::Relationship.new(:friends, a, b)[:since] = 2008
Neo4j::Relationship.new(:friends, a, c)[:since] = 2005
Neo4j::Relationship.new(:friends, a, d)[:since] = 2003
Neo4j::Relationship.new(:friends, c, b)[:since] = 2004
Neo4j::Relationship.new(:friends, b, d)[:since] = 2001
Neo4j::Relationship.new(:friends, b, e)[:since] = 2010
Neo4j::Relationship.new(:friends, e, f)[:since] = 1998
Neo4j::Relationship.new(:friends, e, g)[:since] = 2010

Neo4j::Relationship.new(:recommend, a, d)
Neo4j::Relationship.new(:recommend, a, c)
Neo4j::Relationship.new(:recommend, c, g)

Creates this graph:

2.6.1 The path parameter

Several traversal methods uses a path parmeter to guide the traversal. The following methods are defined on the path object:

  • #end_node – Returns the end node of this path.
  • #last_relationship – Returns the last Relationship in this path.
  • #length – Returns the length of this path.
  • #nodes – Returns all the nodes in this path.
  • #relationships – Returns all the relationships in between the nodes which this path consists of.
  • #start_node – Returns the start node of this path.

2.7 Using several #incoming and #outgoing

You can traverse several relationship types at the same time:

a.outgoing(:friends).outgoing(:recommend).
  each{|node| puts node[:name]}

The example above prints B, C and D.

You can traverse both incoming and outgoing relationships:

a.outgoing(:recommend).incoming(:friends).depth(:all).
   each{|node| puts node[:name]}

The above example prints: D, C, B, G and E (not F).

2.7.1 #unique

This value specifies which paths will be traversed and if the same path should be traversed more then once.

The following values are possible:

  • :node_global :: A node cannot be traversed more than once (default)
  • :node_path :: For each returned node there ’s a unique path from the start node to it.
  • :node_recent :: This is like :node_global, but only guarantees uniqueness among the most recent visited nodes, with a configurable count.
  • :none :: No restriction (the user will have to manage it).
  • :rel_global :: A relationship cannot be traversed more than once, whereas nodes can.
  • :rel_path :: No restriction (the user will have to manage it).
  • :rel_recent :: Same as for :node_recent, but for relationships.

See this document or below for more examples of uniqueness. See also the example of generating a HTML view of a node space below.

2.8 #eval_paths

Instead of specifying which relationships should be traversed and which nodes should be returned, you can supply the traversal with a code block. The code block gets an Path argument (see above) and should return one of the following values:

  • :exclude_and_continue
  • :exclude_and_prune
  • :include_and_continue
  • :include_and_prune

Example:

b.eval_paths{|path| puts path; :exclude_and_continue}.depth(2).to_a

will print:


(2)
(2)--[friends,4]-->(4)
(2)--[friends,5]-->(5)
(2)<--[friends,0]--(1)
(2)<--[friends,3]--(3)
(2)--[friends,5]-->(5)--[friends,6]-->(6)
(2)--[friends,5]-->(5)--[friends,7]-->(7)

and return zero nodes since we always return :exclude_and_continue.

If we change the uniqueness to :node_path (:node_global is default) instead:

b.eval_paths{|path| puts path; :exclude_and_continue}.depth(2).unique(:node_path).to_a.size

It will print the following paths:

(2)
(2)—[friends,4]—>(4)
(2)—[friends,5]—>(5)
(2)<—[friends,0]—(1)
(2)<—[friends,3]—(3)
(2)—[friends,4]—>(4)<—[friends,2]—(1)
(2)—[friends,4]—>(4)<—[recommend,8]—(1)
(2)—[friends,5]—>(5)—[friends,6]—>(6)
(2)—[friends,5]—>(5)—[friends,7]—>(7)
(2)<—[friends,0]—(1)—[friends,1]—>(3)
(2)<—[friends,0]—(1)—[friends,2]—>(4)
(2)<—[friends,0]—(1)—[recommend,8]—>(4)
(2)<—[friends,0]—(1)—[recommend,9]—>(3)
(2)<—[friends,3]—(3)<—[friends,1]—(1)
(2)<—[friends,3]—(3)—[recommend,10]—>(7)
(2)<—[friends,3]—(3)<—[recommend,9]—(1)
Notice that the same thing can be accomplished using the #filter and #prune methods. See below.

2.8.1 #filter

Instead of using the #eval_paths method you can just specify which nodes should be included in the traversal result. The path end_node method returns the node it has traversed to. Example: Find all your friends friends friends that are recommended by someone (uses the node space from the example above).

a.outgoing(:friends).outgoing(:recommend).depth(3).
   filter{|path| path.end_node.rel?(:recommend, :incoming)}.
     each{|node| puts node[:name]}

This prints C, D and G. There is also a start_node method on the path paramenter.

To only include nodes who have been friends since before 2005, or have been recommended by someone in my network (any depth), you can do something like:

a.outgoing(:friends).outgoing(:recommend).depth(:all).filter do |path|
    path.last_relationship.rel_type == 'recommend' ||                     
    path.last_relationship[:since] < 2005                                 
  end.each {|node| puts node[:name]}

The following prints D, G and F.

2.8.2 #prune

You can ‘cut off’ parts of the traversals. Let say you don’t want to traverse past node B:

a.outgoing(:friends).outgoing(:recommend).depth(:all).
  prune{|path| path.end_node[:name] == 'B'}.
    each{|node| puts node[:name]}

The example above prints: B, C, D and G. You can also implement this using :exclude_and_prune or :include_and_prune in the :expand_paths block.

2.8.3 #expand

Instead of specifying which relationship should be traversed with outgoing and incoming you can use the expand method to specify which relationship should be traversed.

Example, traverse all relationship with property age above 5:

some_node.expand { |n| n._rels.find_all { |r| r[:age] > 5 } }.
  depth(:all).to_a
# use _rels since it does not wrap the Java Relationships and performs better
# with your own Ruby classes (if you have a Ruby class for that relationship).
2.8.4 #depth_first and #breadth_first

You can set traversal order:

  • traverse depth first, visiting each node before visiting its child nodes, example: node.outgoing(:foo).depth_first(:pre)
  • traverse depth first, visiting each node after visiting its child nodes, example node.outgoing(:foo).depth_first(:post)
  • traverse breadth first, visiting each node before visiting its child nodes, example node.outgoing(:foo).breadth_first(:pre)
  • traverse breadth first, visiting each node after visiting its child nodes, example node.outgoing(:foo).breadth_first(:post)

Available in Neo4j.rb >= 1.7

2.9 Example

Here is an example of producing a HTML tree from the node space above. It traverses the graph depth first.

def show_tree(parent)
  s = ""

  prev_path = nil
  prev_level = -1
  parent._java_node.outgoing(:friends).outgoing(:recommend).unique(:node_path).include_start_node.raw.paths.depth_first(:pre).depth(:all).to_a.each do |path|
    n = path.end_node
    level = path.length
    # same level then close the previous HTML element
    curr = prev_level
    while curr >= level
      curr -= 1
      s << "#{space_indent(curr)}</ul>\n"
    end
    s << "#{space_indent(level)}<ul><li>name: #{n[:name]}</li>\n"
    prev_level = level
  end
  prev_level.downto(0) {|level| s << "#{space_indent(level)}</ul>\n"}
  s
end

def space_indent(level)
  level.times.inject(""){|s, _| s + "  "}
end

puts show_tree(a)

Notice that we get all the paths instead of the nodes when traversing in the example above.

3 Pacer

It might be faster and easier to express search traversals using the pacer gem.

require 'neo4j'
require 'pacer-neo4j'
Neo4j.start
g = Pacer.neo4j('/home/andreas/myrails_project/db/neo4j-development')
g.v.count #=> 1175734 
g.e.count #=> 8269064 

g.search_manual_indices = true
g.v('email'=>'arikan@gmail.com')  # using lucene

friends.out_e(:friend).in_v(:type => 'person').except(friends).except(person).most_frequent(0...10)