Batch Insert

Neo4j has a batch insert mode that drops support for transactions and concurrency in favor of insertion speed. This is useful when you have a big dataset that needs to be loaded once. In our experience, the batch inserter will typically inject data around five times faster than running in normal transactional mode.

Be aware that the BatchInserter is intended use is for initial import of data

  • non thread safe

  • non transactional

  • failure to successfully invoke shutdown (properly) results in corrupt database files

Methods
C
I
N
R
S
T
Included Modules
Attributes
[R] batch_indexer
[R] batch_inserter
Class Public methods
new(storage_path=Neo4j.config.storage_path, config={})

Creates a new batch inserter. Will raise an exception if Neo4j is already running at the same storage_path

# File lib/neo4j/batch/inserter.rb, line 21
def initialize(storage_path=Neo4j.config.storage_path, config={})
  # check if neo4j is running and using the same storage path
  raise "Not allowed to start batch inserter while Neo4j is already running at storage location #{storage_path}" if Neo4j.storage_path == storage_path
  @batch_inserter  = org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.new(storage_path, config)
  Indexer.index_provider  = org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider.new(@batch_inserter)
  @rule_inserter = RuleInserter.new(self)
end
Instance Public methods
create_node(props=nil, clazz = Neo4j::Node)

Creates a node. Returns a Fixnum id of the created node. Adds a lucene index if there is a lucene index declared on the properties

# File lib/neo4j/batch/inserter.rb, line 47
def create_node(props=nil, clazz = Neo4j::Node)
  props = {} if clazz != Neo4j::Node && props.nil?
  props['_classname'] = clazz.to_s if clazz != Neo4j::Node

  node = @batch_inserter.create_node(props)
  props && _index(node, props, clazz)
  @rule_inserter.node_added(node, props)
  node
end
create_rel(rel_type, from_node, to_node, props=nil, clazz=Neo4j::Relationship)

creates a relationship between given nodes of given type. Returns a fixnum id of the created relationship.

# File lib/neo4j/batch/inserter.rb, line 68
def create_rel(rel_type, from_node, to_node, props=nil, clazz=Neo4j::Relationship)
  props = {} if clazz != Neo4j::Relationship && props.nil?
  props['_classname'] = clazz.to_s if clazz != Neo4j::Relationship
  rel = @batch_inserter.create_relationship(from_node, to_node, type_to_java(rel_type), props)

  props && _index(rel, props, clazz)

  from_props = node_props(from_node)

  if from_props['_classname']
    indexer = Indexer.instance_for(from_props['_classname'])
    indexer.index_node_via_rel(rel_type, to_node, from_props)
  end

  to_props   = node_props(to_node)
  if to_props['_classname']
    indexer = Indexer.instance_for(to_props['_classname'])
    indexer.index_node_via_rel(rel_type, from_node, to_props)
  end

end
index_flush(clazz = Neo4j::Node)
Makes sure additions/updates can be seen by #index_get and #index_query

so that they are guaranteed to return correct results.

# File lib/neo4j/batch/inserter.rb, line 118
def index_flush(clazz = Neo4j::Node)
  indexer = Indexer.instance_for(clazz)
  indexer.index_flush
end
index_get(key, value, index_type = :exact, clazz = Neo4j::Node)
Returns matches from the index specified by index_type and class.

Parameters

  • key

    the lucene key

  • value

    the lucene value we look for given the key

  • index_type

    :exact or :fulltext

  • clazz

    on which clazz we want to perform the query

# File lib/neo4j/batch/inserter.rb, line 131
def index_get(key, value, index_type = :exact, clazz = Neo4j::Node)
  indexer = Indexer.instance_for(clazz)
  indexer.index_get(key, value, index_type)
end
index_query(query, index_type = :exact, clazz = Neo4j::Node)
Returns matches from the index specified by index_type and class.

Parameters

  • query

    lucene query

  • index_type

    :exact or :fulltext

  • clazz

    on which clazz we want to perform the query

# File lib/neo4j/batch/inserter.rb, line 143
def index_query(query, index_type = :exact, clazz = Neo4j::Node)
  indexer = Indexer.instance_for(clazz)
  indexer.index_query(query, index_type)
end
node_exist?(id)

returns true if the node exists

# File lib/neo4j/batch/inserter.rb, line 58
def node_exist?(id)
  @batch_inserter.node_exists(id)
end
node_props(node)

Return a hash of all properties of given node

# File lib/neo4j/batch/inserter.rb, line 91
def node_props(node)
  @batch_inserter.get_node_properties(node)
end
ref_node()
# File lib/neo4j/batch/inserter.rb, line 62
def ref_node
  @batch_inserter.get_reference_node
end
rel_props(rel)

Returns the properties of the given relationship

# File lib/neo4j/batch/inserter.rb, line 107
def rel_props(rel)
  @batch_inserter.get_relationship_properties(rel)
end
rels(node)

Returns all the relationships of the given node

# File lib/neo4j/batch/inserter.rb, line 112
def rels(node)
  @batch_inserter.getRelationships(node)
end
running?()
# File lib/neo4j/batch/inserter.rb, line 29
def running?
  @batch_inserter != nil
end
set_node_props(node, hash, clazz = Neo4j::Node)

Sets the properties of the given node, overwrites old properties

# File lib/neo4j/batch/inserter.rb, line 96
def set_node_props(node, hash, clazz = Neo4j::Node)
  @batch_inserter.set_node_properties(node, hash)
  _index(node, hash, clazz)
end
set_rel_props(rel, hash)

Sets the old properties of the given relationship, overwrites old properties

# File lib/neo4j/batch/inserter.rb, line 102
def set_rel_props(rel, hash)
  @batch_inserter.set_relationship_properties(rel, hash)
end
shutdown()

This method MUST be called after inserting is completed.

# File lib/neo4j/batch/inserter.rb, line 34
def shutdown
  @batch_inserter && @batch_inserter.shutdown
  @batch_inserter = nil
  @rule_inserter = nil
  
  Indexer.index_provider
  Indexer.index_provider && Indexer.index_provider.shutdown
  Indexer.index_provider = nil
  Indexer.clear_all_instances
end
to_java_map(hash)

hmm, maybe faster not wrapping this ?

# File lib/neo4j/batch/inserter.rb, line 156
def to_java_map(hash)
  return nil if hash.nil?
  map = java.util.HashMap.new
  hash.each_pair do |k, v|
    case v
      when Symbol
        map[k.to_s] = v.to_s
      else
        map[k.to_s] = v
    end
  end
  map
end