Batch Insert
Neo4j has a batch insert mode that drops support for transactions and concurrency in favor of insertion speed. This is useful when you have a big dataset that needs to be loaded once. In our experience, the batch inserter will typically inject data around five times faster than running in normal transactional mode.
Be aware that the BatchInserter is intended use is for initial import of data
-
non thread safe
-
non transactional
-
failure to successfully invoke shutdown (properly) results in corrupt database files
- C
- I
- N
- R
- S
- T
| [R] | batch_indexer | |
| [R] | batch_inserter |
Creates a new batch inserter. Will raise an exception if Neo4j is already running at the same storage_path
# File lib/neo4j/batch/inserter.rb, line 21 def initialize(storage_path=Neo4j.config.storage_path, config={}) # check if neo4j is running and using the same storage path raise "Not allowed to start batch inserter while Neo4j is already running at storage location #{storage_path}" if Neo4j.storage_path == storage_path @batch_inserter = org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.new(storage_path, config) Indexer.index_provider = org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider.new(@batch_inserter) @rule_inserter = RuleInserter.new(self) end
Creates a node. Returns a Fixnum id of the created node. Adds a lucene index if there is a lucene index declared on the properties
# File lib/neo4j/batch/inserter.rb, line 47 def create_node(props=nil, clazz = Neo4j::Node) props = {} if clazz != Neo4j::Node && props.nil? props['_classname'] = clazz.to_s if clazz != Neo4j::Node node = @batch_inserter.create_node(props) props && _index(node, props, clazz) @rule_inserter.node_added(node, props) node end
creates a relationship between given nodes of given type. Returns a fixnum id of the created relationship.
# File lib/neo4j/batch/inserter.rb, line 68 def create_rel(rel_type, from_node, to_node, props=nil, clazz=Neo4j::Relationship) props = {} if clazz != Neo4j::Relationship && props.nil? props['_classname'] = clazz.to_s if clazz != Neo4j::Relationship rel = @batch_inserter.create_relationship(from_node, to_node, type_to_java(rel_type), props) props && _index(rel, props, clazz) from_props = node_props(from_node) if from_props['_classname'] indexer = Indexer.instance_for(from_props['_classname']) indexer.index_node_via_rel(rel_type, to_node, from_props) end to_props = node_props(to_node) if to_props['_classname'] indexer = Indexer.instance_for(to_props['_classname']) indexer.index_node_via_rel(rel_type, from_node, to_props) end end
Makes sure additions/updates can be seen by #index_get and #index_query
so that they are guaranteed to return correct results.
Returns matches from the index specified by index_type and class.
Parameters
key
the lucene key
value
the lucene value we look for given the key
index_type
:exact or :fulltext
clazz
on which clazz we want to perform the query
Returns matches from the index specified by index_type and class.
Parameters
query
lucene query
index_type
:exact or :fulltext
clazz
on which clazz we want to perform the query
returns true if the node exists
Return a hash of all properties of given node
Returns the properties of the given relationship
Returns all the relationships of the given node
Sets the properties of the given node, overwrites old properties
Sets the old properties of the given relationship, overwrites old properties
This method MUST be called after inserting is completed.
# File lib/neo4j/batch/inserter.rb, line 34 def shutdown @batch_inserter && @batch_inserter.shutdown @batch_inserter = nil @rule_inserter = nil Indexer.index_provider Indexer.index_provider && Indexer.index_provider.shutdown Indexer.index_provider = nil Indexer.clear_all_instances end
hmm, maybe faster not wrapping this ?