In SQL land, all databases support batch inserts. Batch inserts are an effective and efficient mechanism to insert a lot of similar data. That is, instead of issuing x insert statements, you execute 1 insert with x records. This is much more efficient because the insert statement doesn’t need to be re-parsed x times, there is only 1 network trip as opposed to x, and in the case of transactions, there is only 1 transaction instead of x. When compared to x inserts, batch inserts are always faster.
For example, the Mongo Ruby driver’s insert method takes a collection; thus, you can insert an array of hashes quite efficiently. Even if you are using a ODM like Mongoid, you can still perform batch inserts as all you need to do is get a reference to the model object’s underlying collection and then issue an
insert with an array of hashes matching the collection’s intended document structure.
For instance, to insert a collection of
Tag models (each having 3 fields:
account_id) in one fell swoop I can do the following:
In the code above, the
insert takes a collection of hashes; what’s more, the
insert is tied to the
tags collection via the
Batch inserts are always faster if you have a lot of similar documents – in our case, we saw a tremendous performance increase when employing batching.