Migrating Data with FasterCSV and Ar-Extensions in Rails
By proph3t May 18, 2008 · 0 Comments
Today I polished up our system for transporting the data over to the new site. It runs very smoothly, and for the developers out there I have detailed it a bit below. Basically, it now takes under a minute to move all the content from the live GameGum you see now to the new, improved, and totally redone GameGum we have been slaving over for a bit now. While this may seem fundamental, it is very refreshing to be able to see the new GameGum at its full glory.
How We're Doing It
For any rails developers out there, the process is not as hard as you may think. We are moving the database files by dropping it into CSV files, and then creating a set of migrations used specifically for the imports. With the help of the gems faster-csv and ar-extensions, we can move the entire users table in about 12 seconds, and the games table in about 15, with data validated as we move. Because the tables are so big, I did a few tricks to keep the load down while migrating. Here's a small snippet:
# matched to the new database's columns
new_columns = [:id, :title, ...]
values = []
n = 1
FasterCSV.foreach("#{RAILS_ROOT}/path/to/table_data.csv") do |row|
# match these to the old data's columns.. ruby does a little magic here
id, title, ... = row
values << [id, title, ...] # add the values to our array
# heres the part to keep load down
if n%500 == 0
Game.import new_columns, values
values = []
GC.start # Begin garbage cleaning
end
n = n+1
end
# and get any last rows not yet imported after the loop to import
Game.import(new_columns, values) unless values.empty?
Update: Looks like the beginning of the code is not being formatted, some Markdown issues. Oh well, should be readable. Will of course be fixed when we migrate the data :).
That's it! You can see that for the most part it's straightforward. ar-extensions and fastercsv brought the time down a ton. Our first attempt at importing the users table took almost an hour, and with the refactoring done to get what is shown above we can comfortable run the entire migration set multiple times in a period of minutes, great for testing and tweaking during the building process.

