Now that Google has announced availability of Cloud SQL storage for app engine, what will be the best way to migrate existing data from BigTable/GAE Datastore to MySQL?
To respond to the excellent questions brought up by Peter:
- In my particular scenario, I kept my data model very relational.
- I am fine with taking my site down for a few hours to make the transition, or at least warning people that any changes they make for the next few hours will be lost due to database maintenance, etc.
- My data set is not very large - the main dashboard for my app says .67gb, and the datastore statistics page says it's more like 200mb.
- I am using python.
- I am not using the blobstore (although I think that is a separate question from a pure datastore migration - one could migrate datastore usage to MySql while maintaining the blobstore).
- I would be fine with paying a reasonable amount (say, less than $100).
- I believe my application is Master/Slave - it was created during the preview period of App Engine. I can't seem to find an easy way to verify that though.
It seems like the bulk uplo开发者_StackOverflowader should be able to be used to download the data into a text format that could then be loaded with mysqlimport, but I don't have any experience with either technology. Also, it appears that Cloud SQL only supports importing mysqldumps, so I would have to install MqSQL locally, mysqlimport the data, then dump it, then import the dump?
An example of my current model code, in case it's required:
class OilPatternCategory(db.Model):
version = db.IntegerProperty(default=1)
user = db.UserProperty()
name = db.StringProperty(required=True)
default = db.BooleanProperty(default=False)
class OilPattern(db.Model):
version = db.IntegerProperty(default=2)
user = db.UserProperty()
name = db.StringProperty(required=True)
length = db.IntegerProperty()
description = db.TextProperty()
sport = db.BooleanProperty(default=False)
default = db.BooleanProperty(default=False)
retired = db.BooleanProperty(default=False)
category = db.CategoryProperty()
class League(db.Model):
version = db.IntegerProperty(default=1)
user = db.UserProperty(required=True)
name = db.StringProperty(required=True)
center = db.ReferenceProperty(Center)
pattern = db.ReferenceProperty(OilPattern)
public = db.BooleanProperty(default=True)
notes = db.TextProperty()
class Tournament(db.Model):
version = db.IntegerProperty(default=1)
user = db.UserProperty(required=True)
name = db.StringProperty(required=True)
center = db.ReferenceProperty(Center)
pattern = db.ReferenceProperty(OilPattern)
public = db.BooleanProperty(default=True)
notes = db.TextProperty()
class Series(db.Model):
version = db.IntegerProperty(default=3)
created = db.DateTimeProperty(auto_now_add=True)
user = db.UserProperty(required=True)
date = db.DateProperty()
name = db.StringProperty()
center = db.ReferenceProperty(Center)
pattern = db.ReferenceProperty(OilPattern)
league = db.ReferenceProperty(League)
tournament = db.ReferenceProperty(Tournament)
public = db.BooleanProperty(default=True)
notes = db.TextProperty()
allow_comments = db.BooleanProperty(default=True)
complete = db.BooleanProperty(default=False)
score = db.IntegerProperty(default=0)
class Game(db.Model):
version = db.IntegerProperty(default=5)
user = db.UserProperty(required=True)
series = db.ReferenceProperty(Series)
score = db.IntegerProperty()
game_number = db.IntegerProperty()
pair = db.StringProperty()
notes = db.TextProperty()
entry_mode = db.StringProperty(choices=entry_modes, default=default_entry_mode)
Have you considered using the Map Reduce framework? You could write mappers that store the datastore entities in CloudSQL. Do not forget to add a column for the datastore key, this might help you avoiding duplicate rows or identifying missing rows.
You might have a look at https://github.com/hudora/gaetk_replication for an inspiration on the mapper functions.
精彩评论