I've got a CSV with some data that looks like this:
A0A0A0,48.5674500000,-54.8432250000,Gander,NL
A0A1A0,47.0073470000,-52.9589210000,Aquaforte,NL
A0A1B0,47.3622800000,-53.2939930000,Avondale,NL
But my database is normalized such that Cities and Provinces are in separate tables, each with their own ID column.
So what's the easiest way to import this file into 3 separate tables and link the foreign k开发者_JS百科eys properly?
To be more clear, the tables are
cities (id, name, province_id)
provinces (id, code, name, country_id)
postal_codes (id, code, city_id)
countries (id, code, name)
Use COPY
to import the csv into a temp table. Than use some INSERT INTO ... SELECT ... FROM ...
to dump the data in the correct tables.
... my database is normalized
Doesn't appear to be. There are many issues, but the one that will trip you up in this question is, there does not seem to be correct PKs, no Unique Keys at all; so you will end up with duplicated data. Id
"keys" do not prevent duplicate names
, you need an unique index on name
. It is not clear how you support two towns with the same name in the same province.
You know you have to load three tables from the one imported table. Due to FKs, which are a Good Thing, you need to load Provinces first, then Cities, then PostalCodes. But from the look of your import file, it is cities (or towns or localities or suburbs) ... that resolution needs to be clearly identified first. There are 360 kilometres and dozens of localities between Gander and Aquaforte. What exactly constitutes a record in the file ?
It may help to understand the structure on the excellent Canadian postal code system.
Then you need to check what level of granularity you are storing in the Db. Apparently Cities or towns, but not suburbs, not localities. What about Counties or Parishes ? Eg
_0A ___
means it is a rural area; since you are storing cities, not counties, not municipalities, you can ignore them.
Once you are clear about the granularity or resolution of the source data, and the level of resolution you want in the target tables, you can then load the import file, most probably is several waves per table. The SQL is easy.
精彩评论