开发者

sql string extraction

开发者 https://www.devze.com 2023-03-24 08:24 出处:网络
I have a column in sql server 2008 which is a varchar(255). The column is full of \'measurements\' and there are around 15,000 records.

I have a column in sql server 2008 which is a varchar(255). The column is full of 'measurements' and there are around 15,000 records. As the column is of type string and has been manually inputted by human it has become开发者_C百科 messy e.g.

  • 100x200cm
  • 200x400mm
  • 3" by 9"
  • 30 x 20
  • 12cm long

Basically I need to extract all measurements (that I can) into usable dimensions and store each unit into a new column (height/width/depth).

Can anyone suggest the best way to go about doing this?

I was looking at regular expression but not sure on the best way forward


I recommend a very strict parser which throws an exception whenever it encounters something it cannot handle explicitly. This way you are save from surprises.

First of, you should use a scripting language for that.

Fetch the records and iterate over them and for each record you try to parse it:

  • ^([1-9][0-9]*)x([1-9][0-9]*)([a-z]+)$ - if it matches, you match $3 against a list of valid units, for example "cm", "mm", "m", etc.

  • ^([1-9][0-9]*)([^ ]+) by ([1-9][0-9]*)(.+)$ - if it matches, compare the units at match $2 and $4. Match them against a list of valid units again, and if that's successful, but they are not the same, convert one into the other.

You keep adding strict checks for each item which throws an exception. This way, your list of expressions expands with the cases you handle.

0

精彩评论

暂无评论...
验证码 换一张
取 消