开发者

how do i filter out non-numeric values in a text field in teradata?

开发者 https://www.devze.com 2023-01-12 18:35 出处:网络
oI have a teradata table with about 10 million records in it, that stores a numeric id field as a varchar.i need to transfer the values in this field to a bigint column in another table, but i can\'t

oI have a teradata table with about 10 million records in it, that stores a numeric id field as a varchar. i need to transfer the values in this field to a bigint column in another table, but i can't simply say cast(id_field as bigint) because i get an invalid character error. looking through the values, i find that there could be a character at any position in the string, so let's say the string is varchar(18) i could filter out invalid rows like so开发者_开发问答 :

     where substr(id_field,1,1) not in (/*big,ugly array of non-numeric chars*/)
     and substr(id_field,2,1) not in (/*big,ugly array of non-numeric chars*/)

etc, etc... 

then the cast would work, but this is not feasible in the long run. it's slow and if the string has 18 possible characters, it makes the query unreadable. how can i filter out rows that have a value in this field that will not cast as a bigint without checking each character individually for an array of non-numeric characters?

example values would be

   123abc464
   a2.3v65
   a_356087
   ........
   000000000
   BOB KNIGHT
   1235468099

the values follow no specific patterns, I simply need to filter out the ones that contain ANY non-numeric data. 123456789 is okay but 123.abc_c3865 is not...


Starting with TD14 Teradata added some functions, now there are multiple ways, e.g.:

WHERE RTRIM(col, '0123456789') = ''

But the easiest way is TO_NUMBER, which returns NULL for bad data:

TO_NUMBER(col)


The best that I've ever managed is this:

where char2hexint(upper(id_field)) = char2hexint(lower(id_field))

Since upper case characters give a different hex value to lower case ones, this will ensure that you have no alphabetical characters, but will still leave you with underscores, colons and so forth. If this doesn't meet your requirements, you may need to write an UDF.


could we also try to divide the values in the field by some integer "if divided then must be a number and if not and throws some error,then must have some character...." guess this would be lot fast as has just mathematics involved...


I've faced the same issue to try to exclude alpha characters from street address house numbers. The following will work if you don't mind concatanating all the numeric numbers together...... It checks if the upper of a string equals the lower of the string, if so it's a number, if not it becomes null.

select cast(case when upper(substring('12E'from 1 for 1)) = lower(substring('12E'from 1 for 1)) then substring('12E'from 1 for 1) else null end ||
             case when upper(substring('12E'from 2 for 1)) = lower(substring('12E'from 2 for 1)) then substring('12E'from 2 for 1) else null end ||
             case when upper(substring('12E'from 3 for 1)) = lower(substring('12E'from 3 for 1)) then substring('12E'from 3 for 1) else null end ||
             case when upper(substring('12E'from 4 for 1)) = lower(substring('12E'from 4 for 1)) then substring('12E'from 4 for 1) else null end ||
             case when upper(substring('12E'from 5 for 1)) = lower(substring('12E'from 5 for 1)) then substring('12E'from 5 for 1) else null end ||
             case when upper(substring('12E'from 2 for 1)) = lower(substring('12E'from 2 for 1)) then substring('12E'from 2 for 1) else null end
             as integer) 


Try using this code segment

WHERE id_Field NOT LIKE '%[^0-9]%'


I found lins314159 answer to be very helpful with a similar issue. It may be an old thread but for what it's worth, I used:

char2hexint(upper(id_field)) = char2hexint(lower(id_field)) AND substr(id_field,1,1) IN ('1' to '9')

to successfully cast the remaining VARCHAR results to INT


SELECT customer_id
FROM t
WHERE UPPER(customer_id)(CASESPECIFIC) <>
      LOWER(customer_id)(CASESPECIFIC);

This works perfectly fine to check whether the values in a numeric field is non-numeric.


SELECT id_field
WHERE oTranslate(id_field, '0123456789','')<>'';

This works well for me! It reveals any id_field containing a non-numeric value

0

精彩评论

暂无评论...
验证码 换一张
取 消