oracle regex replace (keep only a-z)_问答_开发者

开发者 https://www.devze.com 2023-01-31 11:29 出处：网络

I have data in 开发者_如何学JAVAlast_name column. This data comes from online and at times users are copy pasting the last name from a word document.This is a problem when a last name has a single qu

相关专题：oracle regex

I have data in 开发者_如何学JAVAlast_name column.

This data comes from online and at times users are copy pasting the last name from a word document. This is a problem when a last name has a single quote. Somehow the single quote from a word document is weird.

I want to write an oracle regex replace in my select query such a way that it replaces everything in the last_name column but just keeps (a-z or A-Z).

Is this doable?

finally I went with this:

REGEXP_REPLACE(mbr_last_name,'[^a-zA-Z'']','') replaced_last_name

I'm keeping a to z A to Z and a single quote

By 'weird' do you mean that it is not the normal single quote? Word has some really stupid characters that it uses by default (called 'Smart quotes (google)'. which look a bit like the standard quotes but have different behaviour when converting to and from ASCII and for example UTF-8

To identify them in SQL do a select using the function ASCIISTR and look for a sub-string in the form <backslash><4 digits> (This works with a DB with UTF-8 and simialr characters, I'm unsure about its return in a database with ASCII NLS parameters)

select asciistr(COLUMN) from table

I have used the following code to remove them in some recent work I have been doing

update jiraissue
set summary = replace(asciistr(summary), '\2013','-')
where asciistr(summary) like '%\2013%';`)

HTH

You can use the regexp_replace operator for instance:

select regexp_replace('foobar1000!!!!','[[:cntrl:]]|[[[:digit:]]|[[:punct:]]') from dual;

REGEXP_REPLACE('FOOBAR1000!!!!','[[:CNTRL:]]|[[[:DIGIT:]]|[[:PUNCT:]]') 
----------------------------------------------------------------------- 
foobar

Worthwhile looking at the CONVERT function, which will do a conversion between character sets. You can convert to 'plain' 7-bit ASCII characters to remove anomalies.

For example