开发者

SQL function that extracts strings

开发者 https://www.devze.com 2023-02-24 21:08 出处:网络
From a prior question, this SQL function extracts strings following a set string. 开发者_StackOverflow中文版so,

From a prior question, this SQL function extracts strings following a set string.

开发者_StackOverflow中文版

so,

extractAfter("hello this is a Test:12356.jpg reset of line","Test:")

returns: 12356.jpg

It works, but I need to tweak it to consider two other things:

  • in addition to whitespaces the string should end when "<" is encounter
  • that if no match, I'd like to return null

so,

extractAfter("hello this is a Test:12356.jpg<br> reset of line","Test:")

would also return:12356.jpg

create function dbo.extractAfter(@full nvarchar(max), @part nvarchar(max)) 
returns nvarchar(max) with returns null on null input as 
begin 
    return ltrim(stuff(left(@full,charindex(' ', @full + ' ', charindex(@part,@full)+1)),     1, charindex(@part,@full)+datalength(@part)/2 -1, '')) 
end 
go 


CREATE FUNCTION dbo.extractAfter (@full nvarchar(max), @part nvarchar(max))
RETURNS nvarchar(max)
WITH RETURNS NULL ON NULL INPUT
AS BEGIN
  RETURN (
    SELECT LEFT(p, PATINDEX('%[ <]%', p + ' ') - 1)
    FROM (SELECT
      p = STUFF(@full, 1, NULLIF(CHARINDEX(@part, @full), 0) + LEN(@part) - 1, '')
    ) s
  )
END


Rather than try to keep tweaking a string parsing function when T-SQL really does not specialize in text-parsing, I would recommend using Regular Expressions if you are on SQL Server 2005 or newer. You can find many examples on the internet of the exact SQLCLR code for these, or you can download the Free version of the SQL# library (which I wrote) and start using them right away. Here are some examples using your specific situation:

SELECT SQL#.RegEx_CaptureGroup('hello this is a Test:12356.jpg<br> reset of line',  
           'Test:([^ ]+)<br>', 1, NULL, 1, -1, '')
-- 12356.jpg

SELECT SQL#.RegEx_CaptureGroup('hello this is a Test:<br> reset of line',  
           'Test:([^ ]+)<br>', 1, NULL, 1, -1, '')
-- NULL

SELECT SQL#.RegEx_CaptureGroup('hello this is a T:12356.jpg<br> reset of line',  
           'Test:([^ ]+)<br>', 1, NULL, 1, -1, '')
-- NULL

In each case, the pattern match starts with the word "Test:" then grabs any non-space characters until it reaches a "<br>" (or a space since it is looking for non-space). If not found due to nothing being between the "Test:" and the "<br>" or if the "Test:" isn't there to begin with, it returns NULL.


Declare @TestString varchar(max)
Declare @TestSearch varchar(max)

Set @TestString = 'hello this is a Test:12356.jpg<br> reset of line'
Set @TestSearch = 'Test:'

Select Case
        When CharIndex( @TestSearch, @TestSTring ) = 0 Then ''
        When Left(Reverse( Z.Value ), Len(Z.Value)) = Reverse( Z.Value ) Then Z.Value
        Else Substring( Value, 1, PatIndex( '%[<> ]%', Value ) - 1 )
        End
From ( Select Substring( @TestString
                , CharIndex( @TestSearch, @TestSTring ) + Len(@TestSearch)
                , Len(@TestString )
    ) As Value ) As Z

I have slightly modified my solution to account for the scenario where the search string happens to be at the end of the input string.

0

精彩评论

暂无评论...
验证码 换一张
取 消