I'm currently building a SQL Server full-text index based search feature for our site, and need to split up the user's input into a set of search terms. Once the terms are split, I then need to re-build them into a string containing the query that will be executed against CONTAINS() or a CONTAINSTABLE() full-text query.
For example, if the user enters
Jon Sidnell
into our search box,开发者_运维技巧 I want to be able to transform that string into the following:
'("jon*" OR FORMSOF(THESAURUS, jon) OR FORMSOF(INFLECTIONAL, jon)) OR
("sidnell*" OR FORMSOF(THESAURUS, sidnell) OR FORMSOF(INFLECTIONAL, sidnell))'
Obviously, if there were three words in the user input, there would be three sets of "wildcard OR thesaurus OR inflectional" terms.
Being a T-SQL novice (not newbie, but certainly not a guru!) I'm not sure of the best way to go about this. I've googled about, and while I've come across things that would help with the initial string splitting, I've not really got a clue how to best use that split-up representation to construct the resulting string.
Can anyone help please?
I don't know if this would be the most efficient approach to your problem, but one thought that comes to mind is to encapsulate the logic to split the input string into a Table-valued function.
Invoke the function and store the results to a table variable.
Iterate over the table variable and concatenate the split strings into the final string you'll use for your search.
I haven't included the code for the function here (for brevity), but in my case my function accepts the string to split and the delimiter to split the string on and returns a table with the following structure: Position INT, Value VARCHAR(8000)
Once you have the function in place you can incorporate it similar to the following:
SET NOCOUNT ON
DECLARE @sampleString VARCHAR(500)
SET @sampleString = 'Jon Sidnell Rocks'
DECLARE @delimiter VARCHAR(20);
SET @delimiter = ' '
DECLARE @SplitResults TABLE (
POSITION INT,
VALUE VARCHAR(8000),
fUsed BIT DEFAULT 0)
INSERT INTO @SplitResults ( POSITION, VALUE )
SELECT * FROM dbo.ufn_SplitString(@sampleString, @delimiter)
--Set up a simple loop instead of having to open up a cursor
DECLARE @Value VARCHAR(8000);
DECLARE @Position INT;
SELECT @Value = q.VALUE, @Position = q.Position
FROM (SELECT TOP 1 VALUE, Position FROM @SplitResults WHERE fUsed = 0)q
DECLARE @SearchString VARCHAR(8000)
WHILE @@ROWCOUNT <> 0 AND @Value IS NOT NULL
BEGIN
IF @Position = 1
BEGIN
SET @SearchString = '("' + @Value + '*" OR FORMSOF(THESAURUS, '+ @Value +') OR FORMSOF(INFLECTIONAL, ' + @Value + '))'
END
ELSE
BEGIN
SET @SearchString = @SearchString + ' OR ("' + @Value + '*" OR FORMSOF(THESAURUS, '+ @Value +') OR FORMSOF(INFLECTIONAL, ' + @Value + '))'
END
--Update record so we know we used it
UPDATE @SplitResults SET fUsed = 1
WHERE Position = @Position AND VALUE = @Value
--Get Next Value to Work With
SELECT @Value = q.VALUE, @Position = q.Position
FROM (SELECT TOP 1 VALUE, Position FROM @SplitResults WHERE fUsed = 0)q
END
PRINT @SearchString;
SET NOCOUNT OFF;
The output should look something like this:
("Jon*" OR FORMSOF(THESAURUS, Jon) OR FORMSOF(INFLECTIONAL, Jon)) OR ("Sidnell*" OR FORMSOF(THESAURUS, Sidnell) OR FORMSOF(INFLECTIONAL, Sidnell)) OR ("Rocks*" OR FORMSOF(THESAURUS, Rocks) OR FORMSOF(INFLECTIONAL, Rocks))
There are fairly nice string manipulation methods if you utilize a SQLCLR user-defined function to split the string up. You can use the following string and apply the String.Format method for each search term. It should be adequately fast unless there is an extremely high volume, and maybe even then.
"(\"{0}*\" OR FORMSOF(THESAURUS, {0}) OR FORMSOF(INFLECTIONAL, {0}))"
public static SqlString convertStringToFTS(SqlString input)
{
string[] strings = input.ToString().Split(new string[] {" "}, StringSplitOptions.RemoveEmptyEntries);
StringBuilder sb = new StringBuilder();
foreach (string s in strings)
{
if (sb.Length > 0)
{
sb.Append(" OR ");
}
else
{
sb.Append ("(");
}
sb.Append(string.Format("(\"{0}*\" OR FORMSOF(THESAURUS, {0}) OR FORMSOF(INFLECTIONAL, {0}))", s));
}
sb.Append(")");
return sb.ToString();
}
I will say that there may be more efficient means to achieve this.
Good luck. I hope this helps.
精彩评论