开发者

SQL: Splitting a column into multiple words to search user input

开发者 https://www.devze.com 2023-01-05 00:36 出处:网络
I want to compare the individual words from the user input to individual words from a column in my table.

I want to compare the individual words from the user input to individual words from a column in my table.

For example, consider these rows in my table:

ID Name
1  Jack Nicholson
2  Henry Jack Blueberry
3  Pontiac Riddleson Jack

Consider that the user's input is 'Pontiac Jack'. I want to assign weights/ranks for each match, so I can't use a blanket LIKE (WHERE Name LIKE @SearchString).

If Pontiac is present in any ro开发者_StackOverflow中文版w, I want to award it 10 points. Each match for Jack gets another 10 points, etc. So row 3 would get 20 points, and rows 1 and 2 get 10.

I have split the user input into individual words, and stored them into a temporary table @SearchWords(Word).

But I can't figure out a way to have a SELECT statement that allows me to combine this. Maybe I'm going about this the wrong way?

Cheers, WT


For SQL Server, try this:

SELECT Word, COUNT(Word) * 10 AS WordCount
FROM SourceTable
INNER JOIN SearchWords ON CHARINDEX(SearchWords.Word, SourceTable.Name) > 0
GROUP BY Word


What about this? (this is MySQL syntax, I think you only have to replace the CONCAT and do it with +)

SELECT names.id, count(searchwords.word) FROM names, searchwords WHERE names.name LIKE CONCAT('%', searchwords.word, '%') GROUP BY names.id

Then you would have a SQL result with the ID of the names-table and count of the words that match to that id.


You could do it via a common table expression that works out the weighting. For example:

--** Set up the example tables and data
DECLARE @Name TABLE (id INT IDENTITY, name VARCHAR(50));
DECLARE @SearchWords TABLE (word VARCHAR(50));

INSERT INTO @Name
        (name)
VALUES  ('Jack Nicholson')
       ,('Henry Jack Blueberry')
       ,('Pontiac Riddleson Jack')
       ,('Fred Bloggs');

INSERT INTO @SearchWords
        (word)
VALUES  ('Jack')
       ,('Pontiac');

--** Example SELECT with @Name selected and ordered by words in @SearchWords
WITH Order_CTE (weighting, id)
AS (
    SELECT COUNT(*) AS weighting
         , id 
      FROM @Name AS n
      JOIN @SearchWords AS sw
        ON n.name LIKE '%' + sw.word + '%' 
     GROUP BY id
)
SELECT n.name
     , cte.weighting
  FROM @Name AS n
  JOIN Order_CTE AS cte
    ON n.id = cte.id
 ORDER BY cte.weighting DESC;

Using this technique, you can also apply a value to each search word if you wanted to. So you could make Jack more valueable than Pontiac. This would look something like this:

--** Set up the example tables and data
DECLARE @Name TABLE (id INT IDENTITY, name VARCHAR(50));
DECLARE @SearchWords TABLE (word VARCHAR(50), value INT);

INSERT INTO @Name
        (name)
VALUES  ('Jack Nicholson')
       ,('Henry Jack Blueberry')
       ,('Pontiac Riddleson Jack')
       ,('Fred Bloggs');

--** Set up search words with associated value
INSERT INTO @SearchWords
        (word, value)
VALUES  ('Jack',10)
       ,('Pontiac',20)
       ,('Bloggs',40);


--** Example SELECT with @Name selected and ordered by words and values in @SearchWords
WITH Order_CTE (weighting, id)
AS (
    SELECT SUM(sw.value) AS weighting
         , id 
      FROM @Name AS n
      JOIN @SearchWords AS sw
        ON n.name LIKE '%' + sw.word + '%' 
     GROUP BY id
)
SELECT n.name
     , cte.weighting
  FROM @Name AS n
  JOIN Order_CTE AS cte
    ON n.id = cte.id
 ORDER BY cte.weighting DESC;      


Seems to me that the best thing to do would be to maintain a separate table with all the individual words. Eg:

ID     Word       FK_ID
1      Jack       1
2      Nicholson  1
3      Henry      2
(etc)

This table would be kept up to date with triggers, and you'd have a non-clustered index on 'Word', 'FK_ID'. Then the SQL to produce your weightings would be simple and efficient.


How about something like this....

Select id, MAX(names.name), count(id)*10 from names
inner join @SearchWords as sw on 
    names.name like '%'+sw.word+'%'
group by id 

assuming that table with names called "names".

0

精彩评论

暂无评论...
验证码 换一张
取 消