开发者

How to match columns in MySQL

开发者 https://www.devze.com 2022-12-08 20:36 出处:网络
Everyone knows the \"=\" sign. SE开发者_Python百科LECT * FROM mytable WHERE column1 = column2; However, what if I have different contents in column1 and column2...but they are VERY similar? (maybe

Everyone knows the "=" sign.

SE开发者_Python百科LECT * FROM mytable WHERE column1 = column2;

However, what if I have different contents in column1 and column2...but they are VERY similar? (maybe off by a space, or have a word that's different).

Is it possible to:

SELECT * FROM mytable WHERE ....column matches column2 with .4523423 "Score"...

I believe this is called fuzzy matching? Or pattern matching? That's the technical term for it.

EDIT: I know about Soundex and Levenstein disatance. IS that what you recommend?


What you are looking for is called Levenstein distance. It gives you the number value which discribes the difference between two strings.

In MySQL you have to write stored procedure for that. Here is the articla that may help.


Lukasz Lysik posted a reference to a stored procedure that can do the fuzzy match from inside the database. If you will want to do this as an ongoing task, that is your best bet.

But if you want to do this as a one-off task, and if you might want to do complicated checks, or if you want to do something complicated to clean up the fuzzy matches, you might want to do the fuzzy matching from within Python. (One of your tags is "python" so I assume you are open to a Python solution...)

Using a Python ORM, you can get a Python list with one object per database row, and then use the full power of Python to analyze your data. You could use regular expressions, Python Levenstein functions, or anything else.

The all-around best ORM for Python is probably SQLAlchemy. I actually like the ORM from Django a little better; it's a little simpler, and I value simplicity. If your ORM needs are not complicated, the Django ORM may be a good choice. If in doubt, just go to SQLAlchemy.

Good luck!

0

精彩评论

暂无评论...
验证码 换一张
取 消