UDFs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

UDFs

Nikola Smolenski
Would it be possible to add some user defined functions to MySQL servers? I'm
having in mind Levenshtein distance:
http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/dludf.cgi?ckey=28

_______________________________________________
Toolserver-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/toolserver-l
Reply | Threaded
Open this post in threaded view
|

Re: UDFs

Nikola Smolenski
On Sunday 28 October 2007 01:28, Nikola Smolenski wrote:
> Would it be possible to add some user defined functions to MySQL servers?
> I'm having in mind Levenshtein distance:
> http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/dludf.cgi?ckey=28

Given that there was no response to this, perhaps I should be a bit less
terse.

A user defined function (UDF) is a function added to MySQL by an external
library. It is used as any other MySQL function. You can read more at
http://dev.mysql.com/doc/refman/5.0/en/adding-functions.html and
http://dev.mysql.com/doc/refman/5.0/en/create-function.html . I can
(hopefully) compile the library, but root access is needed to add the
function to MySQL.

A function which I would like to have is Levenshtein distance. This function
can tell how similar two strings are, similar to SOUNDEX() function, but the
latter is limited to English language, and even in it it Levenshtein could
perform better. You can read more at
http://en.wikipedia.org/wiki/Levenshtein_distance . I see two very
interesting applications for it: finding articles with similar titles, and
measuring amount of a contribution. The first is obviously useful to locate
missing redirects, duplicate articles and similar problems, and could later
even be included in MediaWiki to assist searching, and the second could be
used to highlight significant edits (edits which change a lot of text in an
article may still have make small difference in text size) albeit this
probably wouldn't be useful on the Toolserver.

To my knowledge, there should be no performance issues (MySQL shouldn't work
slower with UDFs installed), and I have not found about any security issues.

If someone thinks that another UDF might be useful, see a list of them at
http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/

_______________________________________________
Toolserver-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/toolserver-l