Discovery Weekly Update for the week starting 2018-06-04

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Discovery Weekly Update for the week starting 2018-06-04

Chris Koerner-2
Here's the weekly update from the Search Platform team.

As always, feedback and questions welcome.

== Discussions ==

=== Search ===
* After lots of talk about stemmers getting committed and plugins
getting deployed, the Slovak-language wikis have finally been
*reindexed*, and stemming [0] is now happening on the Slovak wikis!

=== Search—Time Machine Edition  ===
A few things from May that got missed:

* Trey wrote up some potential applications of natural language
processing (NLP) to on-wiki search [2]. We're still going through them
to pick out a couple that we'll turn into projects, probably next
quarter. Right now, spelling correction and entity extraction are high
on the list, but more questions, comments, and suggestions are
* Erik pulled 90 days worth of regular expression (regex) searches
across all wikis, and Trey did a quick survey of the most common
patterns. [3] There are a lot more regex searches than we thought—5.6
million in 90 days!—and three apparently automated processes (bots,
apps, or tools of some kind) are responsible for more than 90% of the
regex searches.


Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.

The archive of all past updates can be found on

Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.


Chris Koerner
Community Liaison
Wikimedia Foundation

Wikitech-l mailing list
[hidden email]