search server migration

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

search server migration

Robert Stojnic-2
Hi all,

As some of you noticed, there are some new features on enwiki search page.
We are in progress of migrating all of our internal search to lucene-search
2.1 running on our brand-new&shiny search cluster. Since not all of the new
servers could be racked at once, we are unracking old ones, and putting new
ones in. As a result, enwiki moved to new cluster, alongside with partially
dewiki, frwiki and jawiki. Others are still on the old cluster.

Two new features to notice on enwiki are:
1) text snippets are now internally handled by lucene-search, which means
more intelligent snippet extraction, and also detection of matches to
redirects and sections.
2) interwiki matches. When a query matches a title from a sister project in
same language, a box on right appears holding the link to the sister project
page. So, for enwiki, a search matching a enwiktionary page will also appear
on the search page. The captions for different projects can be more
intelligent by tunning MediaWiki:search-interwiki-custom. For instance, for
enwiki it could be (format is interwiki:caption per line):
wikt:Wiktionary word definitions
n:Wikinews news results
..

The search should also hopefully give better ranked results than before.
There will be more new features coming up when we update rest of the
software, including the MWSearch plugin used to fetch results from the
search servers. I'll try to keep the community updated as new stuff comes up
and hopefully in few weeks time we will finish the whole migration.

Cheers, Robert
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Chad
On Sat, Oct 18, 2008 at 7:27 PM, Robert Stojnic <[hidden email]> wrote:

> Hi all,
>
> As some of you noticed, there are some new features on enwiki search page.
> We are in progress of migrating all of our internal search to lucene-search
> 2.1 running on our brand-new&shiny search cluster. Since not all of the new
> servers could be racked at once, we are unracking old ones, and putting new
> ones in. As a result, enwiki moved to new cluster, alongside with partially
> dewiki, frwiki and jawiki. Others are still on the old cluster.
>
> Two new features to notice on enwiki are:
> 1) text snippets are now internally handled by lucene-search, which means
> more intelligent snippet extraction, and also detection of matches to
> redirects and sections.
> 2) interwiki matches. When a query matches a title from a sister project in
> same language, a box on right appears holding the link to the sister
> project
> page. So, for enwiki, a search matching a enwiktionary page will also
> appear
> on the search page. The captions for different projects can be more
> intelligent by tunning MediaWiki:search-interwiki-custom. For instance, for
> enwiki it could be (format is interwiki:caption per line):
> wikt:Wiktionary word definitions
> n:Wikinews news results
> ..
>
> The search should also hopefully give better ranked results than before.
> There will be more new features coming up when we update rest of the
> software, including the MWSearch plugin used to fetch results from the
> search servers. I'll try to keep the community updated as new stuff comes
> up
> and hopefully in few weeks time we will finish the whole migration.
>
> Cheers, Robert
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

Awesome!

-Chad
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Platonides
In reply to this post by Robert Stojnic-2
Great :)
Thanks, Robert


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Brion Vibber-3
In reply to this post by Robert Stojnic-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Stojnic wrote:
> 2) interwiki matches. When a query matches a title from a sister project in
> same language, a box on right appears holding the link to the sister project
> page. So, for enwiki, a search matching a enwiktionary page will also appear
> on the search page.

This is totally awesome!

> The search should also hopefully give better ranked results than before.
> There will be more new features coming up when we update rest of the
> software, including the MWSearch plugin used to fetch results from the
> search servers. I'll try to keep the community updated as new stuff comes up
> and hopefully in few weeks time we will finish the whole migration.

Thanks for all your work, Robert!

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkj8s9YACgkQwRnhpk1wk46XMQCg2n66GswbF/8KjU+S6w+Tv/rK
2CsAnibgEW+7YW9tt5TsmJHKh/bLbAC3
=qrbY
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

John Mark Vandenberg
On Tue, Oct 21, 2008 at 2:37 AM, Brion Vibber <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Stojnic wrote:
>> 2) interwiki matches. When a query matches a title from a sister project in
>> same language, a box on right appears holding the link to the sister project
>> page. So, for enwiki, a search matching a enwiktionary page will also appear
>> on the search page.
>
> This is totally awesome!

That is an understatement! :-)

When I saw a wikisource page being offered for a Wikipedia search for
the title of a marginally notable item, it was one of those "omg"
moments.  It is so sensible, and very cool.

>> The search should also hopefully give better ranked results than before.

I look forward to seeing how the new search algorithm works for
Wikisource; it sounds like it will be a radical improvement for our
purposes.

--
John Vandenberg

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Nikola Smolenski
In reply to this post by Robert Stojnic-2
On Sunday 19 October 2008 01:27:05 Robert Stojnic wrote:
> 2) interwiki matches. When a query matches a title from a sister project in
> same language, a box on right appears holding the link to the sister
> project page. So, for enwiki, a search matching a enwiktionary page will
> also appear on the search page. The captions for different projects can be
> more intelligent by tunning MediaWiki:search-interwiki-custom. For
> instance, for enwiki it could be (format is interwiki:caption per line):
> wikt:Wiktionary word definitions
> n:Wikinews news results

Great thing, and works great! :D I have several question though.

1) How to turn it on? It works on enwiki, but doesn't seem to work on srwiki
(f.e. try searching for итацизам that exists in srwiktionary). Doesn't seem
to work on enwiktionary even...

2) Could it also display sister project results if only a few matches exist in
current project?

3) Could it include sister projects in other languages, of course if all
language projects agree? Obvious application of this are b/h/sh/s wikis, but
also recently announced Egyptian and classical Arabic, and similar cases.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Brion Vibber-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nikola Smolenski wrote:

> On Sunday 19 October 2008 01:27:05 Robert Stojnic wrote:
>> 2) interwiki matches. When a query matches a title from a sister project in
>> same language, a box on right appears holding the link to the sister
>> project page. So, for enwiki, a search matching a enwiktionary page will
>> also appear on the search page. The captions for different projects can be
>> more intelligent by tunning MediaWiki:search-interwiki-custom. For
>> instance, for enwiki it could be (format is interwiki:caption per line):
>> wikt:Wiktionary word definitions
>> n:Wikinews news results
>
> Great thing, and works great! :D I have several question though.
>
> 1) How to turn it on?

It will be enabled on the backend bit by bit as things are migrated over.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkBB3kACgkQwRnhpk1wk46S8gCfZTaNcoAWHw5+gOWJJMvXJAWT
szMAn1LzKz/N+rI8hpOhNmNydRVM+a3/
=C/qk
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Robert Stojnic-2
In reply to this post by Nikola Smolenski
> 1) How to turn it on? It works on enwiki, but doesn't seem to work on
> srwiki
> (f.e. try searching for итацизам that exists in srwiktionary). Doesn't seem
> to work on enwiktionary even...


If everything goes well, it'll be enabled on all projects this weekend.


> 2) Could it also display sister project results if only a few matches exist
> in
> current project?


yes, even when there is no match in local wiki, one could get matches from
sister projects.


> 3) Could it include sister projects in other languages, of course if all
> language projects agree? Obvious application of this are b/h/sh/s wikis,
> but
> also recently announced Egyptian and classical Arabic, and similar cases.
>

yes, one can group wikis together in pretty much arbitrary way. The only
limitation is that once wikis are grouped together, all of them get results
from the same group. I grouped them into sister projects because that seemed
most relevant. The only exception currently is  meta-mediawiki.org group. As
for ex-yu languages, I'm not sure if there would be consensus for such a
feature, but if it were, it could be enabled.

r.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Nikola Smolenski
Robert Stojnic wrote:
>> 2) Could it also display sister project results if only a few matches exist
>> in
>> current project?
>
> yes, even when there is no match in local wiki, one could get matches from
> sister projects.

My question is the opposite: could we get matches from sister projects
when there are matches in the local wiki (but only a few of them)?

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: search server migration

Brianna Laugher
In reply to this post by Robert Stojnic-2
2008/10/24 Robert Stojnic <[hidden email]>:

>> 3) Could it include sister projects in other languages, of course if all
>> language projects agree? Obvious application of this are b/h/sh/s wikis,
>> but
>> also recently announced Egyptian and classical Arabic, and similar cases.
>>
>
> yes, one can group wikis together in pretty much arbitrary way. The only
> limitation is that once wikis are grouped together, all of them get results
> from the same group. I grouped them into sister projects because that seemed
> most relevant. The only exception currently is  meta-mediawiki.org group. As
> for ex-yu languages, I'm not sure if there would be consensus for such a
> feature, but if it were, it could be enabled.

What about Wikimedia Commons, how is it grouped? Hopefully it would be
in all groups.

I understand showing thumbnails would probably overload the search
page, so restricting results to the main and category namespace could
be a good compromise.

cheers
Brianna

--
They've just been waiting in a mountain for the right moment:
http://modernthings.org/

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l