Full-text search - list=search VS generator=search

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Full-text search - list=search VS generator=search

Luigi Assom
Hello,

I would like to better understand the difference in using list=search VS generator=search for full-text search.

I've read list=search relies on elastic search:  which are the differences in indexing and differences in returned results between list=generator and generator=search ? 

I also need to query the page_ID of returned articles:
I can using a generator=search: page_IDs are related to returned pages (<a href="https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&amp;prop=extracts|redirects&amp;format=json&amp;rdprop=pageid%7Ctitle&amp;indexpageids=&amp;generator=search&amp;gsrsearch=dj%20tiesto">example in sandbox)

But cannot do it with list=search:
I tried: list=search + generator=allpages + indexpageids parameter.

The pageIDs in query['pageids'] are not related to the articles in the query['search'] list - it looks like generator is querying new stuff by itself, instead of taking the list in input.

Could you please help to write a query using list=search to fetch also pageIDs of returned pages?

My sandbox attempt is:

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Reply | Threaded
Open this post in threaded view
|

Re: Full-text search - list=search VS generator=search

Brad Jorsch (Anomie)
On Thu, Jan 28, 2016 at 5:21 AM, Luigi Assom <[hidden email]> wrote:
Hello,

I would like to better understand the difference in using list=search VS generator=search for full-text search.

I've read list=search relies on elastic search:  which are the differences in indexing and differences in returned results between list=generator and generator=search ? 

Are you actually using generator=search? Below you state that you're using generator=allpages, which is obviously going to give you different results.

Try an example like <a href="https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&amp;list=search&amp;format=json&amp;srsearch=dj%20tiesto&amp;srprop=snippet|titlesnippet&amp;indexpageids=&amp;generator=search&amp;gsrsearch=dj%20tiesto">https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=search&format=json&srsearch=dj%20tiesto&srprop=snippet|titlesnippet&indexpageids=&generator=search&gsrsearch=dj%20tiesto instead.
 

--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Reply | Threaded
Open this post in threaded view
|

Re: Full-text search - list=search VS generator=search

XDiscovery Team
Hi Brad,

I tried a query with :
params = {'action':'query', 'generator':'search', 'gsrnamespace' : 0, 'gsrsearch' : keywords, 'gsrlimit' : 20 , 'prop' : 'pageimages|extracts', 'pilimit' : 'max', 'exintro' : '', 'explaintext' : '', 'exsentences' : 3, 'exlimit' : 'max', 'redirects' : '' }


I want to try a different query with list=search, but also fetch the page_IDs for the results.

I tried:
action=query&list=search&format=json&srsearch=gene%20editing&srprop=snippet&indexpageids=&generator=allpages

but the ids of generators do not match the id of the list.

How to fetch pageIds for results in list=search ?


I would be happy also to fetch decorators (images) in one query, that is use only one generator to complete the list with pageId and images.

Finally, I'd like to understand the difference between list=search and generator=search : do they reflect a different indexing or architecture (e.g. time response and indexing done in elastic search VS lucene )?




On Thu, Jan 28, 2016 at 5:00 PM, Brad Jorsch (Anomie) <[hidden email]> wrote:
On Thu, Jan 28, 2016 at 5:21 AM, Luigi Assom <[hidden email]> wrote:
Hello,

I would like to better understand the difference in using list=search VS generator=search for full-text search.

I've read list=search relies on elastic search:  which are the differences in indexing and differences in returned results between list=generator and generator=search ? 

Are you actually using generator=search? Below you state that you're using generator=allpages, which is obviously going to give you different results.

 

--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api




--
Luigi Assom
Founder & CEO @ XDiscovery - Crazy on Human Knowledge
Corporate
www.xdiscovery.com
Mobile App for knowledge Discovery
APP STORE  | PR  | WEB 

T +39 349 3033334 | +1 415 707 9684

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api