Re: [Services] REST API - Assistance Required - Content Classification and Filtering

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Services] REST API - Assistance Required - Content Classification and Filtering

Michael Holloway
(+mediawiki-api)

Hi Chris,

I think many of us may be having trouble answering this because it's not quite clear what you're trying to do.  Can you be more concrete about what categories (or what category scheme) you have in mind?

Wikipedia doesn't have a single, overarching hierarchy of categories.  A page may be associated with any number of categories (including zero).  Some of these categories may be subcategories of other categories.  Editors may freely create and remove categories, and add and remove page associations with these categories. 

The REST API currently doesn't expose any category information, but you can obtain category information through follow up requests to the Action API (https://en.wikipedia.org/w/api.php).  For example:

To get all categories associated with the page "Marfa, Texas" on English Wikipedia:


To get all pages associated with the category "Category:Cities in Presidio County, Texas":


Best,
Michael



On Wed, Oct 25, 2017 at 5:13 PM, Christopher Smyth <[hidden email]> wrote:

Hello,

We’re a small app development company that has integrated Wikipedia content into a geo-locating iOS app. The app is working well and the Wiki content is displaying correctly. However, we’d like to categorise the Wikipedia content into three categories rather than just one.

Is there a way to filter and categorise Wikipedia content that is accessed through the REST API? We only use content that is geo-coded (ie has latitude and longitude) information associated with each article.

How should we go about configuring our API integration so that we can split Wikipedia content according to its top-level categories? Is there a way to do this?

 

Many thanks for your assistance with this request.

 

Regards,

Chris Smyth

 

 

Christopher Smyth

Director

Inflighto

[hidden email]

<a href="tel:+61%20417%20298%20598" value="+61417298598" target="_blank">+61 (0)417 298 598

 

Inflighto_Source_file_tm_Original Horizontal - EMAIL SIGNATURE SMALL

 

 


_______________________________________________
Services mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/services



_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Reply | Threaded
Open this post in threaded view
|

Re: [Services] REST API - Assistance Required - Content Classification and Filtering

Michael Holloway
A couple more tips which may be helpful, depending on which API endpoint(s) you're using:

If you're using the REST API's page HTML endpoint (/api/rest_v1/page/html/<title>), category links are marked up with rel="mw:PageProp/Category", and you can use this fact to find all linked categories with querySelectorAll.


If you want to query all categories for a page, but omit hidden categories (those not shown on a typical article view, and which may have less taxonomic value to the average reader), you can use the clshow parameter:


mdh



On Mon, Oct 30, 2017 at 11:49 AM, Michael Holloway <[hidden email]> wrote:
(+mediawiki-api)

Hi Chris,

I think many of us may be having trouble answering this because it's not quite clear what you're trying to do.  Can you be more concrete about what categories (or what category scheme) you have in mind?

Wikipedia doesn't have a single, overarching hierarchy of categories.  A page may be associated with any number of categories (including zero).  Some of these categories may be subcategories of other categories.  Editors may freely create and remove categories, and add and remove page associations with these categories. 

The REST API currently doesn't expose any category information, but you can obtain category information through follow up requests to the Action API (https://en.wikipedia.org/w/api.php).  For example:

To get all categories associated with the page "Marfa, Texas" on English Wikipedia:


To get all pages associated with the category "Category:Cities in Presidio County, Texas":


Best,
Michael



On Wed, Oct 25, 2017 at 5:13 PM, Christopher Smyth <[hidden email]> wrote:

Hello,

We’re a small app development company that has integrated Wikipedia content into a geo-locating iOS app. The app is working well and the Wiki content is displaying correctly. However, we’d like to categorise the Wikipedia content into three categories rather than just one.

Is there a way to filter and categorise Wikipedia content that is accessed through the REST API? We only use content that is geo-coded (ie has latitude and longitude) information associated with each article.

How should we go about configuring our API integration so that we can split Wikipedia content according to its top-level categories? Is there a way to do this?

 

Many thanks for your assistance with this request.

 

Regards,

Chris Smyth

 

 

Christopher Smyth

Director

Inflighto

[hidden email]

<a href="tel:+61%20417%20298%20598" value="+61417298598" target="_blank">+61 (0)417 298 598

 

Inflighto_Source_file_tm_Original Horizontal - EMAIL SIGNATURE SMALL

 

 


_______________________________________________
Services mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/services




_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api