how get via a geosearch the list of the Wikipedia articles missing an illustration, i.e. without a photo

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

how get via a geosearch the list of the Wikipedia articles missing an illustration, i.e. without a photo

Oleksiy Muzalyev
Greetings, Wikimeida API Colleagues,

I am working on the application for finding geo-locations of articles of
different language versions of Wikipedia, including articles which need
an illustration: http://ausleuchtung.ch/geo_wiki/ , either by the
coordinates in the articles themselves, or by the wikipedia tag of the
OSM map.

While searching by the coordinates contained in Wikipedia articles the
application basically uses the query:

https://fr.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=7000&gscoord=46.56|6.30&gslimit=20

Is it possible to modify this query so that it shows only the Wikipedia
articles which do not have any image illustration without looping via
all "pageid"s?

But even with looping, even by checking by each "pageid", I encounter a
problem. For example, these articles have got no illustration:

https://fr.wikipedia.org/?curid=9754525
or
https://fr.wikipedia.org/?curid=8578447
(in this article there is even "image missing" banner).

Still, the following queries show the results with numerous auxiliary
JPG, SVG, and PNG images:

https://fr.wikipedia.org/w/api.php?action=parse&pageid=9754525&prop=images
https://fr.wikipedia.org/w/api.php?action=parse&pageid=8578447&prop=images

The application displays geo-locations of articles around the click on
the map, but then I have to open each article by clicking on the link in
the corresponding geo-marker in order to find an article without an
illustration. It is quite time consuming and fault-prone as often there
are dozens of geo-markers, dozens of articles to open.

With best regards,
Oleksiy

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Reply | Threaded
Open this post in threaded view
|

Re: how get via a geosearch the list of the Wikipedia articles missing an illustration, i.e. without a photo

Magnus Manske-2
I have run into this issue several times over the years. My current project at [1]uses Wikidata instead of Wikipedia, as it is easy to get the "image or not" information from Wikidata, and any Wikipedia article associated with a Wikidata item could use the image there, and thus *potentially* has an image.

You can currently get Wikidata items with coordinates but no image from [2] and [3], but only [2] does "around these coordinates" queries. Also, feel free to use the API at [1], if you can figure it out.

A remaining problem is that, while Wikidata has more items with images than any Wikipedia (save en, for now), there are still plenty of Wikipedia pages with images where the Wikidata item has none. One can use [4] to quickly add existing images from Wikipedia to Wikidata.

Cheers,
Magnus

On Mon, Apr 25, 2016 at 8:40 AM Oleksiy Muzalyev <[hidden email]> wrote:
Greetings, Wikimeida API Colleagues,

I am working on the application for finding geo-locations of articles of
different language versions of Wikipedia, including articles which need
an illustration: http://ausleuchtung.ch/geo_wiki/ , either by the
coordinates in the articles themselves, or by the wikipedia tag of the
OSM map.

While searching by the coordinates contained in Wikipedia articles the
application basically uses the query:

https://fr.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=7000&gscoord=46.56|6.30&gslimit=20

Is it possible to modify this query so that it shows only the Wikipedia
articles which do not have any image illustration without looping via
all "pageid"s?

But even with looping, even by checking by each "pageid", I encounter a
problem. For example, these articles have got no illustration:

https://fr.wikipedia.org/?curid=9754525
or
https://fr.wikipedia.org/?curid=8578447
(in this article there is even "image missing" banner).

Still, the following queries show the results with numerous auxiliary
JPG, SVG, and PNG images:

https://fr.wikipedia.org/w/api.php?action=parse&pageid=9754525&prop=images
https://fr.wikipedia.org/w/api.php?action=parse&pageid=8578447&prop=images

The application displays geo-locations of articles around the click on
the map, but then I have to open each article by clicking on the link in
the corresponding geo-marker in order to find an article without an
illustration. It is quite time consuming and fault-prone as often there
are dozens of geo-markers, dozens of articles to open.

With best regards,
Oleksiy

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Reply | Threaded
Open this post in threaded view
|

Re: how get via a geosearch the list of the Wikipedia articles missing an illustration, i.e. without a photo

Oleksiy Muzalyev
Hello Magnus,

Thank you for the information.

I wish in future there would be in addition to "prop=images" an option in the API "prop=illustrations" to retrieve images in a Wikipedia article which are actually illustrations such as:

[[File:Grandson-Castle.JPG|thumb|Grandson Castle, aerial photo]]

or

{{Infobox
| image = Some_Image.JPG

, because now, even though there is not a single JPG, PNG, or SVG file in the wiki-code of the article: https://fr.wikipedia.org/?curid=8578447 , still the query for this article:

https://fr.wikipedia.org/w/api.php?action=parse&pageid=8578447&prop=images

gives a result of eleven image files, JPG&SVG, which are present in the HTML code of the article.

Meanwhile, I will explore the Wikidata approach as you advised.

With best regards,
Oleksiy

On 25/04/16 10:34, Magnus Manske wrote:
I have run into this issue several times over the years. My current project at [1]uses Wikidata instead of Wikipedia, as it is easy to get the "image or not" information from Wikidata, and any Wikipedia article associated with a Wikidata item could use the image there, and thus *potentially* has an image.

You can currently get Wikidata items with coordinates but no image from [2] and [3], but only [2] does "around these coordinates" queries. Also, feel free to use the API at [1], if you can figure it out.

A remaining problem is that, while Wikidata has more items with images than any Wikipedia (save en, for now), there are still plenty of Wikipedia pages with images where the Wikidata item has none. One can use [4] to quickly add existing images from Wikipedia to Wikidata.

Cheers,
Magnus

On Mon, Apr 25, 2016 at 8:40 AM Oleksiy Muzalyev <[hidden email]> wrote:
Greetings, Wikimeida API Colleagues,

I am working on the application for finding geo-locations of articles of
different language versions of Wikipedia, including articles which need
an illustration: http://ausleuchtung.ch/geo_wiki/ , either by the
coordinates in the articles themselves, or by the wikipedia tag of the
OSM map.

While searching by the coordinates contained in Wikipedia articles the
application basically uses the query:

https://fr.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=7000&gscoord=46.56|6.30&gslimit=20

Is it possible to modify this query so that it shows only the Wikipedia
articles which do not have any image illustration without looping via
all "pageid"s?

But even with looping, even by checking by each "pageid", I encounter a
problem. For example, these articles have got no illustration:

https://fr.wikipedia.org/?curid=9754525
or
https://fr.wikipedia.org/?curid=8578447
(in this article there is even "image missing" banner).

Still, the following queries show the results with numerous auxiliary
JPG, SVG, and PNG images:

https://fr.wikipedia.org/w/api.php?action=parse&pageid=9754525&prop=images
https://fr.wikipedia.org/w/api.php?action=parse&pageid=8578447&prop=images

The application displays geo-locations of articles around the click on
the map, but then I have to open each article by clicking on the link in
the corresponding geo-marker in order to find an article without an
illustration. It is quite time consuming and fault-prone as often there
are dozens of geo-markers, dozens of articles to open.

With best regards,
Oleksiy

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api


_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api


_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Reply | Threaded
Open this post in threaded view
|

Re: how get via a geosearch the list of the Wikipedia articles missing an illustration, i.e. without a photo

Bartosz Dziewoński
On 2016-04-26 11:06, Oleksiy Muzalyev wrote:
> I wish in future there would be in addition to "prop=images" an option
> in the API "prop=illustrations" to retrieve images in a Wikipedia
> article which are actually illustrations such as:
>
> [[File:Grandson-Castle.JPG|thumb|Grandson Castle, aerial photo]]

There kind of is, "prop=pageimages" (provided by an extension:
https://www.mediawiki.org/wiki/Extension:PageImages). It seems that it
only returns one image, though (the one it decides is the most prominent
on the page).

https://en.wikipedia.org/w/api.php?action=query&format=jsonfm&prop=images&titles=The+Fighting+Temeraire

https://en.wikipedia.org/w/api.php?action=query&format=jsonfm&prop=pageimages&titles=The+Fighting+Temeraire

--
Bartosz Dziewoński

_______________________________________________
Mediawiki-api mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api