Finding what is said about a topic in other articles

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Finding what is said about a topic in other articles

Kerry Raymond
Is there a tool already (or "how hard would it be?") which would show the
user what is said about article X in other articles. It seems to me that
there are a lot of easy content additions that might be found that way and
used to flesh out stubs and other shorter articles. What is motivating this
is because I often find that  "what links here" often points to some
surprising articles which can reveal new insights into a topic. I often
write about places. Often I think "oh, this one's nothing special" and
suddenly "what links here" reveals some interesting events that occurred
there. Discovery of a famous fossil or a big role in World War II or the
birthplace of someone quite famous. So I am wondering if there is a way to
automate this process a bit by quickly drilling down to the relevant chunk
of the article content rather than having to read/search the whole thing.

 

That is, if I was writing the article [[Bang Bang Jump Up]], I would want a
list along the lines of:

 

*        From article [[Winston Churchill]] within section "After the Second
World War" : On 23 July 1944 at [[Bang Bang Jump Up]], he met [[Harry
Truman]] to discuss the establishment of the [[United Nations]].

 

(False news alert: These world leaders did not meet at Bang Bang Jump Up,
but let's pretend they did.)

 

That is, a list of the articles with the sentence/para containing the link
or +/- N chars before or after the link, whatever's feasible to create an
intelligible snippet without having to read the whole article.

 

I am assuming here that article X is linked from Y (I'm not considering text
mentions). Of course, the success of the tool is its ability to pick what
might be most relevant. Nobody wants to wade through a list of irrelevant
mentions. So I would want to stick to links occurring in the prose of the
article body rather than navbox transclusions, links in citations, templates
and so forth. I also think that ordering the list by some "likely to be most
useful" metric would be beneficial (or ideally the ability of the user to
fiddle with those choices at run-time). Now until one has such a tool to get
experience with, it's hard to know what might constitute more "relevant".
But some metrics might be:

 

*        The relative importance of the topics. I suspect if a more
important topic is mentioning a less important topic, it might be more
relevant. Winston Churchill is more important than Bang Bang Jump Up.

*        The relative quality of the articles. I suspect if a high quality
article is mentioning a low quality article, it might be more relevant.
Winston Church is a higher quality article than Bang Bang Jump Up.

*        Being tagged by the same WikiProject (or not within the same
WikiProject). Not sure which would likely be more relevant but it might be
interesting to explore. It's unlikely Winston Churchill and Bang Bang Jump
Up are in the same WikiProject.

*        The other article is not already linked in this article. That is,
if Bang Bang Jump Up already links to Winston Churchill, then probably this
is less likely to be "new information" for the Bang Bang Jump Up article.

 

Anyhow, do we have a tool that does something along these lines? If not, is
there a student project here? :)

 

Kerry

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Nicholas Moreau
Hi,

I would *love* this tool as well. Being a frequent editor of "List of
people from" articles, it would save me oodles of time to be able to pass
by false positives in What links here. (A third of the links to a community
are gold, the rest are template transclusions and people competing in a
tourney at X community.)

Nick
On Tue, Jun 13, 2017 at 6:45 PM Kerry Raymond <[hidden email]>
wrote:

> Is there a tool already (or "how hard would it be?") which would show the
> user what is said about article X in other articles. It seems to me that
> there are a lot of easy content additions that might be found that way and
> used to flesh out stubs and other shorter articles. What is motivating this
> is because I often find that  "what links here" often points to some
> surprising articles which can reveal new insights into a topic. I often
> write about places. Often I think "oh, this one's nothing special" and
> suddenly "what links here" reveals some interesting events that occurred
> there. Discovery of a famous fossil or a big role in World War II or the
> birthplace of someone quite famous. So I am wondering if there is a way to
> automate this process a bit by quickly drilling down to the relevant chunk
> of the article content rather than having to read/search the whole thing.
>
>
>
> That is, if I was writing the article [[Bang Bang Jump Up]], I would want a
> list along the lines of:
>
>
>
> *        From article [[Winston Churchill]] within section "After the
> Second
> World War" : On 23 July 1944 at [[Bang Bang Jump Up]], he met [[Harry
> Truman]] to discuss the establishment of the [[United Nations]].
>
>
>
> (False news alert: These world leaders did not meet at Bang Bang Jump Up,
> but let's pretend they did.)
>
>
>
> That is, a list of the articles with the sentence/para containing the link
> or +/- N chars before or after the link, whatever's feasible to create an
> intelligible snippet without having to read the whole article.
>
>
>
> I am assuming here that article X is linked from Y (I'm not considering
> text
> mentions). Of course, the success of the tool is its ability to pick what
> might be most relevant. Nobody wants to wade through a list of irrelevant
> mentions. So I would want to stick to links occurring in the prose of the
> article body rather than navbox transclusions, links in citations,
> templates
> and so forth. I also think that ordering the list by some "likely to be
> most
> useful" metric would be beneficial (or ideally the ability of the user to
> fiddle with those choices at run-time). Now until one has such a tool to
> get
> experience with, it's hard to know what might constitute more "relevant".
> But some metrics might be:
>
>
>
> *        The relative importance of the topics. I suspect if a more
> important topic is mentioning a less important topic, it might be more
> relevant. Winston Churchill is more important than Bang Bang Jump Up.
>
> *        The relative quality of the articles. I suspect if a high quality
> article is mentioning a low quality article, it might be more relevant.
> Winston Church is a higher quality article than Bang Bang Jump Up.
>
> *        Being tagged by the same WikiProject (or not within the same
> WikiProject). Not sure which would likely be more relevant but it might be
> interesting to explore. It's unlikely Winston Churchill and Bang Bang Jump
> Up are in the same WikiProject.
>
> *        The other article is not already linked in this article. That is,
> if Bang Bang Jump Up already links to Winston Churchill, then probably this
> is less likely to be "new information" for the Bang Bang Jump Up article.
>
>
>
> Anyhow, do we have a tool that does something along these lines? If not, is
> there a student project here? :)
>
>
>
> Kerry
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Kerry Raymond
Indeed, the “Notable residents” section is one that would definitely benefit from this tool. Is it just me or is there something actually broken with “What links here?”. I try to suppress the transclusions (usually coming from navboxes) but they are still displayed no matter whether I say to “Show/Hide Transclusions” but a search of the article reveals there is no other link present.

 

Kerry

 

 

From: Nicholas Moreau [mailto:[hidden email]]
Sent: Wednesday, 14 June 2017 9:55 AM
To: [hidden email]; Research into Wikimedia content and communities <[hidden email]>
Subject: Re: [Wiki-research-l] Finding what is said about a topic in other articles

 

Hi,

I would *love* this tool as well. Being a frequent editor of "List of people from" articles, it would save me oodles of time to be able to pass by false positives in What links here. (A third of the links to a community are gold, the rest are template transclusions and people competing in a tourney at X community.)

Nick

On Tue, Jun 13, 2017 at 6:45 PM Kerry Raymond <[hidden email] <mailto:[hidden email]> > wrote:

Is there a tool already (or "how hard would it be?") which would show the
user what is said about article X in other articles. It seems to me that
there are a lot of easy content additions that might be found that way and
used to flesh out stubs and other shorter articles. What is motivating this
is because I often find that  "what links here" often points to some
surprising articles which can reveal new insights into a topic. I often
write about places. Often I think "oh, this one's nothing special" and
suddenly "what links here" reveals some interesting events that occurred
there. Discovery of a famous fossil or a big role in World War II or the
birthplace of someone quite famous. So I am wondering if there is a way to
automate this process a bit by quickly drilling down to the relevant chunk
of the article content rather than having to read/search the whole thing.



That is, if I was writing the article [[Bang Bang Jump Up]], I would want a
list along the lines of:



*        From article [[Winston Churchill]] within section "After the Second
World War" : On 23 July 1944 at [[Bang Bang Jump Up]], he met [[Harry
Truman]] to discuss the establishment of the [[United Nations]].



(False news alert: These world leaders did not meet at Bang Bang Jump Up,
but let's pretend they did.)



That is, a list of the articles with the sentence/para containing the link
or +/- N chars before or after the link, whatever's feasible to create an
intelligible snippet without having to read the whole article.



I am assuming here that article X is linked from Y (I'm not considering text
mentions). Of course, the success of the tool is its ability to pick what
might be most relevant. Nobody wants to wade through a list of irrelevant
mentions. So I would want to stick to links occurring in the prose of the
article body rather than navbox transclusions, links in citations, templates
and so forth. I also think that ordering the list by some "likely to be most
useful" metric would be beneficial (or ideally the ability of the user to
fiddle with those choices at run-time). Now until one has such a tool to get
experience with, it's hard to know what might constitute more "relevant".
But some metrics might be:



*        The relative importance of the topics. I suspect if a more
important topic is mentioning a less important topic, it might be more
relevant. Winston Churchill is more important than Bang Bang Jump Up.

*        The relative quality of the articles. I suspect if a high quality
article is mentioning a low quality article, it might be more relevant.
Winston Church is a higher quality article than Bang Bang Jump Up.

*        Being tagged by the same WikiProject (or not within the same
WikiProject). Not sure which would likely be more relevant but it might be
interesting to explore. It's unlikely Winston Churchill and Bang Bang Jump
Up are in the same WikiProject.

*        The other article is not already linked in this article. That is,
if Bang Bang Jump Up already links to Winston Churchill, then probably this
is less likely to be "new information" for the Bang Bang Jump Up article.



Anyhow, do we have a tool that does something along these lines? If not, is
there a student project here? :)



Kerry

_______________________________________________
Wiki-research-l mailing list
[hidden email] <mailto:[hidden email]>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Nick Wilson (Quiddity)
On Tue, Jun 13, 2017 at 6:08 PM, Kerry Raymond <[hidden email]> wrote:
> Indeed, the “Notable residents” section is one that would definitely benefit from this tool. Is it just me or is there something actually broken with “What links here?”. I try to suppress the transclusions (usually coming from navboxes) but they are still displayed no matter whether I say to “Show/Hide Transclusions” but a search of the article reveals there is no other link present.
>
>

That existing feature works by hiding/showing where *the page itself*
is transcluded *into*. E.g.
https://en.wikipedia.org/wiki/Special:WhatLinksHere/Template:WikiFauna
vs https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Template:WikiFauna&hidetrans=1


Making it work differently for incoming links that are coming from a
template, is a long-standing (and complicated to implement)
feature-request: https://phabricator.wikimedia.org/T14396

However, I see this comment by Izno suggests a partial (manual)
workaround, using an "insource:/\[\[FOO/" search.
https://phabricator.wikimedia.org/T14396#3246134
e.g. https://en.wikipedia.org/w/index.php?title=Special:Search&profile=all&search=insource%3A%2F\[\[Wikipedia%3AWikiGremlin%2F&fulltext=1
versus https://en.wikipedia.org/wiki/Special:WhatLinksHere/Wikipedia:WikiGremlin
(I'm not sure why Izno's example also includes the "linksto:FOO"
string, but it appears to be redundant)

--
Quiddity

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Federico Leva (Nemo)
In reply to this post by Kerry Raymond
Kerry Raymond, 14/06/2017 00:45:
 > What is motivating this
 > is because I often find that  "what links here" often points to some
 > surprising articles which can reveal new insights into a topic.

Indeed. I always teach the "what links here" feature at all my wiki courses.

Kerry Raymond, 14/06/2017 03:08:
> I try to suppress the transclusions (usually coming from navboxes) but they are still displayed no matter whether I say to “Show/Hide Transclusions” but a search of the article reveals there is no other link present.

Hiding transclusions means to hide the "links" in the form {{:Bang Bang
Jump Up}}, not the links within templates. There is currently no
distinction in the database for "templated" links (they all go
https://www.mediawiki.org/wiki/Manual:Pagelinks_table ).

I think ElasticSearch/CirrusSearch currently can tell the difference,
for ranking purposes, and could maybe expose it somewhere. Which makes
sense, because overlinking is a problem specific to some wikis (other
wikis have even forbidden the navigational templates that are so
prevalent on the English Wikipedia), and while several HTML classes have
been defined across the years (some now listed at
https://www.mediawiki.org/wiki/Manual:Interface/IDs_and_classes#Content 
), only "noprint" is standard/stable.

Nemo

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Kerry Raymond
In reply to this post by Nick Wilson (Quiddity)
Ahh ... it's a whole new meaning of transclusion ...

I am tempted to say "well, just work with the original wikitext and don’t resolve the templates" but I guess the problem here is that all templates aren't equal. Links in an infobox are much more likely to be relevant to *this* article than links in a navbox are and who knows about arbitrary templates more generally. I saw a stat in passing the other day that said around 50% of Wikipedia article have navboxes and I confess to having added a few navboxes even in the past few days. As a reader I like them, but they are a pain for anyone using "What links here".

Indeed, just using

insource:"Chapel Hill, Queensland"

* without* the square brackets does a jolly fine job of identifying the articles that mention the article [[Chapel Hill, Queensland]] or just the topic and provides a snippet (not a great one but it does gives some context to the link)

It works because it sees the links that are used as parameters in the infobox (whether or not they are wrapped in square brackets) but can't see the ones embedded in the definition of the navboxes. Plus you get mentions as well as links. Sweet! If one could have a filter that eliminated the "mutually linking" articles (X links to Y and Y links to X) it would be close to nailing it! Of course it works better for longer article titles unlikely to occur in other circumstances. I wouldn't bother to try it for [[Food]] but then I am looking to a tool to populate stubs which probably eliminates "common name" articles.

Kerry

-----Original Message-----
From: Nick Wilson (Quiddity) [mailto:[hidden email]]
Sent: Wednesday, 14 June 2017 3:34 PM
To: Kerry Raymond <[hidden email]>; Research into Wikimedia content and communities <[hidden email]>
Cc: Nicholas Moreau <[hidden email]>
Subject: Re: [Wiki-research-l] Finding what is said about a topic in other articles

On Tue, Jun 13, 2017 at 6:08 PM, Kerry Raymond <[hidden email]> wrote:
> Indeed, the “Notable residents” section is one that would definitely benefit from this tool. Is it just me or is there something actually broken with “What links here?”. I try to suppress the transclusions (usually coming from navboxes) but they are still displayed no matter whether I say to “Show/Hide Transclusions” but a search of the article reveals there is no other link present.
>
>

That existing feature works by hiding/showing where *the page itself* is transcluded *into*. E.g.
https://en.wikipedia.org/wiki/Special:WhatLinksHere/Template:WikiFauna
vs https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Template:WikiFauna&hidetrans=1


Making it work differently for incoming links that are coming from a template, is a long-standing (and complicated to implement)
feature-request: https://phabricator.wikimedia.org/T14396

However, I see this comment by Izno suggests a partial (manual) workaround, using an "insource:/\[\[FOO/" search.
https://phabricator.wikimedia.org/T14396#3246134
e.g. https://en.wikipedia.org/w/index.php?title=Special:Search&profile=all&search=insource%3A%2F\[\[Wikipedia%3AWikiGremlin%2F&fulltext=1
versus https://en.wikipedia.org/wiki/Special:WhatLinksHere/Wikipedia:WikiGremlin
(I'm not sure why Izno's example also includes the "linksto:FOO"
string, but it appears to be redundant)

--
Quiddity


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Chris Koerner-2
In reply to this post by Kerry Raymond
Kerry,
What an interesting idea. I created a task in Phabricator, Wikimedia's took
for tracking bug and feature requests. I'll bug some of the search folks to
see if they have any suggestions. In the task I shared a very clunky way of
doing this that is not 100% what you're looking for, but something! :)

https://phabricator.wikimedia.org/T167899

Yours,
Chris Koerner
Community Liaison - Discovery
Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Eran Rosenthal
I wrote a related script with very similar purpose:
Sometimes users write new articles, and forget to add links to the new
article from related articles (in the worst case this result in orphan
articles).
The script aims to find related articles (e.g using search of the current
title), and then suggest the specific context where the article is
mentioned,
so the user can select whether to add link.

In [[Special:MyPage/common.js]] you can add the following snippet:

mw.loader.load('//he.wikipedia.org/w/index.php?title=User:ערן/quickLinker.js&action=raw&ctype=text/javascript&smaxage=21600&maxage=86400
<http://he.wikipedia.org/w/index.php?title=User:%D7%A2%D7%A8%D7%9F/quickLinker.js&action=raw&ctype=text/javascript&smaxage=21600&maxage=86400>');
// [[User:ערן/quickLinker.js]]

Then get to some article that you would like to find related article for,
and press on "Add links" in the sidebar (under tools),
which will open OOUI dialog with article where the current page is
mentioned, and the specific context with suggestion for link.
If you find a suggested link suitable, press "save" or press "skip" to
continue to next relevant page.






On Wed, Jun 14, 2017 at 8:11 PM, Chris Koerner <[hidden email]>
wrote:

> Kerry,
> What an interesting idea. I created a task in Phabricator, Wikimedia's took
> for tracking bug and feature requests. I'll bug some of the search folks to
> see if they have any suggestions. In the task I shared a very clunky way of
> doing this that is not 100% what you're looking for, but something! :)
>
> https://phabricator.wikimedia.org/T167899
>
> Yours,
> Chris Koerner
> Community Liaison - Discovery
> Wikimedia Foundation
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Finding what is said about a topic in other articles

Kerry Raymond
Thanks, Eran!

It seems to  work very well indeed. The only thing standing between me and total happiness is the SAVE button. The tool keeps telling me I have successfully saved things, but I can't work where they were saved to :-) Any clues you'd like to offer?

Kerry

-----Original Message-----
From: Wiki-research-l [mailto:[hidden email]] On Behalf Of Eran Rosenthal
Sent: Sunday, 18 June 2017 2:58 AM
To: Research into Wikimedia content and communities <[hidden email]>
Subject: Re: [Wiki-research-l] Finding what is said about a topic in other articles

I wrote a related script with very similar purpose:
Sometimes users write new articles, and forget to add links to the new article from related articles (in the worst case this result in orphan articles).
The script aims to find related articles (e.g using search of the current title), and then suggest the specific context where the article is mentioned, so the user can select whether to add link.

In [[Special:MyPage/common.js]] you can add the following snippet:

mw.loader.load('//he.wikipedia.org/w/index.php?title=User:ערן/quickLinker.js&action=raw&ctype=text/javascript&smaxage=21600&maxage=86400
<http://he.wikipedia.org/w/index.php?title=User:%D7%A2%D7%A8%D7%9F/quickLinker.js&action=raw&ctype=text/javascript&smaxage=21600&maxage=86400>');
// [[User:ערן/quickLinker.js]]

Then get to some article that you would like to find related article for, and press on "Add links" in the sidebar (under tools), which will open OOUI dialog with article where the current page is mentioned, and the specific context with suggestion for link.
If you find a suggested link suitable, press "save" or press "skip" to continue to next relevant page.






On Wed, Jun 14, 2017 at 8:11 PM, Chris Koerner <[hidden email]>
wrote:

> Kerry,
> What an interesting idea. I created a task in Phabricator, Wikimedia's
> took for tracking bug and feature requests. I'll bug some of the
> search folks to see if they have any suggestions. In the task I shared
> a very clunky way of doing this that is not 100% what you're looking
> for, but something! :)
>
> https://phabricator.wikimedia.org/T167899
>
> Yours,
> Chris Koerner
> Community Liaison - Discovery
> Wikimedia Foundation
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Loading...