[Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Pete Forsyth-2
(Note: I'm creating a new thread which references several old ones; in the
most recent, "Profile of Magnus Manske," the conversation has drifted back
to Wikidata, so that subject line is no longer applicable.)

Andreas Kolbe has argued in multiple threads that Wikidata is fundamentally
problematic, on the basis that it does not require citations. (Please
correct me if I am mistaken about this core premise.) I've found these
threads illuminating, and appreciate much of what has been said by all
parties.

However, that core premise is problematic. If the possibility of people
publishing uncited information were fundamentally problematic, here are
several platforms that we would have to consider ethically problematic at
the core:
* Wikipedia (which for many years had very loose standards around citations)
* Wikipediocracy (of which Andreas is a founding member) and all Internet
forums
* All blogs
* YouTube
* Facebook
* The Internet itself
* The printing press

Every one of the platforms listed above created opportunities for people --
even anonymously -- to publish information without a citation. If we are to
fault Wikidata on this basis, it would be wrong not to apply the same
standard to other platforms.

I'm addressing this now, because I think it is becoming problematic to
paint Wikidata as a flawed project with a broad brush. Wikidata is an
experiment, and it will surely lead to flawed information in some
instances. But I think it would be a big problem to draw the conclusion
that Wikidata is problematic overall.

That said, it is becoming ever more clear that the Wikimedia Foundation has
developed big plans that involve Wikidata; and those big plans are not open
to scrutiny.

THAT, I believe, is a problem.

Wikidata is not a problem; but it is something that could be leveraged in
problematic ways (and/or highly beneficial ways).

I feel it is very important that we start looking at these issues from that
perspective.

-Pete
[[User:Peteforsyth]]
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Gerard Meijssen-3
Hoi,
Thanks for the FUD. You mention that the Wikimedia Foundation has plans.
Really.. There are plans that are published and there has been time for you
to consider them. They are the ones that the WMF has published, they are
the only ones that exist as far as I know and I follow Wikidata closely.
So where are your sources Pete?

When other plans exist, the WMF is not the party developing them. For
instance: I am arguing for the use of Wikidata in links and redlinks. I
have published about it and I welcome comments. I asked you personally and
you were not even interested.

Why should anyone be interested now?
Thanks,
      GerardM

On 26 January 2016 at 08:33, Pete Forsyth <[hidden email]> wrote:

> (Note: I'm creating a new thread which references several old ones; in the
> most recent, "Profile of Magnus Manske," the conversation has drifted back
> to Wikidata, so that subject line is no longer applicable.)
>
> Andreas Kolbe has argued in multiple threads that Wikidata is fundamentally
> problematic, on the basis that it does not require citations. (Please
> correct me if I am mistaken about this core premise.) I've found these
> threads illuminating, and appreciate much of what has been said by all
> parties.
>
> However, that core premise is problematic. If the possibility of people
> publishing uncited information were fundamentally problematic, here are
> several platforms that we would have to consider ethically problematic at
> the core:
> * Wikipedia (which for many years had very loose standards around
> citations)
> * Wikipediocracy (of which Andreas is a founding member) and all Internet
> forums
> * All blogs
> * YouTube
> * Facebook
> * The Internet itself
> * The printing press
>
> Every one of the platforms listed above created opportunities for people --
> even anonymously -- to publish information without a citation. If we are to
> fault Wikidata on this basis, it would be wrong not to apply the same
> standard to other platforms.
>
> I'm addressing this now, because I think it is becoming problematic to
> paint Wikidata as a flawed project with a broad brush. Wikidata is an
> experiment, and it will surely lead to flawed information in some
> instances. But I think it would be a big problem to draw the conclusion
> that Wikidata is problematic overall.
>
> That said, it is becoming ever more clear that the Wikimedia Foundation has
> developed big plans that involve Wikidata; and those big plans are not open
> to scrutiny.
>
> THAT, I believe, is a problem.
>
> Wikidata is not a problem; but it is something that could be leveraged in
> problematic ways (and/or highly beneficial ways).
>
> I feel it is very important that we start looking at these issues from that
> perspective.
>
> -Pete
> [[User:Peteforsyth]]
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Pete Forsyth-2
On Jan 26, 2016 3:22 AM, "Gerard Meijssen" <[hidden email]>
wrote:
> Thanks for the FUD.

"Fear, Uncertainty, and Doubt" are not the precise words I would choose,
but they fairly adequately describe how I feel about the WMF these days.

Of course, as a bit of jargon, FUD typically implies that somebody is
trying to use those emotions in a manipulative way.

All I can say to that is....nope, not my intention.

> So where are your sources Pete?
First, the main point of my email was to challenge what I consider a poor
argument against Wikidata. That point is, IMO, the important one.

However, you're right: I did talk about my beliefs. I do believe there is a
problem to be considered; and I don't think I need to offer proof for what
my own beliefs are.

But, i agree, some substantiation is worthwhile. I consider the following
to be the most interesting published documents relating to these issues:
https://commons.wikimedia.org/wiki/File:Discovery_Year_0-1-2.pdf
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2016-01-13/Op-ed

It is very clear that the WMF has big plans, and that we have only seen
parts of those plans. What those plans are, and whether they are good ones,
remains to be explored; but the opaqueness of the plans is itself a
problem. That is my point.

> When other plans exist, the WMF is not the party developing them. For
> instance: I am arguing for the use of Wikidata in links and redlinks. I
> have published about it and I welcome comments. I asked you personally and
> you were not even interested.

OK, this part is getting silly. You presented an idea to me in private that
is obviously a good idea. But, as I explained to you, your single-minded
interest in me expressing an opinion on it gave me pause. I explained to
you that you seemed more interested in setting me up to be a part of your
political point, than in actually having a discussion. So I declined to
discuss your idea.

This message seems to prove that my instincts were correct.

Pete

>
> On 26 January 2016 at 08:33, Pete Forsyth <[hidden email]> wrote:
>
> > (Note: I'm creating a new thread which references several old ones; in
the
> > most recent, "Profile of Magnus Manske," the conversation has drifted
back
> > to Wikidata, so that subject line is no longer applicable.)
> >
> > Andreas Kolbe has argued in multiple threads that Wikidata is
fundamentally
> > problematic, on the basis that it does not require citations. (Please
> > correct me if I am mistaken about this core premise.) I've found these
> > threads illuminating, and appreciate much of what has been said by all
> > parties.
> >
> > However, that core premise is problematic. If the possibility of people
> > publishing uncited information were fundamentally problematic, here are
> > several platforms that we would have to consider ethically problematic
at
> > the core:
> > * Wikipedia (which for many years had very loose standards around
> > citations)
> > * Wikipediocracy (of which Andreas is a founding member) and all
Internet
> > forums
> > * All blogs
> > * YouTube
> > * Facebook
> > * The Internet itself
> > * The printing press
> >
> > Every one of the platforms listed above created opportunities for
people --
> > even anonymously -- to publish information without a citation. If we
are to

> > fault Wikidata on this basis, it would be wrong not to apply the same
> > standard to other platforms.
> >
> > I'm addressing this now, because I think it is becoming problematic to
> > paint Wikidata as a flawed project with a broad brush. Wikidata is an
> > experiment, and it will surely lead to flawed information in some
> > instances. But I think it would be a big problem to draw the conclusion
> > that Wikidata is problematic overall.
> >
> > That said, it is becoming ever more clear that the Wikimedia Foundation
has
> > developed big plans that involve Wikidata; and those big plans are not
open
> > to scrutiny.
> >
> > THAT, I believe, is a problem.
> >
> > Wikidata is not a problem; but it is something that could be leveraged
in
> > problematic ways (and/or highly beneficial ways).
> >
> > I feel it is very important that we start looking at these issues from
that

> > perspective.
> >
> > -Pete
> > [[User:Peteforsyth]]
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > New messages to: [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Magnus Manske-2
In reply to this post by Pete Forsyth-2
On Tue, Jan 26, 2016 at 7:33 AM Pete Forsyth <[hidden email]> wrote:

> (Note: I'm creating a new thread which references several old ones; in the
> most recent, "Profile of Magnus Manske," the conversation has drifted back
> to Wikidata, so that subject line is no longer applicable.)
>
> Andreas Kolbe has argued in multiple threads that Wikidata is fundamentally
> problematic, on the basis that it does not require citations. (Please
> correct me if I am mistaken about this core premise.)


Every statement on Wikidata /should/ be referenced, unless the statement
itself points to a reference (e.g. VIAF, images). However, at the moment,
this is not a requirement, as Wikidata is still in a steep growth phase.
Over the last few years, many statements were added by bots, which can
process e.g. Wikipedia, but would be hard pressed to find the original
reference for a statement.

Humans, bots, and tools increaingly add references to Wikidata statements;
I wouldn't be surprised if Wikidata starts requiring references within the
next few years on all (new) statements.


> I've found these
> threads illuminating, and appreciate much of what has been said by all
> parties.
>
> However, that core premise is problematic. If the possibility of people
> publishing uncited information were fundamentally problematic, here are
> several platforms that we would have to consider ethically problematic at
> the core:
> * Wikipedia (which for many years had very loose standards around
> citations)
> * Wikipediocracy (of which Andreas is a founding member) and all Internet
> forums
> * All blogs
> * YouTube
> * Facebook
> * The Internet itself
> * The printing press
>
> Every one of the platforms listed above created opportunities for people --
> even anonymously -- to publish information without a citation. If we are to
> fault Wikidata on this basis, it would be wrong not to apply the same
> standard to other platforms.
>
> I'm addressing this now, because I think it is becoming problematic to
> paint Wikidata as a flawed project with a broad brush. Wikidata is an
> experiment, and it will surely lead to flawed information in some
> instances. But I think it would be a big problem to draw the conclusion
> that Wikidata is problematic overall.
>
> That said, it is becoming ever more clear that the Wikimedia Foundation has
> developed big plans that involve Wikidata; and those big plans are not open
> to scrutiny.
>
> THAT, I believe, is a problem.
>

Well, I sure hope WMF has big plans for Wikidata! But do you know of any
such plans that don't revolve around the usual suspects, such as
importing/linking to extisting datasets, or re-using Wikidata in
third-party sites and products?
For example, a "secret" plan along the lines of "company X wants to use
Wikidata, but they don't want to announce this publicly yet" would be
perfectly fine by me. Wikidata is CC-0; technically, no one needs to even
ask permission or link back.
I simply do not see any sinister, nefarious plan the WMF /could/ have for
Wikidata, given their long established policy of staying away from editing
contents.

If you have even minimum indications of "evil" WMF plans for Wikidata,
please share them! Saying "I know nothing about their plans, therefore they
must be evil" doesn't really cut it.

Cheers,
Magnus



>
> Wikidata is not a problem; but it is something that could be leveraged in
> problematic ways (and/or highly beneficial ways).
>
> I feel it is very important that we start looking at these issues from that
> perspective.
>
> -Pete
> [[User:Peteforsyth]]
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Pete Forsyth-2
On Jan 26, 2016 5:24 AM, "Magnus Manske" <[hidden email]>
wrote:
>
> On Tue, Jan 26, 2016 at 7:33 AM Pete Forsyth <[hidden email]>
wrote:
>

<snipping most of Mangnus' message, which I appreciate and agree with>

> If you have even minimum indications of "evil" WMF plans for Wikidata,
> please share them! Saying "I know nothing about their plans, therefore
they
> must be evil" doesn't really cut it.

Indeed, if that were what I was saying...that would be nuts!

I do not have an opinion on the quality (or moral value, for that matter!)
of whatever plans the senior leadership of WMF has around structured data,
search, discovery, knowledge engines, etc.

But I do find the secretive approach to planning problematic.

The plans may very well turn out to be good ones (as I said in my original
message). But that will not justify the level of secrecy we are seeing
lately.

Pete
[[User:Peteforsyth]]

>
> >
> > Wikidata is not a problem; but it is something that could be leveraged
in
> > problematic ways (and/or highly beneficial ways).
> >
> > I feel it is very important that we start looking at these issues from
that

> > perspective.
> >
> > -Pete
> > [[User:Peteforsyth]]
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > New messages to: [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Gerard Meijssen-3
Hoi,
You write about your fear, uncertainty and doubt .. Why have us waste time
on it? Do something useful.
Thanks,
      GerardM

On 26 January 2016 at 11:33, Pete Forsyth <[hidden email]> wrote:

> On Jan 26, 2016 5:24 AM, "Magnus Manske" <[hidden email]>
> wrote:
> >
> > On Tue, Jan 26, 2016 at 7:33 AM Pete Forsyth <[hidden email]>
> wrote:
> >
>
> <snipping most of Mangnus' message, which I appreciate and agree with>
>
> > If you have even minimum indications of "evil" WMF plans for Wikidata,
> > please share them! Saying "I know nothing about their plans, therefore
> they
> > must be evil" doesn't really cut it.
>
> Indeed, if that were what I was saying...that would be nuts!
>
> I do not have an opinion on the quality (or moral value, for that matter!)
> of whatever plans the senior leadership of WMF has around structured data,
> search, discovery, knowledge engines, etc.
>
> But I do find the secretive approach to planning problematic.
>
> The plans may very well turn out to be good ones (as I said in my original
> message). But that will not justify the level of secrecy we are seeing
> lately.
>
> Pete
> [[User:Peteforsyth]]
>
> >
> > >
> > > Wikidata is not a problem; but it is something that could be leveraged
> in
> > > problematic ways (and/or highly beneficial ways).
> > >
> > > I feel it is very important that we start looking at these issues from
> that
> > > perspective.
> > >
> > > -Pete
> > > [[User:Peteforsyth]]
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > New messages to: [hidden email]
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:[hidden email]?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > New messages to: [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Andrea Zanni-2
On Tue, Jan 26, 2016 at 12:56 PM, Gerard Meijssen <[hidden email]
> wrote:

> You write about your fear, uncertainty and doubt .. Why have us waste time
> on it? Do something useful.
> Thanks,
>


I, for one, think that the mail Pete sent (both in content and tone) is
perfectly fine and helpful.
I don't know if I share his concerns about WMF plans for Wikidata, but I
perfectly agree on his position regarding Andreas' criticism on Wikidata. A
distinction was needed.
All in all, I think this thread is useful. M2c.

Aubrey
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Andreas Kolbe-2
In reply to this post by Pete Forsyth-2
Pete,


On Tue, Jan 26, 2016 at 7:33 AM, Pete Forsyth <[hidden email]> wrote:

> Andreas Kolbe has argued in multiple threads that Wikidata is fundamentally
> problematic, on the basis that it does not require citations. (Please
> correct me if I am mistaken about this core premise.) I've found these
> threads illuminating, and appreciate much of what has been said by all
> parties.
>
> However, that core premise is problematic. If the possibility of people
> publishing uncited information were fundamentally problematic, here are
> several platforms that we would have to consider ethically problematic at
> the core:
> * Wikipedia (which for many years had very loose standards around
> citations)
> * Wikipediocracy (of which Andreas is a founding member) and all Internet
> forums
> * All blogs
> * YouTube
> * Facebook
> * The Internet itself
> * The printing press
>
> Every one of the platforms listed above created opportunities for people --
> even anonymously -- to publish information without a citation. If we are to
> fault Wikidata on this basis, it would be wrong not to apply the same
> standard to other platforms.
>


In many countries, people have a right to free speech: to voice opinions,
engage in speculation, and so on. I feel quite certain that we agree that
the right to free speech is a good thing to have.

But Wikipedia and Wikidata are not experiments in free speech. They are
designed to be reference works.

Wikipedia, in its early days, was faulted by Wikipedians – rightly so – for
publishing material that could not be traced to professionally published
sources, including much material that was plain wrong (crank theories
etc.). That was considered unacceptable for a reference work; hence the
requirement for references, the no-original-research rule, and all the rest
of it.



I'm addressing this now, because I think it is becoming problematic to
> paint Wikidata as a flawed project with a broad brush. Wikidata is an
> experiment, and it will surely lead to flawed information in some
> instances. But I think it would be a big problem to draw the conclusion
> that Wikidata is problematic overall.
>



Perhaps we can agree that reliable sources are a useful part of a
crowdsourced reference project. The more citations Wikidata contains, the
more useful it will be. Citations make data provenance transparent to the
end user. They enable end users to verify, judge and correct the
information they're given, if they so desire.

Data provenance is all the more important if Wikidata content comes to be
spread far and wide, as seems possible, given major search engines'
involvement.

In my opinion, Wikidata's CC-0 licence undermines that, because it allows
re-users to cut the chain between the end user and the data's original
source.



That said, it is becoming ever more clear that the Wikimedia Foundation has
> developed big plans that involve Wikidata; and those big plans are not open
> to scrutiny.
>
> THAT, I believe, is a problem.
>



I agree with you that there appears to be an undue amount of secrecy.

Jimmy Wales said[1] over two weeks ago, in response to questions about the
Knight Foundation's Knowledge Engine grant, in the context of ousted board
member James Heilman's complaints about a lack of transparency,

--------------------

"What sort of details do you want? I'll have to talk to others to make sure
there are no contractural reasons not to do so, but in my opinion the grant
letter should be published on meta. The Knight Grant is a red herring here,
so it would be best to clear the air around that completely as soon as
possible."

--------------------

That sounded reassuring. But to date neither the Knight Foundation grant
letter nor the Foundation's grant application have been published on Meta.

The fact that nothing has happened following Jimmy Wales' statement has
been discussed in the Wikipedia Weekly Facebook group. As you probably
know, Jimmy Wales said there yesterday,

--------------------

"Assurances"? Please don't make things up out of thin air. I've expressed
my opinion, but contrary to some people's fantasies, me expressing an
opinion doesn't have the force of law.

--------------------

In the same discussion, a WMF staffer said last week that WMF staff would
be delighted to publish that documentation, but haven't been given leave to
do so.

That sounds to me like there is a continued intent to withhold the
documentation of this restricted grant from public view. I believe that is
a mistake.

If there is nothing objectionable in it, publication now will stop the
rumour mill. If there is something objectionable in it, then it is better
for that to come to light now, rather than six months or a year down the
line.




> Wikidata is not a problem; but it is something that could be leveraged in
> problematic ways (and/or highly beneficial ways).
>
> I feel it is very important that we start looking at these issues from that
> perspective.
>


I agree. Thank you for raising the issue.

Andreas

[1]
https://en.wikipedia.org/w/index.php?title=User_talk%3AJimbo_Wales&diff=698861097&oldid=698860874
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Liam Wyatt
In reply to this post by Magnus Manske-2
On 26 January 2016 at 11:24, Magnus Manske <[hidden email]>
wrote:

> On Tue, Jan 26, 2016 at 7:33 AM Pete Forsyth <[hidden email]>
> wrote:
>
> > (Note: I'm creating a new thread which references several old ones; in
> the
> > most recent, "Profile of Magnus Manske," the conversation has drifted
> back
> > to Wikidata, so that subject line is no longer applicable.)
> >
> > Andreas Kolbe has argued in multiple threads that Wikidata is
> fundamentally
> > problematic, on the basis that it does not require citations. (Please
> > correct me if I am mistaken about this core premise.)
>
>
> Every statement on Wikidata /should/ be referenced, unless the statement
> itself points to a reference (e.g. VIAF, images). However, at the moment,
> this is not a requirement, as Wikidata is still in a steep growth phase.
> Over the last few years, many statements were added by bots, which can
> process e.g. Wikipedia, but would be hard pressed to find the original
> reference for a statement.


To extend Magnus' point...
This is also the case on Wikipedia. Every Wikipedia sentence /should/ be
verified to a reliable source, and those without footnotes can be removed.
But, it is not a /requirement/ that every statement be verified. In short -
'verifiable not verified' is the minimum standard for inclusion of a
sentence in Wikipedia. The ratio of footnotes-to-sentences in Wikipedia
articles is on average probably much lower than the ratio of
references-to-statements in Wikidata. It's just that we have more easily
available /quantitative/ statistics for Wikidata that we do for Wikipedia,
which makes it easy for Wikidata-critics to point to the number of
un-referenced statements in Wikidata as a simple measure of quality, even
though many of them DO meet the "verifiable, even if not yet verified"
minimum standard that we accept for "stubs" on Wikipedia.

For example: even in a Feature Article Wikipedia biography, I've never seen
a footnote /specifically/ for the fact that the subject is "a human". That
reference is implied by other footnotes - citing for the birthdate, or
occupation for example. By comparison, in Wikidata, some people seem to be
a feeling that statements like "instance of -> human", "gender-> male" need
to be given a specific reference before they can be considered reliable.
This is even when there are other statements in the same Wikidata item that
reference biography-authority control numbers (e.g. VIAF).

Yes, ideally, every statement could be given a reference in Wikidata, but
ideally so should every sentence in Wikipedia. In reality we do accept
"stub" Wikipedia articles that have 5 sentences and 1 Reliable Source
footnote. Furthermore, we also do also have Wikidata properties that are,
in effect, "self verifying": like the "VIAF identifier" property - which
links to that authority control database, or the "image" property - which
links directly to a file on Commons. So, simply counting the number of
statements vs. the number of references in those statements on Wikidata and
concluding that Wikidata is therefore inherently unreliable is both
simplistic and quite misleading.

-Liam

wittylama.com
Peace, love & metadata
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

David Goodman-2
People keep mentioning VIAF in the context. VIAF is a federated service,
using the content of its various repositories--and is therefore no more
accurate than they are. For example, a major component in VIAF is the
Library of Congress Authority File. That file has always used author or
publisher statements as the evidence for birth dates without further
verification; in recent years, it has been also using information from WP
articles.  (I suppose that's an improvement--we at least occasionally look
beyond what the person says about himself.)

On Tue, Jan 26, 2016 at 7:38 AM, Liam Wyatt <[hidden email]> wrote:

> On 26 January 2016 at 11:24, Magnus Manske <[hidden email]>
> wrote:
>
> > On Tue, Jan 26, 2016 at 7:33 AM Pete Forsyth <[hidden email]>
> > wrote:
> >
> > > (Note: I'm creating a new thread which references several old ones; in
> > the
> > > most recent, "Profile of Magnus Manske," the conversation has drifted
> > back
> > > to Wikidata, so that subject line is no longer applicable.)
> > >
> > > Andreas Kolbe has argued in multiple threads that Wikidata is
> > fundamentally
> > > problematic, on the basis that it does not require citations. (Please
> > > correct me if I am mistaken about this core premise.)
> >
> >
> > Every statement on Wikidata /should/ be referenced, unless the statement
> > itself points to a reference (e.g. VIAF, images). However, at the moment,
> > this is not a requirement, as Wikidata is still in a steep growth phase.
> > Over the last few years, many statements were added by bots, which can
> > process e.g. Wikipedia, but would be hard pressed to find the original
> > reference for a statement.
>
>
> To extend Magnus' point...
> This is also the case on Wikipedia. Every Wikipedia sentence /should/ be
> verified to a reliable source, and those without footnotes can be removed.
> But, it is not a /requirement/ that every statement be verified. In short -
> 'verifiable not verified' is the minimum standard for inclusion of a
> sentence in Wikipedia. The ratio of footnotes-to-sentences in Wikipedia
> articles is on average probably much lower than the ratio of
> references-to-statements in Wikidata. It's just that we have more easily
> available /quantitative/ statistics for Wikidata that we do for Wikipedia,
> which makes it easy for Wikidata-critics to point to the number of
> un-referenced statements in Wikidata as a simple measure of quality, even
> though many of them DO meet the "verifiable, even if not yet verified"
> minimum standard that we accept for "stubs" on Wikipedia.
>
> For example: even in a Feature Article Wikipedia biography, I've never seen
> a footnote /specifically/ for the fact that the subject is "a human". That
> reference is implied by other footnotes - citing for the birthdate, or
> occupation for example. By comparison, in Wikidata, some people seem to be
> a feeling that statements like "instance of -> human", "gender-> male" need
> to be given a specific reference before they can be considered reliable.
> This is even when there are other statements in the same Wikidata item that
> reference biography-authority control numbers (e.g. VIAF).
>
> Yes, ideally, every statement could be given a reference in Wikidata, but
> ideally so should every sentence in Wikipedia. In reality we do accept
> "stub" Wikipedia articles that have 5 sentences and 1 Reliable Source
> footnote. Furthermore, we also do also have Wikidata properties that are,
> in effect, "self verifying": like the "VIAF identifier" property - which
> links to that authority control database, or the "image" property - which
> links directly to a file on Commons. So, simply counting the number of
> statements vs. the number of references in those statements on Wikidata and
> concluding that Wikidata is therefore inherently unreliable is both
> simplistic and quite misleading.
>
> -Liam
>
> wittylama.com
> Peace, love & metadata
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>



--
David Goodman

DGG at the enWP
http://en.wikipedia.org/wiki/User:DGG
http://en.wikipedia.org/wiki/User_talk:DGG
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Ethics of launching Wikidata, vs. ethics of WMF plans for Wikidata

Cristian Consonni
In reply to this post by Andreas Kolbe-2
Hi Andreas,

2016-01-26 13:17 GMT+01:00 Andreas Kolbe <[hidden email]>:
> In my opinion, Wikidata's CC-0 licence undermines that, because it allows
> re-users to cut the chain between the end user and the data's original
> source.

If I understand, you are concerned about verifiability of information
in Wikidata. What is completely unclear to me is why you are mixing
verifiability and copyright or, in other words, why you think that you
can solve the problem of verifiability with copyright.

TL;DR
Licenses are for copyright, not verifiability. Using a different
license will not solve your verifiability problems.

# Is CC-BY for Wikidata a good idea?

CC-0 or CC-BY (or any license) are based on copyright law. Broadly
speaking (but IANAL), "facts" are not copyrightable because they lack
originality which is one of the conditions required by copyright law.
In this sense, no single statement that you find on Wikidata (e.g.
Barack Obama was born on 4 August 1961) is copyrightable.

For collections of facts (i.e. datasets) the situation is much less
clear and it is not easy to decide if collection of data/facts are
copyrightable at all. The doctrine of the "Sweat of the Brow" [1a][1b]
indeed the originality requirement is relaxed and the fact that "skill
and labour" was put in creating a collection of data is sufficient to
give rise to copyright. This view has been recently rejected in some
court cases by the European Court of Justice (see Football Dataco &
others v. Yahoo UK ! [2a][2b]) ruling that it is not sufficient to say
that putting together a collection of facts required some sort of
effort (even quantifiable in monetary terms) to give rise to
copyright. In Football Dataco v. Yahoo the dataset consisted in sports
event results, but the same applies also to other contexts such as the
digitization of (public domain) photographs or OCR of (public domain)
texts.

As a Wikimedian, I am more than eager to support the idea that scanned
versions of PD photos and texts should remain in the public domain. I
do not want to invoke this kind of principle to be able to claim
copyright on the Wikidata dataset so to be able to apply the CC-BY
license. This is also the position of other projects like Project
Gutenberg [3].

On the other hand, in many jurisdictions the moral rights [4]
associated with any work, e. g. among other the right of having the
paternity of a work attributed, are perpetual and can not be
transferred or waived. In fact the CC-0 legal code says: "A Work made
available under CC0 may be protected by copyright and related or
neighboring rights includ[ing]: moral rights retained by the original
author(s) and/or performer(s); database rights; [...]".

So the problem of which is the justification for having Wikidata
released under CC-BY remains.

# Licenses and verifiability

Besides the problem above, even if we could use CC-BY and make use of
"Sui Generis Database Rights" (see section 4 of CC-BY legal code [5])
I am not sure your verifiability problem would be solved. CC-BY
requires the reuser to provide "[...] attribution, in any reasonable
manner requested by the Licensor".

This means that I could build a page replicating (part of) Wikidata
data, maybe mix them with other sources and the add a link to the
bottom of the saying "Data from Wikidata (c) Wikidata contributors
CC-BY (+link to the item and item history for author names); source A;
source B; ...".

This would completely satisfy the attribution requirement but do
little to solve the verifiability problem because, basically, you can
not use copyright to force anybody to use a particular design of their
website and/or database and maintain the "verifiability chain" for
each statement.

To conclude, the verifiability problem is very important for all the
projects, but I am very skeptic to the idea that  copyright licenses
are the means to solve it

C

[1a] https://en.wikipedia.org/wiki/Sweat_of_the_brow
[1b] https://meta.wikimedia.org/wiki/Wikilegal/Sweat_of_the_Brow
[2a] http://curia.europa.eu/juris/liste.jsf?&num=C-604/10
[2b] http://kluwercopyrightblog.com/2012/03/01/football-dataco-skill-and-labour-is-dead/
[3] https://www.gutenberg.org/wiki/Gutenberg:No_Sweat_of_the_Brow_Copyright
[4] https://en.wikipedia.org/wiki/Moral_rights
[5] https://creativecommons.org/licenses/by/4.0/legalcode

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>