Sampling new editors in English Wikipedia

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Pine W
Hi Haifeng,

Some users will state on user pages that an account is an alternate
account. However, this practice is not followed by everyone, and those who
do follow this practice aren't required to so in a uniform way.

Alternate accounts which are not labeled as such, and which are used for
illegitimate purposes such as double voting, are an ongoing problem. You
might be interested in the English Wikipedia page
https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry.

Alternate accounts can also be used for legitimate purposes, such as people
who have one account for their professional or academic activities and
another account for their personal use.

Good luck with your project.

Pine
( https://meta.wikimedia.org/wiki/User:Pine )


On Thu, Mar 14, 2019 at 1:30 PM Haifeng Zhang <[hidden email]>
wrote:

> Stuart,
>
> I'm building an agent-based simulation of Wikipedia collaboration.
>
> I would like my model to be empirically grounded, so I need to collect
> data for new editors.
>
> Alternative accounts can be an issue, but I wonder is there a way to
> identify editors who have multiple account?
>
>
> Thanks,
>
> Haifeng Zhang
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Kerry Raymond
Apart from the legitimate alternate accounts and the illegitimate sockpuppet accounts, there are other ways that alternate accounts exist.

Occasional contributors often forget their username and/or password. Password recovery isn't possible unless you provide an email address at sign-up (it's optional, but you can add it later). So what such people then  do is just create a new user account (I'm not sure there is anything else they can do). I see this sort of behaviour a lot at events. The other variation of the problem is that they did provide an email address but it is one not easily accessible to them at the event (i.e. a librarian who signed up with a work email address that cannot be accessed outside of the organisation).

The other group of people with multiple accounts are those who edit anonymously as serial IPs. The same person can use a number of IP numbers over time. Often you don't realise it is the same person unless you see a lot of their work and can see a pattern in it. For example, at the moment, there is a person with a series of IP accounts that is  changing a common section of a Queensland place article to be a subsection of another, who I notice on my watchlist . This person appears to acquire a new IP address every week or so, but the pattern of editing makes it obvious it's the same person behind it. Whether or not an IP address can be considered "an account" depends on your purposes. The one IP address can also be used by multiple people (e.g. coming through a shared organisational network in a library or school). It is claimed by some people that many new users do their first edits anonymously, so if you are serious about studying "new contributors", then maybe you have to look at anonymous editing. And also even regular contributors may sometimes choose to edit anonymously, e.g. being in an unsecure IT environment and reluctant to use their username/password in that situation (particularly people with administrator or other significant access rights).

Because I do outreach, I look for new accounts that turn up on my watchlist and send them welcome messages etc. Because I also do training, I see a lot of genuinely new people in action where I can observe their edits. So when I see new accounts or IPs doing far more "sophisticated" edits than I see new users do, I tend to suspect they are not genuinely new contributors.

I think the best you can do is look for new accounts and be prepared to omit any that show signs of sophisticated editing (either in terms of they are doing technically or what they say on Talk pages or in edit summaries). For example, no genuine new user will mention a policy (they don't know they exist). Also genuine new users don't tend to edit that quickly, so any rapid fire series of successful edits is unlikely to be a genuine new user.  I think this inability to know if a new account represents a genuinely new user is an inherent limitation for your research and should be documented as such explaining the many circumstances in which new accounts might belong to non-new users.

Kerry

-----Original Message-----
From: Wiki-research-l [mailto:[hidden email]] On Behalf Of Pine W
Sent: Tuesday, 19 March 2019 5:27 AM
To: Research into Wikimedia content and communities <[hidden email]>
Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia

Hi Haifeng,

Some users will state on user pages that an account is an alternate account. However, this practice is not followed by everyone, and those who do follow this practice aren't required to so in a uniform way.

Alternate accounts which are not labeled as such, and which are used for illegitimate purposes such as double voting, are an ongoing problem. You might be interested in the English Wikipedia page https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry.

Alternate accounts can also be used for legitimate purposes, such as people who have one account for their professional or academic activities and another account for their personal use.

Good luck with your project.

Pine
( https://meta.wikimedia.org/wiki/User:Pine )


On Thu, Mar 14, 2019 at 1:30 PM Haifeng Zhang <[hidden email]>
wrote:

> Stuart,
>
> I'm building an agent-based simulation of Wikipedia collaboration.
>
> I would like my model to be empirically grounded, so I need to collect
> data for new editors.
>
> Alternative accounts can be an issue, but I wonder is there a way to
> identify editors who have multiple account?
>
>
> Thanks,
>
> Haifeng Zhang
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Stuart A. Yeates
In addition to Kerry's excellent examples there are users editing
wikipedia though TOR, the anonymity and censorship circumvention
network. These users face extra scrutiny.

cheers
stuart


--
...let us be heard from red core to black sky

On Tue, 19 Mar 2019 at 13:04, Kerry Raymond <[hidden email]> wrote:

>
> Apart from the legitimate alternate accounts and the illegitimate sockpuppet accounts, there are other ways that alternate accounts exist.
>
> Occasional contributors often forget their username and/or password. Password recovery isn't possible unless you provide an email address at sign-up (it's optional, but you can add it later). So what such people then  do is just create a new user account (I'm not sure there is anything else they can do). I see this sort of behaviour a lot at events. The other variation of the problem is that they did provide an email address but it is one not easily accessible to them at the event (i.e. a librarian who signed up with a work email address that cannot be accessed outside of the organisation).
>
> The other group of people with multiple accounts are those who edit anonymously as serial IPs. The same person can use a number of IP numbers over time. Often you don't realise it is the same person unless you see a lot of their work and can see a pattern in it. For example, at the moment, there is a person with a series of IP accounts that is  changing a common section of a Queensland place article to be a subsection of another, who I notice on my watchlist . This person appears to acquire a new IP address every week or so, but the pattern of editing makes it obvious it's the same person behind it. Whether or not an IP address can be considered "an account" depends on your purposes. The one IP address can also be used by multiple people (e.g. coming through a shared organisational network in a library or school). It is claimed by some people that many new users do their first edits anonymously, so if you are serious about studying "new contributors", then maybe you have to look at anonymous editing. And also even regular contributors may sometimes choose to edit anonymously, e.g. being in an unsecure IT environment and reluctant to use their username/password in that situation (particularly people with administrator or other significant access rights).
>
> Because I do outreach, I look for new accounts that turn up on my watchlist and send them welcome messages etc. Because I also do training, I see a lot of genuinely new people in action where I can observe their edits. So when I see new accounts or IPs doing far more "sophisticated" edits than I see new users do, I tend to suspect they are not genuinely new contributors.
>
> I think the best you can do is look for new accounts and be prepared to omit any that show signs of sophisticated editing (either in terms of they are doing technically or what they say on Talk pages or in edit summaries). For example, no genuine new user will mention a policy (they don't know they exist). Also genuine new users don't tend to edit that quickly, so any rapid fire series of successful edits is unlikely to be a genuine new user.  I think this inability to know if a new account represents a genuinely new user is an inherent limitation for your research and should be documented as such explaining the many circumstances in which new accounts might belong to non-new users.
>
> Kerry
>
> -----Original Message-----
> From: Wiki-research-l [mailto:[hidden email]] On Behalf Of Pine W
> Sent: Tuesday, 19 March 2019 5:27 AM
> To: Research into Wikimedia content and communities <[hidden email]>
> Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia
>
> Hi Haifeng,
>
> Some users will state on user pages that an account is an alternate account. However, this practice is not followed by everyone, and those who do follow this practice aren't required to so in a uniform way.
>
> Alternate accounts which are not labeled as such, and which are used for illegitimate purposes such as double voting, are an ongoing problem. You might be interested in the English Wikipedia page https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry.
>
> Alternate accounts can also be used for legitimate purposes, such as people who have one account for their professional or academic activities and another account for their personal use.
>
> Good luck with your project.
>
> Pine
> ( https://meta.wikimedia.org/wiki/User:Pine )
>
>
> On Thu, Mar 14, 2019 at 1:30 PM Haifeng Zhang <[hidden email]>
> wrote:
>
> > Stuart,
> >
> > I'm building an agent-based simulation of Wikipedia collaboration.
> >
> > I would like my model to be empirically grounded, so I need to collect
> > data for new editors.
> >
> > Alternative accounts can be an issue, but I wonder is there a way to
> > identify editors who have multiple account?
> >
> >
> > Thanks,
> >
> > Haifeng Zhang
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Adam Jenkins
In reply to this post by Haifeng Zhang
A quick and dirty solution might be to use the hostbot list from the
teahouse at
https://en.wikipedia.org/wiki/Wikipedia:Teahouse/Hosts/Database_reports The
list is regularly refreshed, so you could pull the account names from there
over the course of a month and then randomly select your sample, noting
that it is biased towards new editors that have made more than 10 edits.

Otherwise perhaps using recent changes, but filtering for logged actions by
new users?
https://en.wikipedia.org/wiki/Special:RecentChanges?userExpLevel=newcomer&hidebots=1&hidepageedits=1&hidenewpages=1&hidecategorization=1&hideWikibase=1&limit=50&days=7&urlversion=2

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Wed, 13 Mar 2019 at 04:49, Haifeng Zhang <[hidden email]> wrote:

> Hi folks,
>
> My work needs to randomly sample new editors in each month, e.g., 100
> editors per month.
>
> Do any of you have good suggestions for how to do this efficiently?
>
> I could think of using the dump files, but wonder are there other options?
>
>
> Thanks,
>
> Haifeng Zhang
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Giovanni Luca Ciampaglia-4
In reply to this post by Stuart A. Yeates
Does anybody know how prevalent are sockpuppets? Has anybody tried
estimating the percentage of editors that have created at least one
additional account? (Legitimate or otherwise.)

Giovanni


On Mon, Mar 18, 2019, 20:20 Stuart A. Yeates <[hidden email]> wrote:

> In addition to Kerry's excellent examples there are users editing
> wikipedia though TOR, the anonymity and censorship circumvention
> network. These users face extra scrutiny.
>
> cheers
> stuart
>
>
> --
> ...let us be heard from red core to black sky
>
> On Tue, 19 Mar 2019 at 13:04, Kerry Raymond <[hidden email]>
> wrote:
> >
> > Apart from the legitimate alternate accounts and the illegitimate
> sockpuppet accounts, there are other ways that alternate accounts exist.
> >
> > Occasional contributors often forget their username and/or password.
> Password recovery isn't possible unless you provide an email address at
> sign-up (it's optional, but you can add it later). So what such people
> then  do is just create a new user account (I'm not sure there is anything
> else they can do). I see this sort of behaviour a lot at events. The other
> variation of the problem is that they did provide an email address but it
> is one not easily accessible to them at the event (i.e. a librarian who
> signed up with a work email address that cannot be accessed outside of the
> organisation).
> >
> > The other group of people with multiple accounts are those who edit
> anonymously as serial IPs. The same person can use a number of IP numbers
> over time. Often you don't realise it is the same person unless you see a
> lot of their work and can see a pattern in it. For example, at the moment,
> there is a person with a series of IP accounts that is  changing a common
> section of a Queensland place article to be a subsection of another, who I
> notice on my watchlist . This person appears to acquire a new IP address
> every week or so, but the pattern of editing makes it obvious it's the same
> person behind it. Whether or not an IP address can be considered "an
> account" depends on your purposes. The one IP address can also be used by
> multiple people (e.g. coming through a shared organisational network in a
> library or school). It is claimed by some people that many new users do
> their first edits anonymously, so if you are serious about studying "new
> contributors", then maybe you have to look at anonymous editing. And also
> even regular contributors may sometimes choose to edit anonymously, e.g.
> being in an unsecure IT environment and reluctant to use their
> username/password in that situation (particularly people with administrator
> or other significant access rights).
> >
> > Because I do outreach, I look for new accounts that turn up on my
> watchlist and send them welcome messages etc. Because I also do training, I
> see a lot of genuinely new people in action where I can observe their
> edits. So when I see new accounts or IPs doing far more "sophisticated"
> edits than I see new users do, I tend to suspect they are not genuinely new
> contributors.
> >
> > I think the best you can do is look for new accounts and be prepared to
> omit any that show signs of sophisticated editing (either in terms of they
> are doing technically or what they say on Talk pages or in edit summaries).
> For example, no genuine new user will mention a policy (they don't know
> they exist). Also genuine new users don't tend to edit that quickly, so any
> rapid fire series of successful edits is unlikely to be a genuine new
> user.  I think this inability to know if a new account represents a
> genuinely new user is an inherent limitation for your research and should
> be documented as such explaining the many circumstances in which new
> accounts might belong to non-new users.
> >
> > Kerry
> >
> > -----Original Message-----
> > From: Wiki-research-l [mailto:
> [hidden email]] On Behalf Of Pine W
> > Sent: Tuesday, 19 March 2019 5:27 AM
> > To: Research into Wikimedia content and communities <
> [hidden email]>
> > Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia
> >
> > Hi Haifeng,
> >
> > Some users will state on user pages that an account is an alternate
> account. However, this practice is not followed by everyone, and those who
> do follow this practice aren't required to so in a uniform way.
> >
> > Alternate accounts which are not labeled as such, and which are used for
> illegitimate purposes such as double voting, are an ongoing problem. You
> might be interested in the English Wikipedia page
> https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry.
> >
> > Alternate accounts can also be used for legitimate purposes, such as
> people who have one account for their professional or academic activities
> and another account for their personal use.
> >
> > Good luck with your project.
> >
> > Pine
> > ( https://meta.wikimedia.org/wiki/User:Pine )
> >
> >
> > On Thu, Mar 14, 2019 at 1:30 PM Haifeng Zhang <[hidden email]>
> > wrote:
> >
> > > Stuart,
> > >
> > > I'm building an agent-based simulation of Wikipedia collaboration.
> > >
> > > I would like my model to be empirically grounded, so I need to collect
> > > data for new editors.
> > >
> > > Alternative accounts can be an issue, but I wonder is there a way to
> > > identify editors who have multiple account?
> > >
> > >
> > > Thanks,
> > >
> > > Haifeng Zhang
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Kerry Raymond
The thing about sockpuppets is that we only know about the ones that have been detected (and some of them have been large groups of 100s of accounts). The problem is that we don’t know about the undetected ones. I am sure many of us have had suspicions about the behaviour of certain accounts but to request a sockpuppet investigation requires a level of evidence above suspicious behaviour (specifically identifying another account). New users with sophisticated editing skills and writing on topics associated with living individuals, businesses or products in a positive way often seem to me to be the kind of account likely to be doing undisclosed paid editing, and almost therefore certainly a sockpuppet of a paid PR person, but if each account writes about a different topic, it is difficult to work out what the other accounts might be to look for evidence of sockpuppeting.

 

How far underwater does the iceberg go?

 

Kerry

 

From: Giovanni Luca Ciampaglia [mailto:[hidden email]]
Sent: Tuesday, 19 March 2019 11:37 AM
To: Research into Wikimedia content and communities <[hidden email]>
Cc: Kerry Raymond <[hidden email]>
Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia

 

Does anybody know how prevalent are sockpuppets? Has anybody tried estimating the percentage of editors that have created at least one additional account? (Legitimate or otherwise.)

 

Giovanni

 

On Mon, Mar 18, 2019, 20:20 Stuart A. Yeates <[hidden email] <mailto:[hidden email]> > wrote:

In addition to Kerry's excellent examples there are users editing
wikipedia though TOR, the anonymity and censorship circumvention
network. These users face extra scrutiny.

cheers
stuart


--
...let us be heard from red core to black sky

On Tue, 19 Mar 2019 at 13:04, Kerry Raymond <[hidden email] <mailto:[hidden email]> > wrote:

>
> Apart from the legitimate alternate accounts and the illegitimate sockpuppet accounts, there are other ways that alternate accounts exist.
>
> Occasional contributors often forget their username and/or password. Password recovery isn't possible unless you provide an email address at sign-up (it's optional, but you can add it later). So what such people then  do is just create a new user account (I'm not sure there is anything else they can do). I see this sort of behaviour a lot at events. The other variation of the problem is that they did provide an email address but it is one not easily accessible to them at the event (i.e. a librarian who signed up with a work email address that cannot be accessed outside of the organisation).
>
> The other group of people with multiple accounts are those who edit anonymously as serial IPs. The same person can use a number of IP numbers over time. Often you don't realise it is the same person unless you see a lot of their work and can see a pattern in it. For example, at the moment, there is a person with a series of IP accounts that is  changing a common section of a Queensland place article to be a subsection of another, who I notice on my watchlist . This person appears to acquire a new IP address every week or so, but the pattern of editing makes it obvious it's the same person behind it. Whether or not an IP address can be considered "an account" depends on your purposes. The one IP address can also be used by multiple people (e.g. coming through a shared organisational network in a library or school). It is claimed by some people that many new users do their first edits anonymously, so if you are serious about studying "new contributors", then maybe you have to look at anonymous editing. And also even regular contributors may sometimes choose to edit anonymously, e.g. being in an unsecure IT environment and reluctant to use their username/password in that situation (particularly people with administrator or other significant access rights).
>
> Because I do outreach, I look for new accounts that turn up on my watchlist and send them welcome messages etc. Because I also do training, I see a lot of genuinely new people in action where I can observe their edits. So when I see new accounts or IPs doing far more "sophisticated" edits than I see new users do, I tend to suspect they are not genuinely new contributors.
>
> I think the best you can do is look for new accounts and be prepared to omit any that show signs of sophisticated editing (either in terms of they are doing technically or what they say on Talk pages or in edit summaries). For example, no genuine new user will mention a policy (they don't know they exist). Also genuine new users don't tend to edit that quickly, so any rapid fire series of successful edits is unlikely to be a genuine new user.  I think this inability to know if a new account represents a genuinely new user is an inherent limitation for your research and should be documented as such explaining the many circumstances in which new accounts might belong to non-new users.
>
> Kerry
>
> -----Original Message-----
> From: Wiki-research-l [mailto:[hidden email] <mailto:[hidden email]> ] On Behalf Of Pine W
> Sent: Tuesday, 19 March 2019 5:27 AM
> To: Research into Wikimedia content and communities <[hidden email] <mailto:[hidden email]> >
> Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia
>
> Hi Haifeng,
>
> Some users will state on user pages that an account is an alternate account. However, this practice is not followed by everyone, and those who do follow this practice aren't required to so in a uniform way.
>
> Alternate accounts which are not labeled as such, and which are used for illegitimate purposes such as double voting, are an ongoing problem. You might be interested in the English Wikipedia page https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry.
>
> Alternate accounts can also be used for legitimate purposes, such as people who have one account for their professional or academic activities and another account for their personal use.
>
> Good luck with your project.
>
> Pine
> ( https://meta.wikimedia.org/wiki/User:Pine )
>
>
> On Thu, Mar 14, 2019 at 1:30 PM Haifeng Zhang <[hidden email] <mailto:[hidden email]> >
> wrote:
>
> > Stuart,
> >
> > I'm building an agent-based simulation of Wikipedia collaboration.
> >
> > I would like my model to be empirically grounded, so I need to collect
> > data for new editors.
> >
> > Alternative accounts can be an issue, but I wonder is there a way to
> > identify editors who have multiple account?
> >
> >
> > Thanks,
> >
> > Haifeng Zhang
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email] <mailto:[hidden email]>
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email] <mailto:[hidden email]>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Sampling new editors in English Wikipedia

Giovanni Luca Ciampaglia-4
Thanks Kerry.

Your raise a valid point: It makes sense that focusing only on detected
cases may not be representative or indicative of how widespread the
behavior is.

Has anybody ever though about running a survey with a representative sample
of registered editors? Given the nature of the behavior I can imagine
response rates would still be affected by social desirability bias, but it
could at least be a starting point for a slightly less biased estimate....

Giovanni Luca Ciampaglia ∙ glciampaglia.com
Assistant Professor
Computer Science and Engineering
<https://www.usf.edu/engineering/cse/> ∙ University
of South Florida <https://www.usf.edu/>News 🕫*New email address*:
[hidden email]
*Hoaxy Botometer*: Check out our new tool: https://hoaxy.iuni.iu.edu/


On Mon, Mar 18, 2019 at 10:12 PM Kerry Raymond <[hidden email]>
wrote:

> The thing about sockpuppets is that we only know about the ones that have
> been detected (and some of them have been large groups of 100s of
> accounts). The problem is that we don’t know about the undetected ones. I
> am sure many of us have had suspicions about the behaviour of certain
> accounts but to request a sockpuppet investigation requires a level of
> evidence above suspicious behaviour (specifically identifying another
> account). New users with sophisticated editing skills and writing on topics
> associated with living individuals, businesses or products in a positive
> way often seem to me to be the kind of account likely to be doing
> undisclosed paid editing, and almost therefore certainly a sockpuppet of a
> paid PR person, but if each account writes about a different topic, it is
> difficult to work out what the other accounts might be to look for evidence
> of sockpuppeting.
>
>
>
> How far underwater does the iceberg go?
>
>
>
> Kerry
>
>
>
> *From:* Giovanni Luca Ciampaglia [mailto:[hidden email]]
> *Sent:* Tuesday, 19 March 2019 11:37 AM
> *To:* Research into Wikimedia content and communities <
> [hidden email]>
> *Cc:* Kerry Raymond <[hidden email]>
> *Subject:* Re: [Wiki-research-l] Sampling new editors in English Wikipedia
>
>
>
> Does anybody know how prevalent are sockpuppets? Has anybody tried
> estimating the percentage of editors that have created at least one
> additional account? (Legitimate or otherwise.)
>
>
>
> Giovanni
>
>
>
> On Mon, Mar 18, 2019, 20:20 Stuart A. Yeates <[hidden email]> wrote:
>
> In addition to Kerry's excellent examples there are users editing
> wikipedia though TOR, the anonymity and censorship circumvention
> network. These users face extra scrutiny.
>
> cheers
> stuart
>
>
> --
> ...let us be heard from red core to black sky
>
> On Tue, 19 Mar 2019 at 13:04, Kerry Raymond <[hidden email]>
> wrote:
> >
> > Apart from the legitimate alternate accounts and the illegitimate
> sockpuppet accounts, there are other ways that alternate accounts exist.
> >
> > Occasional contributors often forget their username and/or password.
> Password recovery isn't possible unless you provide an email address at
> sign-up (it's optional, but you can add it later). So what such people
> then  do is just create a new user account (I'm not sure there is anything
> else they can do). I see this sort of behaviour a lot at events. The other
> variation of the problem is that they did provide an email address but it
> is one not easily accessible to them at the event (i.e. a librarian who
> signed up with a work email address that cannot be accessed outside of the
> organisation).
> >
> > The other group of people with multiple accounts are those who edit
> anonymously as serial IPs. The same person can use a number of IP numbers
> over time. Often you don't realise it is the same person unless you see a
> lot of their work and can see a pattern in it. For example, at the moment,
> there is a person with a series of IP accounts that is  changing a common
> section of a Queensland place article to be a subsection of another, who I
> notice on my watchlist . This person appears to acquire a new IP address
> every week or so, but the pattern of editing makes it obvious it's the same
> person behind it. Whether or not an IP address can be considered "an
> account" depends on your purposes. The one IP address can also be used by
> multiple people (e.g. coming through a shared organisational network in a
> library or school). It is claimed by some people that many new users do
> their first edits anonymously, so if you are serious about studying "new
> contributors", then maybe you have to look at anonymous editing. And also
> even regular contributors may sometimes choose to edit anonymously, e.g.
> being in an unsecure IT environment and reluctant to use their
> username/password in that situation (particularly people with administrator
> or other significant access rights).
> >
> > Because I do outreach, I look for new accounts that turn up on my
> watchlist and send them welcome messages etc. Because I also do training, I
> see a lot of genuinely new people in action where I can observe their
> edits. So when I see new accounts or IPs doing far more "sophisticated"
> edits than I see new users do, I tend to suspect they are not genuinely new
> contributors.
> >
> > I think the best you can do is look for new accounts and be prepared to
> omit any that show signs of sophisticated editing (either in terms of they
> are doing technically or what they say on Talk pages or in edit summaries).
> For example, no genuine new user will mention a policy (they don't know
> they exist). Also genuine new users don't tend to edit that quickly, so any
> rapid fire series of successful edits is unlikely to be a genuine new
> user.  I think this inability to know if a new account represents a
> genuinely new user is an inherent limitation for your research and should
> be documented as such explaining the many circumstances in which new
> accounts might belong to non-new users.
> >
> > Kerry
> >
> > -----Original Message-----
> > From: Wiki-research-l [mailto:
> [hidden email]] On Behalf Of Pine W
> > Sent: Tuesday, 19 March 2019 5:27 AM
> > To: Research into Wikimedia content and communities <
> [hidden email]>
> > Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia
> >
> > Hi Haifeng,
> >
> > Some users will state on user pages that an account is an alternate
> account. However, this practice is not followed by everyone, and those who
> do follow this practice aren't required to so in a uniform way.
> >
> > Alternate accounts which are not labeled as such, and which are used for
> illegitimate purposes such as double voting, are an ongoing problem. You
> might be interested in the English Wikipedia page
> https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry.
> >
> > Alternate accounts can also be used for legitimate purposes, such as
> people who have one account for their professional or academic activities
> and another account for their personal use.
> >
> > Good luck with your project.
> >
> > Pine
> > ( https://meta.wikimedia.org/wiki/User:Pine )
> >
> >
> > On Thu, Mar 14, 2019 at 1:30 PM Haifeng Zhang <[hidden email]>
> > wrote:
> >
> > > Stuart,
> > >
> > > I'm building an agent-based simulation of Wikipedia collaboration.
> > >
> > > I would like my model to be empirically grounded, so I need to collect
> > > data for new editors.
> > >
> > > Alternative accounts can be an issue, but I wonder is there a way to
> > > identify editors who have multiple account?
> > >
> > >
> > > Thanks,
> > >
> > > Haifeng Zhang
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
12