How can I get data to map our linguistic interconnectedness?

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

How can I get data to map our linguistic interconnectedness?

Alec Conroy-2
The recent elections showed us that language issues and translation
are something we have to take very seriously from now on.  As a first
step towards improving communication, it seems like we should get an
idea of which users speak which languages?

We could directly ask them to tell us, but upon reflection, the
information is already hidden in our database.  A multilingual user is
one that actively edits two projects of different languages.

In devising a comprehensive translation strategy, we need to know how
interconnected any two given projects are.   We also need to know how
connected any given project is to English, since it's our working
language.

We need to pay special attention to languages that are very 'distant'
from English-- distant in the sense of having few members who fluent
in both English and the language in question.

Could someone aid me in getting this data, or explaining why I don't
need it or why we already have it, etc?

Specifically, I'm looking for:
#   For each non-english-language project, how many of their active
users are ALSO active on an english-language project? (the answer is
should be a single whole number for each project)
#   For any two projects, how many users are there who are active on
both? (answer is a square matrix, roughly 750x750 )
#   For any two languages, how many users appear to speak both
languages? (answer is a square matrix, roughly 750x750)

Does anyone know how to pull this out of the database?    It's an
important question for us to recruit translators and really just
assess "where we are" in terms of inter-project language capabilities.

Alec

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Aryeh Gregor
On Wed, Jun 15, 2011 at 8:46 AM, Alec Conroy <[hidden email]> wrote:
> We could directly ask them to tell us, but upon reflection, the
> information is already hidden in our database.  A multilingual user is
> one that actively edits two projects of different languages.

That doesn't follow.  Perhaps someone speaks a language, but doesn't
edit the corresponding wiki.  For instance, I know a decent amount of
Hebrew, although I wouldn't call myself fluent in Modern Hebrew.  But
I'm a native English speaker, and English Wikipedia articles are
almost always better than the corresponding Hebrew ones (often even on
Judaism-related topics).  So I have no reason to read the Hebrew
Wikipedia, when it takes more effort for me and the content isn't
usually as good.  Likewise, some people edit exclusively or almost
exclusively on multilingual projects like Commons.

On the other hand, people might edit on projects in languages they
don't understand.  For instance, they might be running scripts that
automatically fix interwikis or such.  This is less likely, though,
once you exclude bot accounts.

If you want this info, toolserver queries are the right way to do it.
It should be pretty easy to pull this kind of info out of the revision
or recentchanges tables, although it would require reading a lot of
data.  The simplest way would be to get a list of usernames for each
wiki that have edited in the last X days, then use a script to reverse
the lists so that you get a list of languages for each user.  You'd
probably want to only include unified accounts here.  (How many
accounts still aren't unified?)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Alec Conroy-2
Hi Aryeh, thanks for the fast reply.
Yes, this will definitely underestimate linguistic capabilities of
some users, and overestimate the linguistic capabilities of others---
it's a rough measure at best.

But is there another way to try to get who how "easily" two languages
should be able to communicate with each other?
The best way I can think of is looking for editing patterns that
suggest multilingual skills.    Even if this isn't a direct measure of
language, it's at least a measure of "inter-wiki interaction", which
is a good measure to have.

The important point of doing this would be:
1) to identify those users with unique language skills and recruit them
2) to identify projects and languages that are 'most disconnected'
from the English hub, so we can make them less disconnected.

Is there an easy way to run this:

For each of the 86,000 'active users':
    Store a list for their edit counts on each project they've edited

That's actually a fairly small dataset, and it would get us all the
data we want.   I've been a developer before, but never here.   Any
idea how I go about getting that info?

(global accounts only is fine, usernames not needed at this point if
we have privacy concerns)

Alec



On Wed, Jun 15, 2011 at 7:24 AM, Aryeh Gregor
<[hidden email]> wrote:

> On Wed, Jun 15, 2011 at 8:46 AM, Alec Conroy <[hidden email]> wrote:
>> We could directly ask them to tell us, but upon reflection, the
>> information is already hidden in our database.  A multilingual user is
>> one that actively edits two projects of different languages.
>
> That doesn't follow.  Perhaps someone speaks a language, but doesn't
> edit the corresponding wiki.  For instance, I know a decent amount of
> Hebrew, although I wouldn't call myself fluent in Modern Hebrew.  But
> I'm a native English speaker, and English Wikipedia articles are
> almost always better than the corresponding Hebrew ones (often even on
> Judaism-related topics).  So I have no reason to read the Hebrew
> Wikipedia, when it takes more effort for me and the content isn't
> usually as good.  Likewise, some people edit exclusively or almost
> exclusively on multilingual projects like Commons.
>
> On the other hand, people might edit on projects in languages they
> don't understand.  For instance, they might be running scripts that
> automatically fix interwikis or such.  This is less likely, though,
> once you exclude bot accounts.
>
> If you want this info, toolserver queries are the right way to do it.
> It should be pretty easy to pull this kind of info out of the revision
> or recentchanges tables, although it would require reading a lot of
> data.  The simplest way would be to get a list of usernames for each
> wiki that have edited in the last X days, then use a script to reverse
> the lists so that you get a list of languages for each user.  You'd
> probably want to only include unified accounts here.  (How many
> accounts still aren't unified?)
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Platonides
In reply to this post by Alec Conroy-2
Alec Conroy wrote:
 > The recent elections showed us that language issues and translation
 > are something we have to take very seriously from now on.  As a first
 > step towards improving communication, it seems like we should get an
 > idea of which users speak which languages?
 >
 > We could directly ask them to tell us, but upon reflection, the
 > information is already hidden in our database.  A multilingual user is
 > one that actively edits two projects of different languages.

Many users already told us, by using babel templates. That also explains
how much confidence do they have in those languages (native level, basic
skills...).


 > In devising a comprehensive translation strategy, we need to know how
 > interconnected any two given projects are.   We also need to know how
 > connected any given project is to English, since it's our working
 > language.

There's also the motivation factor. I am not much of a translator.
Although I have fixed translations that I encountered just when
accessing as a user that had been there for days.
  From what I have seen in the past many translations aren't done by the
skilled people but just by people that was motivated enough to translate
it, which sometimes are in a autotranslation-like level.
However, as the people running the event obviously don't know every
language, they have to rely on the few translating users, and bad texts
pass as 'translated'.


 > We need to pay special attention to languages that are very 'distant'
 > from English-- distant in the sense of having few members who fluent
 > in both English and the language in question.
 >
 > Could someone aid me in getting this data, or explaining why I don't
 > need it or why we already have it, etc?
 >
 > Specifically, I'm looking for:
 > #   For each non-english-language project, how many of their active
 > users are ALSO active on an english-language project? (the answer is
 > should be a single whole number for each project)

First point: define being active. That should be something like 'more
than X non-minor edits in the last Y weeks.'

I see a problem in that you are exposing it as a symmetric relationship,
while I don't think it should be. I could be very skilled to translate
something to my mother tongue, but an inept to translate it in the
opposite way.
Specially when translating between similar languages, where a
non-speaker can easily grasp the meaning.

Also, someone which routinely translates articles for enwiki to xzwiki
would have the exact profile you want to discover, but could be skipped
due to not having enough edits to enwiki.

 > #   For any two projects, how many users are there who are active on
 > both? (answer is a square matrix, roughly 750x750 )
 > #   For any two languages, how many users appear to speak both
 > languages? (answer is a square matrix, roughly 750x750)

I think the answer would actually be three-dimensional, since for each
cell you would have a list of people, the number being just a summary.


 > Does anyone know how to pull this out of the database?    It's an
 > important question for us to recruit translators and really just
 > assess "where we are" in terms of inter-project language capabilities.
 >
 > Alec

I think I can build you something if you give me appropiate values for
the above definition.

Cheers



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Platonides
In reply to this post by Alec Conroy-2
Alec Conroy wrote:

> Is there an easy way to run this:
>
> For each of the 86,000 'active users':
>      Store a list for their edit counts on each project they've edited
>
> That's actually a fairly small dataset, and it would get us all the
> data we want.   I've been a developer before, but never here.   Any
> idea how I go about getting that info?
>
> (global accounts only is fine, usernames not needed at this point if
> we have privacy concerns)
>
> Alec

Yes, there is. It's not efficient, but it should be no problem to generate.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Niklas Laxström
In reply to this post by Alec Conroy-2
On 15 June 2011 17:34, Alec Conroy <[hidden email]> wrote:
> The important point of doing this would be:
> 1) to identify those users with unique language skills and recruit them
Recruit them to do what?
> 2) to identify projects and languages that are 'most disconnected'
> from the English hub, so we can make them less disconnected.
Can we make them less disconnected? How?

  -Niklas
--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Thomas Morton
There is a lot of cross-wiki collaboration that can be done (whilst
supporting the idea of wiki independence)  and should be
encouraged. Foundation work, cross-wiki translations of material, etc. Alec
is largely talking about the board elections though, which was Anglo-centric
and could have benefited from extra translation work and contacts on many
more language wiki's to promote the election in line with "local customs".

I think the idea at the root of this thread is a good one; it's not a
perfect metric by any stretch of the imagination - but it could highlight
Wiki's that have little cross-wiki collaboration.

I'd be interested to see activity intersection between the various Wiki's
and Meta (and other organisational wiki's); to see what portion of people
are also active in the foundation.

This could highlight areas where Wiki's suffer from under-representation in
areas of the foundation and gives us something to target "outreach" work
etc.

Tom

On 15 June 2011 16:08, Niklas Laxström <[hidden email]> wrote:

> On 15 June 2011 17:34, Alec Conroy <[hidden email]> wrote:
> > The important point of doing this would be:
> > 1) to identify those users with unique language skills and recruit them
> Recruit them to do what?
> > 2) to identify projects and languages that are 'most disconnected'
> > from the English hub, so we can make them less disconnected.
> Can we make them less disconnected? How?
>
>  -Niklas
> --
> Niklas Laxström
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Alec Conroy-2
In reply to this post by Platonides
> I think I can build you something if you give me appropiate values for
> the above definition.
>
> Cheers

Excellent-- so striking while the iron is hot-- I see that
[[Special:Statistics]] defines active as "edited within the last 30
days".    I'm open to whoever many users we can realistically get info
on-- the more the merrier, at least until I run out of ram. :)

My initial query my go something like
"Select users where lasttouched was within the last month and total
edit counts are greater than 500".

And then, adding in the requirement of second project will narrow that pool.
And then adding the constraint of a second project with a second
language will narrow the pool even more.

We're looking for the orphan community who have a lot of editors but
little connection to English and Meta.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Alec Conroy-2
In reply to this post by Platonides
On Wed, Jun 15, 2011 at 7:42 AM, Platonides <[hidden email]> wrote:
> Alec Conroy wrote:
>  > We could directly ask them to tell us, but upon reflection, the
>  > information is already hidden in our database.  A multilingual user is
>  > one that actively edits two projects of different languages.
>
> Many users already told us, by using babel templates. That also explains
> how much confidence do they have in those languages (native level, basic
> skills...).

Babel templates are great-- if every user had them, we'd be good.
Unfortunately, if you know enough to use a babel template, you
probably are already 'tied in' to the global community and thus not in
need of outreach.   (this assumption may be false).

> There's also the motivation factor.
That's saying a mouthful.  Just knowing people can translate is not at
all the same as being able to expect they'll actually do it.  We just
found that out, and that's why we need to start building a translator
network now, rather than wait till next year.


> First point: define being active. That should be something like 'more
> than X non-minor edits in the last Y weeks.'

I'm flexible.   The point of activity is just to weed the data down to
a manageable size.  If we want to call anyone active at this stage,
that'd work.     I suggest lasttouched in 30 days, but that's totally
arbitrary.


> I see a problem in that you are exposing it as a symmetric relationship,
> while I don't think it should be.

Again, another very brilliant caveat.
I should say that my initial attempt at getting these kinds of
estimates was to look at wordwide language-overlap statistics and just
assume that wikimedians are "average humans", which they clearly
aren't.  This would get us a very very rough picture.

Analysis of actual edit patterns will get us a better view, but it'll
still be less precise than babel boxes or actual self-identification
as a translator.   Perhaps at some point we can explicitly ask users
to tell us directly their language skills.

Alecmconroy

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Alec Conroy-2
In reply to this post by Niklas Laxström
On Wed, Jun 15, 2011 at 8:08 AM, Niklas Laxström
<[hidden email]> wrote:
> On 15 June 2011 17:34, Alec Conroy <[hidden email]> wrote:
>> The important point of doing this would be:
>> 1) to identify those users with unique language skills and recruit them
> Recruit them to do what?

Recruit them to help the global community with itself.   There are
currently-unidentified individuals with a special gift that will
enable them to unite the global community in a way beyond that of
monolingual members.    Most recently, we needed a translator army to
help us run the elections, but the need for translators isn't going
away.  Everyone language we have needs to have a clear and direct
translation path so it can participate in the movement.

>> 2) to identify projects and languages that are 'most disconnected'
>> from the English hub, so we can make them less disconnected.
> Can we make them less disconnected? How?

First and foremost by pointing out to us that a certain community is
isolated.   This will hopefully  cause members of the global community
to reach out to the isolated community.  At the same time, it will
hopefully inspire members of the isolated community to reach out to
the global community.

In extreme cases, it's not inconceivable that the foundation has a
direct role to play in helping underrepresented projects communicate
with the rest of us.

Alec

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Aryeh Gregor
In reply to this post by Alec Conroy-2
On Wed, Jun 15, 2011 at 10:34 AM, Alec Conroy <[hidden email]> wrote:
> Is there an easy way to run this:
>
> For each of the 86,000 'active users':
>    Store a list for their edit counts on each project they've edited
>
> That's actually a fairly small dataset, and it would get us all the
> data we want.   I've been a developer before, but never here.   Any
> idea how I go about getting that info?

Get any toolserver user to run the necessary SQL queries.  This page
might be helpful, if no one on this list wants to run it for you:

https://wiki.toolserver.org/view/Query_service

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Platonides
In reply to this post by Alec Conroy-2
Alec Conroy wrote:

>> I think I can build you something if you give me appropiate values for
>> the above definition.
>>
>> Cheers
>
> Excellent-- so striking while the iron is hot-- I see that
> [[Special:Statistics]] defines active as "edited within the last 30
> days".    I'm open to whoever many users we can realistically get info
> on-- the more the merrier, at least until I run out of ram. :)
>
> My initial query my go something like
> "Select users where lasttouched was within the last month and total
> edit counts are greater than 500".
>
> And then, adding in the requirement of second project will narrow that pool.
> And then adding the constraint of a second project with a second
> language will narrow the pool even more.
>
> We're looking for the orphan community who have a lot of editors but
> little connection to English and Meta.

I have added a small script at
http://www.toolserver.org/~platonides/activeusers/activeusers.php to
show active users per project and language.
Requisites for appearing there are more than 500 edits (total) and at
least one action (usually an edit) in the last month (since May 16, data
is cached).
Bots appear in the list.
I'm still populating the data, but it should be completed by the time
you read this.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Diederik van Liere
Dear Alec,

Maybe the Community Department can help you out with your question. We
are doing a number of research sprints this summer to map out
different aspects of the Wikipedia communities and this sounds like a
great question and we have some researchers available to help write
the queries.
So please contact me and I'll hook you up with the right people.
Best,
Diederik


On Thu, Jun 16, 2011 at 4:40 AM, Platonides <[hidden email]> wrote:

> Alec Conroy wrote:
>>> I think I can build you something if you give me appropiate values for
>>> the above definition.
>>>
>>> Cheers
>>
>> Excellent-- so striking while the iron is hot-- I see that
>> [[Special:Statistics]] defines active as "edited within the last 30
>> days".    I'm open to whoever many users we can realistically get info
>> on-- the more the merrier, at least until I run out of ram. :)
>>
>> My initial query my go something like
>> "Select users where lasttouched was within the last month and total
>> edit counts are greater than 500".
>>
>> And then, adding in the requirement of second project will narrow that pool.
>> And then adding the constraint of a second project with a second
>> language will narrow the pool even more.
>>
>> We're looking for the orphan community who have a lot of editors but
>> little connection to English and Meta.
>
> I have added a small script at
> http://www.toolserver.org/~platonides/activeusers/activeusers.php to
> show active users per project and language.
> Requisites for appearing there are more than 500 edits (total) and at
> least one action (usually an edit) in the last month (since May 16, data
> is cached).
> Bots appear in the list.
> I'm still populating the data, but it should be completed by the time
> you read this.
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



--
<a href="http://about.me/diederik">Check out my about.me profile!</a>

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Platonides
In reply to this post by Platonides
> I have added a small script at
> http://www.toolserver.org/~platonides/activeusers/activeusers.php to
> show active users per project and language.
> Requisites for appearing there are more than 500 edits (total) and at
> least one action (usually an edit) in the last month (since May 16, data
> is cached).
> Bots appear in the list.
> I'm still populating the data, but it should be completed by the time
> you read this.

I have done the intersection part
http://toolserver.org/~platonides/activeusers/intersection.php

I find the results to be quite useless for the original goal. Almost all
entries are bots.
Even intersecting big wikis like de-en [1] or en-es [2], where many
people is able to speak both languages, only shows one user.
So my conclusion is that people stays on its home wiki, and it is very
strange that someone passes 500 edits *both* on its wiki and in a
foreign one.

For the record, going through the whole list to get the active users
took 30m26.207s. Not bad for 797 wikis. Actually doing the intersection
took 3m47.916s.
The app doesn't check sul accounts, instead it naively takes equal
usernames as being the same person.
All wikis were compared for actions after 20110516081337.
The drift between that epoch and the point where the query was done was
not compensated.

1-http://wolfsbane.toolserver.org/~platonides/activeusers/intersection.php?project=wikipedia&pairs=de|en
2-http://wolfsbane.toolserver.org/~platonides/activeusers/intersection.php?project=wikipedia&pairs=en|es


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Steven Walling
On Thu, Jun 16, 2011 at 9:44 AM, Platonides <[hidden email]> wrote:

> So my conclusion is that people stays on its home wiki, and it is very
> strange that someone passes 500 edits *both* on its wiki and in a
> foreign one.
>

Agreed, I don't think this is a surprising result.

If we can filter out bots and contribs that have been imported/exported
(maybe via the logs?) then I think it would be more useful to lower the bar.
Perhaps at least 100 edits on their home wiki, and 10 edits on another?

Steven
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Jelle Zijlstra
You might also get better results when you don't limit yourself to recent
contributions. For example, I contributed heavily to the Dutch Wikipedia a
few years ago, and now contribute heavily to the English. I don't appear in
Platonides's list, because I hardly edit nl: at all any more. There may be
more people like that.

Jelle Zijlstra

2011/6/16 Steven Walling <[hidden email]>

> On Thu, Jun 16, 2011 at 9:44 AM, Platonides <[hidden email]> wrote:
>
> > So my conclusion is that people stays on its home wiki, and it is very
> > strange that someone passes 500 edits *both* on its wiki and in a
> > foreign one.
> >
>
> Agreed, I don't think this is a surprising result.
>
> If we can filter out bots and contribs that have been imported/exported
> (maybe via the logs?) then I think it would be more useful to lower the
> bar.
> Perhaps at least 100 edits on their home wiki, and 10 edits on another?
>
> Steven
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Platonides
Jelle Zijlstra wrote:
> You might also get better results when you don't limit yourself to recent
> contributions. For example, I contributed heavily to the Dutch Wikipedia a
> few years ago, and now contribute heavily to the English. I don't appear in
> Platonides's list, because I hardly edit nl: at all any more. There may be
> more people like that.
>
> Jelle Zijlstra

Maybe we should broad the span for being active.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

M. Williamson
I would say broaden the span and lower the number of contribs required
just a little (maybe 300?).


2011/6/16 Platonides <[hidden email]>:

> Jelle Zijlstra wrote:
>> You might also get better results when you don't limit yourself to recent
>> contributions. For example, I contributed heavily to the Dutch Wikipedia a
>> few years ago, and now contribute heavily to the English. I don't appear in
>> Platonides's list, because I hardly edit nl: at all any more. There may be
>> more people like that.
>>
>> Jelle Zijlstra
>
> Maybe we should broad the span for being active.
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Thomas Morton
Or look for actives on one wiki.. and then cross check those names with all
the other wikis for the same names with over, say, 300 edits (at any time).

Tom

On 16 June 2011 22:34, M. Williamson <[hidden email]> wrote:

> I would say broaden the span and lower the number of contribs required
> just a little (maybe 300?).
>
>
> 2011/6/16 Platonides <[hidden email]>:
> > Jelle Zijlstra wrote:
> >> You might also get better results when you don't limit yourself to
> recent
> >> contributions. For example, I contributed heavily to the Dutch Wikipedia
> a
> >> few years ago, and now contribute heavily to the English. I don't appear
> in
> >> Platonides's list, because I hardly edit nl: at all any more. There may
> be
> >> more people like that.
> >>
> >> Jelle Zijlstra
> >
> > Maybe we should broad the span for being active.
> >
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: How can I get data to map our linguistic interconnectedness?

Platonides
Thomas Morton wrote:
> Or look for actives on one wiki.. and then cross check those names with all
> the other wikis for the same names with over, say, 300 edits (at any time).
>
> Tom

The edit count is are already looking at the full count, in the last
month only one is needed.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l