Dumps are currently halted...

classic Classic list List threaded Threaded
48 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Dumps are currently halted...

Jochen Magnus-2
Hello admins and hostmasters,

download.wikimedia.org/backup-index.html says: "Dumps are currently halted pending
resolution of disk space issues. Hopefully will be resolved shortly."

Meanwhile some weeks have passed, the german dump is six weeks old.  May we still stay
hopefully?

Thank you!

jo


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Robert Ullmann
And now another 3 weeks.

The en.wikt has not seen a dump since 13 June.

What does it take?

Robert

On Wed, Sep 3, 2008 at 4:25 PM, Jochen Magnus <[hidden email]> wrote:

> Hello admins and hostmasters,
>
> download.wikimedia.org/backup-index.html says: "Dumps are currently halted
> pending
> resolution of disk space issues. Hopefully will be resolved shortly."
>
> Meanwhile some weeks have passed, the german dump is six weeks old.  May we
> still stay
> hopefully?
>
> Thank you!
>
> jo
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Brion Vibber-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Ullmann wrote:
> And now another 3 weeks.
>
> The en.wikt has not seen a dump since 13 June.
>
> What does it take?

We ended up with an incompatible disk array for the new dumps server;
replacement delivery ETA is September 29.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjXzv0ACgkQwRnhpk1wk44UxwCg3VmbpNnMKZKjihSD5oxC6m8a
13AAoMywbna6Rlag2QFsDwCNSOS6R2UQ
=oAi0
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Mathias Schindler-2
On Mon, Sep 22, 2008 at 6:59 PM, Brion Vibber <[hidden email]> wrote:

> We ended up with an incompatible disk array for the new dumps server;
> replacement delivery ETA is September 29.

Thanks for the info. In the meantime, would it be possible just to
produce pages-articles.xml.bz2 files without the history part, saving
enough disk space for this task to be run?

There is a huge number of projects which rely on at least sporadic dump process.

Mathias

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Brion Vibber-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mathias Schindler wrote:
> On Mon, Sep 22, 2008 at 6:59 PM, Brion Vibber <[hidden email]> wrote:
>
>> We ended up with an incompatible disk array for the new dumps server;
>> replacement delivery ETA is September 29.
>
> Thanks for the info. In the meantime, would it be possible just to
> produce pages-articles.xml.bz2 files without the history part, saving
> enough disk space for this task to be run?

Well, there's no *meantime* left -- I'll just start them all today.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjqP0YACgkQwRnhpk1wk44rEwCfRk1A4bMZBeHxozrzfdjJRIXI
hZoAnjz9cf2+oSbJZ+f2HWcuSEKZxzIz
=yFzn
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Mathias Schindler-2
On Mon, Oct 6, 2008 at 6:39 PM, Brion Vibber <[hidden email]> wrote:

> Well, there's no *meantime* left -- I'll just start them all today.

Even better, thanks a million.

Mathias

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Robert Ullmann
In reply to this post by Brion Vibber-3
Hi,

That is excellent.

However, it does not solve the longstanding problem of having current pages
dumps and all-history dumps in the same queue. The current pages dump for a
small project that takes a few minutes is thus queued behind history dumps
for large projects that take weeks.

It is essential that the history dumps be in a separate queue, or that
threads are reserved for smaller projects.

best,
Robert

FYI: for anyone interested (although I suspect anyone on the en.wikt already
knows this): there are daily XML dumps for the en.wikt available at
http://devtionary.info/w/dump/xmlu/  ... these are done by incremental
revisions to the previous dump (i.e. not by magic ;-)

On Mon, Oct 6, 2008 at 7:39 PM, Brion Vibber <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Mathias Schindler wrote:
> > On Mon, Sep 22, 2008 at 6:59 PM, Brion Vibber <[hidden email]>
> wrote:
> >
> >> We ended up with an incompatible disk array for the new dumps server;
> >> replacement delivery ETA is September 29.
> >
> > Thanks for the info. In the meantime, would it be possible just to
> > produce pages-articles.xml.bz2 files without the history part, saving
> > enough disk space for this task to be run?
>
> Well, there's no *meantime* left -- I'll just start them all today.
>
> - -- brion
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkjqP0YACgkQwRnhpk1wk44rEwCfRk1A4bMZBeHxozrzfdjJRIXI
> hZoAnjz9cf2+oSbJZ+f2HWcuSEKZxzIz
> =yFzn
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Brion Vibber-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Ullmann wrote:

> Hi,
>
> That is excellent.
>
> However, it does not solve the longstanding problem of having current pages
> dumps and all-history dumps in the same queue. The current pages dump for a
> small project that takes a few minutes is thus queued behind history dumps
> for large projects that take weeks.
>
> It is essential that the history dumps be in a separate queue, or that
> threads are reserved for smaller projects.

Should be pretty easy to set up a small-projects-only thread.

Bigger changes to the dumps generation should come in the next couple
months to make it quicker and more reliable...

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjuTIkACgkQwRnhpk1wk47IkwCePENOkXNuz98vG/t6cgqTb11w
VtwAmQEIGt30bSJymf7WpNXWyW+OXLGa
=axqs
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Robert Ullmann
On Thu, Oct 9, 2008 at 9:25 PM, Brion Vibber <[hidden email]> wrote:
>
>
> Should be pretty easy to set up a small-projects-only thread.
>
> Bigger changes to the dumps generation should come in the next couple
> months to make it quicker and more reliable...
>

 If we look at the process right now, there are two threads: one is doing
enwiki, the other hewiki. The enwiki thread isn't even to pages-articles
yet, and will run for weeks. The hewiki dump will complete in a day or so,
but then next on deck is dewiki, which takes at least a week.

So with things running now, it will be a week or two before any other
projects get anything.

As you say, it would be easy to make threads limited to smaller projects;
I'd suggest adding an option (-small or something) that just has the code
skip [en, de, zh, he ...]wiki when looking for the least-recently completed
task. The code list should be 10-15 of the biggest 'pedias, and possibly
commons. Then start two small threads, and everything should go well?

best, Robert
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Thomas Dalton
>  If we look at the process right now, there are two threads: one is doing
> enwiki, the other hewiki. The enwiki thread isn't even to pages-articles
> yet, and will run for weeks. The hewiki dump will complete in a day or so,
> but then next on deck is dewiki, which takes at least a week.
>
> So with things running now, it will be a week or two before any other
> projects get anything.
>
> As you say, it would be easy to make threads limited to smaller projects;
> I'd suggest adding an option (-small or something) that just has the code
> skip [en, de, zh, he ...]wiki when looking for the least-recently completed
> task. The code list should be 10-15 of the biggest 'pedias, and possibly
> commons. Then start two small threads, and everything should go well?

Is it necessary to skip the 10 biggest? I think skipping just the top
3 would be a massive help.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Robert Ullmann
There are more large wikis than you might think. If you "skip" only the
three largest, there will still be a serious queuing problem. Better to have
2-3 threads unrestricted, with the understanding that 95% of the time one of
those will be on enwiki, and 1-2 doing all the other projects.

In any case, as it is right now, don't expect anything for a week or two
...

Robert

On Fri, Oct 10, 2008 at 8:03 PM, Thomas Dalton <[hidden email]>wrote:

> >  If we look at the process right now, there are two threads: one is doing
> > enwiki, the other hewiki. The enwiki thread isn't even to pages-articles
> > yet, and will run for weeks. The hewiki dump will complete in a day or
> so,
> > but then next on deck is dewiki, which takes at least a week.
> >
> > So with things running now, it will be a week or two before any other
> > projects get anything.
> >
> > As you say, it would be easy to make threads limited to smaller projects;
> > I'd suggest adding an option (-small or something) that just has the code
> > skip [en, de, zh, he ...]wiki when looking for the least-recently
> completed
> > task. The code list should be 10-15 of the biggest 'pedias, and possibly
> > commons. Then start two small threads, and everything should go well?
>
> Is it necessary to skip the 10 biggest? I think skipping just the top
> 3 would be a massive help.
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Thomas Dalton
I'm trying to work out if it is actually desirable to separate the
larger projects onto one thread. The only way you can have a smaller
project dumped more often is the have the larger ones dumped less
often, but do we really want less frequent enwiki dumps? By
separateing them and sharing them fairly between the threads you can
get more regular dumps, but the significant number is surely the
amount of time between one dump of your favourite project and the
next, which will only change if you share the projects unfairly. Why
do we want small projects to be dumped more frequently than large
projects?

I guess the answer, really, is to get more servers doing dumps - I'm
sure that will come in time.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Robert Ullmann
Look at this way: you can't get enwiki dumps more than once every six weeks.
Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit ;-)

The example I have used before is going into my bank: in the main Queensway
office, there will be 50-100 people on the queue. When there are 8-10
tellers, it will go well; except that some transactions (depositing some
cash) take a minute or so, and some take many, many minutes. If there are 8
tellers, and 8 people in front of you with 20-30 minute transactions, you
are toast. (They handle this by having fast lines for deposits and such ;-)

In general, one queue feeding multiple servers/threads works very nicely if
the tasks are about the same size.

But what we have here is projects that take less than a minute, in the same
queue with projects that take weeks. That is 5 orders of magnitude: in the
time in takes to do the enwiki dump, the same thread could do ONE HUNDRED
THOUSAND small projects.

Imagine walking into your bank with a 30 second transaction, and being told
it couldn't be completed for 6 weeks because there were 3 officers
available, and 5 people who needed complicated loan approvals on the queue
in front of you.

That's the way the dumps are set up right now.

On Sat, Oct 11, 2008 at 2:49 AM, Thomas Dalton <[hidden email]>wrote:

> I'm trying to work out if it is actually desirable to separate the
> larger projects onto one thread. The only way you can have a smaller
> project dumped more often is the have the larger ones dumped less
> often, but do we really want less frequent enwiki dumps? By
> separateing them and sharing them fairly between the threads you can
> get more regular dumps, but the significant number is surely the
> amount of time between one dump of your favourite project and the
> next, which will only change if you share the projects unfairly. Why
> do we want small projects to be dumped more frequently than large
> projects?
>
> I guess the answer, really, is to get more servers doing dumps - I'm
> sure that will come in time.
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Thomas Dalton
2008/10/11 Robert Ullmann <[hidden email]>:
> Look at this way: you can't get enwiki dumps more than once every six weeks.
> Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit ;-)
>
> The example I have used before is going into my bank: in the main Queensway
> office, there will be 50-100 people on the queue. When there are 8-10
> tellers, it will go well; except that some transactions (depositing some
> cash) take a minute or so, and some take many, many minutes. If there are 8
> tellers, and 8 people in front of you with 20-30 minute transactions, you
> are toast. (They handle this by having fast lines for deposits and such ;-)

Your analogy is flawed. In that analogy the desire is to minimise the
amount of time between walking in the door and completing your
transaction, but in our case we desire to minimise the amount of time
between a person completing one transaction and that person completing
their next transaction in an ever repeating loop. The circumstances
are not the same.

And you can have enwiki dumps less than 6 weeks apart, it will just
involve having more than one running at a time.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Anthony-73
In reply to this post by Thomas Dalton
On Fri, Oct 10, 2008 at 7:49 PM, Thomas Dalton <[hidden email]>wrote:

> I guess the answer, really, is to get more servers doing dumps - I'm
> sure that will come in time.
>

No, the answer, really, is to do the dumps more efficiently.  Brion says
this should come in the next couple months.

Anthony
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Nicolas Dumazet
Hey !

May I mention that the scripts generating the dumps and handling the
scheduling are written in Python and are available on wikimedia svn ?
[1]

If you have some improvements to suggest on the task scheduling, I
guess that patches are welcome :)

In may, following another wikitech-l discussion [2] some small
improvements were done on the dump processing, to prioritize the dumps
that haven't been successfully dumped in a long time. Previously, we
were not taking into account the fact that some dump attempts failed,
only ordering the dumps by "last dump try start time", leading to some
inconsistencies.

If I'm right, I think that you should also consider the fact that the
Xml dumping process is also basing itself on the previous dumps to be
faster: in other words, if you have a recent Xml dump, it is faster to
work with that existing dump because you can fetch text records from
the old dump instead of fetching them from the external storage which
also requires normalizing and decompressing.
Here, the latest dump available for enwiki is from July, meaning a lot
of new text to fetch from external storage: this first dump *will*
take a long time, but you should expect the next dumps to go faster.

[1] http://svn.wikimedia.org/viewvc/mediawiki/trunk/backup/
[2] http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/38401/focus=38438

2008/10/11 Anthony <[hidden email]>:

> On Fri, Oct 10, 2008 at 7:49 PM, Thomas Dalton <[hidden email]>wrote:
>
>> I guess the answer, really, is to get more servers doing dumps - I'm
>> sure that will come in time.
>>
>
> No, the answer, really, is to do the dumps more efficiently.  Brion says
> this should come in the next couple months.
>
> Anthony
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



--
Nicolas Dumazet — NicDumZ [ nɪk.d̪ymz ]
pywikipedia & mediawiki
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Anthony-73
On Fri, Oct 10, 2008 at 10:09 PM, Nicolas Dumazet <[hidden email]> wrote:

> Hey !
>
> May I mention that the scripts generating the dumps and handling the
> scheduling are written in Python and are available on wikimedia svn ?
> [1]
>

Well, you can, but I already knew that.

If you have some improvements to suggest on the task scheduling, I
> guess that patches are welcome :)
>

Well, I don't know Python, and I'd advocate rewriting the dump system from
scratch anyway, but 1) I'd really need access to the SQL server in order to
do that; and 2) If I put that much work into something I need some sort of
financial reward.  Hiring me and/or paying for my family's health care is
welcome as well.

I'm actually working on redoing the full history bz2 dump as a bunch of
smaller bz2 files (of 900K or less uncompressed text each) so they can be
accessed randomly without losing the compressing.  But it's going to take a
while for me to complete it, since I don't have a very fast machine or hard
drives, and I don't have a lot of time to spend on it since working on it
has little potential to feed, clothe, or shelter my family.  And when I
finish it, I'm probably not going to give it away for free, on the off
chance that maybe I can sell it to buy my daughter diapers or buy my son
milk or something.

I'm a terrible person, aren't I?
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Ilmari Karonen
In reply to this post by Thomas Dalton
Thomas Dalton wrote:

> 2008/10/11 Robert Ullmann <[hidden email]>:
>> Look at this way: you can't get enwiki dumps more than once every six weeks.
>> Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit ;-)
>>
>> The example I have used before is going into my bank: in the main Queensway
>> office, there will be 50-100 people on the queue. When there are 8-10
>> tellers, it will go well; except that some transactions (depositing some
>> cash) take a minute or so, and some take many, many minutes. If there are 8
>> tellers, and 8 people in front of you with 20-30 minute transactions, you
>> are toast. (They handle this by having fast lines for deposits and such ;-)
>
> Your analogy is flawed. In that analogy the desire is to minimise the
> amount of time between walking in the door and completing your
> transaction, but in our case we desire to minimise the amount of time
> between a person completing one transaction and that person completing
> their next transaction in an ever repeating loop. The circumstances
> are not the same.
>
> And you can have enwiki dumps less than 6 weeks apart, it will just
> involve having more than one running at a time.

AIUI (but please correct me if I'm wrong), you can't.  At least not
without throwing more hardware at it.  Otherwise, if you try to run two
enwiki dumps concurrently on the same hardware, you'll find that they
both finish in _twelve_ weeks instead of six.

If not, let's just run _all_ the dumps in parallel simultaneously, and
the problem is solved!  ...right?

--
Ilmari Karonen

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Alex Zaddach
In reply to this post by Robert Ullmann
You also have to take demand into consideration; how many people are
waiting for dumps of enwiki, dewiki, etc. vs. how many are waiting for
the smaller wikis? (Not a rhetorical question, I'd be interested in the
answer.) To use the bank analogy, if everyone is waiting for a loan, you
don't move your loan officers to the teller windows just because they
can process small transactions faster. Note also that several dozen of
the smallest wikis have fewer than 5000 articles. If someone has a bot
or sysop account, they can get the current revision of every article
with a single API query. While a dump would be more efficient and
probably slightly faster, getting the current revision for every article
on a large wiki basically requires a dump.

Robert Ullmann wrote:

> Look at this way: you can't get enwiki dumps more than once every six weeks.
> Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit ;-)
>
> The example I have used before is going into my bank: in the main Queensway
> office, there will be 50-100 people on the queue. When there are 8-10
> tellers, it will go well; except that some transactions (depositing some
> cash) take a minute or so, and some take many, many minutes. If there are 8
> tellers, and 8 people in front of you with 20-30 minute transactions, you
> are toast. (They handle this by having fast lines for deposits and such ;-)
>
> In general, one queue feeding multiple servers/threads works very nicely if
> the tasks are about the same size.
>
> But what we have here is projects that take less than a minute, in the same
> queue with projects that take weeks. That is 5 orders of magnitude: in the
> time in takes to do the enwiki dump, the same thread could do ONE HUNDRED
> THOUSAND small projects.
>
> Imagine walking into your bank with a 30 second transaction, and being told
> it couldn't be completed for 6 weeks because there were 3 officers
> available, and 5 people who needed complicated loan approvals on the queue
> in front of you.
>
> That's the way the dumps are set up right now.
>
> On Sat, Oct 11, 2008 at 2:49 AM, Thomas Dalton <[hidden email]>wrote:
>
>> I'm trying to work out if it is actually desirable to separate the
>> larger projects onto one thread. The only way you can have a smaller
>> project dumped more often is the have the larger ones dumped less
>> often, but do we really want less frequent enwiki dumps? By
>> separateing them and sharing them fairly between the threads you can
>> get more regular dumps, but the significant number is surely the
>> amount of time between one dump of your favourite project and the
>> next, which will only change if you share the projects unfairly. Why
>> do we want small projects to be dumped more frequently than large
>> projects?
>>
>> I guess the answer, really, is to get more servers doing dumps - I'm
>> sure that will come in time.
>>


--
Alex (wikipedia:en:User:Mr.Z-man)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Dumps are currently halted...

Robert Ullmann
In reply to this post by Thomas Dalton
On Sat, Oct 11, 2008 at 3:50 AM, Thomas Dalton <[hidden email]>wrote:

> 2008/10/11 Robert Ullmann <[hidden email]>:
> > Look at this way: you can't get enwiki dumps more than once every six
> weeks.
> > Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit
> ;-)
> >
> > The example I have used before is going into my bank: in the main
> Queensway
> > office, there will be 50-100 people on the queue. When there are 8-10
> > tellers, it will go well; except that some transactions (depositing some
> > cash) take a minute or so, and some take many, many minutes. If there are
> 8
> > tellers, and 8 people in front of you with 20-30 minute transactions, you
> > are toast. (They handle this by having fast lines for deposits and such
> ;-)
>
> Your analogy is flawed. In that analogy the desire is to minimise the
> amount of time between walking in the door and completing your
> transaction, but in our case we desire to minimise the amount of time
> between a person completing one transaction and that person completing
> their next transaction in an ever repeating loop. The circumstances
> are not the same.
>
>
No, the analogy is exactly correct; your statement of the problem is not.
There is no reason whatever that a hundred other projects should have to
wait six weeks to be "fair", just because the enwiki takes that long. Just
as there is no reason for the person with the 30 second daily transaction to
wait behind someone spending 30 minutes settling their monthly KRA (tax
authority) accounts.

We aren't going to get enwiki dumps more often than 6 weeks. (Unless/until
whatever rearrangement Brion is planning.) But at the same time, there is no
reason whatever that smaller projects can't get dumps every week
consistently; they just need a thread that only serves them. Just like that
"deposits only in 500's and '1000's bills" teller at the bank.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
123