Important news about the November dumps run!

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Important news about the November dumps run!

Ariel Glenn WMF
As was previously announced on the xmldatadumps-l list, the sql/xml dumps
generated twice a month will be written to an internal server, starting
with the November run.  This is in part to reduce load on the web/rsync/nfs
server which has been doing this work also until now.  We want separation
of roles for some other reasons too.

Because I want to get this right, and there are a lot of moving parts, and
I don't want to rsync all the prefetch data over to these boxes again next
month after cancelling the move:

********
If needed, the November full run will be delayed for a few days.
If the November full run takes too long, the partial run, usually starting
on the 20th of the month, will not take place.
*********

Additionally, as described in an earlier email on the xmldatadumps-l list:

*********
files will show up on the web server/rsync server with a substantial
delay.  Initially this may be a day or more.  This includes index.html and
other status files.
*********

You can keep track of developments here:
https://phabricator.wikimedia.org/T178893

If you know folks not on the lists in the recipients field for this email,
please forward it to them and suggest that they subscribe to this list.

Thanks,

Ariel
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Important news about the November dumps run!

Ariel Glenn WMF
The first set of dumps is running there and looks like it's working ok.
I've done a manual rsync of files produced up to this point, so those are
now available on the web server.

As before, you can follow work on this at
https://phabricator.wikimedia.org/T178893

Note that it is possible that some index.html files may contain links to
files which did not get picked up on the rsync.  They'll be there sometime
tomorrow after the next rsync.

Ariel

On Mon, Oct 30, 2017 at 5:39 PM, Ariel Glenn WMF <[hidden email]>
wrote:

> As was previously announced on the xmldatadumps-l list, the sql/xml dumps
> generated twice a month will be written to an internal server, starting
> with the November run.  This is in part to reduce load on the web/rsync/nfs
> server which has been doing this work also until now.  We want separation
> of roles for some other reasons too.
>
> Because I want to get this right, and there are a lot of moving parts, and
> I don't want to rsync all the prefetch data over to these boxes again next
> month after cancelling the move:
>
> ********
> If needed, the November full run will be delayed for a few days.
> If the November full run takes too long, the partial run, usually starting
> on the 20th of the month, will not take place.
> *********
>
> Additionally, as described in an earlier email on the xmldatadumps-l list:
>
> *********
> files will show up on the web server/rsync server with a substantial
> delay.  Initially this may be a day or more.  This includes index.html and
> other status files.
> *********
>
> You can keep track of developments here: https://phabricator.wikimedia.
> org/T178893
>
> If you know folks not on the lists in the recipients field for this email,
> please forward it to them and suggest that they subscribe to this list.
>
> Thanks,
>
> Ariel
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Important news about the November dumps run!

Ariel Glenn WMF
Rsync of xml/sql dumps to the web server is now running on a rolling basis
via a script, so you should see updates regularly rather than "every
$random hours".  There's more to be done on that front, see
https://phabricator.wikimedia.org/T179857 for what's next.

Ariel
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Important news about the November dumps run!

Nicolas Vervelle-4
Hi,

Are there problems with some dumps like frwiki with the new system ?
On your.org mirror, important files like page-articles are still missing
from the 20171103 dump directory, when usually it only takes a day...

Nico

On Mon, Nov 6, 2017 at 8:01 PM, Ariel Glenn WMF <[hidden email]> wrote:

> Rsync of xml/sql dumps to the web server is now running on a rolling basis
> via a script, so you should see updates regularly rather than "every
> $random hours".  There's more to be done on that front, see
> https://phabricator.wikimedia.org/T179857 for what's next.
>
> Ariel
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Important news about the November dumps run!

Ariel Glenn WMF
There are no problems that I see.  We did get started a couple days late
for this run due to the move to an internal server, but I see all jobs
running fine.  The frwiki page-articles dumps have not yet run; enwiki and
wikidatawiki are in progress; eswiki, itwiki, jawiki, and zhwiki are busy
writing pages-articles right now, etc.  Just give it another couple of days
:-)

Ariel

On Tue, Nov 7, 2017 at 7:28 PM, Nicolas Vervelle <[hidden email]>
wrote:

> Hi,
>
> Are there problems with some dumps like frwiki with the new system ?
> On your.org mirror, important files like page-articles are still missing
> from the 20171103 dump directory, when usually it only takes a day...
>
> Nico
>
> On Mon, Nov 6, 2017 at 8:01 PM, Ariel Glenn WMF <[hidden email]>
> wrote:
>
> > Rsync of xml/sql dumps to the web server is now running on a rolling
> basis
> > via a script, so you should see updates regularly rather than "every
> > $random hours".  There's more to be done on that front, see
> > https://phabricator.wikimedia.org/T179857 for what's next.
> >
> > Ariel
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l