Fwd: Ongoing Labs Outage Update

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Ongoing Labs Outage Update

Yuvi Panda
---------- Forwarded message ----------
From: Yuvi Panda <[hidden email]>
Date: Thu, Jun 18, 2015 at 4:56 PM
Subject: Ongoing Labs Outage Update
To: [hidden email]


Yesterday, the filesystem used by many Labs tools suffered a
catastrophic failure, causing most tools to break. This was noticed
quickly but recovery is taking a long time because of the size of the
filesystem.

There have been file system corruption on the filesystem backing the
NFS setup that all of labs uses, causing a prolonged outage. The
Operations team is currently attempting to restore a backup made on
June 9 at 16:00 UTC. Recovery of modifications made after that date is
potentially possible, but our first priority is getting the backup
restored. We will update the incident report page
https://wikitech.wikimedia.org/wiki/Incident_documentation/20150617-LabsNFSOutage
with notes on our progress. E-mails will also be sent to the
labs-announce (https://lists.wikimedia.org/mailman/listinfo/labs-announce)
and labs-l (https://lists.wikimedia.org/mailman/listinfo/labs-l) on
significant changes. We are not yet able to estimate when things will
be back up fully.

This also means that tools hosted on tools.wmflabs.org will not be
accessible until this is finished, and even then they might need some
more fiddling to work properly. We will update
https://wikitech.wikimedia.org/wiki/Incident_documentation/20150617-LabsNFSOutage
as well as soon as we have more information.

If you have a non-tools project on labs that does not depend on NFS
and is currently down, you can recover it by getting rid of NFS. (We
can help you with that.) For instructions, see
https://wikitech.wikimedia.org/wiki/Recover_instance_from_NFS . Join
us on #wikimedia-labs and we will assist you.


--
Yuvi Panda T
http://yuvi.in/blog


--
Yuvi Panda T
http://yuvi.in/blog

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Ongoing Labs Outage Update

Edward Galvez
Thank you for working on this Yuvi and Labs team!

On Thu, Jun 18, 2015 at 9:00 AM, Yuvi Panda <[hidden email]> wrote:

> ---------- Forwarded message ----------
> From: Yuvi Panda <[hidden email]>
> Date: Thu, Jun 18, 2015 at 4:56 PM
> Subject: Ongoing Labs Outage Update
> To: [hidden email]
>
>
> Yesterday, the filesystem used by many Labs tools suffered a
> catastrophic failure, causing most tools to break. This was noticed
> quickly but recovery is taking a long time because of the size of the
> filesystem.
>
> There have been file system corruption on the filesystem backing the
> NFS setup that all of labs uses, causing a prolonged outage. The
> Operations team is currently attempting to restore a backup made on
> June 9 at 16:00 UTC. Recovery of modifications made after that date is
> potentially possible, but our first priority is getting the backup
> restored. We will update the incident report page
>
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20150617-LabsNFSOutage
> with notes on our progress. E-mails will also be sent to the
> labs-announce (https://lists.wikimedia.org/mailman/listinfo/labs-announce)
> and labs-l (https://lists.wikimedia.org/mailman/listinfo/labs-l) on
> significant changes. We are not yet able to estimate when things will
> be back up fully.
>
> This also means that tools hosted on tools.wmflabs.org will not be
> accessible until this is finished, and even then they might need some
> more fiddling to work properly. We will update
>
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20150617-LabsNFSOutage
> as well as soon as we have more information.
>
> If you have a non-tools project on labs that does not depend on NFS
> and is currently down, you can recover it by getting rid of NFS. (We
> can help you with that.) For instructions, see
> https://wikitech.wikimedia.org/wiki/Recover_instance_from_NFS . Join
> us on #wikimedia-labs and we will assist you.
>
>
> --
> Yuvi Panda T
> http://yuvi.in/blog
>
>
> --
> Yuvi Panda T
> http://yuvi.in/blog
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l




--
Edward Galvez
Program Evaluation Associate
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l