Wikimedia production excellence (February 2019)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Wikimedia production excellence (February 2019)

Krinkle
📘 Read on Phabricator at
https://phabricator.wikimedia.org/phame/live/1/post/141/
-------

How’d we do in our strive for operational excellence? Read on to find out!

- Month in numbers.
- Current problems.
- Highlighted stories.

## 📊 *Month in numbers*

* 7 documented incidents. [1]
* 30 new Wikimedia-prod-error tasks created. [2] (17 new in Jan, and 18 in
Dec.)
* 27 Wikimedia-prod-error tasks closed. [3] (16 closed in Jan, and 20 in
Dec.)

There are in total 177 open Wikimedia-prod-error tasks today. (188 in Feb,
172 in Jan, and 165 in Dec.)

## 📉  *Current problems*

There’s been an increase in how many application errors are reported each
week. And, we’ve also managed to mostly keep up with those each week, so
that’s great!

But, it does appear that most weeks we accumulated one or two unresolved
errors, which is starting to add up. I believe this is mainly because they
were reported a day after the branch went out. That is, if the same issues
had been reported 24 hours earlier in a given week, then they might’ve
blocked the train as a regression.

→  https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Below is breakdown of unresolved prod errors since last quarter. (I’ve
omitted the last three weeks.)

By month:

  - February: 5 reports (1.33-wmf.16, 1.33-wmf.17, 1.33-wmf.18).
  - January: 3 reports (1.33-wmf.13, 1.33-wmf.14).
  - December 2018: 5 reports (1.33-wmf.9).
  - November 2018: 3 reports (1.33-wmf.2).
  - October 2018: 1 report (1.32-wmf.26).
  - September 2018: 2 reports (1.32-wmf.20).

By steward and software component:

Core Platform:
* Parser: https://phabricator.wikimedia.org/T216664.
* Revision backend: https://phabricator.wikimedia.org/T214035,
https://phabricator.wikimedia.org/T212428.

Growth:
* Echo: https://phabricator.wikimedia.org/T217079.
* Flow: https://phabricator.wikimedia.org/T212742,
https://phabricator.wikimedia.org/T204793.
* Page deletion: https://phabricator.wikimedia.org/T203913.

Wikidata:
* Wikibase: https://phabricator.wikimedia.org/T217329,
https://phabricator.wikimedia.org/T215380,
https://phabricator.wikimedia.org/T213483.
* WikibaseLexeme: https://phabricator.wikimedia.org/T207479,
https://phabricator.wikimedia.org/T200906.
* WikibaseQualityConstraints: https://phabricator.wikimedia.org/T212282.

Performance:
* Lib-rdmbs: https://phabricator.wikimedia.org/T212284.

Multimedia:
* MediaWiki uploading: https://phabricator.wikimedia.org/T208539.

Fundraising-Tech:
* CentralNotice: https://phabricator.wikimedia.org/T209741.

(Nobody - pending code ownership process):
* ImageMap extension: https://phabricator.wikimedia.org/T217087.
* Nuke extension: https://phabricator.wikimedia.org/T212690.

## *️⃣ *Fixed exposed fatal error on Special:Contributions*

Previously, a link to Special:Contributions could pass invalid options to a
part of MediaWiki that doesn’t allow invalid options. Why would anything
allow invalid options? Let’s find out.

Think about software as an onion. Software tends to have an outer layer
where everything is allowed. If this layer finds illegal user input, it has
to respond somehow. For example, by informing the user. In this outer
layer, illegal input is not a problem in the software. It is a normal thing
to see as we interact with the user. This outer layer responds directly to
a user, is translated, and can do things like “view recent changes”, “view
user contributions” or “rename a page”.

Internally, such action is divided into many smaller tasks (or functions).
For example, a function might be “get talk namespace for given subject
namespace”. This would answer “Talk:” to “(Article)”, and “Wikipedia_talk:”
to “Wikipedia:”. When searching for edits on My Contributions with
“Associated namespaces” ticked, this function is used. It is also used by
Move Page if renaming a page together with its talk page. And it’s used on
Recent Changes and View History, for all those little “talk” links next to
each page title and username.

If one of your edits is for a page that has no discussion namespace, what
should MediaWiki do? Show no edits? Skip that edit and tell the user “1
edit was hidden”? Show normally, but without a talk link? That decision is
made by the outer layer for a feature, when it catches the internal
exception. Alternatively, it can sometimes avoid an exception by asking a
different question first – a question that cannot fail. Such as “Does
namespace X have a talk space?”, instead of “What is the talk space for X?”.

When a program doesn’t catch or avoid an exception, a fatal error occurs.
Thanks to D3r1ck01 for fixing this fatal error. –
https://phabricator.wikimedia.org/T150324


💡*ProTip*: If the Jenkins build is failing for a project and you suspect
it’s unrelated to the project itself, be sure to report it to Phabricator
under “Shared Build Failure”, or use
https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?project=shared-build-failure
.


## 🎉 *Thanks!*

Thank you to everyone who has helped by reporting, investigating, or
resolving problems in Wikimedia production. Including: Aaron Schulz,
Addshore, Alaa Sarhan, Amorymeltzer, Anomie, D3r1ck01, Daimona, Daniel
Kinzler, Hashar, Hoo man, Jcrespo, KaMan, Mainframe98, Marostegui, Matej
Suchanek, Ottomata, Pchelolo, Reedy, Revi, Smalyshev, Tarrow, Tgr,
Thcipriani, Umherirrender, and Volker E.

Thanks!

Until next time,

– Timo Tijhof

🍏 *He got me invested in some kind of.. fruit company.*

-------

Footnotes:

[1] Incidents. –
https://wikitech.wikimedia.org/wiki/Special:AllPages?from=Incident+documentation%2F20190200&to=Incident+documentation%2F20190300&namespace=0

[2] Tasks created. –
https://phabricator.wikimedia.org/maniphest/query/a0yuo6bqDOrh/#R

[3] Tasks closed. –
https://phabricator.wikimedia.org/maniphest/query/7pmQcTvTWw_4/#R
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia production excellence (February 2019)

Derk-Jan Hartman
Just a note that I have really appreciated these overviews and I hope they
keep going strong !

DJ

On Thu, Mar 21, 2019 at 8:15 PM Krinkle <[hidden email]> wrote:

> 📘 Read on Phabricator at
> https://phabricator.wikimedia.org/phame/live/1/post/141/
> -------
>
> How’d we do in our strive for operational excellence? Read on to find out!
>
> - Month in numbers.
> - Current problems.
> - Highlighted stories.
>
> ## 📊 *Month in numbers*
>
> * 7 documented incidents. [1]
> * 30 new Wikimedia-prod-error tasks created. [2] (17 new in Jan, and 18 in
> Dec.)
> * 27 Wikimedia-prod-error tasks closed. [3] (16 closed in Jan, and 20 in
> Dec.)
>
> There are in total 177 open Wikimedia-prod-error tasks today. (188 in Feb,
> 172 in Jan, and 165 in Dec.)
>
> ## 📉  *Current problems*
>
> There’s been an increase in how many application errors are reported each
> week. And, we’ve also managed to mostly keep up with those each week, so
> that’s great!
>
> But, it does appear that most weeks we accumulated one or two unresolved
> errors, which is starting to add up. I believe this is mainly because they
> were reported a day after the branch went out. That is, if the same issues
> had been reported 24 hours earlier in a given week, then they might’ve
> blocked the train as a regression.
>
> →  https://phabricator.wikimedia.org/tag/wikimedia-production-error/
>
> Below is breakdown of unresolved prod errors since last quarter. (I’ve
> omitted the last three weeks.)
>
> By month:
>
>   - February: 5 reports (1.33-wmf.16, 1.33-wmf.17, 1.33-wmf.18).
>   - January: 3 reports (1.33-wmf.13, 1.33-wmf.14).
>   - December 2018: 5 reports (1.33-wmf.9).
>   - November 2018: 3 reports (1.33-wmf.2).
>   - October 2018: 1 report (1.32-wmf.26).
>   - September 2018: 2 reports (1.32-wmf.20).
>
> By steward and software component:
>
> Core Platform:
> * Parser: https://phabricator.wikimedia.org/T216664.
> * Revision backend: https://phabricator.wikimedia.org/T214035,
> https://phabricator.wikimedia.org/T212428.
>
> Growth:
> * Echo: https://phabricator.wikimedia.org/T217079.
> * Flow: https://phabricator.wikimedia.org/T212742,
> https://phabricator.wikimedia.org/T204793.
> * Page deletion: https://phabricator.wikimedia.org/T203913.
>
> Wikidata:
> * Wikibase: https://phabricator.wikimedia.org/T217329,
> https://phabricator.wikimedia.org/T215380,
> https://phabricator.wikimedia.org/T213483.
> * WikibaseLexeme: https://phabricator.wikimedia.org/T207479,
> https://phabricator.wikimedia.org/T200906.
> * WikibaseQualityConstraints: https://phabricator.wikimedia.org/T212282.
>
> Performance:
> * Lib-rdmbs: https://phabricator.wikimedia.org/T212284.
>
> Multimedia:
> * MediaWiki uploading: https://phabricator.wikimedia.org/T208539.
>
> Fundraising-Tech:
> * CentralNotice: https://phabricator.wikimedia.org/T209741.
>
> (Nobody - pending code ownership process):
> * ImageMap extension: https://phabricator.wikimedia.org/T217087.
> * Nuke extension: https://phabricator.wikimedia.org/T212690.
>
> ## *️⃣ *Fixed exposed fatal error on Special:Contributions*
>
> Previously, a link to Special:Contributions could pass invalid options to a
> part of MediaWiki that doesn’t allow invalid options. Why would anything
> allow invalid options? Let’s find out.
>
> Think about software as an onion. Software tends to have an outer layer
> where everything is allowed. If this layer finds illegal user input, it has
> to respond somehow. For example, by informing the user. In this outer
> layer, illegal input is not a problem in the software. It is a normal thing
> to see as we interact with the user. This outer layer responds directly to
> a user, is translated, and can do things like “view recent changes”, “view
> user contributions” or “rename a page”.
>
> Internally, such action is divided into many smaller tasks (or functions).
> For example, a function might be “get talk namespace for given subject
> namespace”. This would answer “Talk:” to “(Article)”, and “Wikipedia_talk:”
> to “Wikipedia:”. When searching for edits on My Contributions with
> “Associated namespaces” ticked, this function is used. It is also used by
> Move Page if renaming a page together with its talk page. And it’s used on
> Recent Changes and View History, for all those little “talk” links next to
> each page title and username.
>
> If one of your edits is for a page that has no discussion namespace, what
> should MediaWiki do? Show no edits? Skip that edit and tell the user “1
> edit was hidden”? Show normally, but without a talk link? That decision is
> made by the outer layer for a feature, when it catches the internal
> exception. Alternatively, it can sometimes avoid an exception by asking a
> different question first – a question that cannot fail. Such as “Does
> namespace X have a talk space?”, instead of “What is the talk space for
> X?”.
>
> When a program doesn’t catch or avoid an exception, a fatal error occurs.
> Thanks to D3r1ck01 for fixing this fatal error. –
> https://phabricator.wikimedia.org/T150324
>
>
> 💡*ProTip*: If the Jenkins build is failing for a project and you suspect
> it’s unrelated to the project itself, be sure to report it to Phabricator
> under “Shared Build Failure”, or use
>
> https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?project=shared-build-failure
> .
>
>
> ## 🎉 *Thanks!*
>
> Thank you to everyone who has helped by reporting, investigating, or
> resolving problems in Wikimedia production. Including: Aaron Schulz,
> Addshore, Alaa Sarhan, Amorymeltzer, Anomie, D3r1ck01, Daimona, Daniel
> Kinzler, Hashar, Hoo man, Jcrespo, KaMan, Mainframe98, Marostegui, Matej
> Suchanek, Ottomata, Pchelolo, Reedy, Revi, Smalyshev, Tarrow, Tgr,
> Thcipriani, Umherirrender, and Volker E.
>
> Thanks!
>
> Until next time,
>
> – Timo Tijhof
>
> 🍏 *He got me invested in some kind of.. fruit company.*
>
> -------
>
> Footnotes:
>
> [1] Incidents. –
>
> https://wikitech.wikimedia.org/wiki/Special:AllPages?from=Incident+documentation%2F20190200&to=Incident+documentation%2F20190300&namespace=0
>
> [2] Tasks created. –
> https://phabricator.wikimedia.org/maniphest/query/a0yuo6bqDOrh/#R
>
> [3] Tasks closed. –
> https://phabricator.wikimedia.org/maniphest/query/7pmQcTvTWw_4/#R
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l