Wikimedia production excellence (June 2019)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Wikimedia production excellence (June 2019)

📘 Read on Phabricator at

How’d we do in our strive for operational excellence last month? Read on to
find out!

##  📊 Month in numbers

* ⚠️11 documented incidents. [1]
* 39 new Wikimedia-prod-error reports. [2]
* 25 Wikimedia-prod-error reports closed. [3]

The number of incidents in June was high compared to previous years. At 11
incidents, this is higher than this year’s median (5), the 2018 median (4),
and the 2017 median (5). It is also higher than any month of June in the
last 4 years. – More data at <>.

To read more about these incidents, their investigations, and pending
actionables; check <>.

There are currently 204 open Wikimedia-prod-error reports (up from 186 in
April, and 201 in May). [4]


##  *️⃣ [Op-ed] Integrated maintenance cost

Hereby a shoutout to the Wikidata and Core Platform teams, at WMDE and WMF
respectively. They both recently established a rotating subteam that
focuses on incidental work. Such as maintenance, and other work that might
otherwise hinder feature development.

I expect this to improve efficiency by avoiding context switches between
feature and incidental work. The rotational aspect should distribute the
work more evenly among team members (avoiding burnout). And, it may
increase exposure to other teams, and lesser-known areas of our code; which
provide opportunities for personal growth and to retain institutional


##  📉  Current problems

Take a look at the workboard and look for tasks that might need your help.
The workboard lists known issues, grouped by the month in which they were
first observed.


Breakdown of recent months (past two weeks not included):

* November: 1 issue got fixed! (1 issue left).
* December: 3 issues left (unchanged). ⚠️
* January: 1 issue left (unchanged). ⚠️
* February: 2 issues left (unchanged). ⚠️
* March: 4 issues left (unchanged). ⚠️
* April: 2 issues got fixed! (10 of 14 issues, that survived April, remain
* May: 4 issues got fixed! (6 of 10 issues, that survived May, are left).
* June: 11 new issues from last month remain unresolved.

By steward and software component, the unresolved issues that survived June:

* CPT / MW Auth (PHP fatal):
* CPT / MW Actor (DB contention):
* CPT or Multimedia / Thumb handler (MultiCurl error):
* Multimedia / File metadata (PHP error):
* Wikidata / Commons page view (PHP fatal):
* Wikidata / Jobrunner (PHP memory fatal):
* Wikidata / Jobrunner (Trx error):
* Product-Infra / ReadingList API (PHP fatal):
* (Unknown?) / Special:ConfirmEmail (PHP fatal):
* (Unknown?) / Page renaming (DB timeout):
* (Unknown?) / Page renaming (Bad revision fatal):


##  🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or
resolving problems in Wikimedia production. Including: Brad Jorsch, Brion
Vibber, Roan Kattouw, CScott, Daniel Kinzler, David Causse, DerFussi,
Ebe123, Filippo Giunchedi, James Forrester, Kosta Harlan, Legoktm, Lucas
Werkmeister, Bartosz Dziewoński, Matthias Mullie, Michael Große, Niklas
Laxström, Stephane Bisson, Stas Malyshev, Tchanders, Gergő Tisza, Tpt,
Umherirrender, and Urbanecm.


Until next time,

– Timo Tijhof

🔮 “These are his marbles...” “Ha! He really did lose his marbles, didn't
he?” “Yeah, he lost them good.”



[1] Incidents. –

[2] Tasks created. –

[3] Tasks closed. –

[4] Open tasks. –
Wikitech-l mailing list
[hidden email]