Fwd: Followup (Re: Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Followup (Re: Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018)

Erica Litrenta

Hey all, if you are not watching https://www.mediawiki.org/wiki/Parsing/Get_involved (and you should!),
here's an update that Subbu wanted to share with all of us about the progress on the current initiative to fix wikitext patterns that behave differently with RemexHTML.
Some wikis have really taken action, so that's good!, but some big communities (notably enwp) are really behind...
As a reminder, the numbers you'll see in some dashboards do not mean you need to fix millions of pages: most of the times you'll fix one template and most if not all of them will be gone.
Also: if you want to support the communities in figuring out what's to be done, note that  https://www.mediawiki.org/wiki/Parsing/Replacing_Tidy/FAQ#What_will_editors_need_to_do.3F is marked for translation, but you could probably focus just on section 5 (the one I linked).
Finally, the team is always, always available for questions and clarifications. Just leave a message on the talk for the FAQ page ;)

Best,
Elitre

-------- Forwarded Message --------
Subject: Followup (Re: Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018)
Date: Thu, 10 Aug 2017 14:42:31 -0400
From: Subramanya Sastry [hidden email]
To: Wikimedia developers [hidden email]


On 07/06/2017 08:02 AM, Subramanya Sastry wrote:
>
> TL;DR
> -----
> The Parsing team wants to replace Tidy with a RemexHTML-based solution on the
> Wikimedia cluster by June 2018. This will require editors to fix pages and
> templates to address wikitext patterns that behave differently with
> RemexHTML.  Please see 'What editors will need to do' section on the Tidy
> replacement FAQ [1].
>
......
>
> 9. Monitoring progress
> ----------------------
> In order to monitor progress, we plan to do a weekly (or some such periodic
> frequency) test run that compares the rendering of pages with Tidy and with
> RemexHTML on a large sample of pages (in the 50K range) from a large subset
> of Wikimedia wikis (~50 or so).  This will give us a pulse of how fixups are
> going, and when we might be able to flip the switch on different wikis.

I wanted to post some followups on this.

1. We have a revived dashboard that tracks linter error counts on wikis
   for all linter categories.

   See https://tools.wmflabs.org/wikitext-deprecation/

2. We track the error counts as they change and publish weekly snapshots
   comparing counts to a July 24th baseline (which is when I first
   started collecting stats)

   See https://www.mediawiki.org/wiki/Parsing/Replacing_Tidy/Linter/Stats

3. We also have a pixel-diffs test run (previously called visual diffs)
   that compares page rendering with Tidy and with RemexHTML. The test
   set has 73K pages sampled from 60 wikis. These diffs more accurately
   reflect what kind of rendering differences we can expect to see if
   pages are not fixed.

   See http://mw-expt-tests.wmflabs.org/

4. Based on the runs above, I identified one more high priority linter
   category which is a Tidy whitespace bug and needs to be fixed (expect
   mostly templates, especially navboxes based on what I've seen in the
   test run above). Once the code is reviewed and deployed to the
   cluster, we'll start populating this category.

   See https://gerrit.wikimedia.org/r/#/c/371068/ and 
https://gerrit.wikimedia.org/r/#/c/371071/

Thanks,
Subbu.


_______________________________________________
Translators-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/translators-l