[Wikimedia Technical Talks] Retargeting extensions to work with Parsoid

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[Wikimedia Technical Talks] Retargeting extensions to work with Parsoid

Sarah Rodlund
Hi Everyone,

Mark your calendars! Wikimedia Tech Talks 2020 Episode 6 will take
place on Wednesday
on 12 August 2020 at 17:00 UTC.

Title: Retargeting extensions to work with Parsoid

Speaker: Subramanya Sastry


The Parsing team is aiming to replace the core wikitext parser with Parsoid
for Wikimedia wikis sometime late next year. Parsoid models and processes
wikitext quite differently from the core parser (all that Parsoid
guarantees is that the rendering is largely identical, not the specific
process of generating the rendering). So, that does mean that extensions
that extend the behavior of the parser will need to adapt to work with
Parsoid instead to provide similar functionality [1]. With that in mind, we
have been working to more clearly specify how extensions need to adapt to
the Parsoid regime.

At a high level, here are the questions we needed to answer:
1) How do extensions "hook" into Parsoid?
2) When the registered hook listeners are invoked by Parsoid, how do they
process any wikitext they need to process?
3)  How is the extension's output assimilated into the page output?

Broadly, the (highly simplified) answers are as follows:
1) Extensions now need to think in terms of transformations (convert this
to that) instead of events (at this point in the pipeline, call this
listener). So, more transformation hooks, and less parsing-event hooks.
2) Parsoid provides all registered listeners with a ParsoidExtensionAPI
object to interact with it which extensions can use to process wikitext.
3) The output is treated as a "fully-processed" page/DOM fragment. It is
appropriately decorated with additional markup and slotted into place into
the page. Extensions need not make any special efforts (aka strip state) to
protect it from the parsing pipeline.

In this talk, we will go over the draft Parsoid API for extensions [2] and
the kind of changes that would need to be made. While in this initial
stage, we are primarily targeting extensions that are deployed on the
Wikimedia wikis, eventually, all MediaWiki extensions that use parser hooks
or use the "parser API" to process wikitext will need to change. We hope to
use this talk to reach out to MediaWiki extension developers and get
feedback about the draft API so we can refine it appropriately.

[1] https://phabricator.wikimedia.org/T258838

[2] https://www.mediawiki.org/wiki/Parsoid/Extension_API

The link to the Youtube Livestream can be found here:


During the live talk, you are invited to join the discussion on IRC at

You can browse past Tech Talks here:

If you are interested in giving your own tech talk, you can learn more here:


Sarah R. Rodlund
Senior Technical Writer, Developer Advocacy
[hidden email]
Wikitech-l mailing list
[hidden email]