Fwd: find previous section title

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: find previous section title

Moritz Schubotz-2
Dear all,

is there an API for extraction the previous section title from a wikipage?
My situation is the following. I have a wikipage that looks like that:
<page>
intro
==section1==
text
<math id=1>...</math>
text
<math id=2>...</math>
===section 2===
text
<math id=3>...</math>
</page>
And I want to know the previous section title for each math object in that page
1->section1
2->section1
3->section 2

It's certainly doable to write a program that extracts that that
information from the wikipage... but I guess seldom special cases
would cause a lot of long tail trouble.

So is there a API that could be used for that. Both parsoid or the old
regular parser works for me.

Best
Physikerwelt

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: find previous section title

Gabriel Wicke-3
Moritz,

you can certainly do this in HTML, either using the PHP parser output or
Parsoid. Parsoid output makes it easier to identify math extension output.
If you need the wikitext for the heading, then Parsoid can also give you
the source offsets of the that in data-parsoid (see the dsr property in
there, it encodes startOffset, endOffset, startTagWidth, endTagWidth).

Gabriel
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l