war on Cite/{{cite}}

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

war on Cite/{{cite}}

Domas Mituzas
Hello,

I understand the need for cite, thats why it is still there :) But...

- We format Cite references list every 100th request to backend,  
though it takes 8.15% backend response time (thanks parser cache,  
without it Cite formatting would take 815% cluster time - though  
developers should understand I'm not exactly right at this hyperbole ;-)

- When parsing articles like one of most popular today,  
[[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the  
page, 17s is spent on Cite block, executing {{cite}} mostly. That  
makes every editor wait for ages to get a page displayed, and due to  
cache stampede after invalidation it causes considerable stress on  
site (look at numbers mentioned above).

- This 8% is in real-time, which includes waiting for search,  
databases, and simply CPU contention, which we end up having today.  
CPU-time wise it is way higher, so can actually have 20% CPU time  
impact on our application farm. Thats at least 100k$ worth of hardware  
(and rising), even if new/modern one, just for citation formatting.

So, a checklist what can be done ( simple to complex )

[  ] - Simplification of {{cite}}
[  ] - Separate cache for Cite, to avoid reparsing on minor edits,  
that don't involve citations. I have no idea how much this would win,  
but there is theoretical chance of stripping 1% or so. ;)
[  ] - Offload some templates like {{cite}} to actual PHP extensions  
(can of worms, but, oh well, can be standardized process too)
[  ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua 
  - another can of worms, though yet again, can be managed via trusted  
set of people, on top20 wikis or so).
[  ] - Frustrated operations guy adding something like ( return ""; )  
in some random extension, and syncing the live hack. Obviously there  
would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.

I for one can directly participate in at least two of these options. ;-)

Unfortunately, {{cite}} is the only template I can profile/account for  
now, we don't have proper per-template profiling, but I wish to get  
one some day. Then we'd have more "war on ..." topics ;-D

Generally, templates are major part of our parsing, and thats over 50%  
of our current cluster CPU load.
As we've actually managed to hit 100% last week, something what hasn't  
happened for a while, some of work has to be done here.

Of course, new hardware will help for a while, but I for one have huge  
personal satisfaction saving donation money. ;-)

CHEERS!
--
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Marco Schuster-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Jan 31, 2009 at 2:03 PM, Domas Mituzas  wrote:
> Hello,
>
> I understand the need for cite, thats why it is still there :) But...
> (...)
What about converting these to ref tags?

> Unfortunately, {{cite}} is the only template I can profile/account for
> now, we don't have proper per-template profiling, but I wish to get
> one some day. Then we'd have more "war on ..." topics ;-D
Stub templates, for example :D

> Generally, templates are major part of our parsing, and thats over 50%
> of our current cluster CPU load.
Wow. Can you compare the load to the systems with the load caused by
solely using  tags?

Marco
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2)

iD8DBQFJhG4xW6S2GapJUuQRAsQdAJ0WHP1DfI0+5BF5s0PYlHe6Ax5rPwCfRXax
f/yjmuQRbPinnl4mzvRWCtw=
=F6F1
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

KALAN-4
In reply to this post by Domas Mituzas
Domas, have you performed any further analysis to figure out _how_ can
be the template optimized? Would, say, reducing size help or
complicatedness caused by it outweighs the advantage?

— Kalan

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Robert Rohde
In reply to this post by Domas Mituzas
A long while ago I remember looking at the parser and realizing that
the recursive template expansion and argument handling led the parser
to run all branches of #if and #switch statements before deciding
which one to include.

In other words, given {{#if: something | statements_A | statements_B
}}, the parser was fully expanding both statements_A and statements_B
before checking #if to decide which one to keep.  Obviously that is
inefficient and in the case of very complicated conditional templates
potentially very expensive.

The parser has changed so much since I last worked with it that I am
having difficulty figuring out if this is still true.  Hopefully,
someone already went through and improved the branch handling logic,
but if not, I would suggest that this would also be a good generalized
target for improving template operation.

-Robert Rohde


On Sat, Jan 31, 2009 at 5:03 AM, Domas Mituzas <[hidden email]> wrote:

> Hello,
>
> I understand the need for cite, thats why it is still there :) But...
>
> - We format Cite references list every 100th request to backend,
> though it takes 8.15% backend response time (thanks parser cache,
> without it Cite formatting would take 815% cluster time - though
> developers should understand I'm not exactly right at this hyperbole ;-)
>
> - When parsing articles like one of most popular today,
> [[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the
> page, 17s is spent on Cite block, executing {{cite}} mostly. That
> makes every editor wait for ages to get a page displayed, and due to
> cache stampede after invalidation it causes considerable stress on
> site (look at numbers mentioned above).
>
> - This 8% is in real-time, which includes waiting for search,
> databases, and simply CPU contention, which we end up having today.
> CPU-time wise it is way higher, so can actually have 20% CPU time
> impact on our application farm. Thats at least 100k$ worth of hardware
> (and rising), even if new/modern one, just for citation formatting.
>
> So, a checklist what can be done ( simple to complex )
>
> [  ] - Simplification of {{cite}}
> [  ] - Separate cache for Cite, to avoid reparsing on minor edits,
> that don't involve citations. I have no idea how much this would win,
> but there is theoretical chance of stripping 1% or so. ;)
> [  ] - Offload some templates like {{cite}} to actual PHP extensions
> (can of worms, but, oh well, can be standardized process too)
> [  ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
>  - another can of worms, though yet again, can be managed via trusted
> set of people, on top20 wikis or so).
> [  ] - Frustrated operations guy adding something like ( return ""; )
> in some random extension, and syncing the live hack. Obviously there
> would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.
>
> I for one can directly participate in at least two of these options. ;-)
>
> Unfortunately, {{cite}} is the only template I can profile/account for
> now, we don't have proper per-template profiling, but I wish to get
> one some day. Then we'd have more "war on ..." topics ;-D
>
> Generally, templates are major part of our parsing, and thats over 50%
> of our current cluster CPU load.
> As we've actually managed to hit 100% last week, something what hasn't
> happened for a while, some of work has to be done here.
>
> Of course, new hardware will help for a while, but I for one have huge
> personal satisfaction saving donation money. ;-)
>
> CHEERS!
> --
> Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Alex Zaddach
In reply to this post by Domas Mituzas
Domas Mituzas wrote:
>
> So, a checklist what can be done ( simple to complex )
>
> [  ] - Simplification of {{cite}}

Short of significant improvements to the parser or requireing people to
ask Domas before editing the template, I can

> [  ] - Separate cache for Cite, to avoid reparsing on minor edits,  
> that don't involve citations. I have no idea how much this would win,  
> but there is theoretical chance of stripping 1% or so. ;)
> [  ] - Offload some templates like {{cite}} to actual PHP extensions  
> (can of worms, but, oh well, can be standardized process too)

I've actually considered something like this in the past, basically
creating a Cite 2.0 extension, where all the main cite options would be
in the <ref> tags themselves with pre-defined "templates" written in PHP
for web citations, book citations, etc.; this would greatly reduce the
amount of  stuff that needs to be done using the Cite wiki-templates and
run through the parser.

You would have something like:

<ref author="Foo" title="Bar" type="book">Pages 1-10</ref>

Any parameters in the ref tag would be converted to HTML output using
the "book" template in the extension rather than a thousand parser
functions in some meta-template, and only the content of the tag (the
page numbers in this case) would have to be run through the parser, so
it would also be backwards-compatible with the current templates until
they can all be migrated.

The main downside to this is that it requires someone to file a Bugzilla
request every time a template needs changing.

> [  ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua 
>   - another can of worms, though yet again, can be managed via trusted  
> set of people, on top20 wikis or so).
> [  ] - Frustrated operations guy adding something like ( return ""; )  
> in some random extension, and syncing the live hack. Obviously there  
> would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.
>
> I for one can directly participate in at least two of these options. ;-)
>
> Unfortunately, {{cite}} is the only template I can profile/account for  
> now, we don't have proper per-template profiling, but I wish to get  
> one some day. Then we'd have more "war on ..." topics ;-D
>
> Generally, templates are major part of our parsing, and thats over 50%  
> of our current cluster CPU load.
> As we've actually managed to hit 100% last week, something what hasn't  
> happened for a while, some of work has to be done here.
>
> Of course, new hardware will help for a while, but I for one have huge  
> personal satisfaction saving donation money. ;-)
>
> CHEERS!


--
Alex (wikipedia:en:User:Mr.Z-man)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Platonides
In reply to this post by Robert Rohde
Would storing an intermediate template improve things?
I mean, keep a template but where the inner templates are substed,
depending on the original parameters.



Robert Rohde wrote:

> A long while ago I remember looking at the parser and realizing that
> the recursive template expansion and argument handling led the parser
> to run all branches of #if and #switch statements before deciding
> which one to include.
>
> In other words, given {{#if: something | statements_A | statements_B
> }}, the parser was fully expanding both statements_A and statements_B
> before checking #if to decide which one to keep.  Obviously that is
> inefficient and in the case of very complicated conditional templates
> potentially very expensive.

The new preprocessor don't follow unused branches (or so were we told ;).

http://en.wikipedia.org/wiki/Template:Citation/core screams for having loops


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Chad
In reply to this post by Alex Zaddach
On Sat, Jan 31, 2009 at 1:28 PM, Alex <[hidden email]> wrote:

> Domas Mituzas wrote:
> >
> > So, a checklist what can be done ( simple to complex )
> >
> > [  ] - Simplification of {{cite}}
>
> Short of significant improvements to the parser or requireing people to
> ask Domas before editing the template, I can
>
> > [  ] - Separate cache for Cite, to avoid reparsing on minor edits,
> > that don't involve citations. I have no idea how much this would win,
> > but there is theoretical chance of stripping 1% or so. ;)
> > [  ] - Offload some templates like {{cite}} to actual PHP extensions
> > (can of worms, but, oh well, can be standardized process too)
>
> I've actually considered something like this in the past, basically
> creating a Cite 2.0 extension, where all the main cite options would be
> in the <ref> tags themselves with pre-defined "templates" written in PHP
> for web citations, book citations, etc.; this would greatly reduce the
> amount of  stuff that needs to be done using the Cite wiki-templates and
> run through the parser.
>
> You would have something like:
>
> <ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
>
> Any parameters in the ref tag would be converted to HTML output using
> the "book" template in the extension rather than a thousand parser
> functions in some meta-template, and only the content of the tag (the
> page numbers in this case) would have to be run through the parser, so
> it would also be backwards-compatible with the current templates until
> they can all be migrated.
>
> The main downside to this is that it requires someone to file a Bugzilla
> request every time a template needs changing.
>

What about throwing them in MediaWiki: space, similar to editnotices?
At least then they could be cached to hell and back in the message
cache.

-Chad
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Alex Zaddach
Chad wrote:

> On Sat, Jan 31, 2009 at 1:28 PM, Alex <[hidden email]> wrote:
>
>> Domas Mituzas wrote:
>>> So, a checklist what can be done ( simple to complex )
>>>
>>> [  ] - Simplification of {{cite}}
>> Short of significant improvements to the parser or requireing people to
>> ask Domas before editing the template, I can
>>
>>> [  ] - Separate cache for Cite, to avoid reparsing on minor edits,
>>> that don't involve citations. I have no idea how much this would win,
>>> but there is theoretical chance of stripping 1% or so. ;)
>>> [  ] - Offload some templates like {{cite}} to actual PHP extensions
>>> (can of worms, but, oh well, can be standardized process too)
>> I've actually considered something like this in the past, basically
>> creating a Cite 2.0 extension, where all the main cite options would be
>> in the <ref> tags themselves with pre-defined "templates" written in PHP
>> for web citations, book citations, etc.; this would greatly reduce the
>> amount of  stuff that needs to be done using the Cite wiki-templates and
>> run through the parser.
>>
>> You would have something like:
>>
>> <ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
>>
>> Any parameters in the ref tag would be converted to HTML output using
>> the "book" template in the extension rather than a thousand parser
>> functions in some meta-template, and only the content of the tag (the
>> page numbers in this case) would have to be run through the parser, so
>> it would also be backwards-compatible with the current templates until
>> they can all be migrated.
>>
>> The main downside to this is that it requires someone to file a Bugzilla
>> request every time a template needs changing.
>>
>
> What about throwing them in MediaWiki: space, similar to editnotices?
> At least then they could be cached to hell and back in the message
> cache.
>
> -Chad

I considered that as well, but I'm not sure how much that will actually
help. Looking at
http://en.wikipedia.org/wiki/Joe%20the%20Plumber?action=purge&forceprofile=true

it took 21.796 seconds to load, most of which seems be from
Parser::recursiveTagParse, about 90% of that that is from
Cite::referencesFormat-parse. Even if the templates themselves are
heavily cached, it still has to run all the conditionals and formatting
through the parser. Heavy caching might help if there's lots of refs
with the same content on multiple pages, but I don't think that's very
common.

--
Alex (wikipedia:en:User:Mr.Z-man)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

K. Peachey
In reply to this post by Marco Schuster-2
>> I understand the need for cite, thats why it is still there :) But...
>> (...)
> What about converting these to ref tags?
Unfortunately most of those are designed to format the ref's to a
"proper" standard that we use (Harvard/MLA standard iirc) and are
designed to easily updated when we change out standards (eg: recently
the "pages" value changed in one of the cite templates and a bot when
though and fixed them all)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Chad
In reply to this post by Alex Zaddach
On Sat, Jan 31, 2009 at 5:37 PM, Alex <[hidden email]> wrote:

> Chad wrote:
> > On Sat, Jan 31, 2009 at 1:28 PM, Alex <[hidden email]> wrote:
> >
> >> Domas Mituzas wrote:
> >>> So, a checklist what can be done ( simple to complex )
> >>>
> >>> [  ] - Simplification of {{cite}}
> >> Short of significant improvements to the parser or requireing people to
> >> ask Domas before editing the template, I can
> >>
> >>> [  ] - Separate cache for Cite, to avoid reparsing on minor edits,
> >>> that don't involve citations. I have no idea how much this would win,
> >>> but there is theoretical chance of stripping 1% or so. ;)
> >>> [  ] - Offload some templates like {{cite}} to actual PHP extensions
> >>> (can of worms, but, oh well, can be standardized process too)
> >> I've actually considered something like this in the past, basically
> >> creating a Cite 2.0 extension, where all the main cite options would be
> >> in the <ref> tags themselves with pre-defined "templates" written in PHP
> >> for web citations, book citations, etc.; this would greatly reduce the
> >> amount of  stuff that needs to be done using the Cite wiki-templates and
> >> run through the parser.
> >>
> >> You would have something like:
> >>
> >> <ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
> >>
> >> Any parameters in the ref tag would be converted to HTML output using
> >> the "book" template in the extension rather than a thousand parser
> >> functions in some meta-template, and only the content of the tag (the
> >> page numbers in this case) would have to be run through the parser, so
> >> it would also be backwards-compatible with the current templates until
> >> they can all be migrated.
> >>
> >> The main downside to this is that it requires someone to file a Bugzilla
> >> request every time a template needs changing.
> >>
> >
> > What about throwing them in MediaWiki: space, similar to editnotices?
> > At least then they could be cached to hell and back in the message
> > cache.
> >
> > -Chad
>
> I considered that as well, but I'm not sure how much that will actually
> help. Looking at
>
> http://en.wikipedia.org/wiki/Joe%20the%20Plumber?action=purge&forceprofile=true
>
> it took 21.796 seconds to load, most of which seems be from
> Parser::recursiveTagParse, about 90% of that that is from
> Cite::referencesFormat-parse. Even if the templates themselves are
> heavily cached, it still has to run all the conditionals and formatting
> through the parser. Heavy caching might help if there's lots of refs
> with the same content on multiple pages, but I don't think that's very
> common.
>
> --
> Alex (wikipedia:en:User:Mr.Z-man)
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

Throw a caching layer on top of it. Do a final expansion until final
substitution at the {{cite book}} etc level. Then you've got less to
recursively parse.

-Chad
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Aryeh Gregor
In reply to this post by Domas Mituzas
On Sat, Jan 31, 2009 at 8:03 AM, Domas Mituzas <[hidden email]> wrote:
> [  ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
>  - another can of worms, though yet again, can be managed via trusted
> set of people, on top20 wikis or so).

This seems like it's the only solution from your list that would be
generally applicable to similar future scenarios.  I don't think the
users would have to be particularly trusted -- just make sure that the
runtime of the programs is limited, and that it's properly sandboxed
(is the Lua PECL extension sandboxed?).

Another thought that occurs to me is to cache the output of templates
as a function of their parameters and any appropriate variables they
use (like {{PAGENAME}} or {{CURRENTDAY}}).  Then a reparse of a
template-heavy page will generally only have to reparse templates if
the parameters to the template have changed.  This will save a lot on
Cite, infoboxes, etc.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Platonides
Aryeh Gregor wrote:

> On Sat, Jan 31, 2009 at 8:03 AM, Domas Mituzas <[hidden email]> wrote:
>> [  ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
>>  - another can of worms, though yet again, can be managed via trusted
>> set of people, on top20 wikis or so).
>
> This seems like it's the only solution from your list that would be
> generally applicable to similar future scenarios.  I don't think the
> users would have to be particularly trusted -- just make sure that the
> runtime of the programs is limited, and that it's properly sandboxed
> (is the Lua PECL extension sandboxed?).

That would be like adding a dependancy on Lua extension for reusers, as
the core templates will be implemented in Lua.
And I don't think worth reimplementing a Lua interpreter in php...


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Aryeh Gregor
On Sat, Jan 31, 2009 at 8:19 PM, Platonides <[hidden email]> wrote:
> That would be like adding a dependancy on Lua extension for reusers, as
> the core templates will be implemented in Lua.

Yes, that would be the major disadvantage I can see.  In practice,
nobody can reuse large chunks of Wikipedia content on shared hosting
anyway, since it's way too big, but it would be a serious obstacle for
people who want to reuse only parts of Wikipedia.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Daniel Friesen
^_^ Wikipedia is already a horrible place to copy templates from. Unlike
Wikipedia most other MW installations don't bother turning on Tidy, and
Wikipedia abuses that /feature/ way to much.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com]
-Nadir-Point (http://nadir-point.com)
-Wiki-Tools (http://wiki-tools.com)
-MonkeyScript (http://monkeyscript.nadir-point.com)
-Animepedia (http://anime.wikia.com)
-Narutopedia (http://naruto.wikia.com)
-Soul Eater Wiki (http://souleater.wikia.com)



Aryeh Gregor wrote:

> On Sat, Jan 31, 2009 at 8:19 PM, Platonides <[hidden email]> wrote:
>  
>> That would be like adding a dependancy on Lua extension for reusers, as
>> the core templates will be implemented in Lua.
>>    
>
> Yes, that would be the major disadvantage I can see.  In practice,
> nobody can reuse large chunks of Wikipedia content on shared hosting
> anyway, since it's way too big, but it would be a serious obstacle for
> people who want to reuse only parts of Wikipedia.
>  


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Gerard Meijssen-3
Hoi,
Let us please appreciate what is being said here: "Wikipedia is a horrible
place to copy templates from". We pride ourselves of being open source and
the current templates make us as bad as the worst proprietary vendor. We
have what is effectively an API and it is not documented at all.
Thanks,
      GerardM

2009/2/1 Daniel Friesen <[hidden email]>

> ^_^ Wikipedia is already a horrible place to copy templates from. Unlike
> Wikipedia most other MW installations don't bother turning on Tidy, and
> Wikipedia abuses that /feature/ way to much.
>
> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com]
> -Nadir-Point (http://nadir-point.com)
> -Wiki-Tools (http://wiki-tools.com)
> -MonkeyScript (http://monkeyscript.nadir-point.com)
> -Animepedia (http://anime.wikia.com)
> -Narutopedia (http://naruto.wikia.com)
> -Soul Eater Wiki (http://souleater.wikia.com)
>
>
>
> Aryeh Gregor wrote:
> > On Sat, Jan 31, 2009 at 8:19 PM, Platonides <[hidden email]>
> wrote:
> >
> >> That would be like adding a dependancy on Lua extension for reusers, as
> >> the core templates will be implemented in Lua.
> >>
> >
> > Yes, that would be the major disadvantage I can see.  In practice,
> > nobody can reuse large chunks of Wikipedia content on shared hosting
> > anyway, since it's way too big, but it would be a serious obstacle for
> > people who want to reuse only parts of Wikipedia.
> >
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Chad
How is the api not documented? Between the docs on
Mediawiki.org and the fact that every parameter is
documented (with examples), I'd say its highly
documented.

-Chad

On Feb 1, 2009 12:18 AM, "Gerard Meijssen" <[hidden email]>
wrote:

Hoi,
Let us please appreciate what is being said here: "Wikipedia is a horrible
place to copy templates from". We pride ourselves of being open source and
the current templates make us as bad as the worst proprietary vendor. We
have what is effectively an API and it is not documented at all.
Thanks,
     GerardM

2009/2/1 Daniel Friesen <[hidden email]>

> ^_^ Wikipedia is already a horrible place to copy templates from. Unlike >
Wikipedia most other M...
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Tim Starling-2
In reply to this post by Robert Rohde
Robert Rohde wrote:

> A long while ago I remember looking at the parser and realizing that
> the recursive template expansion and argument handling led the parser
> to run all branches of #if and #switch statements before deciding
> which one to include.
>
> In other words, given {{#if: something | statements_A | statements_B
> }}, the parser was fully expanding both statements_A and statements_B
> before checking #if to decide which one to keep.  Obviously that is
> inefficient and in the case of very complicated conditional templates
> potentially very expensive.
>
> The parser has changed so much since I last worked with it that I am
> having difficulty figuring out if this is still true.  Hopefully,
> someone already went through and improved the branch handling logic,
> but if not, I would suggest that this would also be a good generalized
> target for improving template operation.

No it's not still true, yes dead branches are now eliminated. This was
done at a significant cost to code complexity and there was quite a lot of
overhead. The elimination of dead branches is the only reason the new
parser has comparable performance to the old parser, otherwise it would
have been slower.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Robert Rohde
In reply to this post by Gerard Meijssen-3
On Sat, Jan 31, 2009 at 9:16 PM, Gerard Meijssen
<[hidden email]> wrote:
> Hoi,
> Let us please appreciate what is being said here: "Wikipedia is a horrible
> place to copy templates from". We pride ourselves of being open source and
> the current templates make us as bad as the worst proprietary vendor. We
> have what is effectively an API and it is not documented at all.
> Thanks,
>      GerardM

Actually, I think Daniel had a somewhat different point.

Wikimedia uses Tidy which does a good job at closing dangling format
tags.  A very substantial fraction of our templates actually have
dangling divs, and tables, and other bad syntax that Tidy is covering
up for us.  Anyone who has ever tried to copy Wikimedia templates into
a wiki with Tidy turned off (the default setting) knows that many of
our templates will actually return a lot of junk.

Strictly speaking it should be the editors' job to properly close
tables and divs, etc., but because Tidy is so good at it they don't
have to, which makes our wikicode less portable.

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

K. Peachey
In reply to this post by Chad
> How is the api not documented? Between the docs on
> Mediawiki.org and the fact that every parameter is
> documented (with examples), I'd say its highly
> documented.
I think he means on wiki, most people probably won't know to look for
information on how to use it at the main/official mediawiki wiki and
just go by the scraps they can find on whatever local wiki they are on
(in this case en.wiki).

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: war on Cite/{{cite}}

Tim Starling-2
In reply to this post by Domas Mituzas
Domas Mituzas wrote:
> - When parsing articles like one of most popular today,  
> [[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the  
> page, 17s is spent on Cite block, executing {{cite}} mostly. That  
> makes every editor wait for ages to get a page displayed, and due to  
> cache stampede after invalidation it causes considerable stress on  
> site (look at numbers mentioned above).

Can you say how you measured this? What function you patched, what the
code was, etc.?

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
12