Expensive parser function count

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Expensive parser function count

Alex Brollo
Browsing the html code of source pages, I found this statement into a html
comment:

*Expensive parser function count: 0/500*

I'd like to use this statement to evaluate "lightness" of a page, mainly
testing the expensiveness of templates into the page but: in your opinion,
given that the best would be a 0/500 value, what are limits for a good,
moderately complex, complex page, just to have a try to work about? What is
a really alarming value that needs fast fixing?

And - wouldn't a good idea to display - just with a very small mark or
string into a corner of the page - this datum into the page, allowing a fast
feedback?

Alex
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Alex Zaddach
On 1/5/2011 8:07 PM, Alex Brollo wrote:

> Browsing the html code of source pages, I found this statement into a html
> comment:
>
> *Expensive parser function count: 0/500*
>
> I'd like to use this statement to evaluate "lightness" of a page, mainly
> testing the expensiveness of templates into the page but: in your opinion,
> given that the best would be a 0/500 value, what are limits for a good,
> moderately complex, complex page, just to have a try to work about? What is
> a really alarming value that needs fast fixing?
>
> And - wouldn't a good idea to display - just with a very small mark or
> string into a corner of the page - this datum into the page, allowing a fast
> feedback?
>

The expensive parser function count only counts the use of a few
functions when they do a DB query, PAGESINCATEGORY, PAGESIZE, and
#ifexist are the only ones I know of. While a page that uses a lot of
these would likely be slow, these aren't heavily used functions, and a
page might be slow even if it uses zero.

The other 3 limits: Preprocessor node count, Post-expand include size,
and Template argument size are probably better for a measurement of
complexity, though I don't know what a "typical" value for these might be.

--
Alex (wikipedia:en:User:Mr.Z-man)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Alex Brollo
2011/1/6 Alex <[hidden email]>

> On 1/5/2011 8:07 PM, Alex Brollo wrote:
>
> The expensive parser function count only counts the use of a few
> functions when they do a DB query, PAGESINCATEGORY, PAGESIZE, and
> #ifexist are the only ones I know of. While a page that uses a lot of
> these would likely be slow, these aren't heavily used functions, and a
> page might be slow even if it uses zero.
>
> The other 3 limits: Preprocessor node count, Post-expand include size,
> and Template argument size are probably better for a measurement of
> complexity, though I don't know what a "typical" value for these might be.
>

Thanks. I would appeciate an algoritm to evaluate those parameters together,
weighting them to have a significant, sigle "heavyness index."  of the page.
Then, removing or adding code would easily allow to mark the critical code.

Alex ("the other one") :-)
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Aryeh Gregor
In reply to this post by Alex Brollo
On Wed, Jan 5, 2011 at 8:07 PM, Alex Brollo <[hidden email]> wrote:

> Browsing the html code of source pages, I found this statement into a html
> comment:
>
> *Expensive parser function count: 0/500*
>
> I'd like to use this statement to evaluate "lightness" of a page, mainly
> testing the expensiveness of templates into the page but: in your opinion,
> given that the best would be a 0/500 value, what are limits for a good,
> moderately complex, complex page, just to have a try to work about? What is
> a really alarming value that needs fast fixing?

A really alarming value that needs fast fixing would be, approximately
speaking, 501 or higher.  That's why the maximum is there.  We don't
leave fixing this kind of thing to users.

> And - wouldn't a good idea to display - just with a very small mark or
> string into a corner of the page - this datum into the page, allowing a fast
> feedback?

No.  It's only meant for debugging when you run over the limit and the
page stops working.  It can help you track down why the page isn't
working, and isolate the templates that are causing the problem.  The
same goes for the other limits.

If you want to detect whether a page is rendering too slowly, just try
action=purge and see how long it takes.  If it takes more than a few
seconds, you probably want to improve it, because that's how long it
will take to render for a lot of logged-in users (parser cache hides
this if you have default preferences, including for all anons).  We're
forced to use artificial metrics when imposing automatic limits on
page rendering only because the time it takes to parse a page isn't
reliable, and using it as an automatic limit would make parsing
non-deterministic.  For manual inspection, you should just use time to
parse, not any artificial metrics.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Tim Starling-2
On 07/01/11 07:50, Aryeh Gregor wrote:

> On Wed, Jan 5, 2011 at 8:07 PM, Alex Brollo <[hidden email]> wrote:
>> Browsing the html code of source pages, I found this statement into a html
>> comment:
>>
>> *Expensive parser function count: 0/500*
>>
>> I'd like to use this statement to evaluate "lightness" of a page, mainly
>> testing the expensiveness of templates into the page but: in your opinion,
>> given that the best would be a 0/500 value, what are limits for a good,
>> moderately complex, complex page, just to have a try to work about? What is
>> a really alarming value that needs fast fixing?
>
> A really alarming value that needs fast fixing would be, approximately
> speaking, 501 or higher.  That's why the maximum is there.  We don't
> leave fixing this kind of thing to users.

I think the maximum was set to 100 initially, and raised to 500 due to
user complaints. I'd be completely happy if users fixed all the
templates that caused pages to use more than 100, then we could put
the limit back down.

-- Tim


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

MZMcBride-2
Tim Starling wrote:

> On 07/01/11 07:50, Aryeh Gregor wrote:
>> On Wed, Jan 5, 2011 at 8:07 PM, Alex Brollo <[hidden email]> wrote:
>>> Browsing the html code of source pages, I found this statement into a html
>>> comment:
>>>
>>> *Expensive parser function count: 0/500*
>>>
>>> I'd like to use this statement to evaluate "lightness" of a page, mainly
>>> testing the expensiveness of templates into the page but: in your opinion,
>>> given that the best would be a 0/500 value, what are limits for a good,
>>> moderately complex, complex page, just to have a try to work about? What is
>>> a really alarming value that needs fast fixing?
>>
>> A really alarming value that needs fast fixing would be, approximately
>> speaking, 501 or higher.  That's why the maximum is there.  We don't
>> leave fixing this kind of thing to users.
>
> I think the maximum was set to 100 initially, and raised to 500 due to
> user complaints. I'd be completely happy if users fixed all the
> templates that caused pages to use more than 100, then we could put
> the limit back down.

Doesn't it make much more sense to fix the underlying problem instead? Users
shouldn't have to be concerned with the number of #ifexists on a page.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Alex Brollo
In reply to this post by Tim Starling-2
2011/1/11 Tim Starling <[hidden email]>

> On 07/01/11 07:50, Aryeh Gregor wrote:
> > On Wed, Jan 5, 2011 at 8:07 PM, Alex Brollo <[hidden email]>
> wrote:
> >> Browsing the html code of source pages, I found this statement into a
> html
> >> comment:
> >>
> >> *Expensive parser function count: 0/500*
>
> I think the maximum was set to 100 initially, and raised to 500 due to
> user complaints. I'd be completely happy if users fixed all the
> templates that caused pages to use more than 100, then we could put
> the limit back down.
>

Thanks Tim. So, implementing a simple js to show that value (and the other
three data too) in small characters and into a border of the page  into the
page  display is not completely fuzzy. As I told I hate to waste resources -
any kind of them. It's a pity that those data are not saved into the xml
dump. But I don't want to overload the servers just to get data about
servers overloading. :-)

Just another question about resources. I can get the same result with an
AJAX call or with a #lst (labeled section transclusion) call. Which one  is
lighter for servers in your opinion? Or - are they they more or less
similar?

Alex
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Aryeh Gregor
On Tue, Jan 11, 2011 at 3:04 AM, Alex Brollo <[hidden email]> wrote:
> Just another question about resources. I can get the same result with an
> AJAX call or with a #lst (labeled section transclusion) call. Which one  is
> lighter for servers in your opinion? Or - are they they more or less
> similar?

Fewer HTTP requests is better, all else being equal.  I don't know how
LST works, but I imagine it's more efficient than doing a whole API
call.  (Although maybe not, for instance if the API is caching things
and LST isn't.)

Overall, I'd advise you to do whatever minimizes user-visible latency.
 That directly improves things for your users, and is a decent proxy
for server resource use.  So use whichever method takes less time to
fully render.  This is rather more practical than trying to consult
MediaWiki developers about every detail of your program's
implementation, which is unlikely to be used widely enough to greatly
affect server load anyway, and even if it were we couldn't necessarily
give intelligent answers without knowing exactly what the program is
doing and why.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Alex Brollo
2011/1/11 Aryeh Gregor
<[hidden email]<Simetrical%[hidden email]>
>

>
> Overall, I'd advise you to do whatever minimizes user-visible latency.
>  That directly improves things for your users, and is a decent proxy
> for server resource use.  So use whichever method takes less time to
> fully render.  This is rather more practical than trying to consult
> MediaWiki developers about every detail of your program's
> implementation, which is unlikely to be used widely enough to greatly
> affect server load anyway, and even if it were we couldn't necessarily
> give intelligent answers without knowing exactly what the program is
> doing and why.
>

I'm already using your  suggestion, today I replaced a complex test template
from our village pump (replacing it with a link to a subpage, visited by
interested users only) and, really, the difference in rendering the page
village pump page was obviuos.

Probably the best is, to use any trick together, paying much more attention
to widely used templates and frequently parsed pages  than to exotic, rarely
used ones. Unluckily for servers, the worse, heavier pages often are the
most frequently parsed and parsed again too.. as "village pump" ones;
nevertheless they are the most useful for community, so they deserve some
servers load.

Nevertheless... sometime people tells me "don't us this hack, it is server
overloading"... sometimes it isn't, or simply it is a undocumented,
unproofed personal opinion.

Alex
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Casey Brown-5
On Tue, Jan 11, 2011 at 10:55 AM, Alex Brollo <[hidden email]> wrote:
> I'm already using your  suggestion, today I replaced a complex test template
> from our village pump (replacing it with a link to a subpage, visited by
> interested users only) and, really, the difference in rendering the page
> village pump page was obviuos.

That's good, but also keep in mind that, generally, you shouldn't
worry too much about performance:
<http://en.wikipedia.org/wiki/WP:PERF>.  (Had to throw in the little
disclaimer here. ;-))

--
Casey Brown
Cbrown1023

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Alex Brollo
2011/1/11 Casey Brown <[hidden email]>

> That's good, but also keep in mind that, generally, you shouldn't
> worry too much about performance:
> <http://en.wikipedia.org/wiki/WP:PERF>.  (Had to throw in the little
> disclaimer here. ;-))
>

Yes I got this suggestion.... but when I try new tricks and new ideas,
someone often tells me "Please stop! This is server overloading! It's
terribly heavy!"; when I try to go deeper into server overload details,
other tell me "Don't worry so much about performance".

This is a little confusing for a poor DIY  contributor   ^__^

Nevertheless, some of my ideas will spread over the *whole* it.source
project by imitation and by bot activity (so that any my mistake could
really have a significant effect; small... it.source is nothing when
compared with the whole set of wiki projects...), and there's the risk that
a little bit of my ideas could spread into other projects too, so I try to
be careful.

Alex
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Aryeh Gregor
In reply to this post by Alex Brollo
On Tue, Jan 11, 2011 at 10:55 AM, Alex Brollo <[hidden email]> wrote:
> Nevertheless... sometime people tells me "don't us this hack, it is server
> overloading"... sometimes it isn't, or simply it is a undocumented,
> unproofed personal opinion.

Ignore them.  Server overload is not a problem that users are in a
position to evaluate, and a lot of users get completely insane ideas
about performance.  There have been cases of wikis making up entire
policies that were completely groundless.  The performance issues you
should be paying attention to are the ones that are visible to the
front-end, i.e., ones that produce slowness or error messages that you
can personally see.  If anyone tries to tell you that you should or
should not do something because of server load, point them to
<http://en.wikipedia.org/wiki/WP:PERF> and ignore them.

(Except if they're a sysadmin.  But if a performance issue becomes
important enough that a sysadmin intervenes, they're not going to give
you the option of ignoring them.)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Tim Starling-2
In reply to this post by Casey Brown-5
On 12/01/11 05:34, Casey Brown wrote:

> On Tue, Jan 11, 2011 at 10:55 AM, Alex Brollo <[hidden email]> wrote:
>> I'm already using your  suggestion, today I replaced a complex test template
>> from our village pump (replacing it with a link to a subpage, visited by
>> interested users only) and, really, the difference in rendering the page
>> village pump page was obviuos.
>
> That's good, but also keep in mind that, generally, you shouldn't
> worry too much about performance:
> <http://en.wikipedia.org/wiki/WP:PERF>.  (Had to throw in the little
> disclaimer here. ;-))

I've always been opposed to that policy.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Aryeh Gregor
On Tue, Jan 11, 2011 at 5:21 PM, Tim Starling <[hidden email]> wrote:
> I've always been opposed to that policy.

Are you aware of the completely insane things users have sometimes
established as conventions or even policies based on nonsensical
server-load grounds?  Like on the Dutch Wikipedia, apparently new
users were routinely being told to make as few edits as possible so
that the servers wouldn't run out of disk space:

http://en.wikipedia.org/wiki/Wikipedia_talk:Don%27t_worry_about_performance#Too_many_edits

An English Wikipedia user tried to argue for a couple of years that
Wikipedia was becoming slow because of too many links between pages,
and that something terrible would happen if templates didn't have
fewer links in them (fortunately no one listened to him that I know
of):

http://en.wikipedia.org/wiki/Wikipedia_talk:Don%27t_worry_about_performance#Reality_update_June.2F2009

There are probably even stupider things that I don't know about.
Ideally users would understand the issues involved and make
intelligent decisions about server load, but in practice the policy
seems to prevent a lot more harm than it causes.  Users are just not
going to be able to figure out what causes server load without
specific instruction by sysadmins.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
OQ
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

OQ
On Tue, Jan 11, 2011 at 4:31 PM, Aryeh Gregor
<[hidden email]> wrote:
> On Tue, Jan 11, 2011 at 5:21 PM, Tim Starling <[hidden email]> wrote:
>> I've always been opposed to that policy.
>
> Are you aware of the completely insane things users have sometimes
> established as conventions or even policies based on nonsensical
> server-load grounds?

*cough*

http://en.wikipedia.org/wiki/Template:Toolserver

24k revisions of pretty useless historical information.

http://en.wikipedia.org/wiki/Wikipedia:Open_proxy_detection
80k revisions

and so forth and so on.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

MZMcBride-2
In reply to this post by Tim Starling-2
Tim Starling wrote:

> On 12/01/11 05:34, Casey Brown wrote:
>> On Tue, Jan 11, 2011 at 10:55 AM, Alex Brollo <[hidden email]> wrote:
>>> I'm already using your  suggestion, today I replaced a complex test template
>>> from our village pump (replacing it with a link to a subpage, visited by
>>> interested users only) and, really, the difference in rendering the page
>>> village pump page was obviuos.
>>
>> That's good, but also keep in mind that, generally, you shouldn't
>> worry too much about performance:
>> <http://en.wikipedia.org/wiki/WP:PERF>.  (Had to throw in the little
>> disclaimer here. ;-))
>
> I've always been opposed to that policy.

As the person who implemented the expensive parser function count, I don't
imagine anyone on this list finds your opposition surprising. I do find the
view that users ought to be concerned about accidentally using too many
{{#ifexist:}}s or {{PAGESIZE:}}s on a page (for example) to be a horrible
approach to user experience, though.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Platonides
In reply to this post by MZMcBride-2
MZMcBride wrote:
> Doesn't it make much more sense to fix the underlying problem instead? Users
> shouldn't have to be concerned with the number of #ifexists on a page.
>
> MZMcBride

Well, if someone wants to change #ifexist, they should change the parser
(braceSubstitution) so that they can be done in parallel.
So that if you have for instance:
{{#ifexist: File:Flag of {{{1}}}.svg|<td>[[File:Flag of {{{1}}}.svg]]</td>}}
{{#ifexist: File:Shield of {{{1}}}.svg|<td>[[File:Shield of
{{{1}}}.svg]]</td>}}

They are performed in parallel, using one LinkBatch, instead of being
two separated queries.
Nested #ifexist and other cases would still need being checked
separatedly, but it would substantially reduce the "#ifexist load". I
think that most of them are even on the
same "child level".



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Alex Brollo
2011/1/12 Platonides <[hidden email]>

> MZMcBride wrote:
> > Doesn't it make much more sense to fix the underlying problem instead?
> Users
> > shouldn't have to be concerned with the number of #ifexists on a page.
> >
> > MZMcBride
>
>
>
Ok, now I feel myself much more comfortable. These my conclusions:

# I can feel myself free to test anything even if exotic.
# I will pay attention to html rendering time when trying something exotic.
# In the remote case that I really build something server-expensive, and
such exotic thing "infects" largely wiki projects (a very remote case!),
some sysop would see bad results of a bad idea and:
## will fix the parser code, if the idea is good, but the software manages
it with a low level of efficience;
## will kill the idea, if the idea is server expensive and simply unuseful
or wrong.

Alex
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Tim Starling-2
In reply to this post by Aryeh Gregor
On 12/01/11 09:31, Aryeh Gregor wrote:
> On Tue, Jan 11, 2011 at 5:21 PM, Tim Starling <[hidden email]> wrote:
>> I've always been opposed to that policy.
>
> Are you aware of the completely insane things users have sometimes
> established as conventions or even policies based on nonsensical
> server-load grounds?

Yes. I know that the main reason for the existence of the "don't worry
about performance" page is to make such policy debates easier by
elevating elitism to the status of a pseudo-policy. It means that
sysadmins don't have to explain anything, they just have to say "what
I say goes, see [[WP:PERF]]."

My issue with it is that it tends to discourage smart, capable users
who are interested in improving server performance. Particularly in
the area of template design, optimising server performance is
important, and it's frequently done by users with a great amount of
impact. It's not very hard. I've done it myself from time to time, but
it's best done by people with a knowledge of the templates in question
and the articles they serve.

Taking a few simple measures, like reducing the number of arguments in
loop-style templates down to the minimum necessary, can have a huge
impact on the parse time of very popular pages. I've given general
tips in the past.

> Users are just not
> going to be able to figure out what causes server load without
> specific instruction by sysadmins.

I think this is an exaggeration.

When I optimise the parse time of particular pages, I don't even use
my sysadmin access. The best way to do it is to download the page with
all its templates using Special:Export, and then to load it into a
local wiki. Parsing large pages is typically CPU-dominated, so you can
get a very good approximation without simulating the whole network.
Once the page is in your local wiki, you can use whatever profiling
tools you like: the MW profiler with extra sections, xdebug, gprof,
etc. And you can modify the test cases very easily.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Expensive parser function count

Aryeh Gregor
On Wed, Jan 12, 2011 at 6:51 PM, Tim Starling <[hidden email]> wrote:
> I think this is an exaggeration.
>
> When I optimise the parse time of particular pages, I don't even use
> my sysadmin access. The best way to do it is to download the page with
> all its templates using Special:Export, and then to load it into a
> local wiki.

But how do you determine which templates are causing server load
problems?  If we could expose enough profiling info to users that they
could figure out what's causing load so that they know their
optimization work is having an effect, I'd be all for encouraging them
to optimize.  The problem is that left to their own devices, people
who have no idea what they're talking about make up nonsensical server
load problems, and there's no way for even fairly technical users to
figure out that these people indeed have no idea what they're talking
about.  If we can expose clear metrics to users, like amount of CPU
time used per template, then encouraging them to optimize those
specific metrics is certainly a good idea.

> Parsing large pages is typically CPU-dominated, so you can
> get a very good approximation without simulating the whole network.

Templates that use a lot of CPU time cause user-visible latency in
addition to server load, and WP:PERF already says it's okay to try
optimizing clear user-visible problems (although it could be less
equivocal about it).

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
12