Template Dump Problems on enwiki-20071018/More Debug Info

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Template Dump Problems on enwiki-20071018/More Debug Info

jmerkey-3


The debug logs report the timeout occurs in strtr() throughout the code.
This last dump is very poor.  I will be dumping it out of the database and
reverting to the 20070908 dump until a working dump can be provided.


error log:

[Fri Oct 26 02:10:14 2007] [error] [client 24.10.175.9] File does not
exist: /wi
kidump/en/images/thumb/e/e8/Banksia_man.png/180px-Banksia_man.png,
referer: http
://www.wikigadugi.org/wiki/Banksia
[Fri Oct 26 03:55:20 2007] [error] [client 66.112.55.174] PHP Fatal error:
 Maxi
mum execution time of 30 seconds exceeded in
/wikidump/en/includes/Parser.php on
 line 2649
[Fri Oct 26 04:18:42 2007] [error] [client 88.112.108.126] PHP Fatal
error:  Max
imum execution time of 30 seconds exceeded in
/wikidump/en/includes/StringUtils.
php on line 275, referer:
http://mail.google.com/mail/?ui=1&view=page&name=gp&ve
r=sh3fib53pgpk
[Fri Oct 26 05:21:46 2007] [error] [client 24.10.175.9] PHP Fatal error:
Maximu
m execution time of 30 seconds exceeded in
/wikidump/en/includes/StringUtils.php
 on line 294


Jeff




_______________________________________________
Wikitech-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Template Dump Problems on enwiki-20071018/More Debug Info

jmerkey-3


Increasing the timeout in PHP to 120 seconds and setting the memory usage
upper limit to 100MB for PHP gets around the problem.

I think needing 100MB of memory to run nested templates and chewed up
stack space with PHP is somewhat excessive.  I will use the 20071018 dumps
with these fixes, but look forward to a new enwiki dump with these issues
at least understood and correctable.

Jeff.




/etc/php.ini


setting you php.ini file to these limits will allow the 20071018 dumps to
load and render properly.

;;;;;;;;;;;;;;;;;;;
; Resource Limits ;
;;;;;;;;;;;;;;;;;;;

max_execution_time = 120     ; Maximum execution time of each script, in
seconds
max_input_time = 60     ; Maximum amount of time each script may spend
parsing request data
memory_limit = 100M      ; Maximum amount of memory a script may consume
(8MB)





>
>
> The debug logs report the timeout occurs in strtr() throughout the code.
> This last dump is very poor.  I will be dumping it out of the database and
> reverting to the 20070908 dump until a working dump can be provided.
>
>
> error log:
>
> [Fri Oct 26 02:10:14 2007] [error] [client 24.10.175.9] File does not
> exist: /wi
> kidump/en/images/thumb/e/e8/Banksia_man.png/180px-Banksia_man.png,
> referer: http
> ://www.wikigadugi.org/wiki/Banksia
> [Fri Oct 26 03:55:20 2007] [error] [client 66.112.55.174] PHP Fatal error:
>  Maxi
> mum execution time of 30 seconds exceeded in
> /wikidump/en/includes/Parser.php on
>  line 2649
> [Fri Oct 26 04:18:42 2007] [error] [client 88.112.108.126] PHP Fatal
> error:  Max
> imum execution time of 30 seconds exceeded in
> /wikidump/en/includes/StringUtils.
> php on line 275, referer:
> http://mail.google.com/mail/?ui=1&view=page&name=gp&ve
> r=sh3fib53pgpk
> [Fri Oct 26 05:21:46 2007] [error] [client 24.10.175.9] PHP Fatal error:
> Maximu
> m execution time of 30 seconds exceeded in
> /wikidump/en/includes/StringUtils.php
>  on line 294
>
>
> Jeff
>
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> http://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



_______________________________________________
Wikitech-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Template Dump Problems on enwiki-20071018/More Debug Info

Rob Church
On 26/10/2007, [hidden email]
<[hidden email]> wrote:
> I think needing 100MB of memory to run nested templates and chewed up
> stack space with PHP is somewhat excessive.  I will use the 20071018 dumps
> with these fixes, but look forward to a new enwiki dump with these issues
> at least understood and correctable.

You continue to assert that the dumps are at fault, when in fact the
problem (I agree it's a problem) lies at a higher level than the dumps
- it's in the actual content being used on Wikipedia et al. which is a
matter for the user base to discuss, come to terms with, and then
correct.

The "technical team" is not responsible for checking that content is
correct, nor is it responsible for checking page load times for each
article and pruning them in the dumps. If a page contains obvious
abuses of markup which cause significant problems for large numbers of
users, then we'll kill it off, but of course, we haven't had a large
number of reports of that in recent months, although as other threads
on the list imply, the problem is resurfacing, and will likely be
looked into.

We can't really help it if our users are silly enough to insist upon
abusing a markup language as if it were pure code, nor if they insist
upon continuing to use fragile-looking template constructs which will
end in tears. What we can do is to impose limits justified by
realistic limitations on processing capabilities, for example.

Please *stop* using language that asserts the fault lies at a
technical level, or that it's our fault. If you wish to complain about
the actions of the user base, then complain *to* the user base - you
might, for instance, wish to engage in some of the discussions
springing up over the problems, or start a dialogue on the talk pages
of some of the affected articles.


Rob Church

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Template Dump Problems on enwiki-20071018/More Debug Info

jmerkey-3
> On 26/10/2007, [hidden email]
> <[hidden email]> wrote:
>> I think needing 100MB of memory to run nested templates and chewed up
>> stack space with PHP is somewhat excessive.  I will use the 20071018
>> dumps
>> with these fixes, but look forward to a new enwiki dump with these
>> issues
>> at least understood and correctable.
>
> You continue to assert that the dumps are at fault, when in fact the
> problem (I agree it's a problem) lies at a higher level than the dumps
> - it's in the actual content being used on Wikipedia et al. which is a
> matter for the user base to discuss, come to terms with, and then
> correct.
>
> The "technical team" is not responsible for checking that content is
> correct, nor is it responsible for checking page load times for each
> article and pruning them in the dumps. If a page contains obvious
> abuses of markup which cause significant problems for large numbers of
> users, then we'll kill it off, but of course, we haven't had a large
> number of reports of that in recent months, although as other threads
> on the list imply, the problem is resurfacing, and will likely be
> looked into.
>
> We can't really help it if our users are silly enough to insist upon
> abusing a markup language as if it were pure code, nor if they insist
> upon continuing to use fragile-looking template constructs which will
> end in tears. What we can do is to impose limits justified by
> realistic limitations on processing capabilities, for example.
>
> Please *stop* using language that asserts the fault lies at a
> technical level, or that it's our fault. If you wish to complain about
> the actions of the user base, then complain *to* the user base - you
> might, for instance, wish to engage in some of the discussions
> springing up over the problems, or start a dialogue on the talk pages
> of some of the affected articles.
>
>
> Rob Church



Rob,

I agree with all of what you said above.  That being said, it is within
your realm of influence to code mediawiki to check templates with garbage
logic when the pages are saved and prevent template code from nesting
infinite levels of depth.  You can set a limit of 3 levels deep or make
templates flat.  It's gotten out of hand, and this is due to the design of
mediawiki allowing people to use templates as a programming language.

You guys can fix this by preventing folks from creating templates which
nest to infinity by simply not allowing them to run  or preventing such
changes.

At any rate, I can get around it, and I will be writing a template
converter to read and cache these garbage templates and output static
tables if necessary so at least the content is usable.  You people own the
dumps, you create them, you wrote MediaWiki and its YOUR STUFF.   Whether
you wish to pass on the blame to others or not, the quality of all of it
lands squarely in the lap of MediaWiki and those who publish.

Jeff


_______________________________________________
Wikitech-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Template Dump Problems on enwiki-20071018/More Debug Info

Thomas Dalton
In reply to this post by Rob Church
> The "technical team" is not responsible for checking that content is
> correct, nor is it responsible for checking page load times for each
> article and pruning them in the dumps. If a page contains obvious
> abuses of markup which cause significant problems for large numbers of
> users, then we'll kill it off, but of course, we haven't had a large
> number of reports of that in recent months, although as other threads
> on the list imply, the problem is resurfacing, and will likely be
> looked into.

True enough, however the general userbase has always been told "Don't
worry about the servers, we'll tell you if there's a problem." Someone
should probably tell them there's a problem...

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Template Dump Problems on enwiki-20071018/More Debug Info

Aryeh Gregor
On 10/26/07, Thomas Dalton <[hidden email]> wrote:
> True enough, however the general userbase has always been told "Don't
> worry about the servers, we'll tell you if there's a problem." Someone
> should probably tell them there's a problem...

Unfortunately, most people don't seem to understand the difference
between "problems for the servers" and "problems for the viewers".
It's no huge problem for the servers to serve these pages, it's a
problem when you have to twiddle your thumbs for thirty seconds for
the page to render.  The latter is a user-side problem that it's
entirely correct and necessary to report.  But most people don't get
that.

(Still, I'm going to have to say the "Don't worry about performance"
essay/guideline does more good than harm, for all that.  Just look at
this, for example:
http://en.wikipedia.org/wiki/Wikipedia_talk:Don%27t_worry_about_performance#Too_many_edits
)

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://lists.wikimedia.org/mailman/listinfo/wikitech-l