[MediaWiki-l] Mediawiki MySQL user-related database [dead]locks?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[MediaWiki-l] Mediawiki MySQL user-related database [dead]locks?

Andrew Smith
Hello

I administer a relatively busy wiki for our school, at
https://wiki.cdot.senecacollege.ca

In august I migrated the wiki from an older server where it had as far
as I can tell 1.15.4 on. The new server had the latest mediawiki verison
available, as far as I can tell now 1.27

The database server didn't change, only the web server.

After updating the mediawiki files (I started from 1.27 and added in
missing stuff as described here
https://www.mediawiki.org/wiki/Manual:Moving_a_wiki ) I ran update.php

I can't remember if it printed any errors the first time I ran it. There
was a lot of output and I don't understand half of it. I think it ran
without errors. Also I ran it multiple times since then and haven't
noticed any errors.

Everything seemed to work. But now that the new semester started and we
have a lot of new students - we discovered a very serious problem: the
wiki won't allow new user registrations.

You can go and try yourself. Usually the interface just hangs there,
never coming back with a response. Sometimes I get an error like this:

MySQL [cdotwiki_db]> SELECT user_id FROM `mw_user` WHERE user_name =
'Asmith20' LIMIT 1 LOCK IN SHARE MODE
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Just now I finished migrating the database to a brand new MySQL 5.7.15
server, thinking that maybe I would see some change. But nothing
changed. Because it's 1AM I got some debugging going, and almost
certainly this was the first Mediawiki request to the database that
failed (from SHOW ENGINE INNODB STATUS;):

---TRANSACTION 4748, ACTIVE 62 sec
8 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 8
MySQL thread id 1345, OS thread handle 140329361524480, query id 20771
web-cdot1.sparc 10.7.0.45 cdotwiki_usr cleaning up
Trx read view will not see trx with id >= 4742, sees < 4742
TABLE LOCK table `cdotwiki_db`.`mw_user` trx id 4748 lock mode IS
RECORD LOCKS space id 70 page no 157 n bits 624 index user_name of table
`cdotwiki_db`.`mw_user` trx id 4748 lock mode S locks gap before rec
Record lock, heap no 244 PHYSICAL RECORD: n_fields 2; compact format;
info bits 0
  0: len 6; hex 41736f583139; asc AsoX19;;
  1: len 4; hex 000001ac; asc     ;;

Record lock, heap no 558 PHYSICAL RECORD: n_fields 2; compact format;
info bits 0
  0: len 8; hex 41736d6974683230; asc Asmith20;;
  1: len 4; hex 000036c2; asc   6 ;;

TABLE LOCK table `cdotwiki_db`.`mw_user` trx id 4748 lock mode IX
RECORD LOCKS space id 70 page no 436 n bits 112 index PRIMARY of table
`cdotwiki_db`.`mw_user` trx id 4748 lock_mode X locks rec but not gap
Record lock, heap no 42 PHYSICAL RECORD: n_fields 17; compact format;
info bits 0
  0: len 4; hex 000036c2; asc   6 ;;
  1: len 6; hex 00000000128c; asc       ;;
  2: len 7; hex 21000001362118; asc !   6! ;;
  3: len 8; hex 41736d6974683230; asc Asmith20;;
  4: len 0; hex ; asc ;;
  5: len 30; hex
3a70626b6466323a7368613235363a31303030303a3132383a2f66545962; asc
:pbkdf2:sha256:10000:128:/fTYb; (total 222 bytes);
  6: len 0; hex ; asc ;;
  7: len 21; hex 61736d6974683230406c6974746c657376722e6361; asc
[hidden email];;
  8: len 14; hex 3230313630393135303530303137; asc 20160915050017;;
  9: len 30; hex
623061346535323762613365336462656133323035633666343564663163; asc
b0a4e527ba3e3dbea3205c6f45df1c; (total 32 bytes);
  10: SQL NULL;
  11: len 30; hex
396561386335613365663263623666353062303736646165393934393331; asc
9ea8c5a3ef2cb6f50b076dae994931; (total 32 bytes);
  12: len 14; hex 3230313630393232303530303130; asc 20160922050010;;
  13: len 14; hex 3230313630393135303530303130; asc 20160915050010;;
  14: SQL NULL;
  15: len 4; hex 80000000; asc     ;;
  16: SQL NULL;

TABLE LOCK table `cdotwiki_db`.`mw_watchlist` trx id 4748 lock mode IX
RECORD LOCKS space id 70 page no 157 n bits 624 index user_name of table
`cdotwiki_db`.`mw_user` trx id 4748 lock_mode X locks rec but not gap
Record lock, heap no 558 PHYSICAL RECORD: n_fields 2; compact format;
info bits 0
  0: len 8; hex 41736d6974683230; asc Asmith20;;
  1: len 4; hex 000036c2; asc   6 ;;

TABLE LOCK table `cdotwiki_db`.`mw_logging` trx id 4748 lock mode IX
TABLE LOCK table `cdotwiki_db`.`mw_recentchanges` trx id 4748 lock mode IX

If I kill this transaction (thread id 1345) via phpmyadmin - strangely
nothing will appear to happen. My browser will keep spinning, as if it's
waiting for a response.

I am starting to become desperate. With every day that passes the
problem grows. And I don't have a lot of ideas left.

It's not the MySQL server. I disabled nearly all extensions. I can't
really read the output from innodb status (example above). What should I do?

My best guess now is that the upgrade didn't work 100%. But what do I do
now? Is it an option to downgrade? If yes - how far back? Will that even
help or is the database corrupt? How can I check whether it's corrupt?

If I have to roll back the the backup from before the upgrade (shiver) -
will it be possible to apply the changes made to the new database to the
restored old version? Probably not?

Please help! I thought I got this upgrade to work and I really hope I
didn't screw hundreds of users over :(

Cheers,

Andrew

_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Mediawiki MySQL user-related database [dead]locks?

Andrew Smith
Also I noticed something weird in Firebug. When I go to the registration
page - looks like there's an Ajax request to api.php after I type in the
username, returning this:

{"batchcomplete":"","query":{"users":[{"name":"Asmith20","missing":""}]}}

Which is presumably a good thing. But the strange thing is - according
to the Net panel in Firebug - there are no more network requests after
that, even after I click the creat button.

I don't know what to make of this, but thought I'd mention in case it
rings a bell for anyone else.

Andrew

On 15/09/16 01:36, Andrew Smith wrote:

> Hello
>
> I administer a relatively busy wiki for our school, at
> https://wiki.cdot.senecacollege.ca
>
> In august I migrated the wiki from an older server where it had as far
> as I can tell 1.15.4 on. The new server had the latest mediawiki verison
> available, as far as I can tell now 1.27
>
> The database server didn't change, only the web server.
>
> After updating the mediawiki files (I started from 1.27 and added in
> missing stuff as described here
> https://www.mediawiki.org/wiki/Manual:Moving_a_wiki ) I ran update.php
>
> I can't remember if it printed any errors the first time I ran it. There
> was a lot of output and I don't understand half of it. I think it ran
> without errors. Also I ran it multiple times since then and haven't
> noticed any errors.
>
> Everything seemed to work. But now that the new semester started and we
> have a lot of new students - we discovered a very serious problem: the
> wiki won't allow new user registrations.
>
> You can go and try yourself. Usually the interface just hangs there,
> never coming back with a response. Sometimes I get an error like this:
>
> MySQL [cdotwiki_db]> SELECT user_id FROM `mw_user` WHERE user_name =
> 'Asmith20' LIMIT 1 LOCK IN SHARE MODE
> ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
>
> Just now I finished migrating the database to a brand new MySQL 5.7.15
> server, thinking that maybe I would see some change. But nothing
> changed. Because it's 1AM I got some debugging going, and almost
> certainly this was the first Mediawiki request to the database that
> failed (from SHOW ENGINE INNODB STATUS;):
>
> ---TRANSACTION 4748, ACTIVE 62 sec
> 8 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 8
> MySQL thread id 1345, OS thread handle 140329361524480, query id 20771
> web-cdot1.sparc 10.7.0.45 cdotwiki_usr cleaning up
> Trx read view will not see trx with id >= 4742, sees < 4742
> TABLE LOCK table `cdotwiki_db`.`mw_user` trx id 4748 lock mode IS
> RECORD LOCKS space id 70 page no 157 n bits 624 index user_name of table
> `cdotwiki_db`.`mw_user` trx id 4748 lock mode S locks gap before rec
> Record lock, heap no 244 PHYSICAL RECORD: n_fields 2; compact format;
> info bits 0
>  0: len 6; hex 41736f583139; asc AsoX19;;
>  1: len 4; hex 000001ac; asc     ;;
>
> Record lock, heap no 558 PHYSICAL RECORD: n_fields 2; compact format;
> info bits 0
>  0: len 8; hex 41736d6974683230; asc Asmith20;;
>  1: len 4; hex 000036c2; asc   6 ;;
>
> TABLE LOCK table `cdotwiki_db`.`mw_user` trx id 4748 lock mode IX
> RECORD LOCKS space id 70 page no 436 n bits 112 index PRIMARY of table
> `cdotwiki_db`.`mw_user` trx id 4748 lock_mode X locks rec but not gap
> Record lock, heap no 42 PHYSICAL RECORD: n_fields 17; compact format;
> info bits 0
>  0: len 4; hex 000036c2; asc   6 ;;
>  1: len 6; hex 00000000128c; asc       ;;
>  2: len 7; hex 21000001362118; asc !   6! ;;
>  3: len 8; hex 41736d6974683230; asc Asmith20;;
>  4: len 0; hex ; asc ;;
>  5: len 30; hex
> 3a70626b6466323a7368613235363a31303030303a3132383a2f66545962; asc
> :pbkdf2:sha256:10000:128:/fTYb; (total 222 bytes);
>  6: len 0; hex ; asc ;;
>  7: len 21; hex 61736d6974683230406c6974746c657376722e6361; asc
> [hidden email];;
>  8: len 14; hex 3230313630393135303530303137; asc 20160915050017;;
>  9: len 30; hex
> 623061346535323762613365336462656133323035633666343564663163; asc
> b0a4e527ba3e3dbea3205c6f45df1c; (total 32 bytes);
>  10: SQL NULL;
>  11: len 30; hex
> 396561386335613365663263623666353062303736646165393934393331; asc
> 9ea8c5a3ef2cb6f50b076dae994931; (total 32 bytes);
>  12: len 14; hex 3230313630393232303530303130; asc 20160922050010;;
>  13: len 14; hex 3230313630393135303530303130; asc 20160915050010;;
>  14: SQL NULL;
>  15: len 4; hex 80000000; asc     ;;
>  16: SQL NULL;
>
> TABLE LOCK table `cdotwiki_db`.`mw_watchlist` trx id 4748 lock mode IX
> RECORD LOCKS space id 70 page no 157 n bits 624 index user_name of table
> `cdotwiki_db`.`mw_user` trx id 4748 lock_mode X locks rec but not gap
> Record lock, heap no 558 PHYSICAL RECORD: n_fields 2; compact format;
> info bits 0
>  0: len 8; hex 41736d6974683230; asc Asmith20;;
>  1: len 4; hex 000036c2; asc   6 ;;
>
> TABLE LOCK table `cdotwiki_db`.`mw_logging` trx id 4748 lock mode IX
> TABLE LOCK table `cdotwiki_db`.`mw_recentchanges` trx id 4748 lock mode IX
>
> If I kill this transaction (thread id 1345) via phpmyadmin - strangely
> nothing will appear to happen. My browser will keep spinning, as if it's
> waiting for a response.
>
> I am starting to become desperate. With every day that passes the
> problem grows. And I don't have a lot of ideas left.
>
> It's not the MySQL server. I disabled nearly all extensions. I can't
> really read the output from innodb status (example above). What should I
> do?
>
> My best guess now is that the upgrade didn't work 100%. But what do I do
> now? Is it an option to downgrade? If yes - how far back? Will that even
> help or is the database corrupt? How can I check whether it's corrupt?
>
> If I have to roll back the the backup from before the upgrade (shiver) -
> will it be possible to apply the changes made to the new database to the
> restored old version? Probably not?
>
> Please help! I thought I got this upgrade to work and I really hope I
> didn't screw hundreds of users over :(
>
> Cheers,
>
> Andrew
>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>

_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Mediawiki MySQL user-related database [dead]locks?

Brian Wolff
In reply to this post by Andrew Smith
On Thu, Sep 15, 2016 at 5:36 AM, Andrew Smith <[hidden email]> wrote:

> Hello
>
> I administer a relatively busy wiki for our school, at
> https://wiki.cdot.senecacollege.ca
>
> In august I migrated the wiki from an older server where it had as far as I
> can tell 1.15.4 on. The new server had the latest mediawiki verison
> available, as far as I can tell now 1.27
>
> The database server didn't change, only the web server.
>
> After updating the mediawiki files (I started from 1.27 and added in missing
> stuff as described here https://www.mediawiki.org/wiki/Manual:Moving_a_wiki
> ) I ran update.php
>
> I can't remember if it printed any errors the first time I ran it. There was
> a lot of output and I don't understand half of it. I think it ran without
> errors. Also I ran it multiple times since then and haven't noticed any
> errors.
>
> Everything seemed to work. But now that the new semester started and we have
> a lot of new students - we discovered a very serious problem: the wiki won't
> allow new user registrations.
>
> You can go and try yourself. Usually the interface just hangs there, never
> coming back with a response. Sometimes I get an error like this:
>
> MySQL [cdotwiki_db]> SELECT user_id FROM `mw_user` WHERE user_name =
> 'Asmith20' LIMIT 1 LOCK IN SHARE MODE
> ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
>
> Just now I finished migrating the database to a brand new MySQL 5.7.15
> server, thinking that maybe I would see some change. But nothing changed.
> Because it's 1AM I got some debugging going, and almost certainly this was
> the first Mediawiki request to the database that failed (from SHOW ENGINE
> INNODB STATUS;):
>
> ---TRANSACTION 4748, ACTIVE 62 sec
> 8 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 8
> MySQL thread id 1345, OS thread handle 140329361524480, query id 20771
> web-cdot1.sparc 10.7.0.45 cdotwiki_usr cleaning up
> Trx read view will not see trx with id >= 4742, sees < 4742
> TABLE LOCK table `cdotwiki_db`.`mw_user` trx id 4748 lock mode IS
> RECORD LOCKS space id 70 page no 157 n bits 624 index user_name of table
> `cdotwiki_db`.`mw_user` trx id 4748 lock mode S locks gap before rec
> Record lock, heap no 244 PHYSICAL RECORD: n_fields 2; compact format; info
> bits 0
>  0: len 6; hex 41736f583139; asc AsoX19;;
>  1: len 4; hex 000001ac; asc     ;;
>
> Record lock, heap no 558 PHYSICAL RECORD: n_fields 2; compact format; info
> bits 0
>  0: len 8; hex 41736d6974683230; asc Asmith20;;
>  1: len 4; hex 000036c2; asc   6 ;;
>
> TABLE LOCK table `cdotwiki_db`.`mw_user` trx id 4748 lock mode IX
> RECORD LOCKS space id 70 page no 436 n bits 112 index PRIMARY of table
> `cdotwiki_db`.`mw_user` trx id 4748 lock_mode X locks rec but not gap
> Record lock, heap no 42 PHYSICAL RECORD: n_fields 17; compact format; info
> bits 0
>  0: len 4; hex 000036c2; asc   6 ;;
>  1: len 6; hex 00000000128c; asc       ;;
>  2: len 7; hex 21000001362118; asc !   6! ;;
>  3: len 8; hex 41736d6974683230; asc Asmith20;;
>  4: len 0; hex ; asc ;;
>  5: len 30; hex
> 3a70626b6466323a7368613235363a31303030303a3132383a2f66545962; asc
> :pbkdf2:sha256:10000:128:/fTYb; (total 222 bytes);
>  6: len 0; hex ; asc ;;
>  7: len 21; hex 61736d6974683230406c6974746c657376722e6361; asc
> [hidden email];;
>  8: len 14; hex 3230313630393135303530303137; asc 20160915050017;;
>  9: len 30; hex
> 623061346535323762613365336462656133323035633666343564663163; asc
> b0a4e527ba3e3dbea3205c6f45df1c; (total 32 bytes);
>  10: SQL NULL;
>  11: len 30; hex
> 396561386335613365663263623666353062303736646165393934393331; asc
> 9ea8c5a3ef2cb6f50b076dae994931; (total 32 bytes);
>  12: len 14; hex 3230313630393232303530303130; asc 20160922050010;;
>  13: len 14; hex 3230313630393135303530303130; asc 20160915050010;;
>  14: SQL NULL;
>  15: len 4; hex 80000000; asc     ;;
>  16: SQL NULL;
>
> TABLE LOCK table `cdotwiki_db`.`mw_watchlist` trx id 4748 lock mode IX
> RECORD LOCKS space id 70 page no 157 n bits 624 index user_name of table
> `cdotwiki_db`.`mw_user` trx id 4748 lock_mode X locks rec but not gap
> Record lock, heap no 558 PHYSICAL RECORD: n_fields 2; compact format; info
> bits 0
>  0: len 8; hex 41736d6974683230; asc Asmith20;;
>  1: len 4; hex 000036c2; asc   6 ;;
>
> TABLE LOCK table `cdotwiki_db`.`mw_logging` trx id 4748 lock mode IX
> TABLE LOCK table `cdotwiki_db`.`mw_recentchanges` trx id 4748 lock mode IX
>
> If I kill this transaction (thread id 1345) via phpmyadmin - strangely
> nothing will appear to happen. My browser will keep spinning, as if it's
> waiting for a response.
>
> I am starting to become desperate. With every day that passes the problem
> grows. And I don't have a lot of ideas left.
>
> It's not the MySQL server. I disabled nearly all extensions. I can't really
> read the output from innodb status (example above). What should I do?
>
> My best guess now is that the upgrade didn't work 100%. But what do I do
> now? Is it an option to downgrade? If yes - how far back? Will that even
> help or is the database corrupt? How can I check whether it's corrupt?
>
> If I have to roll back the the backup from before the upgrade (shiver) -
> will it be possible to apply the changes made to the new database to the
> restored old version? Probably not?
>
> Please help! I thought I got this upgrade to work and I really hope I didn't
> screw hundreds of users over :(
>
> Cheers,
>
> Andrew
>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l


>My best guess now is that the upgrade didn't work 100%. But what do I do now? Is it an option to downgrade? If yes - how far back? Will that even help or is the database corrupt? How can I check whether it's corrupt?

The upgrade only partially done would probably not cause this - If the
schema change wasn't made, you should get an instance error about
missing columns. (However, if the schema change (ALTER TABLE) was
still in progress, maybe you might have it blocking everything going
to the user table. You could maybe use SHOW processlist; to verify
that that is not the case. However if it was the case I feel like a
whole lot of other stuff would be broken too).

Its difficult to downgrade MW when going so many versions back. You'd
probably be able to downgrade from 1.27 -> 1.26. But going all the way
back to 1.15 would be extremely difficult.

I don't know much about DB locking, so the following might be stupid and wrong:

I was under the impression that mysql would add a line like "-------
TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:" if the
transaction is actually waiting for a lock. So the output of show
innodb engine status you pasted above makes me almost think that its
not so much mediawiki is waiting for a lock as MW is holding a bunch
of locks, and for some reason never sending a commit (This might be
the totally wrong conclusion, I don't really know anything about mysql
locks). So maybe something is wrong on the php side. Thus I'd also
check the php error log, and maybe try enabling the MediaWiki debug
log and check that.

As an aside, maybe double check the mysql isolation level and make
sure its only at REPEATABLE-READ. (No idea if that would help or not).

--
bawolff

_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Mediawiki MySQL user-related database [dead]locks?

Andrew Smith
On 15/09/16 07:08, Brian Wolff wrote:

> I don't know much about DB locking, so the following might be stupid and wrong:
>
> I was under the impression that mysql would add a line like "-------
> TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:" if the
> transaction is actually waiting for a lock. So the output of show
> innodb engine status you pasted above makes me almost think that its
> not so much mediawiki is waiting for a lock as MW is holding a bunch
> of locks, and for some reason never sending a commit (This might be
> the totally wrong conclusion, I don't really know anything about mysql
> locks). So maybe something is wrong on the php side. Thus I'd also
> check the php error log, and maybe try enabling the MediaWiki debug
> log and check that.

That's exactly what I was expecting! But the transaction will literally
never die, it will stay there for hours (days if I let it).

I've had the following set fora while:

$wgShowExceptionDetails = true;
$wgShowSQLErrors = true;
$wgDebugDumpSql  = true;
$wgDBerrorLog = true;

But I don't see any extra logs in the server error logs. I do get the
sql error in the web interface when it times out.

I will set $wgDebugLogFile and see if that shows anything.

What else could I do? Surely it's not a bug in mediawiki, I wouldn't be
the first to notice :) It's got to be related with my setup or my data?

Thanks,

Andrew

_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Mediawiki MySQL user-related database [dead]locks?

Andrew Smith
On 15/09/16 10:14, Andrew Smith wrote:
> On 15/09/16 07:08, Brian Wolff wrote:
>> locks). So maybe something is wrong on the php side. Thus I'd also
>> check the php error log, and maybe try enabling the MediaWiki debug
>> log and check that.
>
Thanks Brian. Your suggestion to turn on the MediaWiki debug log helped
me track down the problem.

It was a bizarre one. wgSMTP was misconfigured and what happened was:

* Someone would try to register
* Mediawiki would start a transaction to do whatever in SQL, perhaps
record that an account is being created
* Mediawiki would try to send an email, which I could see from a single
log line with PEAR in it (the last log line for that request).
* The email would never get sent and the request to send the email never
timed out (go figure).
* The transaction started way back when was never committed, and was
left open forever.
* From that point on any other requests to modify user table timed out.

Why would one send an email in the middle of an SQL transaction? That's
a disaster waiting to happen.

Is there a place where I should report this bug? I would like to help
get this fixed, not just complain on my blog :)
http://littlesvr.ca/grumble/2016/09/17/what-does-a-posix-signal-handler-and-an-sql-transaction-have-in-common/

It really sucks that I wasted so much time and patience dealing with
this, but I'm glad I got it sorted out in the end. Thanks for the help
Brian!

Cheers,

Andrew

_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Mediawiki MySQL user-related database [dead]locks?

Florian Schmidt
Hi,

thanks for writing your conclusion Back to this list. Could you maybe say, what the log message said, which indicated the underlying SMTP problem (and btw. What was the false configuration?)?

Generally, if you want to report a bug, you should describe the problem as detailed as possible with as much information as possible and a clear way of reproducing it in a phabricator task: https://phabricator.wikimedia.org
You could send and answer to this list containing the link to your task, just for the record.

Thanks for reporting it!

Best,
Florian

-----Urspr√ľngliche Nachricht-----
Von: MediaWiki-l [mailto:[hidden email]] Im Auftrag von Andrew Smith
Gesendet: Samstag, 17. September 2016 20:32
An: [hidden email]
Betreff: Re: [MediaWiki-l] Mediawiki MySQL user-related database [dead]locks?

On 15/09/16 10:14, Andrew Smith wrote:
> On 15/09/16 07:08, Brian Wolff wrote:
>> locks). So maybe something is wrong on the php side. Thus I'd also
>> check the php error log, and maybe try enabling the MediaWiki debug
>> log and check that.
>
Thanks Brian. Your suggestion to turn on the MediaWiki debug log helped me track down the problem.

It was a bizarre one. wgSMTP was misconfigured and what happened was:

* Someone would try to register
* Mediawiki would start a transaction to do whatever in SQL, perhaps record that an account is being created
* Mediawiki would try to send an email, which I could see from a single log line with PEAR in it (the last log line for that request).
* The email would never get sent and the request to send the email never timed out (go figure).
* The transaction started way back when was never committed, and was left open forever.
* From that point on any other requests to modify user table timed out.

Why would one send an email in the middle of an SQL transaction? That's a disaster waiting to happen.

Is there a place where I should report this bug? I would like to help get this fixed, not just complain on my blog :) http://littlesvr.ca/grumble/2016/09/17/what-does-a-posix-signal-handler-and-an-sql-transaction-have-in-common/

It really sucks that I wasted so much time and patience dealing with this, but I'm glad I got it sorted out in the end. Thanks for the help Brian!

Cheers,

Andrew

_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l


_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Mediawiki MySQL user-related database [dead]locks?

Jan Steinman-2
In reply to this post by Brian Wolff
Are you using InnoDB?

This sounds like a MyISAM issue.

MyISAM does table level locking. InnoDB does row locking. So a transaction should not hang in InnoDB unless two processes are attempting to modify the same record, whereas it can happen in MyISAM if two processes are trying to modify the same table.

    Jan Steinman
    EcoReality Co-op, http://www.EcoReality.org
    2152 Fulford-Ganges Road
    Salt Spring Island, BC V8K 1Z7 CANADA
    +1 250.653.2024




_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l