Toolserver db outperformed by labs

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Toolserver db outperformed by labs

Platonides
Today^W Yesterday, I was asked about some file numbers, which involved
subcategory traversing, which is an "inefficient" problem. It seemed a
good problem for comparing toolserver and labs. And toolserver db sucks:

willow: 31m5.157s (user 0m4.038s)
labs: 0m4.271s (user 2.488)

Toolserver was *436 times slower*.

Surely, the labs server is better (in hardware) than the one in TS. I
don't know how many scripts were hitting the TS db, while the labs one
would be almost-idle. Still, it seems a really big gap. Do we have
something wrongly configured? Did mariadb somehow massively improve vs
mysql? Are some parameters too small? Is it just a problem that the
mysql servers are underprovisioned of ram?



_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Reply | Threaded
Open this post in threaded view
|

Re: Toolserver db outperformed by labs

Tim Landscheidt
(anonymous) wrote:

> Today^W Yesterday, I was asked about some file numbers, which involved
> subcategory traversing, which is an "inefficient" problem. It seemed a
> good problem for comparing toolserver and labs. And toolserver db sucks:

> willow: 31m5.157s (user 0m4.038s)
> labs: 0m4.271s (user 2.488)

> Toolserver was *436 times slower*.

> Surely, the labs server is better (in hardware) than the one in TS. I
> don't know how many scripts were hitting the TS db, while the labs one
> would be almost-idle. Still, it seems a really big gap. Do we have
> something wrongly configured? Did mariadb somehow massively improve vs
> mysql? Are some parameters too small? Is it just a problem that the
> mysql servers are underprovisioned of ram?

IIRC, the replicated databases on Labs are hosted on SSDs so
it's not really fair to compare them :-).  What would proba-
bly be a better benchmark are user databases on Toolserver
and tools-db on Labs; the latter (different credentials than
replicated databases) is on a VM with storage on a (IIRC
spinning) NFS server, but that would of course neglect that
the Toolserver databases have to cope with replication as
well, while tools-db only holds the user databases.  So I
don't think an adequate comparison can be made.

Tim


_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Reply | Threaded
Open this post in threaded view
|

Re: Toolserver db outperformed by labs

Patricia Pintilie
In reply to this post by Platonides

The problem starts at the servers abilities to categorize each discrepancy. If it cant it dumps without hesitating. Overloading the server is one thing it wont  do to itself on both negative and positive sides. When server programs start having issues their discrepancies affect the servers log which affects the servers progress in its own job.  If u have one server give it 2 input and output nodes ONLY otherwise  all ur programs will back the server db up.

On May 23, 2013 5:40 PM, "Platonides" <[hidden email]> wrote:
Today^W Yesterday, I was asked about some file numbers, which involved
subcategory traversing, which is an "inefficient" problem. It seemed a
good problem for comparing toolserver and labs. And toolserver db sucks:

willow: 31m5.157s (user 0m4.038s)
labs: 0m4.271s (user 2.488)

Toolserver was *436 times slower*.

Surely, the labs server is better (in hardware) than the one in TS. I
don't know how many scripts were hitting the TS db, while the labs one
would be almost-idle. Still, it seems a really big gap. Do we have
something wrongly configured? Did mariadb somehow massively improve vs
mysql? Are some parameters too small? Is it just a problem that the
mysql servers are underprovisioned of ram?



_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette

_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Reply | Threaded
Open this post in threaded view
|

Re: Toolserver db outperformed by labs

Johannes Kroll
In reply to this post by Platonides
On Fri, 24 May 2013 00:42:22 +0200
Platonides <[hidden email]> wrote:

> Today^W Yesterday, I was asked about some file numbers, which involved
> subcategory traversing, which is an "inefficient" problem. It seemed a
> good problem for comparing toolserver and labs. And toolserver db sucks:
>
> willow: 31m5.157s (user 0m4.038s)
> labs: 0m4.271s (user 2.488)
>
> Toolserver was *436 times slower*.
>
> Surely, the labs server is better (in hardware) than the one in TS. I
> don't know how many scripts were hitting the TS db, while the labs one
> would be almost-idle. Still, it seems a really big gap. Do we have
> something wrongly configured? Did mariadb somehow massively improve vs
> mysql? Are some parameters too small? Is it just a problem that the
> mysql servers are underprovisioned of ram?

Almost nobody is using the replicated Labs DB yet so it's not really a
surprise. Wait half a year or so then try again. I expect the
Labs DB to be faster still because of better hardware, but probably not
*that* much faster.

BTW: If you're doing recursive traversal of categories, you may be
interested in CatGraph: http://tools.wmflabs.org/render-tests/catgraph/
Ask me if you want to know more about it. This address or
JohannesK_WMDE on freenode. :)



_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Reply | Threaded
Open this post in threaded view
|

Re: Toolserver db outperformed by labs

Federico Leva (Nemo)
In reply to this post by Platonides
Yesterday many were playing with big SSD DB queries. :)
https://gist.github.com/brion/5652302
https://twitter.com/mdammers/status/338652420362092544
Thanks, Platonides, for this post; it's the kind of stuff we need,
reasons and examples of why [some] people may *want* to use [also] Labs,
resulting in an orderly increase of willing users without traumas.

Nemo

_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Reply | Threaded
Open this post in threaded view
|

Re: Toolserver db outperformed by labs

Tim Landscheidt