Doing backups and avoiding downtime

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Doing backups and avoiding downtime

Travis Derouin-2
Hi,

We've been using mysqldump to do daily full database backups in case
our hardware on our DB server fails. This causes some problems because
for a short period of 4 minutes or so, the site in inaccessible
because mysqldump has the db locked.

I'm not too familiar with the maintenance/dumpPages.xml script, but
this script doesn't backup the whole db, including user accounts,
recent changes, links, etc, does it? And if it does, it probably
doesn't avoid the problem of having to lock the DB for a few minutes,
right?

Is there any reason why Squid is reporting this error to anonymous
users for pages that should be cached? Squid does seem to be caching
pages properly.

If mysqldump is still the answer,(I'm using the --quick option) are
there any other ways we can avoid this brief downtime to capture a
backup? How does Wikipedia do this?

Thanks a lot,
Travis
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

Ron Hall
Travis Derouin wrote:
>
> Hi,
>
> We've been using mysqldump to do daily full database backups in case
> our hardware on our DB server fails. This causes some problems because
> for a short period of 4 minutes or so, the site in inaccessible
> because mysqldump has the db locked.
>
We backup every day at 4h00 - here is the "sophisticated" mechanism we use.

#!/bin/sh
#
# Get the date as a timestamp
#
date=`date '+%Y%m%d'`
#
# Make a hot copy of the MySQL DB
#
/usr/bin/mysqlhotcopy -h bushido.mcgill.ca -u root wikidb /tmp
#
# Make a tar of the DB files
#
cd /tmp
tar cf /opt/DB/wikidb$date.tar wikidb
#
# Make a tar of the wiki files
#
cd /opt
tar cf /opt/DB/wikifiles$date.tar mediawiki
#
# Clean up old files
#
rm -rf /tmp/wikidb
cd /opt/DB
gzip *.tar


enjoy - the hot backup is really short.

r
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

George William Herbert
In reply to this post by Travis Derouin-2
On 10/31/06, Travis Derouin <[hidden email]> wrote:

>
> Hi,
>
> We've been using mysqldump to do daily full database backups in case
> our hardware on our DB server fails. This causes some problems because
> for a short period of 4 minutes or so, the site in inaccessible
> because mysqldump has the db locked.
>
> I'm not too familiar with the maintenance/dumpPages.xml script, but
> this script doesn't backup the whole db, including user accounts,
> recent changes, links, etc, does it? And if it does, it probably
> doesn't avoid the problem of having to lock the DB for a few minutes,
> right?
>
> Is there any reason why Squid is reporting this error to anonymous
> users for pages that should be cached? Squid does seem to be caching
> pages properly.
>
> If mysqldump is still the answer,(I'm using the --quick option) are
> there any other ways we can avoid this brief downtime to capture a
> backup? How does Wikipedia do this?



This may be an argument for using PostgreSQL instead of MySQL for your site
- the PostgreSQL "pg_dump" command doesn't cause a read or write lock on the
database, so nobody using it gets blocked.  Things do slow down a bit, but
you can keep right on reading or updating through the dump.

Even mysqlhotcopy has a moderate lock window.


--
-george william herbert
[hidden email]
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

Travis Derouin-2
In reply to this post by Ron Hall
Hi Ron,

Thanks. it says here that mysqlhotcopy can only be used on MyISAM tables.

http://dev.mysql.com/doc/refman/5.0/en/mysqlhotcopy.html

Mediawiki tables are all InnoDB, did you find a work around for this?

Thanks,
Travis
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

George William Herbert
On 10/31/06, Travis Derouin <[hidden email]> wrote:
>
> Hi Ron,
>
> Thanks. it says here that mysqlhotcopy can only be used on MyISAM tables.
>
> http://dev.mysql.com/doc/refman/5.0/en/mysqlhotcopy.html
>
> Mediawiki tables are all InnoDB, did you find a work around for this?



 With innodb, you can do mysqldump --single-transaction --quick




--
-george william herbert
[hidden email]
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

Ron Hall
George Herbert wrote:

>
> On 10/31/06, Travis Derouin <[hidden email]> wrote:
> >
> > Hi Ron,
> >
> > Thanks. it says here that mysqlhotcopy can only be used on MyISAM
> tables.
> >
> > http://dev.mysql.com/doc/refman/5.0/en/mysqlhotcopy.html
> >
> > Mediawiki tables are all InnoDB, did you find a work around for this?
>
>
>
>  With innodb, you can do mysqldump --single-transaction --quick
>
    Oddly enough it works on my system - go figure.

    r

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

Jay Ashworth-2
In reply to this post by George William Herbert
On Tue, Oct 31, 2006 at 12:04:26PM -0800, George Herbert wrote:
> This may be an argument for using PostgreSQL instead of MySQL for your site
> - the PostgreSQL "pg_dump" command doesn't cause a read or write lock on the
> database, so nobody using it gets blocked.  Things do slow down a bit, but
> you can keep right on reading or updating through the dump.

While still preserving coherence between all the table's it's backing
up, right?

Cheers,
-- jr "isn't MVCC great?" a
--
Jay R. Ashworth                                                [hidden email]
Designer                          Baylink                             RFC 2100
Ashworth & Associates        The Things I Think                        '87 e24
St Petersburg FL USA      http://baylink.pitas.com             +1 727 647 1274

        "That's women for you; you divorce them, and 10 years later,
          they stop having sex with you."  -- Jennifer Crusie; _Fast_Women_
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

George William Herbert
On 10/31/06, Jay R. Ashworth <[hidden email]> wrote:

>
> On Tue, Oct 31, 2006 at 12:04:26PM -0800, George Herbert wrote:
> > This may be an argument for using PostgreSQL instead of MySQL for your
> site
> > - the PostgreSQL "pg_dump" command doesn't cause a read or write lock on
> the
> > database, so nobody using it gets blocked.  Things do slow down a bit,
> but
> > you can keep right on reading or updating through the dump.
>
> While still preserving coherence between all the table's it's backing
> up, right?



Yep.




--
-george william herbert
[hidden email]
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

Tim Starling-2
In reply to this post by George William Herbert
George Herbert wrote:
> This may be an argument for using PostgreSQL instead of MySQL for your site
> - the PostgreSQL "pg_dump" command doesn't cause a read or write lock on the
> database, so nobody using it gets blocked.  Things do slow down a bit, but
> you can keep right on reading or updating through the dump.
>
> Even mysqlhotcopy has a moderate lock window.

mysqldump on InnoDB is the same, if you use the right options. No locking,
just multi-versioned copy-on-write tables. That's how we used to do backups
before we had slaves that we could stop.

-- Tim Starling

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

MHart
In reply to this post by Travis Derouin-2
> We've been using mysqldump to do daily full database backups in case
> our hardware on our DB server fails. This causes some problems because
> for a short period of 4 minutes or so, the site in inaccessible
> because mysqldump has the db locked.

I do a full mysqldump of 100+ wikis and 25+ blogs. It takes and average of
55 seconds to dump and gzip. It auto-runs at around midnight Eastern time.
Never a problem. Saved to a RAID 1 device daily and all backups are copied
to an offsite RAID 5 device weekly.

How big is your resulting dump... and how powerful is the dumper? 4 minutes
sounds excessive. Gzipped, my backups are 156meg. Unzipped, 1.2 gig. Around
the time of the backup, I have between 17 and 21 meg of bandwidth hitting
the site: around 2600 hits, just under 1000 page views. If I adjusted my
backup time to 6am Eastern - that's when I have the least traffic - 2meg of
bandwidth usage.

- MHart

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

Travis Derouin-2
You're right, looking at the logs, it is more like 1.5 minutes. Funny!
I still hear complaints about that small amount of downtime though.
The downtime is likely longer because the incoming connections are
likely getting backed up, and there's probably a bottleneck when the
dump has finished.

Our dump uncompressed is 1.2 GB  one wiki.
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Doing backups and avoiding downtime

George William Herbert
On 11/1/06, Travis Derouin <[hidden email]> wrote:

>
> You're right, looking at the logs, it is more like 1.5 minutes. Funny!
> I still hear complaints about that small amount of downtime though.
> The downtime is likely longer because the incoming connections are
> likely getting backed up, and there's probably a bottleneck when the
> dump has finished.
>
> Our dump uncompressed is 1.2 GB  one wiki.
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> http://mail.wikipedia.org/mailman/listinfo/wikitech-l
>

That does sound sort of slow.

As Tim points out, the ultimate solution to this is a slave DB which you can
stop without affecting the primary which is serving to the live wiki.

Are you doing the dump-to-/tmp trick others noted early in the thread?  A
RAM disk is far better than real disk for dump speed, if you have the RAM
available... and 1.2 GB isn't all that much RAM these days.


--
-george william herbert
[hidden email]
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l