[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Delete archive time
On Wed, Apr 18, 2012 at 06:33:04PM +0100, Michael Stevens wrote:
> On Wed, Apr 18, 2012 at 10:09:48AM -0700, Colin Percival wrote:
> > [picking a random email to reply to...]
> >
> > On 04/18/12 06:34, Michael Stevens wrote:
> > > I'm finding deleting archives is very slow and quite bandwidth intensive
> > > as well.
> >
> > The bandwidth used should be about 0.1% of the size of the archive you're
> > deleting; are you seeing more than this?
>
> It looks like I've done about 10-20gb of bandwidth in the last 48 hours
> deleting archives. I'm deleting ~100 archives with content mostly
> duplicated between them.
>
> > > I've been trying to clear up old archives and deleting about a year's
> > > worth of one a day - it's been going around 36 hours now and using a
> > > fair bit of bandwidth.
> >
> > Can you send me your --print-stats output? It's possible that if Tarsnap's
> > deduplication worked extremely well when it was creating archives then the
> > 0.1% overhead is significant -- I hadn't worried about it because I figured
> > it would always be much less than the time taken to originally create the
> > archives.
>
> On one machine (but not the one I'm referring to above):
> Total size Compressed size
> All archives 603859837025 245148994034
> (unique data) 24571415132 10465650048
>
> I'll see if I can get some more numbers later.
The machine I was particularly complaining about has taken almost
exactly 48 hours to delete around 300 archives (one failed due to
network problems my end).
The output of the final delete:
Total size Compressed size
All archives 1075430463240 689511951789
(unique data) 104970009987 75556009572
This archive 154419395356 99378386213
Deleted data 341060679 234719159
Going as far back as I have scrollback, for one of the earlier deletes:
Total size Compressed size
All archives 29119731147593 18272658548233
(unique data) 136239802444 91778391522
This archive 148559529876 92986492326
Deleted data 107609840 61792498
I had one delete I wanted to do left over, which I've timed:
root@osaka:~# time tarsnap -d -f osaka-home-2011-05-18
Total size Compressed size
All archives 926836591828 596496483543
(unique data) 96754663741 68973013309
This archive 148593871412 93015468246
Deleted data 8215346246 6582996263
real 14m37.343s
user 0m42.731s
sys 0m4.344s
Network connection is roughly 10mbit down/1mbit up ADSL line.
In my case, the reason I'm deleting so much is that I have no automated
way of cleaning up old incremental backups, and they're cheap, so I've
let them build up, and am finding cleaning things up is slower than I
expected.
I'm wondering if a cheap fix for this would be to provide some higher
level operations (possibly as a wrapper), like "archive daily and keep
last n days" (<insert attempt to get colin to do all the work here>).
Michael