[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Delete archive time
On 4/18/12 3:13 PM, Michael Stevens wrote:
> On Wed, Apr 18, 2012 at 06:33:04PM +0100, Michael Stevens wrote:
>> On Wed, Apr 18, 2012 at 10:09:48AM -0700, Colin Percival wrote:
>>> [picking a random email to reply to...]
>>>
>>> On 04/18/12 06:34, Michael Stevens wrote:
>>>> I'm finding deleting archives is very slow and quite bandwidth intensive
>>>> as well.
>>> The bandwidth used should be about 0.1% of the size of the archive you're
>>> deleting; are you seeing more than this?
>> It looks like I've done about 10-20gb of bandwidth in the last 48 hours
>> deleting archives. I'm deleting ~100 archives with content mostly
>> duplicated between them.
>>
>>>> I've been trying to clear up old archives and deleting about a year's
>>>> worth of one a day - it's been going around 36 hours now and using a
>>>> fair bit of bandwidth.
>>> Can you send me your --print-stats output? It's possible that if Tarsnap's
>>> deduplication worked extremely well when it was creating archives then the
>>> 0.1% overhead is significant -- I hadn't worried about it because I figured
>>> it would always be much less than the time taken to originally create the
>>> archives.
>> On one machine (but not the one I'm referring to above):
>> Total size Compressed size
>> All archives 603859837025 245148994034
>> (unique data) 24571415132 10465650048
>>
>> I'll see if I can get some more numbers later.
> The machine I was particularly complaining about has taken almost
> exactly 48 hours to delete around 300 archives (one failed due to
> network problems my end).
>
> The output of the final delete:
>
> Total size Compressed size
> All archives 1075430463240 689511951789
> (unique data) 104970009987 75556009572
> This archive 154419395356 99378386213
> Deleted data 341060679 234719159
>
> Going as far back as I have scrollback, for one of the earlier deletes:
>
> Total size Compressed size
> All archives 29119731147593 18272658548233
> (unique data) 136239802444 91778391522
> This archive 148559529876 92986492326
> Deleted data 107609840 61792498
>
> I had one delete I wanted to do left over, which I've timed:
>
> root@osaka:~# time tarsnap -d -f osaka-home-2011-05-18
> Total size Compressed size
> All archives 926836591828 596496483543
> (unique data) 96754663741 68973013309
> This archive 148593871412 93015468246
> Deleted data 8215346246 6582996263
>
> real 14m37.343s
> user 0m42.731s
> sys 0m4.344s
>
> Network connection is roughly 10mbit down/1mbit up ADSL line.
>
> In my case, the reason I'm deleting so much is that I have no automated
> way of cleaning up old incremental backups, and they're cheap, so I've
> let them build up, and am finding cleaning things up is slower than I
> expected.
>
> I'm wondering if a cheap fix for this would be to provide some higher
> level operations (possibly as a wrapper), like "archive daily and keep
> last n days" (<insert attempt to get colin to do all the work here>).
>
> Michael
Hi Michael,
I have been using feather (https://github.com/danrue/feather), and it
has been working well as a replacement for rsnapshot. I also added it
to the FreeBSD ports tree and maintain it there:
http://www.freshports.org/sysutils/feather/
Hope that helps,
Greg Larkin