[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: using fsck in a lowmem environment



On 11/18/11 10:06, Mark Smith wrote:
> I was hoping to have two separate environments: production and a
> secure off-site server, and "ne'er the twain shall meet". I put
> limited access keys (-r and -w only) in production, and then full
> access keys (-d and --nuke allowed) on the off-site server.
> [...]
> Am I missing anything here? Has anybody implemented something like
> this with tarsnap?

One option you might want to consider is having a passphrased delete
key on the production server; that way you'd only be exposed if the
production server was compromised while deletes were running.  This
is obviously not an ideal solution, of course; I mention it just in
case your security requirements can tolerate it.

>> Aside from that (semi-serious) answer, no.  I think it would be possible
>> to change how tarsnap does things in order to reduce the memory usage at
>> the expense of adding considerably more I/O, but I haven't investigated
>> this in detail and it would likely require significant work -- and most
>> people have more than enough RAM based on the amount of data they're
>> storing.
> 
> It looks like we may not be able to use tarsnap for our purposes,
> then. One of the main uses (so far) here at Bump is to back up the log
> files from our logs machine. The machine only has 4GB of RAM because
> it literally just collects log files. But there are ~15TB of data to
> be stored.
> 
> I've only backed up ~3TB at this point and tarsnap is starting to use
> ~2GB of RAM when it runs. Assuming a linear increase in usage, before
> I've even gotten 35% stored, tarsnap will no longer be able to run on
> this machine.

How well do those logs deduplicate?  In particular, do today's logs share
blocks with yesterdays/last week's/last month's logs?

My semi-serious "store less data" might actually be the answer you need,
subject to the qualification "with each set of keys".  You can register
multiple "machines" and use the keys on the same system as long as you
specify different cache directories for them; the memory requirement will
be determined by the amount of data stored *using the current set of keys*.
The price you pay for this is that each set of keys forms an independent
deduplication domain.

> Are there any tips for reducing cache directory usage?

If you're not already doing so, try the --lowmem or --verylowmem options.

> Should we
> abandon using tarsnap for doing large-scale infrastructure backups?

I hope not!  15 GB is a nice amount of data. :-)

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid