[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: using fsck in a lowmem environment
> Hmm, that's probably not a good plan. Tarsnap needs an up to date cache
> directory in order to create or delete archives, so if you delete an
> archive on said small VM the system which is creating archives won't be
> able to create any more backups until its cache directory is brought back
> into sync with the Tarsnap server -- i.e., until either it runs fsck or
> you copy the cache directory across to it.
Okay -- this invalidates my entire plan, then. :-(
I was hoping to have two separate environments: production and a
secure off-site server, and "ne'er the twain shall meet". I put
limited access keys (-r and -w only) in production, and then full
access keys (-d and --nuke allowed) on the off-site server.
The goal was to make it so that if the production environment is
penetrated and someone wipes everything out -- even with the tarsnap
keys, they can't then go and wipe out our backups. It can't be a total
loss situation (assuming that the off-site server is secure, of
course).
Unfortunately, it seems like this isn't really feasible... but it's so
close! Tantalizingly close!
Am I missing anything here? Has anybody implemented something like
this with tarsnap?
> Yes, fsck needs to generate a list of all the blocks you have stored and
> how many times each of them is referenced. Of course, you'd need the
> same amount of memory in order to create or delete any archives (since
> those will need to update those block reference counts).
Gotcha, thanks.
> Sure, store less data. ;-)
Har har. :-)
> Aside from that (semi-serious) answer, no. I think it would be possible
> to change how tarsnap does things in order to reduce the memory usage at
> the expense of adding considerably more I/O, but I haven't investigated
> this in detail and it would likely require significant work -- and most
> people have more than enough RAM based on the amount of data they're
> storing.
It looks like we may not be able to use tarsnap for our purposes,
then. One of the main uses (so far) here at Bump is to back up the log
files from our logs machine. The machine only has 4GB of RAM because
it literally just collects log files. But there are ~15TB of data to
be stored.
I've only backed up ~3TB at this point and tarsnap is starting to use
~2GB of RAM when it runs. Assuming a linear increase in usage, before
I've even gotten 35% stored, tarsnap will no longer be able to run on
this machine.
Are there any tips for reducing cache directory usage? Should we
abandon using tarsnap for doing large-scale infrastructure backups?
--
Mark Smith // Operations Lead
mark@bumptechnologies.com