[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rotating back-ups, removed files, etc.



Hi Graham,

Again, apologies for such a tardy reply. My hobby of the month is
necroposting, but this will be the last one.

I am now clear on where my earlier understanding was failing, and have
been for a few months now, largely thanks to (as you pointed out) Jamie.
Your "wordification" analogy is fine, but for me I just needed to
understand that Tarsnap is keeping track of the changed blocks/chunks
and knows how to put them all back together as they were on any given
date.

Thanks again.


Craig



On Sat, 2018-11-24 at 22:48 -0800, Graham Percival wrote:
> Hi Craig,
> 
> If you'd like a better understanding of deduplication, I very much recommend:
> http://www.tarsnap.com/deduplication-explanation.html
> (new page from 6 weeks ago, so regular tarsnap users probably haven't seen it)
> 
> I made up an example of "wordification" (where we make multiple backups of a
> short phrase that changes slightly) as an analogy to backing up a hard drive.
> The whole point of that page is to clarify questions like yours, so please let
> me know if anything isn't clear on that page!  :)
> 
> 
> One specific thing to correct:
> > Now, even though (technically speaking) FILE.EXT was not in the newest
> > archive because it has not changed since the initial back-up,
> 
> FILE.EXT is definitely part of that archive!  You can test this for yourself by
> listing all of its filenames with:
> 
>     tarsnap -t -f NEWEST-ARCHIVE
> 
> I very much agree with Jamie's email where he says "it's far more simple than
> you think".  Let's pretend that you make a daily backup of one directory.  Then
> each backup is "what did that directory contain on that day".
> 
> - Want to see it from 3 days ago?  Restore that archive.
> - Want to see it from yesterday?  Restore that archive.
> - Do you care about the contents from 2 days ago?  No?  Ok, delete that
>   archive.
> 
> Your ability to see the version from 1 or 3 days ago is completely unaffected
> by your deleting the 2-day-old version.
> 
> Cheers,
> - Graham
> 
> 
> On Sat, Nov 24, 2018 at 03:57:03AM -0800, Craig Hartnett wrote:
> > Hi Niels,
> > 
> > Thanks for your reply. Yes, one of my questions -- about intentionally
> > deleting a file and then wanting it gone from all back-ups too -- was
> > hypothetical, but the knowledge of how to accomplish that (if necessary)
> > is (of course) useful, both to more fully understand Tarsnap and (in
> > case the need ever arises) to actually do it.
> > 
> > And since this is just a laptop, RAID is not a realistic option. An
> > interesting one, yes, but not practical.
> > 
> > Further to my original email, in another thread I tried to restore a
> > file that has not changed since I did my initial back-up, but I
> > specified the most recent archive:
> > 
> >         tarsnap -x -f NEWEST-ARCHIVE media/USER/PATH/DIRECTORY/FILE.EXT
> > 
> > Now, even though (technically speaking) FILE.EXT was not in the newest
> > archive because it has not changed since the initial back-up, the
> > restore command still worked. I assume Tarsnap is just smart enough to
> > know that I'm stupid and specified the "wrong" archive, and got the file
> > for me from wherever it was residing. But I assume, going back to one of
> > my original questions, that if I had deleted the initial archive, that
> > file would not have been there for Tarsnap to find until after my next
> > scheduled back-up.
> > 
> > 
> > Craig
> > 
> > 
> > 
> > On Sat, 2018-11-24 at 08:05 +0100, Niels Kobschaetzki wrote:
> > > That is one of the idea of backups: protect you from accidentally deleting files (they protect you also from hardware failure but redundancy and RAIDs a re a better choice here because of possibility to continue the device during the “outage”) 
> > > Thus if you truly want to have a file deleted you need it to delete also from the backups. Most backup systems in my experience only know the concept of volumes which need to be deleted then. Thus a file is only gone when all volumes are gone that contain the file. Thus in that case you have to wait until the file is rotated away or destroy all the volumes. 
> > > Rotation has the added benefit of saving on space which means in the end saving money (with any backup system because with other system you will need more drives, tapes whatever with time). 
> > > 
> > > Niels
> > > 
> > > 
> > > > On 24. Nov 2018, at 04:04, Craig Hartnett <craig@1811.spamslip.com> wrote:
> > > > 
> > > > Hi again,
> > > > 
> > > > OK, so I did read that I'm supposed to forget everything I know about
> > > > back-ups, but frankly that wasn't much. :) Not that I know nothing, but
> > > > it hasn't been something I've spent a *lot* of time thinking about.
> > > > 
> > > > But as I think about Tarsnap, deleted files, rotating/deleting archives,
> > > > daily storage charges (increasing, of course, as the amount of data
> > > > stored slowly increases), etc., I start wondering about what happens to
> > > > files I intentionally delete from my hard drive. If I understand Tarsnap
> > > > correctly, a file that I backed up in my initial back-up and that hasn't
> > > > since changed only exists in that initial back-up archive because (a) it
> > > > hasn't changed so there has been no need to re-upload any part of it and
> > > > (b) archives are immutable. If I delete that initial archive I assume (I
> > > > could be wrong, so this is part of my series of questions) that Tarsnap
> > > > will realise that and back up those files again. Am I right?
> > > > 
> > > > So if I delete my initial archive today, Tarsnap will realise that it
> > > > has to upload pretty much everything -- not everything, but almost
> > > > everything -- again, right?
> > > > 
> > > > And what if I delete a file -- any file -- on my hard drive that has
> > > > been backed up in the past? Of course Tarsnap won't upload a null file,
> > > > but does that file continue to exist in the archives unless or until I
> > > > delete the last archive that contains it? In other words, it's *my*
> > > > responsibility to curate my archives, right? (I'm quite happy to curate
> > > > my own stuff. Just want to make sure.)
> > > > 
> > > > And what if I want to delete a file from my hard drive *and* my
> > > > back-ups? Since the archives are immutable, and this file was in my
> > > > initial back-up, am I right that there is no way to delete that single
> > > > file from the back-up archives without deleting the whole archive, and
> > > > consequently re-uploading most of the original archive again?
> > > > 
> > > > Which leads me to the conclusion that I should pick a time frame -- say,
> > > > 90 days -- or come up with some traditional, staggered rotation system,
> > > > and start deleting archives older than that *except* the initial
> > > > archive, right?
> > > > 
> > > > Or am I completely out to lunch here? :)
> > > > 
> > > > Thanks for any light you can shed on this, via links to documentation
> > > > that covers it of course if I have missed it.
> > > > 
> > > > 
> > > > Craig