[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: retry/append/restart restores ? Re: Speeding up slooooowwww extractions
> On 27 May 2021, at 14:53 , Dave Cottlehuber <dch@skunkwerks.at> wrote:
>
> On Thu, 27 May 2021, at 11:04, hvjunk wrote:
>> SO, my next issue that pops up is the ability to restart/append a file
>> busy being extracted when the tarsnap process gets killed/etc. during
>> the restore.
>> I don’t see anything in the manual page so wonder where that is
>> documented if at all?
>
> does --retry-forever help?
the case/issue is the VM/instance got restarted, ie. tarsnap needs to restart,
not just a conenction error
>
>> ( and yes I’ve started an instance in Canada to be closer to the
>> tarsnap USoA for the restores, yes, seems to be about double the speed,
>> but still <50% after 24hours for a 100GB file extraction ;( )
>
> https://www.tarsnap.com/improve-speed.html
>
> The only sensible option for performant tarsnap restores of large files is:
>
> - splitting the archive *before* it goes to tarsnap
yeah, that is the challenge ;(
> - parallelised recovery
as mentioned earlier: current single big file
> - into AWS server running in US S3 hopefully in the same network area
$$$$$ cost suddenly ;(
> - then move to the expected location
adding extra $$$ ;(
> I hacked a script here https://git.io/vdrbG "works on my machine" and
> makes a number of assumptions including path length that may bite you.
> It won't help you restore a single large file, but it does help for
> many large-ish files.
YEs, that’s why it won’t work in my (current) case ;(
> The moment we introduce pipes and splitting in shell scripts, is the
> moment when, years later, we find that the split tool truncates at 64-bit
> size, and data has been irrecoverably lost. tarsnap really should be able
> to handle this scenario natively and sensibly.
;(
Yes, that is what I’m trying to prevent myself
> In all other respects its
> my preferred choice for backup & recovery of Stuff That Matters.
Yes, so the current issue is about the CAPEX costs now to write a solution
to fix this by splitting (*reliably*) the big files, or spend more opex for a different solution ;(