[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Speeding up slooooowwww extractions
> On 26 May 2021, at 22:10 , james young <pronoiac@gmail.com> wrote:
>
> The tarsnap page on speed - https://www.tarsnap.com/improve-speed.html - suggests making parallel requests, and indirectly points to redsnapper - https://github.com/directededge/redsnapper - as a tool to do this.
YEah, the Kicker about REdsnapper:
<quote>
Simple tool to run parallel tarsnap clients to do faster extractions of archives with lots of files.
<quote>
The emphasis there: *LOTS* of files
I already done similar (tarsnap -tf archive | grep sql | xargs -P10 tarsnap -xf archive ) but the issue where the hold up happens: a *SINGLE*FILE* sixed 100GB ;( (Actually I have several using autopostgresqlbackup, one for each day of the week and one for the last 4 weeks and then the next year’s monthlies)
I’ts that single day’s I’m interested that I need to recover, thus redsnapper doesn’t help there… as it doesn’t/can’t extract parts of a file in an archive ;(
--fast-read also not the next issue… but yeah, looks like that latency is the killer for my use case ;(
> -James
>
> Sent from my iPad
>
>> On May 26, 2021, at 6:54 AM, hvjunk <hvjunk@gmail.com> wrote:
>>
>>
>>
>>> On 26 May 2021, at 15:47 , Mauro Ciancio <mauro.ciancio@acadeu.com> wrote:
>>>
>>> Hi there!
>>> Can you do the recovery in a VPS next to tarsnap location and then copy the file to the final destination?
>>
>> For that you ask me to create an account with another provider in USoA?
>>
>> yeah, but that is really not that beneficial ;(
>>
>>>
>>> On Wed, May 26, 2021 at 10:39 AM hvjunk <hvjunk@gmail.com> wrote:
>>>
>>>
>>>> On 26 May 2021, at 11:32 , hvjunk <hvjunk@gmail.com> wrote:
>>>>
>>>> Good day,
>>>>
>>>> Is there any equivalent “aggressive-networking” settings for extractions?
>>>>
>>>> From OVH.com in FRance datacentre, it’s is excruciatingly slow to extract a 100GB file ;(
>>>>
>>>> Hendrik
>>>
>>> So as I’ve been watching this paint dry, I see the biggest culprit is the latency ;(
>>>
>>> Looking at a snippet of a strace on the tarsnap process (Linux Container), the time difference between the sendto and recvfrom is the “expected” ~200ms, which begs the question(s) about pipelining and bigger batch requests, as this test basically makes me seriously consider dropping the use of tarsnap for my European and South African servers as the recoveries are just hamstrung by the latency ;(
>>>
>>> So before I do other drastic options, (And yes, I’m reluctant as tarsnap does a great job and I’m impressed by it from a backup/etc. perspective) what other options is possible as I couldn’t find anything for restoring a 100GB file from a backups that will restore before next week Friday?
>>>
>>> 13:28:22.030563 recvfrom(3, "PA\30_e\224Y\317\177\205Ml|\343E\210\234\21F\22\253\374\244\206:4L\221\24]B\\"..., 49629, 0, NULL, NULL) = 49629
>>> 13:28:22.033582 select(4, [], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.035132 setsockopt(3, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>> 13:28:22.035360 setsockopt(3, SOL_TCP, TCP_CORK, [1], 4) = 0
>>> 13:28:22.035517 select(4, [], [3], NULL, {tv_sec=299, tv_usec=999999}) = 1 (out [3], left {tv_sec=299, tv_usec=999997})
>>> 13:28:22.035727 sendto(3, "\222\7U\3520\31\211V\365\275\243\247<\343\262\362J*\234\364iE\222\342\326\351\267\236\207/__"..., 146, MSG_NOSIGNAL, NULL, 0) = 146
>>> 13:28:22.035950 setsockopt(3, SOL_TCP, TCP_CORK, [0], 4) = 0
>>> 13:28:22.036123 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>>> 13:28:22.036174 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.036220 select(4, [3], [], NULL, {tv_sec=59, tv_usec=999952}) = 1 (in [3], left {tv_sec=59, tv_usec=836643})
>>> 13:28:22.199620 recvfrom(3, "\313KW\231:\235\250\214\204\270\0\5\360fA33\f\212\242\210^\332\37\200\270c\264\306\25\330\256"..., 69, 0, NULL, NULL) = 69
>>> 13:28:22.199735 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
>>> 13:28:22.199791 recvfrom(3, "mF\342\236\356\250\251\7<\312G\277\271\t=\3\267,z\355\264O\230\256u\265\37L^+\244n"..., 25936, 0, NULL, NULL) = 24547
>>> 13:28:22.199839 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.199869 select(4, [3], [], NULL, {tv_sec=59, tv_usec=999969}) = 1 (in [3], left {tv_sec=59, tv_usec=916402})
>>> 13:28:22.283654 recvfrom(3, "0\374\17!\347\263U\377\16\376\2\200I5O\245\374\364\341|\270\367\277\"\374\257\270\25\341\205\3179"..., 1389, 0, NULL, NULL) = 1389
>>> 13:28:22.284533 select(4, [], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.284916 setsockopt(3, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>> 13:28:22.285129 setsockopt(3, SOL_TCP, TCP_CORK, [1], 4) = 0
>>> 13:28:22.285319 select(4, [], [3], NULL, {tv_sec=299, tv_usec=999999}) = 1 (out [3], left {tv_sec=299, tv_usec=999997})
>>> 13:28:22.285378 sendto(3, "\226/\366\247Xk^\222U\375\t\364\27\275<lfI]h\266R%\324\377\375\v\350\220\351\324R"..., 146, MSG_NOSIGNAL, NULL, 0) = 146
>>> 13:28:22.285425 setsockopt(3, SOL_TCP, TCP_CORK, [0], 4) = 0
>>> 13:28:22.285520 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>>> 13:28:22.285598 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.285637 select(4, [3], [], NULL, {tv_sec=59, tv_usec=999960}) = 1 (in [3], left {tv_sec=59, tv_usec=846082})
>>> 13:28:22.450937 recvfrom(3, "\336}d\266P\357\210W\374?\334n\2175?\2522\260\354\33L\363\266\262r\324\323\332#\353T\315"..., 69, 0, NULL, NULL) = 69
>>> 13:28:22.451124 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
>>> 13:28:22.451313 recvfrom(3, "tcEH\315\351L\346\207}\261\226\2(Z\300\330nr\212'\262\250\316\274\305\243\225\211\232x\25"..., 135563, 0, NULL, NULL) = 75227
>>> 13:28:22.451427 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.451596 select(4, [3], [], NULL, {tv_sec=59, tv_usec=999832}) = 1 (in [3], left {tv_sec=59, tv_usec=916654})
>>> 13:28:22.534931 recvfrom(3, "\30\10.\325\0372\23\6\250\325\251\366\272)\245a\317\243Xz;\213\305\26Y\326U\306g\367\\\16"..., 60336, 0, NULL, NULL) = 37648
>>> 13:28:22.535034 select(4, [3], [], NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
>>> 13:28:22.535086 recvfrom(3, "{\300\271\2309 \2634nY\25\323\352^x\323\220l\204\205\242\5i\4\215=\32\324\241YuN"..., 22688, 0, NULL, NULL) = 22688
>>> 13:28:22.537381 select(4, [], [], NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
>>> 13:28:22.538469 setsockopt(3, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>> 13:28:22.538501 setsockopt(3, SOL_TCP, TCP_CORK, [1], 4) = 0
>>> 13:28:22.538524 select(4, [], [3], NULL, {tv_sec=299, tv_usec=999999}) = 1 (out [3], left {tv_sec=299, tv_usec=999998})
>>> 13:28:22.538551 sendto(3, "g\222\202^+\235jJ\t\0075\237\26\254\17\260\347\233\340\3738/K\357\212\347\246\\5\344_\314"..., 146, MSG_NOSIGNAL, NULL, 0) = 146
>>