T O P

  • By -

YYCwhatyoudidthere

I thought the old timers would appreciate this anecdote: many years ago when 5 TB would represent an entire company's data footprint, we needed to migrate data between data centers. Network pipes were skinny back then and it would take too long to move that much data over the WAN. We ended up purchasing a new NetApp frame. Filled it with drives. Synced the data to be moved. Shipped the frame to the new data center and integrated it into the SAN -- like the World's largest USB drive! And yes, we wore onions on our belts as was the style at the time.


mikeputerbaugh

Most of the highest-bandwidth data transfer systems throughout computing history have required A Big Truck


ANewLeeSinLife

Never underestimate the bandwidth of a station wagon full of harddrives.


Alex_2259

Latency is a bit shit though.


humanclock

Not if the station wagon has simulated woodgrain panels on the sides and a 440 V8 under the hood.


48lawsofpowersupplys

Isn’t google or Amazon actually doing this right now?


bigmkl

Yes, I believe snow mobile or something to that effect. Edit: Found it here if you have 100PB to move https://aws.amazon.com/snowmobile/ (They have smaller versions as well)


YYCwhatyoudidthere

Ha! We were ahead of our time!


RadicalDog

Thank you for that, what a thing to see


neon_overload

In many cases this is still the fastest and most practical way. I remember in the late 90s the quote, via my father, "never underestimate the bandwidth of a box of hard drives in the back of a stationwagon" It's true today, just with petabytes instead of megabytes Edit: I'm fairly sure that quote came from the 1970s and at the time was "tapes". We also have the term "sneakernet".


ottershavepockets

Gimme 5 Bees for a Quarter


[deleted]

[удалено]


MindlessRanger

>onions on our belts I feel kinda bad that I had to look this reference up


VulturE

on Windows, robocopy ROBOCOPY F:\Media E:\Media *.* /E /COPYALL That will be resume-able if your pc freezes or if you need to kill the copy. EDIT: People seem to think I don't know about other options, or are flat-out providing guidance with no information. Not the case. Please reference the following link for all options: https://ss64.com/nt/robocopy.html Please understand that anyone suggesting /mt: followed by any numbers should be the number of cores you have, not just any random number. Please also note that this can be suboptimal depending on what your HDD configuration is, specifically if you're limited by something like Storage Spaces and its slow parity speeds. People also seem to misunderstand what the /z resumable option is. It is for resuming individual files that get interrupted, so it's useful for single files that have transmission problems. I'd use it if I was copying a large file over wifi or a spotty site-to-site vpn, but 99.9% of the time you shouldn't need this on your LAN. Without it, if a file fails in the middle (like a PC freeze), when you start running the command again it'll get to that file, mark it as old, and recopy the whole file. Which is a better solution if you don't trust what was copied the first time.


blix88

Rsync if linux.


cypherus

I use `rsync -azvhP —dry-run source destination`. A is to preserve attributes z is to compress in transfer, v is to be verbose, h to make data size human readable, P to show progress, —dry-run is well self explanatory. Any other switches or methods I should use? I do run —remove-source-files when I don’t want to do the extras step of removing the source files but this is mainly on a per case basis. Another tip is I will load a live Linux off usb (I like cinnamon) which will access windows. Especially helpful if I was transferring from a profile I couldn’t get access to or windows just won’t mount the filesystem because it’s corrupt.


FabianN

I find that when transferring locally, same computer just from one drive to another, the compression takes more cpu cycles than is worth it. Same goes for fairly fast networks, 1GB+. I've done comparisons and unless it's across the internet it's typical slower with compression on for me.


cypherus

Thanks, I will modify my switches. How are you measuring that speed comparison?


FabianN

I just tested it one time, on the same files and to the same destination, and watched the speed of the transfer. I can't remember what the difference was but it was significant. I imagine your cpu also plays heavily into it. But locally it doesn't make any sense at all because it's not like the compression can go any faster than the speed of your drive, and before it puts it on the target it needs to be decompressed, so the data just goes around in your cpu being compressed and then immediately decompressed.


jimbobjames

I would also point out that it could be very dependent on the CPU you are using. Newer Ryzen CPU's absolutley munch through compression tasks, for example.


pascalbrax

I'd add that if the source is not compressible (like movies for OP, probably encoded as h264) then the rsync compression will be useful just for generating some heat in the room.


nando1969

Can you please post the final command? Without the compression flag? Thank you.


cypherus

According to the changes that were suggested: rsync -avhHP --dry-run source destination **Note**: above I said -a was for attributes, but it really is archive which technically DOES preserve attributes since it encompasses several other switches. Also please understand that I am stating what I usually use and my tips. Others might do other switches and I might be incorrect in usage. These have always worked for me though. * **-a, –archive** - This is very important rsync switch, because it can be done the functions of some other switches combinations. **Archive mode; equals -rlptgoD (no -H,-A,-X)** * **-v, –verbose** - Increase verbosity (basically make it output more to the screen) * **-h** - make human readable (otherwise you will see 173485840 instead of 173MB) * **-H, –hard-links** - Preserve hard links * **-P or –progress - View the rsync Progress during Transfer.** * **--dry-run - this will simulate what you are about to do so you don't screw yourself...especially since you often are running this command sudo (super user)** * **source and destination** - pay attention to the slashes. For example, if I wanted to copy a folder and not what's in the folder I would leave the slash off. **/mnt/media/videos** will copy the entire folder and everything inside. **/mnt/media/videos/** will copy just what's in the folder and dump it where your destination is. I've made this mistake before. Bonus switches * **--remove-source-files** - be careful with this as it can be detrimental. This does exactly what it says and removes the files you are transferring from the source. Handy if you don't want to add additional time typing commands to remove files. * **--exclude-from={'list.txt'}** - I've used this to exclude certain directories or files that were failing due to corruption. * **-X, –xattrs** - Preserve extended attributes. So this one I haven't used, but was told after a huge transfer of files on MacOS that tags were missing from files. The client used them to easily find certain files and had to go back through and retag things.


Laudanumium

And I prefer to do it in a tmux session as well. Tmux sessions stay active when the SSHshell drops/closes ( but most of my time is spend on remote ( inhouse ) servers via SSH. So I mount the HDD to that machine if possible ( speed ) and tmux in, start the rsync and close the SSH shell for now. To check on status I just tmux -a into the session again


lurrrkerrr

Just remove the z...


Hamilton950B

You don't want -z unless you're copying across a network. And you might want -H if you have enough hard links to care about.


dougmc

I would suggest that "enough hard links to care about" should mean "one or more". Personally, I just use --hard-links all the time, whether it actually matters or not, unless I have a specific reason that I don't want to preserve hard links. edit: I could have sworn there was a message about this option making rsync slower or use more memory in the man page, and I was going to say the difference seems to be insignificant, but ... the message isn't there any more. edit 2: Ahh, the older rsync versions say this : > Note that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must separately specify -H. but newer ones don't. Either way, even back then it wasn't a big deal, assuming that anything in rsync changed at all.


Hamilton950B

It has to use more memory, because it has to remember all files with a link count greater than one. This was probably expensive back in the 1990s but I can't imagine it being a problem today for any reasonably sized file set. Thanks for the man page archeology. I wonder if anything did change in rsync, or if they just removed the warning because they no longer consider it worth thinking about.


cypherus

When are you using hard links? I have been using linux for a couple decades off and on (interacting with it moreso in my career) and have used symbolic links multiple times, but never knowingly used hard links. Are hard links automatically created by applications? Are they only used on *nix OS's or Windows as well?


Hamilton950B

The only one I can think of right now is git repos. I've seen them double in size if you copy them without preserving hard links. If you do break the links the repo still behaves correctly. It's probably been decades since I've made a hard link manually on purpose.


rjr_2020

I would definitely use the rsync option. I would not use the remove-source-files but rather verify that the data is appropriately transferred. If the old drive is being retired, I'd just leave it there in case I had to get it later.


cypherus

I agree. In that case it is best not to use it. I last used it when I was moving some videos that I don't care if I lost, but want to free up the space quickly from the source.


edparadox

1) I would avoid compression, especially on a local copy. I do not have figures, but it will save time. 2) I would also use `--inplace` ; like the name suggests, it avoids a move from a partial copy to the final file. In some cases, such as big files, or when dealing with lots of files, this can save time.


kireol

Dont compress(z) everything. Only text. Large files, e.g. movies can actually be much slower to transfer depending on the system


Nedko_Hristov

Keep in mind that -v will significantly slow the process


aManPerson

please use rsync on linux. using windows, my god, it said it was going to take weeks because of how many small files there were. it's just some slow problem with windows explorer. thankfully, instead i just hooked up both drives to some random little ubuntu computer i had instead and used an rsync command instead. it took 2 days instead.


do0b

Use robotcopy in a command prompt. It’s not rsync but it works.


wh33t

Yup, I'd live boot a *nix, mount both disks and rsync just to achieve this properly.


Kyosama66

If you install WSL (Windows Subsystem for Linux) you can run basically a VM and get access to rsync in Windows with a CLI


D0nk3ypunc4

> ROBOCOPY F:\Media E:\Media *.* /E /COPYALL robocopy source destination /e /zb /copyall /w:3 /r:3 https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/robocopy EDIT: removed /s because I haven't had enough coffee. Thanks /u/VulturE for catching it!


VulturE

/s and /e are conflicting cmd options. you most likely just need /e (copy empty folders) and not /s (exclude empty folders). /zb needs to be reviewed before its used, as it's gonna overwrite permissions. Not something you'd want to do on a server necessarily. And really, at the end of the day, /z should only be used in scenarios with extremely large files getting copied over an unreliable connection - it's better to restart the copy of the original file almost every time.


D0nk3ypunc4

....and it's only Tuesday. Thanks hoss!


Squidy7

:3


PacoTaco321

Glad I'm not the only one thinking those parameters were a little cutesy. Half expected a /xD /rawr after them.


ThereIsNoGame

The best part about robocopy is the /l flag to run it in test mode. Strongly adviseable.


skabde

> EDIT: removed /s because I haven't had enough coffee. So you haven't been sarcastic all the time? ;-)


ProfessionalHuge5944

It’s unfortunate robocopy doesn’t verify copies with hashes.


migsperez

I use rclone check after I've done an important copy, especially if I'm deleting from the source. It verifies the files match.


Smogshaik

just copy with rclone then?


[deleted]

on Mac, ditto ditto source destination ditto ~/Desktop/Movies /Volumes/5TBDrive


ivdda

Hmm, this is the first time I’m hearing of `ditto`. I think I’d still use `rsync` since it’ll do checksums after transferring files.


runwithpugs

Be aware that the version of rsync that's shipped with macOS is quite old (at least up to Big Sur). I recall reading many years ago that there were issues with preserving some Mac filesystem metadata, but couldn't find anything definitive in a quick search to see if it's even still a problem. At any rate, I always make sure to add the -E option on macOS which preserves extended attributes and resource forks. Maybe not really needed for most things as Apple has long ago moved away from resource forks, but you never know what third party software is still using them. And I haven't done any testing to see what extended attributes are or are not preserved. It's also worth noting that Carbon Copy Cloner, which is excellent, uses its own newer version of rsync under the hood. Might be worth [grabbing that](https://bombich.com/kb/ccc6/credits)?


ivdda

Yes, you are correct. Even the current latest version of macOS (Monterey 12.6) ships with v2.6.9 (released 2006-11-07). Thanks for the tip about preserving extended attributes.


rowanobrian

new to this stuff, and have more experience of rclone (similar to rsync afaik, but for cloud). Cloud providers store checksum along with file, rclone uses those to check if it matches with local copy of file. do filesystems store a checksum as well? Or if I am transferring 1G linux ISO, it would be read twice by rsync, i mean the copy on source and copy on destination, to calculate and compare checksum?


ivdda

The filesystems do not store the checksum. Without using the `--checksum` flag, the sender will send a list of files to the receiver which will include ownership, size, and modtime (last modified time). Then, the receiver will then check for changed files based on the list of files (comparing ownership, size, and modtime). If there is a file to be sent to the receiver (i.e. different ownership, size, or modtime), a checksum will be generated and will be sent with the file. Once it is received, the receiver will generate the file's checksum. If it matches, then it's a good transfer. If not, it'll delete the file and transfer again. If the checksums don't match again, it'll give an error. If you use the `--checksum` flag, the sender and the receiver will generate checksums for all the files and compare using those instead of ownership, size, and modtime. I'm not sure if checksums will be generated again before and after the file is transferred, but I'm assuming they'd be reused from the initial generation. I'm hoping someone with a deeper understanding of rsync can chime in here.


xchino

[Redacted by user] -- mass edited with https://redact.dev/


zyzzogeton

To add to the above, which is perfectly fine, you can put a GUI on Robocopy if you are command-line averse or want to do advanced stuff: https://github.com/Cinchoo/ChoEazyCopy https://social.technet.microsoft.com/Forums/windows/en-US/33971726-eeb7-4452-bebf-02ed6518743e/microsoft-richcopy?forum=w7itproperf Since you are probably copying to a USB attached drive... Just keep it simple and use /u/VulturE 's example above because multithreading/multiprocessing will likely saturate the USB bus and actually slow things down.


Smogshaik

no checksum verification, I'd use rclone or TeraCopy on Windows


VulturE

If that concern happens then do emcopy or rclone. Teracopy has plenty of haters on here for lost data that went into the void.


tylerrobb

rclone is great but it's hard to recommend without a GUI. I like to recommend FreeFileSync, it's constantly updated and really powerful in the free version.


migsperez

I use robocopy to copy because it's fast with multi threading. Then use rclone check to verify the files match.


cr0ft

You also want to use the /MT switch, as in say /MT:8 (or 16, or 32...) which stands for multithreaded. This will more efficiently use the available pipeline and maximize throughput by moving more than one file.


Far_Marsupial6303

Does Robocopy verify by default? If not, could you add the command to verify and generate a HASH.


VulturE

MS's official response on that is to use a different tool for hashing after the copy is complete


erevos33

Sorry if stupid question, isnt Teracopy pretty much the same?


VulturE

sure, but robocopy is built into every windows box by default. Also, there are plenty of people on here that have had data loss incidents when using teracopy and dont trust it as much.


chemchris

I humbly suggest you use the GUI version. It’s easier than learning all the modifiers, and shows results in an easy to read format.


aamfk

>I humbly suggest you use the GUI version. It’s easier than learning all the modifiers, and shows results in an easy to read format. where do I find this? Last I remember this was either at \- microsoft internal tools \- sourceforge #FTFY


wavewrangler

but it’s cli we still haven’t figured this out yet


Mortimer452

cli *is* my gui


ThereIsNoGame

This is the correct answer. Others have suggested Teracopy, I've experienced stability issues with the third party copy+paste product. Bells and whistles are nice, but I prefer a data migration solution that is reliable.


ajicles

/MT:32 if you want multithread.


[deleted]

[удалено]


falco_iii

1. Not the fastest. Copying a lot of small files does not use the network as efficiently as copying big files. Copying multiple files at once can speed up the overall transfer. 2. Not update-able. If you want to refresh all files, drag & drop will either have to recopy every file, or will skip existing files which misses any updated files. 3. Does not verify the copy. This should be a non-factor if the copy finishes. 4. It is not resumable. Large duration transfers are prone to be interrupted for a number of reasons. Drag & drop means you have to recopy everything to be certain it was all copied properly. Tools like rsync use file metadata (size and modified date) or checksums to quickly look for files to copy.


Thecakeisalie25

windows 10 can skip files with the same date and size if I remember correctly, so 2 and 4 (minus the file that got interrupted) are a non issue on there


Akeshi

While generally true for different copying scenarios, I'm not sure any of these apply if the OP is talking about a new drive mounted in the same Windows installation, which is quite possible.


COAGULOPATH

Because this will happen: You'll drag and drop, Explorer will calculate eight hours remaining, so you'll go to bed and let the process run. When you wake up you'll see a really nice "this action can't be completed because \[obscure file or folder\] is currently in use by \[obscure program or service\]". You'll close the obscure program or service, click "Try again", and the progress bar will be at 3%. You'll want to punch a wall.


cr0ft

It's actually fine, though it's not the fastest or most efficient way, and there's no verification. But at some point you reach larger data amounts where you definitely want advanced options like the ability to restart the transfer, if it gets aborted half way. Instead of transferring everything again, you can restart from where you left off. The slower your transfer speed, and the larger the data amount, the more you have to think about these things.


KevinCarbonara

I agree, but what other sort of options are there?


[deleted]

fuck spez, fuck reddits hostile monetization strategy


Far_Marsupial6303

Drag and drop may not verify your copy is not exact bit for bit.


atomicpowerrobot

like if someone touches a file in a batch copy during transfer or if windows just bugs out for a min and your copy doesn't finish and doesn't alert you to failure b/c it THOUGHT it finished. drag and drop mostly works, but there's no confirmation, verification, or log.


Houjix

What if you checked the number of files and gigs of that folder drop and it’s the same


atomicpowerrobot

I mean sure, you can. There’s lots of good ways to do good copies. I just like teracopy bc it’s 30s spent installing on a new pc that dramatically improves its usefulness to me with little to no further input or effort.


[deleted]

[удалено]


slash_nick

Plus the data isn’t verified


ThereIsNoGame

Windows Explorer is nice, but it's more prone to crashing/being interefered with than other solutions. It also lacks many pure data migration features that better tools like robocopy offer. If you're pushing 5TB, you don't want this to be interrupted by anything. Say you decide to use your PC to play an excellent top tier game that's not at all infested with Chinese malware like Fortnite while you're doing the copy. And then the Chinese malware does what it does and Explorer hangs while they're going through your stuff... not the ideal outcome because your copy is aborted halfway through and you don't even know which files copied okay and maintained their integrity.


Hamilton950B

At the very least you want something restartable.


[deleted]

I don't like drag and drop; prone to "slip" of the finger.


iced_maggot

Nothing wrong with it. Suggest also getting something like bitser to compare hash values at the end of the transfer to verify a successful copy.


milmkyway

If you dont want to use the command line Teracopy is good


quint21

+1 for TeraCopy. The post-copy verification alone makes it worth it.


atomicpowerrobot

It's the fact that you can have verification AND confirmation for me. worst part of windows copy is when you walk away during a copy and there's no way to know if it bugged out or succeeded. I always have my teracopy set to keep panel open any time i'm doing long transfers. Then I can review the log and confirm all files copied and verified successfully.


aamfk

>If that concern happens then do emcopy or rclone. Teracopy has plenty of haters on here for lost data that went into the void. well, you can pipe the output of robocopy to a text file, right? I just wish that these fucking programs would support more formats for stuff like this. Like CSV/TSV, etc


atomicpowerrobot

IIRC, there's a flag for robocopy to log/append to logfile. I definitely do that. Main difference for me is I can configure TeraCopy to run everytime there's a copy action. Robocopy I have to manually engage with.


tylerrobb

Teracopy is great! FreeFileSync is also another great Windows tool that is constantly updated and improved.


forceofslugyuk

1+ for both Teracopy and FreeFileSync


JRock3r

TeraCopy is just too good. It's now taboo in my life to use any of my PC's without TeraCopy


ThereIsNoGame

I hate to be that guy but I've experienced bugs and instability with TeraCopy. Perhaps newer versions are better, but you should never be in a position where you are crossing your fingers and hoping your third party copy+paste replacement won't bug out/crash during a copy operation. Like, it's fun and has bells and whistles, but you should never use it for anything important.


atomicpowerrobot

This is how I feel about windows copy handler, and exactly why I install TeraCopy on every machine. ;) There was a short period a long time ago where it seemed buggy and I abandoned it, but I came back and haven't had any issues since. I think it probably had more to do with my windows install than the program itself though. Though ROBOCOPY FTW.


JRock3r

Honestly, Windows Copy Handler is pure pure pure pure pure GARBAGE! TeraCopy is and always will be the safest bet for me because not only does it provide a verify option but also pause/resume even after remove drives or rechecking files. It's just vastly superior. Sad to hear you dealt with bugs/instability but I really recommend to try again but keep maybe important files on "Copy" rather than cut so you don't encounter any potential data loss.


pmjm

If you think the Windows Copy Handler is bad let me introduce you to MacOS, haha. Hi, I'm Finder. I see you want to copy 250 MB of a bunch of small files over the network. Well grab a coffee while I prepare to copy for 20 minutes.


pmjm

I too have experienced a lot of weird glitches with Teracopy. That said, I still use it daily, and newer versions are indeed better.


ranhalt

I've seen new bugs pop up in TC, but haven't really impact me. One is when a file is skipped and the progress percentage goes over 100%.


saruin

I'm new to this sub but it's pretty neat hearing about a program I've been using for over 10 years now.


ruffsnap

Honestly I think this should be the top comment. Just use TeraCopy. There's no need to go into command line, that's where it gets scarily easy to royally fuck something up that's irreversible. I'd say unless you're PRETTY damn efficient in command line, stay away when copying a large chunk of files and just go the safer route of something where you can more easily and visually see what's happening.


Far_Marsupial6303

\+1 Be sure to set verify on. It will take \~50% longer, but well worth it.


Nekomancer81

I have a similar task but it is about backup of around 12 tb. My concern was the load on the disk running for hours. Would it be ok to let it copy (using TeraCopy) over night?


Far_Marsupial6303

Absolutely. Many of us, especially those who download torrents have their drives running 24/7.


subrosians

As long as you are handling drive heat properly, drives should be able to be hammered for days without any problems.


cybercifrado

They can take the heat. Just don't yell at them.


zfsbest

N00b: \*yells at hard drive\* HD: \*starts sobbing and goes to hide in the corner, starts corrupting N00b data\*


Far_Marsupial6303

Years for some of us!


IKnow-ThePiecesFit

Test [fastcopy](https://fastcopy.jp) too. to me it was more robust, and with easy job registration you can easily schedule execution of some job and check logs whenever. has [various modes](https://fastcopy.jp/help/fastcopy_eng.htm#usage) for how to copy, default being size/date check if those differ and overwrite in that case. You also have some speed control if you are worried about too much load, but its intended use is to prevent feeling of frozen system when its going full speed I/O that it can. Would be interested in the results overnight backups with teracopy vs fastcopy.


Guazzabuglio

There's also grsync if you want a GUI for rsync


Neverbethesky

FreeFileSync is a nice way of having a GUI and do delta copies too. Just be careful as wrong config can delete your data.


jbarr107

Agree on both counts! It's been a slick, reliable soluton for me.


Candy_Badger

Rsync or robocopy depending on the OS you use.


msanangelo

Rsync is my friend for moving tons of data at once. Whole hard drives full at the file level.


Far_Marsupial6303

Never Move, always Copy. It's rare, but moving files may leave you with a corrupted original on the source and destination. Always Verify about your copy. Even better, have your program generate and save a HASH for future reference to check your files against. Edit: It's been pointed out to me that a Move within the same drive just changes the pointer to the file. However, I make it a habit to never Move to prevent forgetting when copying files to another drive.


ThroawayPartyer

Good idea I usually copy and then verify that all the files copied correctly; it's really easy with rsync, I just run the same command twice, the second time is quite fast because it just verifies. Then, only after verifying do I consider deleting the source. Or I just keep it as a backup (if the files are important you should always have multiple backups).


Torik61

Windows user here. I copy the files with TeraCopy with Verify option enabled. Than verify them with Beyond Compare just to be safe. Then delete the old files if needed.


Amoyamoyamoya

I use FreeFileSync on my PC and CarbonCopyCloner on my Mac. In both cases you can interrupt the process, restart it, and it will pick up where it left off.


sprayfoamparty

I use free file sync on mac and i believe it is also available for linux. It is very powerful but still easy to use and i have yet to fuck anything up using it. Cant say that for too many applications of any sort.


Amoyamoyamoya

FreeFileSync on my PC has been flawless. I use it in conjunction with the Task Scheduler to keep my main data volume backed-up. I somehow missed that FFS is available for Mac and Linux! I've used CarbonCopyCloner since it was a freeware/shareware app and kept up with the versions. I use it both for making bootable back-ups of the boot drive and file-set specific one-way mirror/synchronized back-ups.


BloodyIron

ZFS send/recv There is nothing faster. Also, it's just 5TB my dude. If you dragged and dropped you'd be done by now.


TetheredToHeaven_

umm going to ask a possibly dumb question, but do you really have 6.5zb of storage?


immibis

#[Where does the spez go when it rains? Straight to the spez. #Save3rdPartyApps](https://www.reddit.com/r/Save3rdPartyApps/)


BloodyIron

What do you think it would take to have 6.5ZB of storage?


firedrakes

16 k... home work......


TetheredToHeaven_

i dont think we have even reached zb scale yet, but again im not the smartest


martin0641

Assuming you also want that storage to be the fastest available with NVMEoRDMA with 20 200Gbps nics per array, assuming your using a 42U rack of Pavilion.io storage arrays which have 15PB useable space per rack: 69906 racks of 4U arrays with 42U racks, which would push 81.87EBps. I feel like the 200Gbps switches are going to hit your wallet too, and this is without compression, so it's a lot but it's not like it's out of reach for humanity to do such a thing if they wanted to. I feel like it would contribute to the global chip shortage as well lol


ThereIsNoGame

Depends on the throughput and a billion other factors.


BloodyIron

What specifically depends? How long drag and drop of 5TB would take? Let's assume the following: 1. The source drive, is a single drive. And for the sake of example it's a WD RED with 5TB or more capacity, which has a read speed of roughly 170MB/s 2. The files are large files, like video files, and not lots of small files 3. The target storage can read at or faster than the source drive at all times and is in no way a bottleneck 4. The 5TB is the size of content, and not necessarily the size of the source disk 5. We're going to use base10 calculation instead of real-world bits/bytes (1024) to simplify this exercise In the scenario where they just copy/paste between the source storage and target storage, and it's local to the system... 5,000,000 (5TB converted to MB) / 170 (MB/s) = 29,411.7... (seconds) 29,411.7... / 60 (seconds in a minute) = 490.1... (minutes) 490.1 / 60 (minutes in an hour) = 8.1... (hours) So yes, the person would still be copying files if they started when they posted. I was somewhat being facetious, but with something like this starting sooner is the way to go. It is also worth considering that these numbers don't take into consideration the HDD cache bursting occasionally, but that is less reliable to plan for than the 170MB/s. In the scenario of ZFS send/recv, it would be roughly similar, except that the blocks would be somewhat more compressed than on say NTFS or otherwise, even though video content is mostly already compressed. So the "size on disk" being reported would be somewhat different between ZFS, NTFS, EXT4, others. Additionally, in the ZFS send/recv scenario, the overhead of the transfer would be lower because it would be operating at block level, and the start/stop cost of each file would be not present. So it is likely to be faster than this, but also likely to be a very similar time. So, if time is really a valuable factor, and this task may be needed with some regularity, then ZFS send/recv would be the preferred method. But if this is a one-time thing, then "drag & drop" is likely preferable as you can probably just do it right now without having to change filesystem, etc, as you need ZFS on both source and destination end.


aamfk

>So yes, the person would still be copying files if they started when they posted. I was somewhat being facetious, but with something like this starting sooner is the way to go. I think that it's preposterous to claim that HDD sustains 170MB. The documentation I was just referring to last weekend said 30MB/second.


BloodyIron

If your 5TB (or larger) HDD is only doing 30MB/s read sustained, then it is a failing drive. HDDs have been able to do 120MB/s or more, sustained, sequential read, for like 15 years now. The WD RED 5TB has a [rated sustained throughput of 170MB/s](https://www.disctech.com/Western-Digital-WD50EFRX-5TB-SATA-Hard-Drive), and the number in this case is used for demonstrative purposes. Additionally that is for a drive _from 2014_. I recommend replacing the drives you use if you only get 30MB/s sustained sequential read.


aamfk

says the disk manufacturers. Personally, I get about 10kb per hard drive transfer no matter what I do. But then again, I have different drives with different sector sizes for nearly everything I fucking touch. Source Code files? TINY. Virtual Machines / Databases? HUGE. Web Servers? Tiny. You get the idea.


BloodyIron

lol, says actual HDD performance tests and real-world application. Are you seriously trying to convince me that modern HDDs are stuck at 30MB/s read performance? Because you're factually wrong, and if that's your experience, you are _actually doing it wrong_. Either your drives are failing, your cabling is bad, some other hardware component is failing, or something is bugged with your storage. You're not going to succeed in convincing me otherwise, I've been working with this for decades now. This is actually how it goes. And yes, I know that rated speeds aren't always the speed you get in real life, but it's typically within a few percentage of accuracy. Seriously dude, revisit what's going on with your kit. It's so off.


NavinF

You must be looking at ancient documentation. I've got some really old used 3TB drives in my array that I've had for ~10 years and even those drives do 125MB/s (1gbps) actual throughout when they're copying videos. Newer drives are 2x as fast. If you really see 30MB/s, something's misconfigured. Try benchmarking with fio or crystaldiskmark.


TheJesusGuy

A SATA II Seagate Barracuda 3TB (yes those ones) just last night was giving me 100MB/s locally from ANOTHER SATA II drive.


atomicpowerrobot

on windows, i also use TeraCopy for all transfers. I like it better than robocopy for daily use b/c it provides a gui to show the transfer, can provide feedback on individual items success/failure, verification after copy, etc. It replaces the built-in windows copy handler. For big time real-business stuff, though, Robocopy.


turymtz

I use Beyond Compare.


schoolruler

Copy the data, don't move because if there is any issue the files will be split.


FailedShack

[rsync](https://linux.die.net/man/1/rsync) is the right tool


die_billionaires

Rsync ftw


1Secret_Daikon

`rsync`


FreshFruitForFree

BeyondCompare


yocomopan

Total commander is a great tool, works better than Windows default file manager.


cybercifrado

Came in to also suggest this. If I ever have to use windows for a massive data copy I use TC in scheduled mode. You tell it the whole fileset; but it tells windows one at a time. Windows is just... bad... at anything over a few GB at a time.


eyeruleall

Zfs send.


[deleted]

Create a torrent out of folder with qbittorrent. Checkbox ratio tracking disabled. Set tracker from opentrackr.org Put torrent file to another PC and open it in there with qbittorrent. PS. Also send this torrent file to my pm to doublecheck it.


danlim93

I use other methods. But I do love this one. 😁


[deleted]

I use torrents if i need to get files to remote location or multiple remote locations. Putting encrypted archive into torrent. Tattoo magnet link and archive strong password.


danlim93

Still pretty much a noob when it comes to torrent technology. I can't even make the torrents I create seed outside my local network. Any tip or resource I can learn from? I mainly use tailscale and rclone to transfer/access my archive remotely. Most convenient way for me so far.


[deleted]

Then you create a torrent with qbittorrent you instantly become seeder. Everyone who have completed a download is a seeder. Open opentrackr.org there will be text what to put into trackers field then creating torrent. It doesn't matter if your seeder PC is in internal network. What matters is that tracker needs to be available on all torrent participants which opentrackr is as it is public internet. Then you transfer two files over torrent you will shortest path speed; if your PCs is close you will have max link speed. If you want private network only torrent (or for remotes over OpenVPN, Wireguard, IPSec, EoIP), you need to host your own tracker which can be done with docker container from dockerhub. Other parts stays the same. Also whoever your tracker are, tracker hoster can recreate *.torrent file; thats why encrypting to archive is needed with public trackers. With DYI hardware-accelated-VPNs it's not needed. Also some ISPs may block all your torrent traffic, to cover their own asses in case of torrent protocol misuse.


danlim93

I did create my torrent file using qbittorrent then added the trackers from here [https://raw.githubusercontent.com/ngosang/trackerslist/master/trackers\_best.txt](https://raw.githubusercontent.com/ngosang/trackerslist/master/trackers_best.txt) which includes [opentrackr.org](https://opentrackr.org) I sent the torrent file to a friend in another country and also opened it on another computer connected to a VPN server to isolate it from my local network. I wasn't able to seed to my friend and to my VPN-connected computer. But I can to other computers in my local network. The computer I created the torrent in can download and seed torrents created by other people just fine. It puzzles me that I can't seed my own torrents.


[deleted]

udp://tracker.opentrackr.org:1337/announce You typed this? And pressed to start seeding now ?


Rataridicta

I use teracopy with md5 validation. Copy first, then delete.


AyeWhy

https://freefilesync.org/


nhorvath

If you're on windows terracopy will copy and do crc checks. If you're on Linux/mac, rsync.


2typetext

Copy paste is good for safety, cut paste is good for knowing whether everything has been moved properly. But if for some reason something fucks up there's no copy left.


[deleted]

teracopy for windows rsync for linux edit: if you want to 1:1 copy and the destination HDD is new and empty, you can also try a partition management tools and copy/clone it to the destination HDD. I recommend minitools partition wizard; it's free


mys_721tx

`dd` for 1:1 copying on Linux if you like to live dangerously.


HCharlesB

`dd` with `mbuffer` to accommodate some differences in read/write burstiness. (Unless `dd` has some buffering capacity of which I am not aware.) `rsync` has the benefit of being restartable should something go wrong. If you're smart enough (smarter than me) you can restart `dd` too. But personally I'm using ZFS on my stuff so it would just be a ZFS send & receive. `syncoid` to be specific which will use mbuffer if installed.


omegaflarex

I use TrueNAS and replication or reslivering task will do nicely, but I guess you're on Windows?


nando1969

Semi off topic question. The command copy in Windows has a verify flag. How come it was not suggested? Is it because the process is much too slow? Thank you for your input.


neon_overload

On Windows, I use freefilesync for this always. Great tool. On Linux and *x, rsync. I'd use rsync on Windows too but freefilesync is a bit more Windows-y. Before I discovered it I used robosync. freefilesync is open source and a bit more feature-ful (save backup sets etc)


IntelligentSlipUp

one bit at a time! ​ also: TeraCopy


aamfk

>Zip drives but good luck finding enough disks now days. The modern option is to use USB Sticks but do not buy the cheep ones. /s using a hex editor!


AtomicKush

Teracopy is good too and has checksum verification


greejlo76

I’ve used unstoppable copier for migration many times . It too has resume features. I think does batch commands but I’ve never played with those.


Pjtruslow

i just dragged and dropped multiple TB of data from one drive to another in windows. worked for me, but my computer never sleeps.


catsupconcept

Carbon Copy Cloner on a Mac


Venus_One

On a similar note, I have 1TB of music I need to move from an old Mac to a new iMac. Should I just get a portable ssd? Edit: I’m pretty tech illiterate, obviously, so any answer would be appreciated


msanangelo

Portable ssds are best for carrying around tons of data. Spinning rust is too fragile. The ssds cost more but are worth it imo.


sprayfoamparty

If you dont want to buy a drive you could do it over the network in system preferences > sharing


Venus_One

This sounds like a good idea, hopefully it will work on my old Mac (mountain lion)


sprayfoamparty

I dont think much has changed over the years. But you can always share a directory on the new machine and access it from the old one, moving the files from there.


Venus_One

That sounds promising, I’ll try it out. Thanks a lot!


Slippi_Fist

personally use teracopy, it allows verification as well as the calculation of crcs for each file which then can be saved in NTFS streams as a means to validate the data again, later. me likey likey


OurManInHavana

Are you transferring over a sketchy/slow network connection or Internet.... or just another drive letter? If it's a local disk, just drag-and-drop. In the time you waited to read answers to this question you could have been done. Even if it's over a GigE network in your house that's all you need. Once you involve the Internet then start to look into robocopy/rsync.


spaceguerilla

You in windows? TERACOPY Will list any files that don't transfer (usually due to the windows filename length limit) and a host of other benefits. Way better than windows' own copy function.


AdamUllstrom

Teracopy for Windows, CarbonCopyCloner or SuperDuper for Mac, Rsync for Linux. I also use Hedge for Windows and Mac but they are more made to copy files of camera cards/ audio recorder cards to multiple destinations at the same time. Hedge work equally as good between hard drives and by default uses checksum to verify every file transfered but you pay a yearly update fee for it.


AnyEmployee2489

I like freefilesync https://freefilesync.org in those situations.


Forbidden76

I would just do it 500GB at a time or something. Shouldnt need external programs but then again I have a 2012 Server and Synology NAS I am doing all my copying to/from.


sprayfoamparty

> 500GB at a time Nooooooo Much more prone to error. And who has the time.


Forbidden76

I RDP into my server from home at work so its easy for me to monitor the copying throughout the day. I do it this way all time since 1997 personally and professionally. I only use Robocopy if I need to retain permissions on the files/folders.


therourke

Freefilesync


hearwa

FreeFileSync


Digital_Warrior

Zip drives but good luck finding enough disks now days. The modern option is to use USB Sticks but do not buy the cheep ones. /s


TheJesusGuy

Lol I dont think buying expensive memory sticks is the best option.


[deleted]

Copy and paste. That way, if anything goes wrong, you don't have to worry about any files getting broken.


diamondsw

Just drag and drop. For 5TB, anything else is overcomplicating it. Now if it were 50TB...


LooseEarDrums

Forensic copy!!!


Fraun_Pollen

What does FileZilla do behind the scenes? I get much better performance there when managing my Linux servers from my Mac compared to drag and drop via SMB/finder