INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: video file copy integrity - advice please  (Read 2545 times)

rjm

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 2699
video file copy integrity - advice please
« on: March 18, 2011, 05:25:17 pm »

I have over 5 TB of media and have a lot of experience moving it around as I upgrade and replace hard drives. I am cautious and always do a binary comparison after copying to ensure no errors were introduced.

I have noticed that it is quite common for video (avi, mp4) files to differ by a small number of bytes after copying (the file sizes and dates never differ). For example, I just copied 2300 videos files (800 GB) and found that 170 of them were not identical after copying.

A few observations:
I have used Windows file copy as well as 3rd party copy tools with similar results.
I have observed this on multiple systems and therefore do not think it is a hardware problem.
I have seen this on XP and Vista and Win 7 and therefore do not think it is OS dependent.
I never see differences in other file types like mp3, pdf, jpg, exe, etc.
The video files always seem to play ok so no obvious damage is being done to their functionality.
After copying the differing video files a second time they always compare ok.

Does anyone out there have experience with copying large quantities of video files and then doing a subsequent binary comparison?

Does anyone know if Windows is doing something "funny" with video files like modifying unimportant meta data?
(I ask this because I know of 2 precedents: xls files are modified when they are opened and closed without being saved; and Nero often reports video files are burned ok to dvd-r after its verification phase detects few byte differences).

Thanks in advance for your help.
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: video file copy integrity - advice please
« Reply #1 on: March 19, 2011, 05:31:42 pm »

Does anyone out there have experience with copying large quantities of video files and then doing a subsequent binary comparison?

Back in the day I did extensive tests on my network by copying around a variety of different file types, including a large number of different types of video.  After each copy I would run through a full MD5 and SHA1 hash-checking comparison procedure to test for any corruption (I was having issues with data corruption with network copies, which turned out to be a bad network card and a stupid bad cable).  This involved creating both local copies from my "master test set" that I was using, and then copies across the network.

I was able to do it and have the hashes consistently match, for both local and network copies, once I found my problem.  Now, this was back on Windows XP.  I don't know if Windows 7 is doing some new file or metadata related trickery, but I'd be surprised.  Maybe the thumbnailing?

How are you comparing the files?  What type of hash are you going by?  Are you just going by how Windows reports the file size?  I don't think this is entirely accurate, and I think it can shift depending on how the disk subsystem stores the data, and can float (particularly with larger files).  I could be wrong, but I wouldn't trust it.  I just go by hashes whenever I need to check integrity.

NOTE: A MD5 sum can possibly be tricked (there are known flaws in the cryptographic algorithm used by MD5, but almost certainly not unless someone was very sophisticated in "attacking" the system).  But, even if you discount MD5, if you use both that and SHA1 that's about the best proof you can get.  While some theoretical cryptographic "weaknesses" in SHA1 have also been discovered, even with intentional attempts to exploit these weaknesses, it takes over 34 billion tries (235) to intentionally get the message digest for two different messages to match.  For this to happen to you even once in your life randomly, you'd have to be extremely lucky.  For twice?  I'd buy a lottery ticket or hide somewhere, because the universe has something in store for you.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

rjm

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 2699
Re: video file copy integrity - advice please
« Reply #2 on: March 19, 2011, 06:45:09 pm »

thanks glynor

I'm not doing anything fancy. I do not rely on checksums, hashes or file sizes. I compare every byte in the source and destination files using two different applications: Beyond Compare and Super Flexible File Synchronizer. These applications agree on any differences that are found.

I've seen it via SATA, e-SATA, and USB. I've seen it on many different drives. I've seen it on at least 2 different systems. Given my understanding of error detection and correction in hard drives, the most likely culprit might be the odd bit flipping in bad system memory. But if this was the case you would think I would be seeing all kinds of system problems, which I do not. I also don't get why it only happens on video files. Perhaps because they are big and the probability of hitting a bad memory bit increases? But I copy many more small files than large files so you would expect to see a lot of problems in small files which I do not. Maybe I've stumbled on some deep Windows bug that no one else has noticed because most video plays fine with a few bad bits, and most people never check the integrity of files they copy?

It's really got me stumped.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: video file copy integrity - advice please
« Reply #3 on: March 19, 2011, 07:53:45 pm »

My theory - your video files are very large, and occupy long, contiguous disk blocks, where as your smaller files are scattered here and there.  If you have disk sectors that are problematic, the changes are greater that the very large files will be the problematic ones, vs. the smaller files, simply because a single large file occupies so many disk blocks.

I very much doubt this is some Windows bug.

Have you examined which bytes are differing?  You'd want to look at byte location, number of bytes, and the exact bit differences of the differing bytes.  This will give you a better idea of what is going wrong.

The software doing the copying (eg. Windows Explorer) will receive a result code upon completion of the file operation.  If the code comes back as successful, Explorer will happy continue and you'll not be made aware of the problem.  This might happen when a faulty disk controller or driver does not detect an error in writing data to the disk blocks.  The disk and controller should know if the write completes successfully or not.  But when it is faulty, all bets are off.
Logged
The opinions I express represent my own folly.

rjm

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 2699
Re: video file copy integrity - advice please
« Reply #4 on: March 19, 2011, 10:05:43 pm »

Thanks MrC.

I have looked at the differences in years past but forget the details other than I recall it is usually one byte being different. I will do a more thorough analysis next time I see the problem.

Your theory about disk sector errors is interesting. My understanding of how a drive works is different than yours and I may be wrong. I think that a drive adds error detection and error correction data to a block of data and then writes it without checking that it wrote correctly. Later when the data is read back the drive uses the error detection data to check for errors and if found applies the error correction data to fix it. If the error is too serious to fix then it maps that sector out of service and passes an error up to Windows which presents an error message on the screen. I don't see any error messages which leads me to believe the drives are ok.
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: video file copy integrity - advice please
« Reply #5 on: March 19, 2011, 10:20:44 pm »

I'd probably guess memory or CPU, but it could be disk controller too.  This seems less likely since you've reproduced it on a variety of subsystems.  However, in a modern chipset, SATA, USB, PCI, PCIe, and most everything else come out of the same southbridge chip (on Nehalem there is only one chip in the "set").

Still, I don't think that should be happening, as long as the comparison apps you're using are accurate.  I'd still guess memory or CPU.  Maybe even malware (a rootkit could intercept disk traffic).

I do wonder about filesystem metadata though....

Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

rjm

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 2699
Re: video file copy integrity - advice please
« Reply #6 on: March 20, 2011, 01:30:51 am »

Thanks.

Until recently I assumed possible flaky hardware in my old AMD Athlon 64 X2 4800+ Asus A8N32-SLI Deluxe system.

But last week I saw the same problem on an Intel Core 2 Duo E6300 + Asus P5B system.

Odds of the same kind of flaky hardware in both seem low.

I have standardized on the Vantec Nexstar3 external enclosure for all 8 of my external backup drives. But these were bought over many years and with different chip revisions so a problem here seems unlikely. Plus I've seen it without using these external enclosures.

I wonder how many people out there do binary comparisons after copying TB's of video? Maybe this is a common problem?
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: video file copy integrity - advice please
« Reply #7 on: March 20, 2011, 02:22:47 pm »

Odds of the same kind of flaky hardware in both seem low.

Indeed.

Just to be sure you fully understand what is going on, I'd really try doing a reliable cryptographic hash compare.  I've seen other binary comparison tools be wrong before, particularly with compressed files.  If the SHA1 hashes match, the files are identical.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

newsposter

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 787
Re: video file copy integrity - advice please
« Reply #8 on: April 04, 2011, 04:49:45 pm »

Just a thought, do you have rdc (remote differential compression) enabled on the source or target machines?
Logged

rjm

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 2699
Re: video file copy integrity - advice please
« Reply #9 on: April 05, 2011, 10:42:09 am »

Just a thought, do you have rdc (remote differential compression) enabled on the source or target machines?
No I don't.
Logged

newsposter

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 787
Re: video file copy integrity - advice please
« Reply #10 on: April 05, 2011, 11:44:17 am »

try installing it/turning it on (assuming that your machines are running an OS newer than Server 2003r2 or Vista SP1).

RDC is designed/intended to enhance copy reliability on limited-bandwidth networks but it also effective on other networks that might have their own 'reliability' problems.

Logged
Pages: [1]   Go Up