INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Backing up metadata when stored in files  (Read 1421 times)

LeoH

  • Regular Member
  • World Citizen
  • ***
  • Posts: 109
Backing up metadata when stored in files
« on: January 24, 2012, 12:25:52 pm »

I have a large, complex classical & jazz music library and frequently work on enhancing its metadata. I have added many custom fields for better access to this library.  All tags are set to 'save in file ... (when possible)'.

I store the library on a 4TB NAS and backup to a second 4TB NAS (drives were cheap before the flood last fall!). I would love to be able to just update the tags (metadata) in the file headers from primary to backup when only tags have changed.  Is there a function in MC (or a clever app somewhere) that already does this?
Logged

doug.ca

  • Junior Woodchuck
  • **
  • Posts: 50
Re: Backing up metadata when stored in files
« Reply #1 on: January 24, 2012, 12:39:13 pm »

Have a look at Microsoft's SyncToy: http://www.microsoft.com/download/en/details.aspx?id=15155. I've been using this tool for quite some time to synchronize and backup various photo and music file archives.
Logged

LeoH

  • Regular Member
  • World Citizen
  • ***
  • Posts: 109
Re: Backing up metadata when stored in files
« Reply #2 on: January 24, 2012, 01:41:04 pm »

I am aware of SyncToy.  It works well but it backs up the entire file.  The issue is that I do not want to backup the very large (typically flac) files when only the tags in the header have changed.  It would be much less time consuming to only copy the header portion of the file and would also be much less of a load on the network.
Logged

Listener

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 1084
Re: Backing up metadata when stored in files
« Reply #3 on: January 24, 2012, 03:21:06 pm »

rsync uses the sort of technology you are talking about.

http://en.wikipedia.org/wiki/Rsync

Bill
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Backing up metadata when stored in files
« Reply #4 on: January 24, 2012, 03:43:50 pm »

I am aware of SyncToy.  It works well but it backs up the entire file.  The issue is that I do not want to backup the very large (typically flac) files when only the tags in the header have changed.  It would be much less time consuming to only copy the header portion of the file and would also be much less of a load on the network.

This seems like a risky proposition to me.

Backup is the one area in all of computer science where I think being extremely conservative is almost always the best choice.  Why risk it, simply to lower (ever so slightly) the network load?

After all, a Gigabit Ethernet network is far faster than the data rates you're likely getting off of those NAS drives.  I know my local, high performance, professional RAID PCIe RAID card with 4 2TB WD Black drives can't peg my network interface.  Unless you have SSDs in those NAS boxes (and if you do, you're a fool), then you aren't going to "fill" a Gigabit ethernet pipe.

So, if you aren't going to fill it, the biggest load will be from transfer setup and request, not data throughput.
And, copying only the headers won't help with that at all.

And, it seems dodgy.

I'd use something like SyncBack to copy only those files that have changed, in their entirety, and be done with it.  Copying and syncing parts-of-files is possible with stuff like rsync on POSIX, but not really recommended if there are other options, and I wouldn't call it "conservative".
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Backing up metadata when stored in files
« Reply #5 on: January 24, 2012, 04:08:12 pm »

rsync is perfectly sound technology and has been around and proven itself for years.

But it doesn't do what the OP wants.  Rsync can minimize the amount of data that has to cross the wire by sending only changed portions of a file, but it does not store block-level change differences.  You need a block-level (vs. file level) backup utility for this.  They exist, but not at the toy level.
Logged
The opinions I express represent my own folly.

LeoH

  • Regular Member
  • World Citizen
  • ***
  • Posts: 109
Re: Backing up metadata when stored in files
« Reply #6 on: January 25, 2012, 02:20:28 am »

For background, I have dual QNAP NAS units on a gigE network with each box using two bonded gigE ports. Backup between the two NAS boxes averages 50 MB/sec which translates to about 5.5 hours per terabyte (interestingly, bonded Ethernet does not buy anything for long term transfers.)  I have more than 3 TB of music in the database.  File system transfer rate is the obvious problem.

I am experimenting with extensive metadata enhancements to the music files in an effort to create more effective access to a complex library.  Sometimes there are wholesale tag changes to a large portion of the files which can result in 15 hour back ups.  I often work 12 hour days, so this is the problem (cue Monty Python: "luxury! 11 of us lived in a cardboard box at the bottom of a lake!)

A safe tag backup regime could possibly be implemented by just reading the tag header, which is usually of a fixed length, and over-writing just the header of the backup file when changes are detected (assuming the matching backup file contained a header of the same length).  The metadata block is clearly delineated and the entire header including space for album art is measured in no more than hundreds of kilobytes whether flac or mp3 or any other popular format.   Hence, a 3TB backup of metadata changes which would have taken more than 15 hours when whole files are copied could be done in about 15 minutes.  That is a huge savings in time.  So, why on earth would one wish to do it the long way?  It is not about network loads (sorry if I implied that in an earlier post), it is about time.  And, for safety, if there were any difference in the header size, then the entire file would be copied but that would be an extremely rare event. What could possibly go wrong...go wrong...go wron...wo gron....

But seriously folks, I do not think that sync programs that look for differences across the entire file and only updates the differences would be reliable and actually could introduce damage.  If one is only working on metadata changes and anything outside of the tag header has changed then there is a bigger problem which should probably not be propagated to a backup file copy. At the very least, the entire file should be copied to backup if any portion outside of the header has changed.  

Another reason for not copying an entire large file when only a small portion of its header has changed is because it badly fragments the backup file system.  This is because the updated file copied to a backup volume would be written to free space then the old file on the backup volume deleted. If only a fixed length header on the backup volume were over-written (block-level update) then this unnecessary file system thrashing would not occur.

And finally for an even simpler example: if, for instance, only the 'Last Played' tag were updated, a flac file of a track from a CD would require about a 50 MB transfer for a 4 byte change.  The newer HD audio files would require 250 MB or more to copy.  This is extremely inefficient.

What is needed for current large music collections is a block-level application for inline (in file) metadata backups that only acts on the tag header and is smart about any other changes in the file outside of the header.  If anything out of the ordinary occurs, the program either copies the entire file or logs an error and copies nothing.  Such a program should be relatively straight-forward to write (though I am not necessary volunteering to do so;)
Logged
Pages: [1]   Go Up