INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Free 1 Million Song Data Set  (Read 6708 times)

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Free 1 Million Song Data Set
« on: March 08, 2011, 10:25:54 am »

You guys might just want to grab this, just in case it could come in handy:

Quote
For far too long, researchers and engineers working on Music Information Retrieval (MIR) have been forced to pay a hefty ante before being able to conduct their research: namely, they’ve had to build a set of data on which test their theories and hone their algorithms.

It may have started as a flippant suggestion for how to solve that problem, but The Million Song Dataset is now real, and anyone can download it. A collaboration between The Echo Nest and Columbia University’s LabROSA department (Laboratory for the Recognition and Organization of Speech and Audio), The Million Song Dataset has four main objectives:

    * To encourage research on algorithms that scale to commercial sizes
    * To provide a reference dataset for evaluating research
    * As a shortcut alternative to creating a large dataset with The Echo Nest’s API
    * To help new researchers get started in the MIR field.

The Million Song Dataset offers researchers, engineers and commercial developers detailed sonic and cultural attributes for each song, as well as extensive metadata, both provided by The Echo Nest.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

Alex B

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 10121
  • The Cosmic Bird
Re: Free 1 Million Song Data Set
« Reply #1 on: March 08, 2011, 11:38:09 am »

Its size is 280 GB! ...or 180 GB if you calculate from the 1.8 GB / 1% / 10,000 random songs sample file. The latter is said to be compressed (odd that such loose data doesn't compress more).

MC should be able to store a library of 1,000,000 extensively tagged songs in about 150 MB (based on 15 MB per 100,000 files, which is about correct for me. In my experience a zipped library backup file for a 15 MB database is about 6 MB. The complete 1 million song db should fit in a 60 MB delivery package.)
Logged
The Cosmic Bird - a triple merger of galaxies: http://eso.org/public/news/eso0755

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Free 1 Million Song Data Set
« Reply #2 on: March 08, 2011, 12:45:11 pm »

I just don't understand why they aren't distributing it via bittorrent.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

tunetyme

  • Galactic Citizen
  • ****
  • Posts: 410
  • Have tunes will travel
Re: Free 1 Million Song Data Set
« Reply #3 on: March 08, 2011, 01:27:14 pm »

It seems to me that this is worth having for the metadata alone.  I've jumped on it.  So much for having just 37,000+ songs.  I see this as a great opportunity to be able to preview music before buying.  I still plan to stick to lossless.
Logged

JustinChase

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3276
  • Getting older every day
Re: Free 1 Million Song Data Set
« Reply #4 on: March 09, 2011, 06:13:01 pm »

It seems to me that this is worth having for the metadata alone.  I've jumped on it.  So much for having just 37,000+ songs.  I see this as a great opportunity to be able to preview music before buying.  I still plan to stick to lossless.

Does this actually include the songs?  it sounds like it only includes the metadata, and links to 30 second samples.

I could see JRiver using this to benefit their search and track lookup processes, but other than that, I don't see a good use for the "average" user.

however, I could certainly just be missing it :)
Logged
pretend this is something funny

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Free 1 Million Song Data Set
« Reply #5 on: March 09, 2011, 08:27:43 pm »

Does this actually include the songs?  it sounds like it only includes the metadata, and links to 30 second samples.

I could see JRiver using this to benefit their search and track lookup processes, but other than that, I don't see a good use for the "average" user.

however, I could certainly just be missing it :)

Nope.  You got it.  It is a developer tool.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

JustinChase

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3276
  • Getting older every day
Re: Free 1 Million Song Data Set
« Reply #6 on: March 09, 2011, 11:07:28 pm »

Thanks for clarifying, it seems awfully big for metadata.

I wonder if/hope J River can use it to augment YADB and/or music fingerprinting/ID.

Logged
pretend this is something funny

tunetyme

  • Galactic Citizen
  • ****
  • Posts: 410
  • Have tunes will travel
Re: Free 1 Million Song Data Set
« Reply #7 on: March 11, 2011, 08:28:01 am »

Thanks for clarifying. 

I believe that it will help identify new music that I may not be familiar with.  I haven't opened it yet but I am looking forward to seeing what they have developed.  I agree, it would be great if this was available through JRiver lookup processes.

Tunetyme
Logged

JimH

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 72439
  • Where did I put my teeth?
Re: Free 1 Million Song Data Set
« Reply #8 on: March 11, 2011, 08:37:34 am »

I believe that it will help identify new music that I may not be familiar with.  I haven't opened it yet but I am looking forward to seeing what they have developed.  I agree, it would be great if this was available through JRiver lookup processes.
The Performer Store inside MC has about 8 million tracks.  It's free to play the samples.

If you find something you like, you can click on the $ sign to buy a high quality MP3 track or even a CD from Amazon.
Logged
Pages: [1]   Go Up