Want to start a music website kid? Then get your storage right

There are stacks of music websites out there, all having grown and developed during a transformational period for the music industry. The emergence of file-sharing kicked it all off with sites like Napster and The Pirate Bay, which then led to the hotly contested (and officially sanctioned) business of music streaming.

For many of the more passive listeners amongst us, a playlist and a set of headphones is the only requirement for personal music consumption. However, for those that really invest themselves in it, other information and content surrounding artists' work can really add to the experience.

This is a service that Bandtrace is looking to provide. By combining information drawn from sites including iTunes, LyricWiki, Wikipedia and MusicBrainz, the startup aims to be the IMDb of music.

Read more: Apple's purchase of Beats Electronics and Music: Was it really worth $3bn?

The website currently delivers a mash-up of over 20,000,000 pages and 100,000,000 links, and dealing with the massive datasets naturally requires an effective, scalable storage solution.

"We settled on a graph database quite soon after we started planning the Bandtrace architecture," CEO Tommy Wassgren explained.

"Neo4j we settled on because it has the right query language and nice format for working on using a web-based API. We found that in order for Neo4j to work as good as it does, you need a fast storage solution – basically SSD disks."

Read more: Hackers turn anti-piracy music industry site into Pirate Bay Proxy

Translating the information to a mobile screen also proved a headache for the Bandtrace team.

"We used an approach called "mobile first" which is pretty much standard these days when you create websites," Wassgren continued.

"When you get a bigger browser, you have a larger display so you can show more information. When you resize the window youshow more information which is called 'responsive design'. But that was quite hard, we had to work through quite a few iterations before we got that bit right, and we're still working on it, finding features we want to add or remove. We also track user interaction which helps us decide what data is most important to users."

Wassgren advised entrepreneurs also looking to start a venture with large datasets to avoid vendor lock-in and to deploy applications in a standard way. He also stressed the importance of using many small servers, rather than just one big one, to ensure the seamless addition of nodes in future.Porthole Ad

"We chose Elastx from the beginning because it uses the platform called Jelastic where you can scale out the number of servers," he said.

Bandtrace's large datasets are stored on an all-flash storage array from SolidFire.

"When we first spoke to Tommy about Bandtrace, we were quick to share his enthusiasm for the project," said Tim Pitcher, VP International at Solidfire.

"As the Bandtrace platform was being built from scratch, however, it needed a solution that ticked all of the boxes for its complex data requirements and would support the whole Bandtrace platform. Tommy was very keen on a scalable data solution with guaranteed IOPS (Input/Output Operations Per Second) to avoid 'noisy neighbour' issues and ensure reliable and consistent performance, and we were able to provide him with a solution that met these demands."

To see just how quickly Bandtrace copes with its vast datasets, visit the website yourself here.