Social media fascinates and confuses many corporations. You can’t blame them. On one hand it’s an opportunity to listen to their customers and drive deeper engagement. On the other hand, the wrong social media campaign could see an organisation fall flat on its face. In expensive or embarrassing ways.
This is partly a philosophical problem about how a brand wants to be perceived on social media. It’s also, in large part, a technology problem. Understanding and acting on social data is one of the defining challenges of the first big data era.
Unstructured data is pouring out of social media platforms. It’s messy and it’s unpredictable. Traditional databases are unsuitable for processing this kind of data. But when all you have is a relational database, everything can look like a row or a column. This is the wrong approach for social media. It is well documented that many organisations are not prepared for this influx of social data.
I’m CTO of a company called Stackla, a social platform that aggregates, analyses and curates social media content. In this article I want to look at how the technology decisions our company made are enabling us help some of the biggest organisations in the world harness the power of social data.
The data revolution will not be structured
Social media is the quintessential big data problem. The raw volume of data is massive and the variety of data types is mindblowing. Commentators have speculated that there are more types of social media data than there are 30 Rock GIFs on the internet.
Videos, Vines, images, location data, GIFs, individual metrics and personal information - just a few of the multitude of data genres generated on the social web. And every piece of it is unstructured data that doesn’t fit into the traditional IT view of information. Imagine a fast moving river flowing with every piece of Lego ever created. With one small net you need to pick out all the grey bricks. Then cross reference that with all the bricks that are required to make the Lego Millennium Falcon.
It’s not difficult. It’s impossible. You need to fundamentally change the way you view and understand the river.
Function and flexibility
To create a platform that was flexible and scalable enough for the entire social web we had to look away from traditional solutions like relational databases. In 2010, after testing a number of options, we started building Stackla on MongoDB, the fastest growing database in the world.
It must be said that there are a number of so called NoSQL (Not only SQL) databases. These databases are part of a new generation of tools that were designed to solve complex unstructured data challenges. There are a number of different flavours of NoSQL: Document, graph, key-value and wide-column stores. Each have their own way of handling data. What’s consistent is that these databases have far greater data flexibility and scalability, so are well suited to social data.
However, much of the experience we had revealed that these projects are too one dimensional to takeover the operational duties of a database. Many NoSQL databases lose key functionality that our database administrators had come to rely on. Ultimately we needed a database that had the innovations of NoSQL while maintaining the foundational functions of relational databases.
The other technology that matured just at the right time for us, was Amazon Web Services. Building our platform on AWS’s cloud gave us the ability to deploy and scale wherever our customers needed us.
An example of functionality provided by this approach is how our database administrators use MongoDB’s location-aware sharding. The sharding keeps relevant data stored in Amazon Web Services data centers that are geographically close to local campaigns. This keeps latency low and improves end user experience.
MongoDB also allows us developer flexibility. This was helpful in the early stages of development and became vital after we went into production. Allowing us to evolve the platform as social media went off in unpredictable directions. For instance, in 2013, Twitter introduced micro-video service Vine. The format quickly took off. Our developers relied on MongoDB’s flexible schema to incorporate a completely new type of content into the platform, without taking it offline and interrupting service.
The use of commodity cloud storage and a non-relational data strategy has helped us offer the rich functionality and scalability that our clients demand.
Unknowable unknowns: Future of social data
So what’s coming next? The tea leaves are far from clear, and anyone that tells you exactly what to expect in the future of social media is probably just guessing. What we can say is this: There will be cat pictures, it will get weirder, and the variety and volume of data is likely to increase significantly.
For an organisation to to take advantage of these mercurial trends they need platforms built on modern databases that are both functional and flexible.
Social media can be a frightening game to be in. We’ve had some success with Stackla but a new holographic social platform could come out tomorrow (it’s only a matter of time). We need to be able quickly help our customers understand and capitalise on this new platform.
Semin Nurkic, CTO of Stackla