Business intelligence applications are moving from the traditional connection to an OLAP Data source based on relational database systems to the ability to link to and consume data from a variety of disparate sources including social networks. The ability for a modern BI application to be able to use mashups of data to provide agility when dealing with integrations of multiple types of data sources has led to NoSql being promoted by many as the next big thing within BI. Does this mean that we have seen the end of the SQL style RDBMS system within the BI area – there are many pros and cons for both systems but I believe that there are still a place for both within the BI arena.
NoSQL implementations like Cassandra and Dynamo can scale out past the terabyte and on to the petabyte size by utilizing horizontal scaling and multiple nodes and in particular the costs differences associated between SQL and NoSQL implementations are significant. However each type of NoSQL system uses its own proprietary code for its connections and the system is usually set up for a particular model which enables super fast performance but does hinder the ability to run any adhoc queries on the data.
Companies are now looking to connect to social networking data to enable them to trend sales and customer selections. This data is very unstructured and most is in the form of NoSQL (Twitter, Facebook etc). The problem as I see it for most major business clients is who within their organizations to use to implement a NoSQL BI solution. Most of the requirements of a BI system – large data sets, speedy recovery of data, and display of results to all business users – can be implemented utilizing a NoSQL data set; however the technology does require a different type of technical resource. One possible solution to this problem could be the Toad for Cloud database application by Quest software which I am just starting to look at in more detail – this shows great promise in its ability to interrogate cloud style NoSQL databases like Cassandra, HBase and Azure with SQL terminology and to allow transfers of data between NoSQL databases and SQL databases.
The trade off for NoSQL database is their lack of ACID and their ability to support adhoc querying. Utilizing SQL RDBMS allows us to use standard connections between servers and clients especially those stalwarts of BI Reporting, crystal reports or business objects. It also allows for clean easy connections when utilizing the most popular of object frameworks like dot Net or xml. Normal IT departments normally have at least one SQL data access language expert in their ranks – this allows them to at least understand a BI implementation based on a SQL RDBMS.
In respect of BI my experience has led me to believe that for the majority of EPOS based customers utilizing a RDBMS SQL based application with the possibility of a star based data warehouse will suffice and provide both transactional integrity and the ability to scale as required. There will of course be exceptions to this model including both the requirement to scale out past the Petabyte mark and a requirement for superfast results and it is at this point that I believe the NoSQL solutions can and should be investigated. I believe that both SQL and NoSQL applications will be implemented side by side in many organizations in the future especially as the drive to include social networking data in our results is realized. Many BI specialists including myself already utilize a plethora of specialized tools to deliver results to the customer – I cannot see any reason for not adding NoSQL into the tool box.