Creating a horizontal comparison bar chart using Tableau

I have put together a screencast with some step by step instructions on how to create a horizontal comparison bar chart in Tableau. This can be problematic when done incorrectly leading to lots of time wasted trying to align two separate bar charts on a dashboard. With this method you will create a single chart making […]

Why Source to Target Mapping documents matter

We as developers really do not like writing documentation. I am quite sure that if given a choice some developers would rather walk over broken glass than sit down and create documentation for their solutions. Sure there are some exceptions to this, but in general writing documentation is not something that we enjoy nearly as […]

Quick and dirty test of Google BigQuery’s ability to scale

As a quick weekend experiment I thought it might be a good idea to look at how BigQuery scales. In order to test this out I made use of the dataset that I already created in BigQuery for my previous blog post comparing HDInsight + Hive against BigQuery. One of the first challenges with such […]

HDInsight + Hive vs BigQuery – A Detailed Comparison

A big thank you goes to Daniel Haviv for his suggestion to use ORC with Snappy compression over Tez (with Vectorised reads) as well as the advice he provided to easily set this up. I have updated the post with the figures from this configuration.   A while ago I wrote about using Google BigQuery […]

Missed opportunity: Power Query as data source in Azure Machine Learning

If you are doing anything around Data Science and Machine Learning in the Microsoft space then I am sure that you have come across Azure Machine Learning. Azure ML is Microsoft’s push to bring machine learning to the masses, kind of the same way that Microsoft has done everything in its power to bring Business Intelligence […]

A fresh approach to delivering projects using Power BI

With the recent changes to Power BI Microsoft has unleashed the most exciting capabilities in the Microsoft BI space in a very long time. Power BI does not only have the ability to promote self-service BI, it has far greater implications for how we deliver BI projects in an agile manner. Power BI allows us to […]

Drop a column from all tables in a specific schema or database

Sometimes you just need to drop a column from all tables where it exists in a specific schema or maybe your entire database. The following script will allow you to do that and will also drop any indexes that use the specified column in order to all the column to be dropped. Use this one […]

Implementing Linear Regression using TSQL

A few weeks ago I started the Machine Learning course, presented by Andrew Ng, on Coursera and so far I am thoroughly enjoying it. The course uses GNU Octave as the environment in which the coding takes place and so far it has proven to be a fantastic tool to use to learn the concepts […]

Centralised Logging for BI Projects based on Android Logcat

I have recently been playing around with creating a centralised logging mechanism to be used across all components of a BI solution. On some projects you can have many different modular and separately developed components that make up the solution as a whole and each one of these components require logging. This brings with it […]