Smooth BI processes are key to forging the analytics team’s reputation within a company. However, data engineers and people maintaining the BI infrastructure often face an ironic challenge. Because solid BI infrastructures only have an indirect positive impact to business users (by facilitating more efficient analytics processes which in turn have a direct positive impact), it can be hard to justify resources to be spent on improving the infrastructure when the business needs insights quickly. However, overlooking the importance of data processes is potentially the greatest offender to efficient analytics and doing so always becomes a major liability over time — and the longer you wait, the bigger the pain.
If you want to be rocking analytics, you need to minimise your BI debt over the long term and therefore first focus on building solid data foundations. BI debt can be seen as a score between 0 and 100, where the score represents the proportion of your analytics team’s time spent on fixing data processes. The higher the score the more time spent on fixing issues and therefore the least time spent on generating valuable insights. Making the wrong choices can quickly throw the balance off and drag you within the 50–100 range — especially when you grow quickly and when mistakes’ negative impact can grow exponentially. This is why, at 173Tech, we have come up with a list of the top 10 tips to help you grow your analytics smoothly — ensuring you maintain low BI debt over time. There are obviously more things to it, but using those tips will already get you a long way!
This seems obvious but can often be overlooked. Analytics’ main aim is to provide the business a better understanding and find areas of optimisation. Therefore, it is really important to focus on the right challenges. If you find a way to increase ROI by 10x on a user segment, but that segment only represents 0.5% of your user base, it might not be worth the effort. Before starting analytics projects, identify the key challenge(s) the business is facing. Is it an acquisition challenge? Is it a retention challenge? Or something else? Once this is clear, you can focus your analytics resources. Similarly, your data collection efforts should focus on the data you need for solving your key challenge(s) — there is no need to collect everything under the sun when starting.
Analytics needs to be well connected to the business. As analytics’ aim is to generate value for business users, frequent and consistent communications is crucial. Otherwise you will most likely end up with analytics projects that do not address the business’ most important challenges. In addition, you ideally want to avoid having analytics sitting under a specific function (e.g. finance, marketing, engineering, etc.) as doing so will bias the focus of analytics towards the function it reports to.
Once you are committed to investing into advanced analytics, you should use an analytics warehouse (e.g. Redshift, Snowflake, BigQuery, etc.) in order to centralise data. This is because those databases are optimised for processing calculations on extremely large volumes of data. Using operational databases (e.g. PostgreSQL, MySQL, etc.) might be fine in the short term (when data volumes remain limited) but those will definitely not scale well when data volumes grow significantly — a lot of analytics time will be lost on waiting for longer queries to output and unnecessary optimisation processes.
When it comes to building your first analytics infrastructure, it is highly probable you will only have one or two analysts in-house. Therefore, is it crucial you automate as much of the heavy lifting as possible. There are a few services (e.g. Fivetran, Stitch, etc.) which enable you to automate data extraction and load it within your analytics warehouse without writing any code. However, there might be situations when you need custom extraction. In this case you can use SAYN, our open source data processing framework tuned for simplicity and flexibility. It is worth assessing the cost vs. benefit of automating data extraction when you start as this can get you going quicker.
Out-of-the-box analytics tools (e.g. Mixpanel, Amplitude, Google Analytics, etc.) are often an easy way to get started. Send the data to those and you will have direct access to some pre-made dashboards. However, out-of-the-box analytics tools do not tend to scale well for two main reasons: a- when data volumes grow, pricing grows significantly as well, eventually becoming more expensive than your own analytics warehouse and b- the analysis possibilities are limited as you have little flexibility in how you can process the data. The real value of analytics lies in going beyond averages!
Data modelling is a no brainer. It is a set of automated transformation processes that organise the data in your data warehouse so it is easily consumed by reporting tools or analysts (more details about this in my article on building advanced analytics for startups). This is the main way to ensure you create a single source of truth, control SQL quality and ensure high integrity of your numbers — and will save you tons of time in the long run! Without data modelling, analysts will often re-write the same query multiple times with small but non-insignificant differences. In addition, complex logic will often result in human errors which can be avoided with automation.
Your analytics cluster should become the central location of data logs and data modelling. In that way, you can ensure your source of truth sits in one place, and it is easy to connect multiple data sources (e.g. product, CRM, marketing, etc.). Because all your data lives in the same place, it makes it easy to join as long as you have a common identifier. Most of your analytics insights will come from joining data from multiple sources and creating a single customer view! In addition, ensure all your business logic (i.e. data modelling) sits solely in your cluster and not in your visualisation tool — otherwise things will become messy and harder to maintain when you scale.
What you want from your visualisation tool is two key things: the ability to define business metrics based on your data model (i.e. logical layer) and a drag-and-drop option. The former will ensure numbers’ reliability (metrics are defined by analytics, hence everyone sees the same thing). The latter significantly reduces the time spent by analytics providing dashboards as you do not need to (re-)write SQL many times and business users can create their own dashboards. A tool which forces you to write SQL to build visualisations will become a major pain when you scale. Metabase is not perfect but it is a good start (free & open source), Looker is a good solution when you need more muscles.
Version controlling enables your team to work in parallel environments whilst developing / testing and therefore not affecting what runs on production. If any mistake happens, it also makes it easy to revert back to the previous version of the code. Version controlling is extensively used in software development and analytics processes can significantly benefit from it.
Because two pair eyes are always better than one, use peer reviews in your processes. This will increase the quality of the output, and also reduce the amount of mistakes that go through the gates. In the end, this will significantly help you build the reputation and trust of the analytics team within your organisation.
That’s all folks! Ensuring you have low BI debt will significantly increase the impact of your analytics team over time. It will enable you to dedicate most resources on meaningful and impactful projects and strengthen the relationship between the business and analytics. We hope these 10 tips will help you build your A-game analytics team. Happy implementation :)
By subscribing, you agree to receive a monthly newsletter from 173Tech.