ONYX REPORTING LTD.
  • Welcome
  • Services
  • Blog
  • YouTube Channel
  • Contact
  • Domo IDEA Exchange
    • Schedule
    • Call for Presenters
  • Welcome
  • Services
  • Blog
  • YouTube Channel
  • Contact
  • Domo IDEA Exchange
    • Schedule
    • Call for Presenters
Search

VIDEO:  10 tips to Polish your Dashboards & Simplify Charts

28/1/2021

0 Comments

 
I had the opportunity to appear as a guest speaker on Janice Taylor's webinar series on Dashboard Design and Best practices, https://www.reportsyourway.com/.  During this 15-minute presentation, I reviewed 10 easy checkpoints dashboard designers can implement to produce concise and effective Dashboards.

Read More
0 Comments

Preventing Stockout and Improving Supply Chain Logistics with Data

21/5/2020

0 Comments

 
Picture

The Problem.

Acme Corp is a national distributor of electronics and hardware to small and medium-sized companies.  Because size and cashflow prevent Acme Corps' SMB customers from carrying large inventories, they frequently stockout or request expensive last-minute air freight deliveries.  

Acme Corp recognized that by combining customer sell-out with product shipment data, they could apply forecasting models that would automate and optimize product replenishment for their SMBs.  This would deliver a better customer experience, while preventing stockout and creating new opportunities for optimizing product distribution logistics.

Unfortunately, although SMBs were happy to send till data to Acme Corp, the Domo implementation team did not have access to reliable starting inventories at the customer locations, so they approached me to design a process for deriving a starting inventory.

In this tutorial, we'll:
  • build a dataflow that derives starting inventory
  • learn why certain ETL (blocking functions) tiles have such a drastic impact on query performance and how you can optimize around them.

While this blog post is more developer-focused, it's important to recognise that these types of data normalization activities are the bridge step to building forecasting models and implementing workflow automation.
  • Ex.  If we build a forecasting model around customer sales but don't consider available inventory, it's impossible to gauge whether the lack of sales is 'normal customer behaviour' or attributable to stock out.
  • In a similar vein, before 'advanced analytics' can begin, we must carefully evaluate whether the available data represents reality.  If we don't apply domain expertise to the problem, we might dismiss rows with negative inventory as invalid or outlier measurements and therefore exclude them from our forecasting analysis.

With any project like this, validation is almost more important than the solution designed.  Make sure to identify and confirm assumptions with the domain experts before proceeding to the next step.

​In any case... on to the tutorial!

Read More
0 Comments

Data Science + Domo: KMeans

18/8/2018

0 Comments

 
Disclaimer:  Let's get it out of the way.  I am employed by Domo; however, the views and ideas presented in this blog are my own and not necessarily the views of the company.

In this post I'm going to show you how to use Domo with it's built-in data science functions to perform a KMeans clustering using the Hotels dataset from Data Mining Techniques for Marketing, Sales, and Customer Relationship Management, 3rd Edition, by Gordon S. Linoff and Michael J. A. Berry (Wiley, 2011).  Then I'll show you how you can do the same thing using Domo's platform to host the data and R to transform and analyze the data.

Why are we doing this?
From the platform perspective, my goal is to showcase how Domo can support Business Analytics and Data Science workflows.

From the business perspective, an organization may want to cluster Hotels to facilitate creating 'hotel personas'.  These personas (ex. luxury business hotel versus weekend warrior favorite) may enable the creation of marketing campaigns or 'similar hotel' recommendation systems. 
  
​Disclaimer 2:  I do not own the Hotels dataset.

The Domo Platform Organizes Datasets

Step 1:  Upload the Hotels dataset to the Domo datacenter using the file connector. 

Duration:   < 2 minutes or 8-ish clicks.

Domo enables analytics workflows by assembling all the recordsets in one place.  The data can be sourced internally from a data warehouse, POS system, Excel spreadsheet or SSAS cube or come from external sources like  Facebook, Google Analytics, or Kaggle.
Picture
Once you've uploaded data to Domo you can apply Tags to keep track of your datasets or implement security to control who has access to your data (even down to the row-level).

Additional notes for InfoSec:
Given that the largest risk in data security is user behavior, finding solutions that are preferable to personal GitHub or Dropbox accounts, USB sticks or (God forbid) the desktop remains a priority.  One of Domo's most understated highlights is its ability to surface IT governed and cleansed datasets to analysts in a single place that's highly accessible yet fortified with bulletproof (Akamai) security.

Notes for architects and engineers:
Under the hood, Domo's platform takes the best of big data / data lake architecture (distributed, high availability, etc. ) and (good) datamart recordset management techniques.  Data workers familiar with Hadoop or Amazon web services will recognize Domo as a modular yet integrated big data stack wrapped in an easy to use GUI that business users will appreciate and get value out of from day one.

When you set up a trial of Domo you effectively have a free data lake that's ready for use -- complete with the ability to store literally millions of rows and/or gigabytes of data at rates you'd expect from Amazon S3 or Microsoft Azure storage at a fraction of the cost.  Try it.
Picture

Introducing DataFlows

To the horror of every data scientist out there, in this blog, I'll skip data preprocessing, profiling or exploration and go straight to using Magic ETL to apply KMeans clustering on a set of columns in my data.

Side Note
ETL stands for Extract, Transform and Load, and Domo provides a proprietary drag-and-drop interface (Magic ETL) for data transformation.  Technically the data has already been extracted from your source system and loaded into Domo, so all we're left with is transform phase.

Double Side Note for Clarity:  
Each DataSet is stored in Domo somewhere as a separate file.  
Each DataFlow has Input DataSets and creates Output DataSets - new file(s)

You have 3 types of dataflows native to the Domo platform.
  • Magic ETL (Domo's proprietary drag and drop tool)
  • MySQL and Redshift.  NOTE:  Redshift is a beta feature for transforming 5 million+ row datasets.  If you need the feature 'turned on' give me a holler.
  • Additionally, Domo supports data ingestion via API, ODBC, and various other SDKs.  In other words, you can transform data using virtually any platform before pushing it to Domo. -- In this blog, I'll do some data analysis using R and then push it to Domo from R.
Picture
Magic ETL has a range of user-friendly set of Data Science functions.  They are in Beta, so if you don't see them in Magic ETL, shoot me an email so we can get it enabled for you.

What is Clustering and why are we using it?
It is not my goal to make you a data scientist in a 5-minute blog post, but if you are interested, the Data Mining Techniques book I linked earlier may be a decent place to start.

Consider the following (completely fictitious) example:

"Jae, given your stay at the Threadneedles Hotel in London, you might like the Ace Hotel in Portland which has similar review rates by international guests, number of rooms and avg. room rates."  

In short, clustering allows you to group entities (hotels) by sets of attributes (in our case, number of rooms, number of domestic reviews, international reviews, whether the hotel is independent etc.).
Picture
How easy is Magic ETL? 

It's 1, 2, 3.  Drag transformation functions (1) into the workspace (2), then define the function parameters (3).

In this example, I used the previously uploaded hotels-wiley dataset as an input, then I applied a K-Means function over a subset of columns.

Note:  I explicitly defined the number of clusters to create (5).
If you don't know what KMeans does, Domo's documentation is on par with what you'd expect from any data science-lite platform.
Picture
Step 3:  Preview and Output the Dataset for visualization or further analysis

​
In the final steps, we'll add a SELECT function to choose which columns to keep.  In this case, we'll only keep columns that were used by the KMeans algorithm, as well as columns to identify the hotel (hotel_code).

Lastly, we add an Output Dataset function to create a new dataset in Domo which we can later visualize or use for further analysis.
Picture
 Beware the false prophets ...

THIS IS IMPORTANT. 
Do not fall into the trap of misinterpreting your results!

In the output dataset below, you'll see we have a new column cluster which happens to be next to a column bubble_rating.  The hasty analyst might conclude: "oh hey, there appears to be a correlation between cluster and hotel ratings." ​

​And all Data Scientists in the room #facepalm while #cheerForJobSecurity.
Picture
There are 5 clusters because in an earlier step, we told the algorithm to create 5 clusters.  We could have easily created 3 or 7.  There is not necessarily a correlation between which cluster a hotel ended up in and its rating.  Keep in mind, cluster number is just an arbitrary numbering.  If you re-run the function, ideally, you'll end up with the same groupings of hotels, but they could easily have different cluster numbers.  (the hotel_cluster_3 could become hotel_cluster_4 and hotel_cluster_4 could become hotel_cluster_1)

Side Note which will become important later:
It would be nice if Domo included metrics for measuring separation between clusters or the strength of clusters.  We arbitrarily told KMeans to create 5 clusters.  But who knows, maybe there are really only 3 clusters.  There are quantitative methods for identifying 'the right' number of clusters, but they aren't available in Domo out of the box.

Domo embraces all Analytic Tools

Let's do it again.  In R.
​

As demonstrated, Domo has user-friendly data science functionality suitable for the novice data scientist; however, in real-world applications, analysts will likely use Domo's data science functions to build up and validate proof of concepts before transitioning to a 'proper' data science development platform to deliver analytics.

Domo has both Python and R packages that facilitate the easy data manipulation in full-fledged data analytics environments.

To extract data from Domo into R:
  1. set up an access token & install the DomoR package
  2. find the dataset ID: https://<yourDomoInstance>.domo.com/datasources/0afe1c21-18f3-496a-acb5-599e07c9e33c/details/overview

The R Code:  

#install and load packages required to extract data from Domo
install.packages('devtools', dependencies = TRUE)
library(devtools)
install_github(repo='domoinc-r/DomoR')
library(DomoR)

#initialize connection to Domo Instance
domo_instance <- ' yourDomoInstance '
your_API_key <- ' yourAPIKey '
DomoR::init(domo_instance, your_API_key)

#extract data from Domo into R
datasetID <- 'yourDataSetID '
raw_df<- DomoR::fetch(datasetID)

## PROFIT!! ##
Picture
Picture
Earlier we asked the question, was 5 'the right' number of clusters. 

Given the variables from before (number of rooms, number of domestic reviews, etc.), once NULL values have been removed and the variables scaled and centerered, it appears that 6 clusters may have been a better choice).

Side bar: READ THE MANUAL!  Unless you read the detailed documentation, it's unclear how Domo's KMeans handles NULL values or whether any data pre-processing takes place.  This can have a significant impact on results.

​With the revelation that we should have used 6 KMeans clusters, we can either adjust our Magic ETL dataflow in Domo, or we can use R to create / replace a new dataset in Domo!

DomoR::create(  cluster_df, "R_hotel KMeans")
DomoR::replace_ds('datasetID', cluster_df)

In Domo's Magic ETL, we'll bind both cluster results to the original dataset for comparison.
  1. Add the R generated dataset (R_hotel_KMeans) to Magic ETL
  2. JOIN the datasets on the hotel_code
  3. SELECT columns to keep (remove any duplicate or unwanted columns)
  4. Revel in the glory!
Picture

Wrap it up

NEXT STEPS:  Create 'Cluster Personas.'  
Typically, organizations would use clusters of attributes to define 'hotel personas'. 
  1. 'Budget Business Hotel'
  2. 'Luxury Business Hotel'
  3. 'Backpackers Delight'
  4. 'Destination for Honeymooners'
  5. 'Weekend Treat'
  6. 'Long-Term Stay'
Because the attributes of 'Budget Business Hotel' and 'Backpacker's Delight' are similar, with 5 clusters, maybe they get combined into 'Budget Hotel'.  Alternatively, perhaps 'Weekend Treat' combines with 'Destination for Honeymooners' into a 'Luxury Short Stay' persona.

From there, the outcome of this clustering and persona generation may influence discount packages or marketing campaigns to appeal to specific travelers.

REMEMBER:  Clustering is not intended to predict Ratings!  If you're still stuck on that, review the purpose of clustering.  
Picture
In this quick article, I give the most cursory of overviews of how Domo's native (and beta) features can enable data science workflows.  We also explored how other analytic workflows can integrate into the Domo offering.

If you're interested in finding out more, I can connect you with a Domo Sales representative (not me), or I'd love to talk to you about your analytic workflows and use case to see if Domo's platform might be a good match for your organization!

Disclaimer: the views, thoughts, and opinions expressed in this post belong solely to the author and do not necessarily reflect the views of the author's employer, Domo Inc.
0 Comments

12 Questions for Dashboard Design (and 5 Mistakes to Avoid)

14/12/2016

0 Comments

 
Last week we explored 6 dashboard design tips to improve the aesthetics of existing dashboards.  In this post, we share design tips from our analysts about dashboard and OLAP cube requirements gathering as well as costly mistakes to avoid.
Picture

12 Tips to Remember

Tailor your dashboard to the audience AND the process they're trying to optimize.
  • How important is this measure to a user compared to other measures?
  • What other measures would provide comparative context?
  • Is there a sequence of measures a user would follow to drill into a question?
  • How might the answer to the previous questions differ by users at different levels of the organizational hierarchy?
Express the data in a way relevant to the audience.
  • What's the scope of your dashboard? (Company, department, individuals, a supplier)
  • What level of summarization or detail is appropriate?
  • What unit of measure is approrpiate for each measure?
  • What complementary information should I include?
  • Again, how might the answers differ as scope or user changes?
Match the tool to the user. 
  • Managers and executives may prefer a dashboard that provides a non-interactive executive or strategic summary, while business or data analysts may prefer to construct their own analyses against raw data straight from the data warehouse, while mid-level managers may require a live operational report.
  • Did you choose the right chart?
  • Should this be a PowerBI dashboard, a SQL query, an SSRS report, a Jet Professional Report, an Excel PivotTable?
  • How often does the data need to be refreshed?  Do operational decisions depend on activity or status changes in the last 15 minutes?  Last 3 hours?  Last 24 hours?

5 Tips to Avoid

Kicking off a project with an overly complex problem quickly  leads to project stall or paralysis.
  • Complexity can arise when a variety of inputs are required to calculate a measure (eg. 'true profitability or cost allocation). A bottom-up approach of first identifying simple measures which aggregate into increasingly complex summarizations can encourage quick wins and maintain momentum.
  • If your team insists on a top-down approach, divide and conquer. Decompose complex strategic objectives goals, or measures into composite pieces.
  In your data warehouse, cubes and dashboards, avoid using metrics or abbreviations who's meanings are not immediately obvious.
  • The use of generic names like "Global Dimension 2", abbreviations or entity codes may affect adoption rates and user acceptance because they only resonate with a handful of seasoned analysts and fall flat with a broader audience.
  • In your visualizaitons, stave-off confusion or frustration by opting for verbose titles, captions, and axis labels while avoiding legends.
KISS (Keep it simple).  Avoid clutter, non-succinct graphics, or unintelligible widgets. 
  • 3D effects, sparklines, temperature gauge charts, mapping charts, and pie charts are common visualization elements that add spice and sizzle to a dashboard; however, when poorly executed, these elements take up significant space without communicating a lot of information.
  • Ask yourself if you could express the same information with less ink, color or space.
Avoid failing to match metrics with goals.
  • "How does this dashboard aid a business process?" It's not enough to showcase the activities of a department.  Keep strategic goals or operational activities in mind. Does the dashboard provide enough context for viewers to reach an actionable conclusion?
Don't wait to get started.
  • Many are tempted to wait until development has finished before they begin laying out their dashboards, when in truth, report requirements should drive or at least inform development efforts.

Go Forth and do Great Things

Still looking for inspiration?
  • Here are some of my favorite Jet Reports from the Report Player
  • Can't create the chart you want with a PivotTable?  Use Excel's Cube Functions 
  • Need dimensions or measures added to your cubes?  Our services team has domain expertise on data warehouse and OLAP cube design for Dynamics NAV and GP.
  • Make sure to sign up for our weekly newsletter for special promotions and articles delivered straight to your inbox.  
0 Comments

6 Design Tips for Better Dashboards

6/12/2016

0 Comments

 
Picture
In his article "Choosing the Right Chart for your Data," Brian Petersen (VP of Professional Services at Jet Reports) writes: "Data is the foundation of effective business.  ... Being able to quickly read and analyze your data enables you ... to understand how a particular set or group of facts contributes to your overall success and steer your decisions proactively."
He then goes on to describe several common charts including:
  • Area Charts
  • Bar Charts
  • Scatter Charts and
  • Stacked Column Charts
Read the full article here:

For more seasoned analysts, the challenge is less about finding the right chart so much as laying out a dashboard that effectively communicates a broad scale of both summarized as well as detailed information. As we delve into optimizing dashboards, we move away from technical or domain expertise and transition toward questions of User Experience and User Interface.

For these projects, I'll leverage knowledge gleaned from Stephen Few's guide to dashboarding -- "Information Dashboard Design" which was heavily influenced by Edward Tufte's seminal work "The Visual Display of Quantitative Information".

Charts Jump off the Page with these 6 Tips

1) "Brevity is the soul of wit" -- Do not exceed 1 page.
Any CxO will describe the perfect dashboard as an interactive report where they can see all the important information on one page. Translation: Edit. Edit. Then edit again. Examine how much excessive detail or decoration you can pare away without supplying inadequate context.

​Your final product shouldn't require scrolling, changing tabs, or (ideally) a legend.

2) How good is good? -- Provide enrichment and context through comparison.
  • Same measure at a point in the past (Last Year or Last Month)
  • Target (Budget or Forecast)
  • Relationship to future target or prior prediction (% of annual budget or forecasted budget)
  • Benchmark for norm (company average or industry standard)
  • Separate but related measure
Picture
3) Consider providing non-quantitative data
If you frame dashboards around improving a process or keeping 'two fingers on the pulse of the company'; in addition to measuring activity, it may make sense to provide non-numeric data.
  • Top & bottom performers
  • Prioritized issues to investigate
  • Upcoming due dates.

4) Emphasize the important things.
By understand how the eye travels across the page, designers can highlight, prioritize and de-epmphasize.  This is particularly important when planning the placement of auxiliary elements including filters, slicers, legends and labels.
Picture
5) Maximize the "Data-Ink Ratio"
  • If 'data ink' is any line or pixel that communicates quantitative data (a line, point or bar in a chart) and 'total ink' includes all the lines used to create said chart (axes, tick marks, borders), then the data-ink ratio measures the distribution of ink used for communication versus formatting and decoration.
  • Strive to enhance data ink while reducing and deemphasizing non- data ink.
    1. Can you remove unnecessary tick marks or grid lines?
    2. Do variations in color provide additional meaning, or is the message equally clear with just one or two muted colors?
    3. Do you really need that 3-D effect or color gradients?
Picture
6) Organize information to support interpretation and application
  • Organize groups according to departments, entities or use
  • Co-locate and subtly delineate related data
  • Support meaningful comparison while minimizing meaningless comparisons

My favorite feature of Few's book was his analysis of sample dashboards, wherein he described, not only the flaws in various dashboards but also modeled various alternative ways of presenting the data.  For dashboard developers this analysis would prove invaluable for sharpening our critical eye, but also provide inspiration for what dashboards can (or shouldn't) look like!
Picture

Bridge the Gap between Concept and Execution

For those of you using Pivot Tables or PowerBI to access data from a Jet Enterprise cube, it can be difficult pursue the optimum dashboard layout or chart because you're constrained by the limits of the pivot table or data model.
In a previous post: Better Dashboarding with Cube Functions, Onyx Reporting presents a tutorial for converting Pivot Tables into infinitely more manipulable Cube functions.

One Book to Rule them All

Picture
Content and images from this blog post were taken from Stephen Few's Information Dashboard Design (buy it on Amazon).

Check out our blog at www.onyxreporting.com/blog or sign up for our weekly newsletter: "The Data Dump".

If you need support knocking out a batch of reports or want to customize your Jet Reports cubes to include some new comparative measures, our services team is amoung the best in the business.
0 Comments

Are you Data Driven?

20/4/2016

0 Comments

 
Picture
The age of big data is here and it's not going away. Do you have the analytic tools to remain competitive?  
At first, traditional bricks and mortar stores were only competing against each other; then, along came digital stores who put on the squeeze by offering lower prices. But that was back in 2001, and today even digital stores are feeling pressure to adopt increasingly data-driven cultures in order to remain competitive against 'the next Uber' or mega-conglomerate.  With the landscape constantly changing, it's hard to know how your company stacks up.
This week I came across Daniel Egger's 20-Item Checklist for Data-Driven Companies as part of my studies for Coursera & Duke University's excellent Business Metrics for Data-Driven Companies course.
The checklist includes:
  • Use a model that modifies inventory levels at the store level by region, season, and day of the week to optimize days inventory against opportunity cost.
  • Track churn rates and have a program to contact former customers who have "gone quiet" and are potentially lost - and provide incentives for them to return.
  • Visitor "conversion" on websites is tracked at two levels - voluntary registration and first sale.
  • Special programs to distinguish, reward, and retain the highest-level recurring revenue customers - "Whales" as they are called in the gambling industry.
Even within this subset I saw analytics solutions I've implemented for clients of Onyx Reporting. If you're wondering how cubes and data warehouses play into all this - consider the challenge of analyzing your CRM data (customer conversions) with AR (Dynamics NAV) or trouble tickets (ZenDesk) and Point of Store Sales data (LS Retail or POS).
If you're struggling to usher your company into the era of data-driven analytics, Onyx Reporting is ready to help.
0 Comments

My Favorite Things

18/4/2016

0 Comments

 
My Favorite Things--Jae WilsonDocs.com
0 Comments

100 free Jet Reports via the Report Player

8/1/2016

0 Comments

 
Between now and my previous post --'why Power BI won't replace a Data Warehouse'--which was built on ideas shared in 'why people should stop confusing BI with Visualization', I've done even more thinking on the topic of why clients have unsuccessful or incomplete Business Intelligence implementations.  The good news is, it's seldom a case of bad software fit, lack of resources or expertise, so much as uncertainty around what the desired outcome should be.  It doesn't matter if your solution is Targit, Crystal, SAP, ZapBI or Jet Enterprise, if the client doesn't start with a clear vision of what they want, no matter what you deliver, it seems, they don't want that!  
Ultimately, the root challenge is not a question of functionality or capability so much as whether the user can articulate what they want or need.  I just had a conversation with a potential client who lamented, "No one on my team except me really knows what they want."  Honestly, that degree of transparency is better than the dreaded "we just want a simple dashboard," but there's still work to be done!  
It's time to sit down with your users and map out some mock ups of what you want your final reports to look like.  Get away from emails and lengthy paragraph long descriptions; instead, use simplified spreadsheets with sample headers, fake data, mock-up graphs, or a list of the KPI's you want to see in your solution.  Even an seemingly obvious question like, "How do you want to consume this data, (spreadsheet, pivot table, web page, or mobile device) can have far reaching implications on what Front End software you use to develop your reports!  I've delivered more than a handful of dashboard workshops where my opening question"what do you want your dashboard to look like?" was met with dead silence. #FacePalm. 
If this still seems too daunting, maybe it's time to do a little window shopping--did you know that Jet Reports has over 100+ generic reports available for you to use?  I know, "these generic reports don't quite meet your company's unique needs."  But that's not our expectation!  If you can express what does and doesn't work about a generic report, then we can uncover not only what you want and need, but now you have ideas about what it could look like on paper!  A good Jet consultant can customize any of the Report Player reports, or even build a more robust one from scratch with all the features you want!
Unfortunately consultants are sometimes slow to promote the Report Player.  "Our end users don't like it because XYZ."  Trust me, I've heard every possible reason, from both clients and consultants, but I'll shorten my counter-argument to the following: the Report Player inspires users with glimpses of what's possible.  At the end of the day, consultants can only deliver what users ask for, and clients can only ask for what they know about or have seen before.  I like to think of the Report Player as a mini IKEA catalogue: the pictures are there to whet your appetite. Don't expect completeness; instead, let the report layouts be a source of inspiration.  I love it when a client shows me a report and says: "I love it, but can it do ..."  Now we can focus the conversation to delivering exactly what you want because you know what you're looking for.

Happy Reporting!
The Report Player--Jae WilsonDocs.com
0 Comments

Power BI will not replace your Data Warehouse

29/10/2015

0 Comments

 
Self Service BI is here! With Power BI, you no longer need a traditional data warehouses.  Blue is the new Black and White is the new Gold! 

In my last blog post, we discussed how data visualization is just the last phase of the business intelligence process.  I sought to emphasize the value of a data warehouse in conjunction with the sexy visualization tools.  Power BI is upsetting the market because some  fear (and others purport) that it completely supplant a data warehouse.  Let's be clear.  No tool is going to replace the need for a conformed, sanitized, and policed data warehouse (a data warehouse takes the current and historical data from your ERP and restructures it for reporting and analytic purposes in a database separate from your transactional systems).

Now, for clarity of language, let’s agree to condense the definition of Business Intelligence, to “a data analysis process aimed at boosting business performance by helping stakeholders make more informed decisions” (thank you Bing search engine).  Having agreed on BI’s definition, let’s now expand our paradigm of BI tools to encompass models suitable for an individual, a small team, or an entire Enterprise.  

Hopefully everyone is still onboard with our definition of BI and the purpose of a data warehouse; here, however is where some part ways.  Because emerging tools like Power BI can create cube structures fed directly by an ERP, controversy arises as to whether we still need a data warehouse.   While I fully endorse and look forward to more companies leveraging Power BI as a presentation tool, I do not believe that an exclusively Power BI solution will replace the data warehouse for Corporate and Team BI.  Let’s back up and discuss.
Picture

Why you need a Data Warehouse. ​

 In traditional BI, whenever possible, we separate the transactional system (your ERP—whether NAV, GP, AX, Quickbooks, Sage) from the reporting environment.  You may have heard names like Inmon or Kimball, tabular versus multidimensional, and star versus snowflake schema bandied about when discussing data warehousing, cubes and your reporting environment.  Where your ERP may have data entry problems, incorrect status changes, or even truncate historical data (when you migrate or upgrade your ERP systems), the purpose of a data warehouse is to create a policed structure that measures and holds all historically relevant company activity and then gives the ability to present said measurements across a series of structured dimensions (like Customer, Vendor, or Date).  

Some cost conscious business owners having invested in an ERP may balk at investing in a separate reporting solution.  “Shouldn’t reporting just come with NAV?”  Yes, and no.  Actually … mostly no.  Just as there’s an industry around recording your company’s transactions, there’s an equally robust industry around extracting and presenting that data.  Some clients want a standalone front-end application like Targit; others need access via mobile devices and web pages; while some analysts just want their data in Excel.  Furthermore, in order to create a consolidated view of a customer or your service offerings, it’s valuable to integrate data from CRM, Helpdesk software and custom CSVs generated by online sources or machines off the manufacturing line.

Admittedly, for ease of reportability, I’ve seen companies integrate these external datasources directly into their ERP, but a hybrid ERP/data warehouse solution often yields subpar results.  “The system only works if customer number in the ERP matches the account number in CRM”, “there’s no system for handling what to do if the customer information changes in CRM but doesn’t update NAV” or vice versa.  We haven’t even begun discussing how to match Tech Support tickets with an invoice line or SLA in your ERP, and you can forget about cost allocation!   

Please, stop asking your ERP to be a Unicorn and a Dragon at the same time. The ERP should capture transactions; your Data Warehouse should be your reporting solution!  And no new-fangled tool should unify those inherently disparate needs.    

It is worth noting that not every reporting request requires a data warehouse.  At the Personal (or trendily named Self-Service) level, an employee on the purchasing team may need a tool that can project demand based on scheduled production and then calculate shortages based on Inventory On Hand and Inventory on Production Orders.  Alternatively, you may have already seen a Power BI demo based on basic sales analytics.  None of this simple analysis requires a data warehouse, but as data volume or sources increase, the need for an alternative to direct queries to the ERP emerges.   

Let your Analysts be Analysts

Let’s pause for a moment.  Again, I like the idea of Power BI for presentation.  And if I had to guess you were sold on Power BI by the pretty dashboarding tools too! But that’s not what we’re evaluating here.  The question we’re tackling is, whether you need a data warehouse.  

When an analyst expands their reporting requirements horizontally to span multiple departments or vertically to roll up by dimension we move past a ‘do-it yourself’ self-service option toward needing a data warehouse where all this data has been collected and sanitized.  Consider the challenge of evaluating whether there could be cost savings in having larger POs and holding inventory versus adopting a Just-In-Time model.  Alternatively, what if we wanted a dashboard that gave us the health of our warehouse measured across 5 or 6 metrics compared against last month and last year?  Because the data may not be measured at the same granularity we then have to agree on consistent allocation rules.  This project might be a bit too rigorous to assemble with Excel Spreadsheets and it’s here where we see the benefit of a data warehouse and accompanying cube solution. 

Whatever you do, don’t let vendors sell you on the idea that your analysts can build a data model themselves (unless they ARE BI developers / analysts).  Remember, you were sold on the sexy visualizations NOT the thrill of learning to build data models.  Yes, an analyst can learn how to create a pivot table or pivot chart, but that’s a far cry from asking you to dabble in MDX, DAX, SQL, SSIS, and SSAS or asking you to get around SSMS and SSDT to build star schemas optimized with indexes, appropriate data types, surrogate keys and all the other arcane concepts supporting cube-based reporting.  Let your analyst do what they do best: analyze data!  Don’t ask them to become data architects or database developers too.   

In conclusion, while it’s true that modern tools are making it easier for end users to create their own small scale data models, there’s still a need for a conformed enterprise-wide model that has been rigorously tested and reviewed for soundness of business logic.  And while we appreciate the flexibility when analyzing sales trends or conducting exploratory analytics at the Team and Corporate level, the flexibility of a self-service option is borderline inappropriate.  There cannot be any creativity with reported measures like EBITDA, Profit Margin and Overhead Costs! 

Does Power BI mean I can't ...

If we now agree that you need a data warehouse, the next question to tackle is “do you want Power BI?  Because if you do, you can’t …” or “you must …” or “it differs from traditional BI because …” Again, stop and step back from the ledge.  Let’s not confuse building a data warehouse with finding tools for presenting the data. Agreeing that you need a data warehouse has NOTHING TO DO with whether you use Power BI for visualization or as a self-service option. 

“But I heard I can only use a Tabular model with Power BI”.  Today, it’s true.  Power BI’s support for traditional multidimensional cubes (as opposed to the newer Tabular model) is limited.  Rumors are flying that by the end of 2016, Power BI will have better support for multidimensional cubes which renders this concern moot.  There are existing articles comparing the merits of Tabular cubes versus Multidimensional cubes, each claims to be easier or more sophisticated than the other, but that’s beyond the scope of this article.  Once you get past the Data Warehouse, I’d argue the larger concern is what kind of tools you want to use to visualize the data and what resources you have access to that can implement the data model you want.   

If you want to empower your employees with more sophisticated reporting and analytics options, whatever you do, don’t let someone tell you, you don’t need a Data Warehouse. 

Many of my client have enjoyed success working with the Jet Enterprise data warehouse automation tool, http://jetreports.com/products/jet-enterprise.php to build their Data Warehouse.  I’d be thrilled to work with you to develop data warehouses either from scratch or customize one from the library of prebuilt projects.  Once that’s done, THEN we can talk about sexy visualization tools.
​

Image from Microsoft Whitepaper: "Introducing the BI Semantic Model in Microsoft® SQL Server® 2012"
0 Comments

Stop buying Visualization Software

20/10/2015

0 Comments

 
​At the risk of being the unpopular consultant, I'm going to make a recommendation that no one wants to hear.  Please.  Stop spending money on visualization software.

If you Google "Top 10 BI Software" or go to a trade show, you'll see all sorts of visualization tools that promise to sing, dance, and make your dreams come true.  It'll make dashboards, KPI's, and pretty charts--maybe even warn you if you drop below Safety Stock.  Wait, put the credit card down, and please stop buying visualization software.

While it is fun believing that a shiny new software can solve your data problems, you probably already own a visualization tool that will meet 90% of your reporting and analytics needs!  Don't believe me?  I'll bet if you asked an analyst to give you numbers about <insert your business metric here>, after a few hours, they'd come back to you with an Excel Spreadsheet or Pivot Chart or two.  Turn around and ask the Finance Department for an Income Statement report broken out by Department and Region with measures for Current Year, Previous Year and %Growth.  Again, I bet you get another spreadsheet.  And why not?  No matter how unsexy it seems, Excel is the de facto tool for business analysis, and it will probably meet 90% of your reporting needs.
​
The real question though, is how much sweating did the analyst have to go through to get you those spreadsheets?  If you look at the formulas and see a slew of SUMIF()s and VLOOKUPS() how nervous are you about presenting to the bank, auditors or in meeting rooms?  Only way it could be worse is if you saw a bunch of static numbers!  Truthfully, this is where BI comes to play.  Although our eyes may start glazing over when you start talking about SSIS, ETL, star schemas, and multidimensional models, the fact of the matter is, any visualization tool will perform better when pointed at a well-designed Data Warehouse or Cube.  In fact, if you ask the vendor giving you your Visualization Software demo, I’ll be they’ll tell you that data is coming out of a sanitized data warehouse, cube or data model that emulates a star schema.   If they’re using none of these tools they’re assuming that your ERP contains clean data.  <Insert skeptical snerk here>.

So what exactly is Data Warehousing all about? 
​Well, the Data Warehouse is actually the end of a tumultuous story of data access and extraction from your various source databases, Point of Sales systems, Google Analytics, Bloomberg CSV's, customs spreadsheets, or web-based Census Bureau data).  Once extracted, data undergoes rigorous application of business logic and queries to conform, validate, and clean all the disparate data, then finally a transformation process forges the cleaned data into a Data Model that consists navigable dimensions and fact measurements.  Voila.  Your Data Warehouse.

Sounds dry and possibly unsexy?  Consider the alternative.  Earlier this year I worked with a client who wanted a ranked dashboard that would show the amount and volume of deals that salespersons were bringing through the door.  The first draft of the report took less than an hour to build, but it was immediately sent back for revision.  "That lead conversion number is almost triple our average," I was told.  Further investigation revealed that Status could waffle between Lead and Conversion multiple times based on a combination of factors including: data entry problems, whether the data was collected via web forms or hand written forms, or if specific intermediary statuses were made or missed.  My report needed Olympic-quality gymnastics to extract a reasonable measurement at the cost of report performance.

Suffice to say, what should have been an easy report ended up taking more time than expected because there wasn't one source of clean data, and I loathe to think what would happen if the client wanted the report rolled up by region, or presented in a different layout.  Every client this story of having paid a consultant an ungodly amount of money for a report that we’re only marginally comfortable with and cannot be modified to present the data a slightly different way.  So what’s the solution?  A new visualization tools?  No!  Invest in a clean data warehouse that handles all the data gymnastics for you!  Don't get lured in by a vendor's siren's song of sexy graphics, KPI's and dashboards if you don't have a solid data warehouse already.  Instead, I would recommend looking around for a Data Warehouse automation tool. 

Many of my client have enjoyed success working with the Jet Enterprise data warehouse automation tool, http://jetreports.com/products/jet-enterprise.php.  And I’d be thrilled to work with you to develop data warehouses either from scratch or customize one from the library of pre-built projects.  Once that’s done, THEN we can talk about sexy visualization tools.
0 Comments
<<Previous
    Profile Picture Jae Wilson
    View my profile on LinkedIn

    Stay Informed.

    * indicates required

    RSS Feed

    Categories

    All
    Automation
    Basic Training Series
    Business Intelligence
    Connect
    Dashboard
    Data Pipeline
    Data Science
    Domo
    Excel Tricks
    Executive Training & Leadership
    Extract
    Jet Enterprise
    Jet Essentials
    New Release
    NP Function
    Onyx Reporting
    Planning
    Power Pivot
    Python
    Report Writing
    Statistics And Analytics
    TimeXtender
    Visualization

London, UK
jae@OnyxReporting.com
+44 747.426.1224
Jet Reports Certified Trainer Logo
  • Welcome
  • Services
  • Blog
  • YouTube Channel
  • Contact
  • Domo IDEA Exchange
    • Schedule
    • Call for Presenters