“We’re a data-driven company.”
I hear you already exuding on your screen. Let’s talk about what this phrase means today, what we as scientists and analysts really want it to mean, and how to catch up. Please read this before responding to your next case-by-case request.
It’s been a sexy couple over the last year to be an independent PROclaimed a “data-driven” organization. Organizations usually mean: we make our decisions using data. Don’t get me wrong now – it’s a great idea to support decision-making processes with information. But there is something insidious about this statement: it does not recognize the hard work of researchers / analysts. The focus is on data, not analyzes, in other words:
We have so much information that we just have to look at it and use it to make decisions. Analysts and researchers just have to get the data for us.
Knowledge-based organizations: analysts search for data.
Analysis-driven organizations: analysts find answers.
Knowledge-based organizations: hiring more and more analysts / DS “hands”.
Analysis-driven organizations: invest in infrastructure, tools, and training.
Based on the data: “Can you draw these numbers …?”
Based on analytics: “Can you help me think…?”
Knowledge-based: searches for information that justifies management decisions
Analysis-driven: creates insights that tell about engaging storytelling
I would say it’s time for organizations to move from data-based to analytics-based.
Unfortunately, there is no silver mark here – becoming a successful “analytics-driven” depends on how your organization views its analytics organization. Do they think partners or subordinates? SQL monkeys or scientists?
At the end of the day, stakeholders need to get used to presenting analyzes with company-level issues without requests for information.
[For a great read on this subject, check out Pedram Navid’s article, Building the Modern Data Stack.]
Despite having spoken at length with cross-border leadership, I have found that a few procedural changes can help with this change:
- 📝 Document your case-specific work.
- 🍽 Document your information.
- 📚 Set a central location for finding documents.
While these don’t directly change your corporate culture, they help you reduce the burden of your case-by-case requests by making your previous job searchable and, dare I say it, self-service.
📝 Document [and own] your case-by-case work
As a data scientist or analyst, you are the person who knows the data best, so naturally data-related requests come to you. The default response has always been to drop the survey / dashboard to stakeholders ASAP, which I forgot right away. But then I had to repeat the work the next time I was asked. Moreover, such a case-by-case response only reinforces the notion that your relationship with decision makers is purely commercial.
Write the document instead. Start with a question:
“What question are you trying to answer?”
Then do your work masterfully and reproducibly – using information is, after all, a science and should be considered one.
This has two advantages:
- It reduces the load. You will have this for later reference if you are asked the same question again (you are going to).
- It brings more weight behind your work. You are forced to take the problem more seriously. This will produce better analysis, strengthen your value as a thought partner, and improve the quality of decision making. Based on the analysis here we come. 🚀
🚗 Document your information [tables + transformations]
Documenting your work is a great first step, but your analysis is only good as the data you use to do it. If you want to trust your analysis work (and if you want others to trust your work as well), you need to thoroughly understand and document the tables and transformations used in your final analysis. Here are two parts:
- Document your information.
At least keeping a key document somewhere for your team to access can be a good start. If you are looking for a lightweight hosted solution, Prequel provides the first note-taking tool in the query with a built-in data list, so you can put both your SQL work and spreadsheet documents in one place. Data list companies (like Alation, Collibra, or data.world) or open source “data retrieval” tools (Datahub, Amundsen, metacat, metamapper, Magda) could fit the bill here, but setting them up is usually a bit cumbersome to maintain / maintain.
- For your conversions, use a tool such as dbt or dataform to document and control your conversions.
Conversions deserve to be visible and version-controlled, and are not pushed into views (although this may be stable for some time). dbt is a great tool for building, documenting, and versioning such work.
📚 Specify a central location to find your document
Now that I have convinced you to document things, there is a glaring question:
Where do we put these documents?
You need to combine all these documents somewhere. I have worked with companies / companies that have used Confluence, Notion or even Github to manage this, which can work with a touch. In Airbnb we used Information backpack together Dataportal save these as git-tracked releases, but this always felt a bit heavy for case-by-case work.
Prequel works really well for organizing work this way – Prequel combines a Notion / Confluence-style note-taking environment with executable query blocks, so you can start an SQL job there, easily share it with another organization, and later, use your work as a starting point for creating a full document assembly.
But at the end of the day, use the right tools for you. The important thing is that your work must be visible, accessible, and searchable among the rest of the team and the company.
Analytics organizations do not exist for writing SQL. We are here to solve the problem. While it is a delicate act to try to force stakeholders to look at it this way, we will at least document our data work to make the critical aspects of our problem more visible. Show the world why analysis-driven organizations can be so powerful! 🙌