TECHNOLOGIES USED
- NodeJS
- BullMQ
- Postgres
- Snowflake
- Kubernetes
OUR CLIENT
Our client was a B2B business that had three major products that assisted with and streamlined technical troubleshooting, as well as quality assurance of equipment installations. Each of the products served a different part of the service lifecycle of their customers' end users.
The client had no centralized reporting capability, and all reports were built ad hoc and relied on inefficient solutions using Elasticsearch or manual solutions like Microsoft Excel, which were not fit for purpose. It was creating friction during development, and leadership had limited data to make strategic business decisions or to quantify the value being delivered to potential customers during product trial periods.
OUR IMPACT
Immediate Value:
- Delivered a working reporting system within a tight 6-week timeframe, providing immediate ROI.
- Established Snowflake as the single source of truth for all product data.
- Replaced slow, error-prone Elasticsearch implementation with a faster, more reliable Snowflake-based solution.
- Empowered non-developers to explore data and create visualizations with Snowflake Worksheets.
- Replaced manual Excel reporting with PowerBI, reducing time and effort for data scientists.
Future-Ready:
- Set up scalable cloud infrastructure (Kubernetes on AWS) for current and future microservice deployments.
- Implemented CI/CD pipeline and preview environments for reliable, efficient updates.
- Provided a roadmap for transitioning to a event-driven, microservice architecture for better scalability and comprehensive data insights.
THE CHALLENGE
Our client wanted to focus on three areas of improvement that each had their own challenges: in-product reports for customers, value realization reports for decision makers at customer companies, and in both cases the ability to report across products.
In-product reports: These were meant for customers to track end-user engagement with the client’s products. However, they were built using Elasticsearch, which resulted in sluggish development times as the tool was designed for text search, not reporting. This led to frequent errors in the reports, and the technical complexity of the system meant that only developers could create or modify the reports. Each report was also designed on a per-product basis, hindering scalability.
Value realization reports: These were custom-built reports the client used to demonstrate product value to prospective clients on trials. A data scientist manually created these reports in Microsoft Excel, a process that often took weeks and involved sourcing missing data directly from clients. This approach was not only time consuming but also became unsustainable as the company grew.
Cross-product reporting: The company's rapid expansion highlighted another critical issue: the need for cross-product reporting to understand customer journeys across different products. Existing reporting methods simply couldn't address this requirement.
Ultimately, the way the company handled reporting needed significant improvement. The company wanted faster report creation times, greater data reliability, and more comprehensive insights across all products to effectively support its growth and client needs.
THE SOLUTION
Understanding the client’s challenges and end goal of a more agile and data-driven system, we drafted plans for a comprehensive architectural solution using Kafka, a platform for handling real-time data streams, and Snowflake, a cloud-based data warehouse. Combining Kafka and Snowflake would allow the client to shift from their existing siloed systems to a connected, event-driven microservice architecture. This shift would turn previously siloed products and features into independent building blocks, or “services” that are designed to handle a specific task, and can be combined together and used by any product in the ecosystem. By making key features independent services that are shared by different products, we could report more effectively across products by reporting on those services in one place instead of in each product.
To meet the client's immediate challenges, we devised a multi-phase plan. In the first phase, we chose a product that didn’t already have reporting as our pilot project. To save the time and effort of setting up Kafka, we used their existing queue management service based on BullMQ to build a tool to process data from each product and write the processed data to the client's new Snowflake data warehouse.
To address the “in-product” reports challenge, we built a microservice fed by an hourly Snowflake sync process that transformed raw data into processed insights. This allowed the in-product reporting to migrate away from the previous Elasticsearch solution, to the newer, faster, more reliable version powered by Snowflake and our reporting microservice. Moving to Snowflake also meant that non-developers could play with the data themselves using Snowflake Worksheets in order to gain insights for themselves, and to help develop new charts for in-product purposes.
To tackle the “value realization” reports challenge, we integrated PowerBI, a tool to create visualizations based on data from external sources, into Snowflake that empowered the client’s data scientist to easily create comprehensive reports and dashboards for internal use, and for client presentations. Using these new tools, Excel was no longer required and report generation was significantly faster and easier to do.
We were able to deliver this first phase of the project in 6 weeks, solving their most immediate challenges. The project not only delivered immediate value through an initial reporting system but also set the stage for the client to achieve long-term scalability and unlock their data's full potential, fueling future growth and innovation.
THE FUTURE
In addition to building a working reporting system, we set the client up for future success by creating both a comprehensive toolkit and detailed roadmap for further development.
We set up a managed Kubernetes (EKS) cluster on AWS, providing a scalable and reliable cloud-based infrastructure to host the new reporting microservice and those to come. We also implemented automated pipelines for testing and deploying new features, which ensures the reporting system's stability and performance as the client’s development team takes over and continues to iterate. Additionally, we set up preview environments, enabling developers to test code changes in isolated environments before deploying them to production, further enhancing stability and minimizing disruption.
To streamline future development, we built a reusable library for sending data to their Snowflake data warehouse, making it easy to integrate reporting into any new product. Data visualization components that are also reusable were also created, simplifying the development of insightful in-product reports. These efforts empower the client's development team to seamlessly expand and refine the reporting system.
Looking ahead, we envision the client transitioning to a more scalable and efficient microservice architecture using Kafka. We provided detailed documentation and a roadmap outlining how they can gradually break down their existing monolithic systems into smaller, interconnected microservices. This approach allows the services to communicate independently and asynchronously, simplifying their task down to reading incoming data from Kafka and outputing the results of their jobs. This data seamlessly flows into Snowflake, creating a comprehensive picture of everything that has ever happened in the system across all products, providing deeper insights and enabling them to build more sophisticated, data-driven features and reports.