Why and how should I modernize my data platform?
- Posted on February 19, 2020
- Estimated reading time 3 minutes
Organizations collect data with every operation and customer interaction. According to a 2018 Forbes study, 2.5 quintillion bytes of data are generated every day. The growing availability and volume of data combined with the emergence of more efficient analytics tools have created opportunities for companies to develop data platforms that provide mission critical insights, fuel new revenue streams, and provide points of transformation for customer and employee experiences beyond cost reduction initiatives. The prevalence of these platforms has made it increasingly more important for organizations to find ways to make effective use of their data.
For organizations to keep up they must develop and evolve data platforms that enable fast, reliable insights. Cloud computing and modern analytic tools offer enterprises the ability to bring data together from across business units and geographical boundaries to take advantage of AI to learn more about their business, customers, and industry. Effective data management is foundational for success, and a robust data supply chain will become a requirement for companies looking to thrive in today’s business landscape.
What does a modern data platform look like?
Traditional data platforms lack the speed, flexibility, and scalability to deliver fast insights. They are constrained by physical infrastructure limitations, siloed operations, and the inability to evolve. Traditional IT departments are struggling to keep up with the daily enhancements found with cloud capabilities.
Modern data platforms are characterized by fast, fault-tolerant infrastructure, a high degree of collaboration, and the ability to process a high volume and variety of data; They allow for self-service business intelligence and predictive analytics. A modern data platform harnesses the power of cloud computing to optimize data availability, scalability, and usability, and takes advantage of modern data processing tools to help answer the most pressing questions for your business. These technical shifts alone do not make a modern data platform produce instant results. However, these capabilities combined with new questions, and insights provide the fuel for those new revenue streams and customer impact much desired today.
Modern data platform in practice
An effective data solution covers the lifecycle from data source to actionable insight. This includes identifying data sources, ingesting, cleansing, and storing data, training a model, and serving insights to end users.
The following points describe the logical architecture of a Modern Data platform. The sub-bullet points contain example tools from the Microsoft Azure suite.
- Sources: Data that has been collected by the organization
- Ingestion: Process of bringing data into the platform
- Azure Data Factory – data integration service to move data from its source to its storage location
- Storage: Data storage for efficient retrieval
- Azure Blob Storage – allows for object storage in the cloud
- Azure Data Lake – enables developers to store data of any size, shape, and speed, and perform analytics across platforms and languages
- Processing: Engine for data engineering and machine learning
- Azure Databricks - fast, easy, and collaborative Apache Spark-based big data analytics service designed for data science and data engineering
- Model and serve: Analysis and discovery
- Azure Synapse Analytics - data Warehouse that use massively parallel processing to quickly run complex queries across petabytes of data
- Cosmos DB - fully managed database service with turnkey global distribution and transparent multi-master replication
- Power BI - business analytics service that provides interactive visualizations and business intelligence capabilities with a user-friendly interface
The value of the modern data platform is not the outcome, but rather the variety of business use cases, the value points and the shift to becoming a data-driven organization that creates a future-ready organization. There are a variety tools and architectures to support the specific data needs of each organization (i.e., advanced analytics, real-time analytics, modern data warehouse).
Avanade’s take on the approach
Modernizing a data ecosystem isn’t always easy; it involves crafting a comprehensive data strategy, introducing fresh processes, and implementing new tools. Cloud data tools are designed in a way that allows organizations to start small and then scale up as the organization becomes ready to adopt the upgraded processes. For clients undertaking initiatives to modernize their data platform or introduce new functionalities, it is often best to take a value/design led approach. This allows organizations to start small, prove value, and iteratively improve on the initial features of the upgrade.
It is important to assess the contribution of each new feature to the overarching data strategy and gauge the ability of the newest features to evolve with technological advances and scale with more data inputs.
Creating a strong data strategy and roadmap acts as a foundation for all your data modernization initiatives. The initial steps on a data modernization journey include identifying the insights, operational enhancements and broader business initiatives that new insights could provide the most value to your business and crafting a data strategy that will allow you to meet your goals in those areas. That means defining a clear vision of where you want to go and generating a strategy that will allow you to get there.
Are you ready to upgrade your data ecosystem? Avanade can help you get started.
Learn about Avanade’s analytics and AI practice and how our clients are getting the most out of their data.
Terje Vatle @terjevatle
Interesting article! Modern organizations may also evaluate to ingest structured and governed data directly to the Synapse SQL architecture of Azure Synapse Analytics without going via an Azure Data Lake Storage gen2. The SQL architecture can enable metadata preservation and provide an architecture fit for analytics use cases such as typically covered by a data warehouse. For exploratory or more advanced analytics purposes such as AI and ML an organization may use ADLS gen2 with the Spark environment in Synapse. A great thing with Synapse is the integration of Power BI and now Azure Purview. However, for environments where stricter governance is needed, developers should be aware of data lineage limitations with environments such as Databricks where metadata APIs are at best limited. Azure Purview helps to some extent in this area, but is not holistic for the whole data platform for example since it is currently not supporting views and stored processes. Just my 5 cents, keep up the great work!