ML Scaling Requires Upgraded Knowledge Administration Plan


Successful data strategies are based on careful data management and create enterprise architectures that “democratize” data access and data usage and deliver measurable results from machine learning platforms.

According to a study by the emerging “AI organization”, the reality is that few data-driven organizations are able to implement their data strategy. A survey commissioned by Databricks and conducted by MIT Technology Review Insights found that only 13 percent of respondents are seeing real measurable business results.

According to MIT Technology Review Insights, 351 CDOs, chief analytics officers, and CIOs, CTOs and senior technology managers were interviewed. Several other high-level technology leaders were also interviewed.

The move to cloud-based platforms, including databases and analytics tools with machine learning capabilities, is offset by legacy systems and the resulting data silos.

“Architecture fragmentation is a headache for many chief data officers, not just because of the silos but also because of the multitude of on-premise and cloud-based tools that many organizations use,” the MIT survey concludes. “Coupled with poor data quality, these issues mean that enterprise data platforms – and the machine learning and analytics models they support – are no longer fast and scalable to deliver the business outcomes they want.”

One consequence is the inability to scale machine learning use cases. The biggest challenge, according to more than half of the respondents, is the current lack of a central repository for recognizing and storing models for machine learning.

This separation contributes to the fact that AI workloads can no longer be brought into production, which indicates “serious difficulties in the collaboration between them” [machine learning], Data [science] and business user teams are a reality, ”said 39 percent of those surveyed.

What should I do? The survey predicts an accelerated shift over the next two years to cloud-native platforms that are better suited to supporting data management – especially the growing volumes of streaming and unstructured data – and thus the data analytics and machine learning functions and those they support Improve data strategies.

In addition to cloud migrations, data managers struggling to develop new architectures that drive machine learning cite the need for open data formats and other open source standards.

The sponsor of the study, Databricks, used the results to promote its “Lakehouse” architecture, which was introduced last year and which includes real-time streaming, batch processing, SQL analytics data science and, last but not least, machine learning.

The MIT study “suggests that companies need to create four different stacks to handle all of their data workloads: business analytics, data engineering, streaming,” and machine learning, the Apache Spark developer said.

“All four stacks require very different technologies and unfortunately sometimes don’t work well together.”

The MIT survey’s “top performers” for effective data strategies were financial services companies and, surprisingly, the government and the public sector. The keys to success included less data duplication, easier data access, fast processing of large amounts of data, and improved data quality.

Recent Articles:

Databricks, partners, open a unified ‘Lakehouse’

Will Databricks Build the First Enterprise AI Platform?

Machine learning is getting a boost to scale