Databricks to Move Signature Data-Storage Technology Upgrades to Open Source

Data-analytics company Databricks Inc. plans to switch to open source all of the capabilities and upgrades it has made to Delta Lake, its flagship cloud-based data-storage technology, essentially giving them away online free.

The move would enable information-technology teams at outside companies to build and operate their own custom data lakehouse, a type of digital repository in the cloud where software developers can build artificial-intelligence applications designed to glean business insights based on massive amounts of data.

Delta Lake technology is a key component of a lakehouse designed to ensure the quality and reliability of AI-ready data. Databricks initially launched Delta Lake as an open-source software project in 2019. But until now, many of the new features it has added since then were proprietary, available only to Databricks’s customers.

Based in San Francisco, Databricks makes money by renting analytics, AI and other cloud-based software designed to help companies mine insights from business data. The services are based on open-source Apache Spark, a real-time data-analytics technology that emerged from the University of California, Berkeley in 2009. Open-source developers make software available free of charge, allowing programmers to modify and share the underlying source code, and create their own apps.

Databricks Chief Executive Ali Ghodsi.


Databricks Inc.

Databricks co-founder and Chief Executive

Ali Ghodsi

said the decision to make the technology available free is aimed at attracting commercial customers who are wary of being locked into a single data-management vendor and limited to using its analytics tools.

Mr. Ghodsi said more lakehouses will drive demand for Databricks’ analytics software and other services. The company will also continue handling security, maintenance and other software issues for customers deploying its tools within their own data lakehouse, he said.

Data kept in conventional data-storage systems needs to be copied, reformatted and shifted to a separate repository, where software developers can access it to create AI apps. A data lakehouse takes that step out of the process, Mr. Ghodsi said. “The lakehouse combines these two worlds into one place, where you have all your data and where you can apply AI,” he said. “One system, one copy.”

Demand for AI and data analytics is running high, as tougher economic conditions are prompting more companies to adopt software that promises to drive better business decisions or identify ways to improve everything from supply chains to customer services.

Ganesh Jayaram, chief information officer at farm- and construction-equipment maker


& Co., a Databricks customer, said Databricks’s open-source move would enable the company to develop its own custom data lakehouse.

“It allows us to scale analytics, at a scale we couldn’t do in the past,” Mr. Jayaram said about data lakehouse technology.

Among other applications, the company uses AI-powered data and analytics to support predictive-maintenance systems for tractors, backhoes and other vehicles in the field, which are loaded with data-collecting sensors. In January, it unveiled a fully autonomous tractor.

But Mr. Jayaram said he also plans to increase the use of AI in efforts to optimize supply chains, marketing, finance and other internal business functions.

“When engineers use an open-source technology, they have access to the source code and can make their own custom version,” said Christopher Condo, a principal analyst at IT research firm Forrester Inc. “At minimum, the users can at least inspect how the open-source software works.”

Deere uses AI-powered data and analytics to support predictive-maintenance systems for tractors, backhoes and other vehicles.


Rick Wilking/REUTERS

Open-source software business models have become increasingly popular. Up to 80% of the code in new software projects is estimated to consist of third-party components, with the most being open source, said Mark Driver, a vice president and analyst at IT research and consulting firm

Gartner Inc.

The most common way that software developers make money from open-source tools is through a so-called freemium strategy, Mr. Driver said. Under that approach, a tech vendor gives away an app’s underlying code free, then converts a share of its users into paid customers by offering more advanced features and services.

Databricks, a nine-year-old company that is itself built in part on open-source software, has a private-market valuation of $38 billion. Earlier this year, it reported $800 million in annual recurring revenue for 2021. It hasn’t disclosed net income.

Before tech-market values began to fall this year, Databricks was one of the most highly anticipated initial public offerings in the startup market—a move that Mr. Ghodsi now says isn’t a priority. “We don’t have to worry about running out of money for a decade to come,” he said.

Write to Angus Loten at

Copyright ©2022 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8

Source link