Linux Foundation to promote dataset sharing and software dev techniques

At the Linux Foundation Membership Summit this week, the Linux Foundation — the nonprofit tech consortium founded in 2000 to standardize Linux and support its growth — announced new projects: Project OpenBytes and the NextArch Foundation. OpenBytes is an “open data community” as well as a new data standard and format primarily for AI applications, while NextArch — which is spearheaded by Tencent — is dedicated to building software development architectures that support a range of environments.


Dataset holders are often reluctant to share their datasets publicly due to a lack of knowledge about licenses. In a recent Princeton study, the coauthors found that the opaqueness around licensing — as well as the creation of derivative datasets and AI models — can introduce serious ethical issues, particularly in the computer vision domain.

OpenBytes is a multi-organization effort charged with creating an open data standard, structure, and format with the goal of reducing data contributors’ liability risks, under the governance of the Linux Foundation. The format of data published, shared, and exchanged will be available on the project’s future platform, ostensibly helping data scientists find the data they need and making collaboration easier.

Linux Foundation senior VP Mike Dolan believes that if data contributors understand that their ownership of data is well-protected and that their data won’t be misused, more data will become accessible. He also thinks that initiatives like OpenBytes could save a large amount of capital and labor resources on repetitive data collection tasks. According to a CrowdFlower survey, data scientists spend 60% of their time cleaning and organizing data and 19% of their time actually collecting datasets.

“The OpenBytes project and community will benefit all AI developers, both academic and professional and at both large and small enterprises, by enabling access to more high-quality open datasets and making AI deployment faster and easier,” Dolan said in a statement.

Autonomous car company Motional (a joint venture of Hyundai and Aptiv), Predibase, Zilliz, Jina AI, and ElectrifAi are among the early members of OpenBytes.


As for NextArch, it’s meant to serve as a “neutral home” for open source developers and contributors to build an architecture that can support compatibility between microservices. “Microservices” refers to a type of architecture that enables the rapid, frequent, and reliable delivery of large and complex apps.

Cloud-native computing, AI, the internet of things (IoT), and edge computing have spurred enterprise growth and digital investment. According to market research, the digital transformation market was valued at $336.14 billion in 2020 and is expected to grow at a compound annual growth rate of 23.6% from 2021 to 2028. But a lack of common architecture is preventing developers from fully realizing these technologies’ promises, Linux Foundation executive director Jim Zemlin asserts.

“Developers today have to make what feel like impossible decisions among different technical infrastructures and the proper tool for a variety of problems,” Zemlin said in a press release. “Every tool brings learning costs and complexities that developers don’t have the time to navigate, yet there’s the expectation that they keep up with accelerated development and innovation.”

Enterprises generally see great value in emerging technologies and next-generation platforms and customer channels. Deloitte reports that the implementation of digital technologies can help accelerate progress towards organizational goals such as financial returns, workforce diversity, and environmental targets by up to 22%. But existing blockers often prevent companies from fully realizing these benefits. According to a Tech Pro survey, buy-in from management and users, training employees on new technology, defining policies and procedures for governance, and ensuring that the right IT skillsets are onboard to support digital technologies remain challenges for digital transformation implementation.

Toward this end, NextArch aims to improve data storage, heterogeneous hardware, engineering productivity, telecommunications, and more through “infrastructure abstraction solutions,” specifically new frameworks, designs, and methods. The project will seek to automate operations and processes to “increase the autonomy of [software] teams” and create tools for enterprises that address the problems of productization and commercialization in digital transformation.

“NextArch … understands that solving the biggest technology challenges of our time requires building an open source ecosystem and fostering collaboration,” Dolan said in a statement. “This is an important effort with a big mission, and it can only be done in the open source community. We are happy to support this community and help build open governance practices that benefit developers throughout its ecosystem.”

Originally appeared on: TheSpuzz