Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.
The metaverse has captivated our collective imagination. The exponential development in internet-connected devices and virtual content is preparing the metaverse for general acceptance, requiring businesses to go beyond traditional approaches to create metaverse content. However, next-generation technologies such as the metaverse, which employs artificial intelligence (AI) and machine learning (ML), rely on enormous datasets to function effectively.
This reliance on large datasets brings new challenges. Technology users have become more conscious of how their sensitive personal data is acquired, stored and used, resulting in regulations designed to prevent organizations from using personal data without explicit permission
Without large amounts of accurate data, it’s impossible to train or develop AI/ML models, which severely limits metaverse development. As this quandary becomes more pressing, synthetic data is gaining traction as a solution.
In fact, According to Gartner, by 2024, 60% of the data required to create AI and analytics projects will be generated synthetically.
Machine learning algorithms generate synthetic data by ingesting real data to train on behavioral patterns and generate simulated fake data that retains the statistical properties of the original dataset. Such data can replicate real-world circumstances and, unlike standard anonymized datasets, it’s not vulnerable to the same flaws as real data.
Reimagining digital worlds with synthetic data
As AR/VR and metaverse developments progress towards more accurate digital environments, they now require new capabilities for humans to interact seamlessly with the digital world. This includes the ability to interact with virtual objects, on-device rendering optimization using accurate eye gaze estimation, realistic user avatar representation and the creation of a solid 3D digital overlay on top of the actual environment. ML models learn 3D objects such as meshes, morphable models, surface normals from photographs and obtaining such visual data to train these AI models is challenging.
Training a 3D model requires a large quantity of face and full body data, including precise 3D annotation. The model also must be taught to perform tasks such as hand pose and mesh estimation, body pose estimation, gaze analysis, 3D environment reconstruction and codec avatar synthesis.
“The metaverse will be powered by new and powerful computer vision machine learning models that can understand the 3D space around a user, capture motion accurately, understand gestures and interactions, and translate emotion, speech, and facial details to photorealistic avatars,” Yashar Behzadi, CEO and founder of Synthesis AI, told VentureBeat.
“To build these, foundational models will require large amounts of data with rich 3D labels,” Behzadi said.
For these reasons, the metaverse is experiencing a paradigm shift — moving away from modeling and toward a data-centric approach to development. Rather than making incremental improvements to an algorithm or model, researchers can optimize a metaverse’s AI model performance much more effectively by improving the quality of the training data.
“Conventional approaches to building computer vision rely on human annotators who can not provide the required labels. However, synthetic data or computer-generated data that mimics reality has proven a promising new approach,” said Behzadi.
Using synthetic data, companies can generate customizable data that can make projects run more efficiently as it can be easily distributed between creative teams without worrying about complying with privacy laws. This provides greater autonomy, enabling developers to be more efficient and focus on revenue-driving tasks.
Behzadi says he believes coupling cinematic visual effects technologies with generative AI models will allow synthetic data technologies to provide vast amounts of diverse and perfectly labeled data to power the metaverse.
To enhance user experience, hardware devices used to step into the metaverse play an equally important role. However, hardware has to be supported by software that makes the transition between the real and virtual worlds seamless, and this would be impossible without computer vision.
To function properly, AR/VR hardware needs to understand its position in the real world to augment users with a detailed and accurate 3D map of the virtual environment. Therefore, gaze estimation( i.e., finding out where a person is looking by the picture of their face and eyes), is a crucial problem for current AR and VR devices. In particular, VR depends heavily on foveated rendering, a technique in which the image in the center of a field of view is produced in high resolution and excellent detail, but the image on the periphery deteriorates progressively.
According to Richard Kerris, vice president of the Omniverse development platform at NVIDIA, synthetic data generation can act as a remedy for such cases, as it can provide visually accurate examples of use cases when interacting with objects or constructing environments for training.
“Synthetic data generated with simulation expedites AR/VR application development by providing continuous development integration and testing workflows,” Kerris told VentureBeat. “Furthermore, when created from the digital twin of the actual world, such data can help train AIs for various near-field sensors that are invisible to human eyes, in addition to improving the tracking accuracies of location sensors.”
When entering virtual reality, one needs to be represented by an avatar for an immersive virtual social experience. Future metaverse environments would need photorealistic virtual avatars that represent real people and can capture their poses. However, constructing such an avatar is a tricky computer vision problem, which is now being addressed by the use of synthetic data.
Kerries explained that the biggest challenges for virtual avatars is how highly personalized they are. This generation of users want a diverse variety of avatars with high fidelity, along with accessories like clothes and hairstyles, and related emotions, without compromising privacy.
“Procedural generation of diverse digital human characters at a large scale can create endlessly different human poses and animate characters for specific use cases. Procedural generation by using synthetic data helps address these many styles of avatars,”Kerries said.
Identifying objects with computer vision
For estimating the position of 3D objects and their material properties in digital worlds such as the metaverse, light must interact with the object and its environment to generate an effect similar to the real world. Therefore, AI-based computer vision models for the metaverse require understanding the object’s surfaces to render them accurately within the 3D environment.
According to Swapnil Srivastava, global head of data and analytics at Evalueserve, by using synthetic data, AI models could predict and make more realistic tracking based on body types, lighting/illumination, backgrounds and environments among others.
“Metaverse/omniverse or similar ecosystems will depend highly on photorealistic expressive and behavioral humans, now achievable with synthetic data. It is humanly impossible to annotate 2D and 3D images at a pixel-perfect scale. With synthetic data, this technological and physical barrier is bridged, allowing for accurate annotation, diversity, and customization while ensuring realism,” Srivastava told VentureBeat.
Gesture recognition is another primary mechanism for interacting with virtual worlds. However, building models for accurate hand tracking is intricate, given the complexity of the hands and the need for 3D positional tracking. Further complicating the task is the need to capture data that accurately represents the diversity of users, from skin tone to the presence of rings, watches, shirt sleeves and more.
Behzadi says that the industry is now using synthetic data to train hand-tracking systems to overcome such challenges.
“By leveraging 3D parametric hand models, companies can create vast amounts of accurately 3D labeled data across demographics, confounds, camera viewpoints and environments,” Behzadi said.
“Data can then be produced across environments and camera positions/types for unprecedented diversity since the data generated has no underlying privacy concerns. This level of detail is orders of magnitude greater than what can be provided by humans and is enabling a greater level of realism to power the metaverse,” he added.
Srivastava said that compared to the current process, the metaverse will collect more personal data like facial features, body gestures, health, financial, social preference, and biometrics, among many others.
“Protecting these personal data points should be the highest priority. Organizations need effective data governance and security policies, as well as a consent governance process. Ensuring ethics in AI would be very important to scaling effectiveness in the metaverse while creating responsible data for training, storing, and deploying models in production,” he said.
Similarly, Behzadi said that synthetic data technologies will allow building more inclusive models in privacy-compliant and ethical ways. However, because the concept is new, broad adoption will require education.
“The metaverse is a broad and evolving term, but I think we can expect new and deeply immersive experiences — whether it’s for social interactions, reimaging consumer and shopping experiences, new types of media, or applications we have yet to imagine. New initiatives like OpenSynthetics.com are a step in the right direction to help build a community of researchers and industrial partners to advance the technology,” said Behzadi.
Creating simulation-ready data sets is challenging for companies wanting to use synthetic data generation to build and operate virtual worlds in the metaverse. Kerris says that off-the-shelf 3D assets aren’t enough to implement accurate training paradigms.
“These data sets must have the information and characteristics that make them useful. For example, weight, friction and other factors must be included in the asset for them to be useful in training,” Kerris said. “We can expect an increased set of sim-ready libraries from companies, which will help accelerate the use cases for synthetic data generation in metaverse applications, for industrial use cases like robotics and digital twins.”