OpenAI launches Point-E, an AI that generates 3D models • TechCrunch
The next breakthrough to take the AI world by storm could be 3D model generators. This week, OpenAI broke the code for Point-E, a machine learning system that creates a 3D object upon receiving a text message. According to one paper published along with the base codePoint-E can produce 3D models in one to two minutes on a single Nvidia V100 GPU.
Point-E does not create 3D objects in the traditional sense. Rather, it generates point clouds, or discrete collections of data points in space that represent a 3D shape, hence the cheeky abbreviation. (The “E” in Point-E is short for “efficiency”, because it’s apparently faster than previous 3D object generation approaches.) Point clouds are easier to synthesize from a computational point of view, but they do not capture the fine granularity of an object. Shape or Texture: A key limitation of Point-E currently.
To get around this limitation, the Point-E team trained an additional AI system to convert Point-E’s point clouds into meshes. (Meshes, the collections of vertices, edges, and faces that define an object, are commonly used in 3D modeling and design.) But they point out in the paper that the model can sometimes miss certain parts of objects, resulting in blocked or distorted shapes.
Outside of the mesh generation model, which is independent, Point-E consists of two models: a text-to-image model and an image-to-3D model. The text-to-image model, similar to generative art systems like OpenAI itself. FROM-E 2 Y stable diffusion, was trained on labeled images to understand the associations between words and visual concepts. The Image to 3D model, on the other hand, was given a set of images paired with 3D objects so that it learned to translate effectively between the two.
When presented with a text message, for example, “a 3D printable gear, a single gear 3 inches in diameter and half an inch thick,” Point-E’s text-to-image model generates a synthetic rendered object that it is fed to image by image. 3D model, which then generates a point cloud.
After training the models on a dataset of “several million” 3D objects and associated metadata, Point-E could produce colored point clouds that frequently matched text cues, the OpenAI researchers say. Not perfect: Point-E’s image-to-3D model sometimes doesn’t understand the image from the text-to-image model, resulting in a shape that doesn’t match the text message. Still, it’s much faster than the previous state of the art, at least according to the OpenAI team.
“While our method performs worse in this evaluation than more modern techniques, it produces samples in a small fraction of the time,” they wrote in the article. “This could make it more practical for certain applications, or could enable the discovery of higher quality 3D objects.”
What are the applications, exactly? Well, the OpenAI researchers point out that Point-E’s point clouds could be used to fabricate real-world objects, for example, through 3D printing. With the additional mesh conversion model, the system could, once it’s a bit more polished, also find its way into animation and game development workflows.
OpenAI might be the last company to jump into the 3D object generator fray, but, as mentioned above, it’s certainly not the first. Earlier this year, Google released DreamFusion, an expanded version of Dream Fields, a generative 3D system that the company introduced in 2021. Unlike Dream Fields, DreamFusion requires no prior training, meaning it can generate 3D representations of objects. no 3D data.
While all eyes are on 2D art generators these days, model synthesis AI could be the next big industry disruptor. 3D models are widely used in film and television, interior design, architecture, and various scientific fields. Architectural firms use them to demonstrate proposed buildings and landscapes, for example, while engineers use the models as designs for new devices, vehicles, and structures.
However, 3D models typically take a while to create, anywhere from several hours to several days. AI like Point-E could change that if the issues are ever worked out, and make OpenAI a respectable profit doing so.
The question is what kind of intellectual property disputes could arise over time. There is a huge market for 3D models, with various online marketplaces including CGStudio and CreativeMarket allowing artists to sell the content they have created. If Point-E catches on and its models hit the markets, modeling artists might protest, pointing to evidence that Modern generative AI relies heavily on its training data — Existing 3D models, in the case of Point-E. Like DALL-E 2, Point-E does not credit or cite any of the artists who might have influenced their generations.
But OpenAI leaves that topic for another day. Neither the Point-E document nor the GitHub page mentions copyright.
In their favor, the researchers to do mention that they expect Point-E to suffer other issues, such as inherited biases from training data and lack of guarantees around models that could be used to create “dangerous objects.” Perhaps this is why they are careful to characterize Point-E as a “starting point” that they hope will inspire “more work” in the field of text-to-3D synthesis.