The C++ 2023 standard will not have an asynchronous algorithm feature called senders and sinks, allowing code to run simultaneously on a system with multiple chips, such as CPUs and GPUs.
“The goal there is maybe to try to get it into the working draft next year: the [C++ 26] working draft, so once it’s there, people will take it a lot more seriously,” he said. Nevin Libera computer scientist in the Center for Advanced Leadership at Argonne National Laboratory, and a C++ committee member, during a working session at last month’s meeting Supercomputing 2022 conference in Dallas.
Software applications written in C++ are undergoing fundamental changes, with PCs, servers, and mobile devices running code simultaneously on multiple chips. The goal with senders and receivers is to update the standard C++ framework to make it easier for developers to write applications that take advantage of the new runtime environments.
Programmers are increasingly writing code for CPUs and accelerators such as GPUs and AI chips, which are important for faster application execution.
“While the C++ standard library has a rich set of concurrency primitives…and lower-level building blocks…we lack a standard vocabulary and framework for asynchrony and parallelism that C++ programmers desperately need,” he says. a document that outlines the proposal.
Senders and Receivers
Currently, C++ code must be optimized for specific hardware. But senders and receivers will add a layer of abstraction so that standard C++ code runs in multiple parallel environments. The goal is to add portability, so the code works on different installations.
“We certainly have ideas of how to connect that with algorithms. My hope would be that for C++26 we can do that. You have a nice way of connecting these things and you also have… algorithms that can do asynchronous work,” he said. christian trottsenior staff member at Sandia National Laboratory and also a member of the C++ standards committee.
The asynchronous communication feature is being largely driven by Nvidia, whose CUDA The parallel programming framework is widely used in machine learning, which relies on CPU and GPU concurrency to reduce training time.
Nvidia has opened its source libcu++ C++ library. The company also released the CUDA 12.0 parallel programming framework last week, which supports the C++ 20 standard and supports host compilers such as GCC 10, Clang 11, and ArmC/C++ 22.x.
Senders/receivers may not make it to C++23, but it will make life easier for coders in the future, he said. esteban jonesCUDA architect at Nvidia, he told The New Stack.
“I feel pretty confident about 2026, but the senders/receivers: It’s a big change in C++. It’s something really very new for them to try to adopt a kind of pipeline asynchronous execution,” Jones said.
Mature technology is needed
While the delay of a key feature may not look good on paper, C++ committee members said it’s best to wait for a technology to mature before adding it as a standard. Accelerator computing is in its infancy, with ever-changing chip designs, memory, and storage requirements.
“I think we need to see more accelerators, he said james reindersan Intel software engineer, adding: “I think that needs a little more time to develop.”
Intel provides a tool called SYCLomatic that makes code portable across hardware by removing CUDA code that limits applications to Nvidia GPUs. Reinders said that GPUs won’t be the only accelerators available.
Reinders also pointed to a vigorous debate about whether hooks for technologies like remote memory are permanently needed in standard C++. Some are better as extensions, she said.
“Give it some time to develop and we’ll see if that’s the right thing to put in C++ or if it’s better as an extension. open PM it has been very strong for a long time. has never been incorporated Fortran or C. It’s appropriate not to overcomplicate a basic language,” Reinders said.