See how two halves of a neural network collaborate, in such programs, with one constantly telling the other: That’s not good enough. Over and over. Until it is.
Large transformer models are mainstream nowadays, creating SoTA results for a variety of tasks. They are powerful but very expensive to train and use. The extremely high inference cost, in both time and memory, is a big bottleneck for adopting a powerful transformer for solving real-world tasks at scale.
Why is it hard to run inference for large transformer models? Besides the increasing size of SoTA models, there are two main factors contributing to the inference challenge (Pope et al.
A new technique greatly reduces the error in an optical neural network, which uses light to process data instead of electrical signals. With their technique, the larger an optical neural network becomes, the lower the error in its computations. This could enable them to scale these devices up so they would be large enough for commercial uses.