ONNX is supposed to help solve one of the key issues in the Machine Learning ecosystem right now. There is a profusion of different frameworks for setting up and executing neural networks and other machine learning systems, but they’re all different, and they aren’t interoperable.
Using ONNX, it’s possible for Facebook to export a trained model created with PyTorch and use it with Caffe2 for inference. That’s important for taking a model created in research (something that PyTorch is good at) and bringing it to production with Caffe2. Microsoft said that it’s working on releasing a version of Cognitive Toolkit (also known as CNTK) that supports ONNX.
The system works by tracing how a neural network generated using one of these frameworks executes at runtime and then using that information to create a generic computation graph that can be moved around. That’s possible because each of those frameworks produces a very similar end result when it comes to computation, even though the higher level representation is different.
Right now, the biggest issue with ONNX is that it isn’t compatible with some other popular machine learning frameworks, including TensorFlow, which originated at Google, and Apache MXNet, which is Amazon’s preferred machine learning framework.
Implementing support for the project is non-trivial, however. Facebook said that it had to make changes to both PyTorch and Caffe2 in order to support the project. Microsoft and Facebook have said that they hope the open source community will help them evolve ONNX, so support for more frameworks will be possible in the future.
In addition, ONNX doesn’t support some more complex networks, like those created in PyTorch with dynamic flow control. That’s something Facebook plans to add in the future.