Developed by a team of MIT graduate students, TensorFire can run TensorFlow-style machine learning models on any GPU, without requiring the GPU-specific middleware typically needed by machine learning libraries such as Keras-js.
TensorFire is another step towards making machine learning available to the broadest possible audience, using hardware and software people are already likely to possess, and via advances in how accurate model predictions can be served with a fraction of the resources previously needed.
The machine learning power is in your browser
TensorFire works using the WebGL standard, a cross-platform system for rendering GPU-accelerated graphics in browsers. WebGL supports GLSL, a C-like language used to write shaders, which are short programs used to transform data directly on the GPU.
Shaders are typically used in the WebGL pipeline to transform how graphics are rendered—for example, to render shadows or other visual effects. But TensorFire uses shaders to run in parallel the computations needed to generate predictions from TensorFlow models. TensorFire also comes with a library for importing existing TensorFlow and Keras models.
With this framework, you can deploy a trained model directly into a web browser and serve predictions locally from the browser. The user doesn’t need to download, install, or compile anything; all the work is done directly in the browser. The data used to make the predictions is also processed entirely on the client. The brand of GPU doesn’t matter, either: Both AMD and Nvidia GPUs are supported.
One web-based example of TensorFire shows a style-transfer neural network, where the style of one piece of artwork can be mapped to another image. The slowest part of the demo is downloading the model and compiling the shader pipeline; the actual execution takes only a second or two.
TensorFire’s creators claim it’s faster than other solutions. Bouncing data between GPU and CPU is a common performance bottleneck, and so TensorFire avoids this by keeping as much data as possible on the GPU at a time.
Keep your data close and your predictions closer
The most prominent advantages of TensorFire’s approach are its portability and convenience. Modern web browsers run on most every operating system and hardware platform, and even low-end smartphones have generous amounts of GPU power to spare. Much of the work involved in getting useful results from machine learning models is setting up the machine learning pipeline, either to perform the training or to deliver the predictions. It is very useful to boil much of that process down to just opening up a web browser and clicking something, at least for certain classes of jobs.
Another advantage claimed by TensorFire’s creators is that it allows the deployment of predictions to be done entirely on the client. This won’t be as much of an advantage where both the trained model and the data are already deployed to the cloud. But it’s a good fit for applications where the deployed model is small, the data is client-side, and the user is uneasy about uploading anything.
A third advantage of TensorFire is that it theoretically loosens the restrictions on which brands of graphics cards can be used for machine learning, thanks to the high speed it gains from both Nvidia and AMD GPUs.
Historically, Nvidia’s CUDA standard has been the go-to for accelerating machine learning via GPUs, providing more performance than the more open-ended OpenCL standard, which supports a broad range of hardware. AMD has its own plans about how to work around OpenCL’s performance issues, but TensorFire lets users and developers sidestep the issue completely.
TensorFire also takes advantage of another growing phenomenon: making machine learning models more compact and efficient with a slight (typically undetectable) loss of accuracy. This “low-precision quantized tensor” approach means smaller models can be deployed to the client, and predictions can be made faster.
But TensorFire’s makers claim the “low-precision quantized tensor” approach allows the software to run on a broader range of GPUs and browsers, especially those that don’t support the full range of WebGL extensions.
Finally, the TensorFire team plans to release the library as an MIT-licensed open source project, so the acceleration work done in TensorFire could also be used in a broad range of applications—even those that don’t have anything to do with TensorFlow or machine learning. The framework’s creators note that the low-level GLSL API in TensorFire “can also be used to do arbitrary parallel general-purpose computation,” meaning that other frameworks for GPU-powered, in-browser, client-side computation could be built atop it.