Callisto, Jupyter and Mac Optimized Machine Learning

We build Callisto with the mindset that Callisto is the best way to do data science on a Mac. A part of that is to helping users get the most out of their Mac hardware by using computational libraries optimized for Apple Silicon chips. TensorFlow is a very popular library for machine learning, so let’s take a look and see what it takes to use an M1 optimized version of TensorFlow with a Jupyter notebook in Callisto.

TensorFlow has a feature called PluggableDevice which let’s developers create plugins for different pieces of ML hardware. Conveniently for us, Apple has written a plugin for Metal which is heavily optimized for Apple Silicon devices like the M1 and M2 chips. Now we just have to get it installed.

You should be able to just install the TensorFlow library for the Mac and then the PluggableDevice for Metal, which you’d do with these commands:

pip install tensorflow-macos
pip install tensorflow-metal

With Callisto, you can use our fancy package manager interface and install tensorflow-macos and tensorflow-metal. Unfortunately, other package dependencies mean that pip won’t install the latest tensorflow-macos, version 2.12.0, but instead, fails back one version to 2.11.0. On the other hand, pip will install the latest version of tensorflow-metal but the PluggableDevice interface is a C API and is tightly bound to the version. While these modules installed, at runtime there’s a symbol mismatch error and the Metal plugin fails to load.

Cue montage of trying to install several permutations of these two packages.

To jump to the end, as suggested in this post on the Apple Dev Forum, more recent versions seem to have issues and falling back to tensorflow-macos version 2.9.0 and tensor flow-metal version 0.5.0 does work with no issues. Pip will install those versions with the following commands:

pip install tensorflow-macos==2.9.0
pip install tensorflow-metal==0.5.0

Don’t forget, you can specify versions using Callisto’s package manager right in the package field by adding the version specifier. Instead of just tensorflow-macos, use tensorflow-macos==2.9.0.

Now we’re up and running, let’s do some tests! We want to compare just running on the CPU versus running with the hardware accelerated Metal GPU. Here’s a little bit of code to disable the GPU accelerated device in TensorFlow:

import tensorflow as tf
tf.__version__

disable_gpu = True

if disable_gpu:
    tf.config.set_visible_devices([], 'GPU')

tf.config.get_visible_devices()

When disable_gpu is true, you should only see one CPU device in the output. When not disabling the GPU, you should see both the CPU and GPU in the output. TensorFlow doesn’t deal well with changing the visibility after the library is up and running, so to switch the state of the GPU, remember to restart your Jupyter kernel.

Now we’re ready to test! First I tried this Quickstart for Beginners from the TF website. Running this example on the CPU, it completed in 7 seconds. Enabling the GPU, it runs in 42 seconds. What, what?! It’s slower using the fancy Metal optimized GPU driver? Yep, turns out that’s right. As noted on Apple’s tensorflow-metal page, the CPU can be faster for small jobs. Well that’s a little disappointing.

Now if we look at Apple’s example on that same page, it’s got a little more heavy lifting to do. Running that on my M1 CPU, it runs in just under a half an hour at 29 minutes and 12 seconds. On the GPU, it blazes through the job in 5 minutes and 10 seconds! Cutting my run time to 1/6 of the original is defintely a solid improvement. That kind of performance spike makes all the installation headaches worth it!

With tensorflow-metal on the cusp of a 1.0.0 release, we’re excited to see how we can integrate this into our builds and include this out of the box with Callisto, but until then, these instructions should help shepherd you through a manual install.