We could write cython++ code to make use of multiple cores to parallelize some cpu intensive features like the Weisfeiler Lehman Graph Kernel