Actually, you get the speed increase if using threads on python with numpy because numpy will release the GIL internally, so, operations using numpy will actually run in parallel (i.e.: c libraries can release the GIL and do get the performance benefits when using multiple processors -- although yes, when you get back to running python code that won't be parallel due to the GIL).