# WGS-84 distance calculations at the speed of C

When we started out doing fleet management at Visual Units, one thing was really hard to get right - distance calculations. There was no end of information available, but most-to-all of it was on a level of mathematics far beyond a poor developer who feels that anything beyond discrete mathematics and basic geometry and statistics really should be somebody else's problem. The implementations that could be found were closed-source licensed version we really could not afford at that stage.

For a while we got by using a solution that relied on having a variant of Lambert conformal conic projection coordinates - it was sufficiently exact if not perfect, and our maps used the same projection, so it worked - although there was the added burden of transforming our stored (WGS-84) coordinates to Lambert every time we needed calculations done. A couple of years ago, however, we switched to Google Maps API and so we really had no use for Lambert - and increased load and precision demands made using the current solution a worse and worse choice.

Enter Chris Veness. Or rather, enter his implementation of the Vincenty inverse formula (pdf). Even though the math is beyond me, porting the Javascript implementation to Python was straightforward, and some testing showed that the result was both faster and had better precision than the previous solution.

Fast-forward to a few months ago, suddenly the performance is starting to look like something that could become a problem. We have many reasons for doing distance calculations, and while the batch jobs were not a problem, any amount of time that can be shaved off user-initiated actions is welcome.

So, I thought to myself, I've ported it once, how hard can it be to do it again? After all, when raw speed becomes the issue, the Python programmer reaches for C. Porting it was once again straightforward, mapping the Python function

def distance(x1, y1, x2, y2): ...

into

const double distance(const double x1, const double y1, const double x2, const double y2) { ...

The resulting C code is almost identical to he Python (and Javascript) implementations but runs about 6 times faster than the Python implementation. Allowing batch submission of calculations instead of calling once for every calculation, eliminating some FFI overhead, would increase the speed further.

$ python2.7 -m tests.test_distance Time elapsed for 100000 calculations in Python: 1952.70 C: 300.46 Factor: 6.50

Wrapping the C and calling it was simple enough using ctypes, and I've added fallback to the Python implementation if the C shared library cannot be found; a small __init__.py in the package hooks up the correct version:

from .distance import distance as _py_distance try: from ctypes import cdll, c_double dll = cdll.LoadLibrary('cDistance.so') dll.distance.restype = c_double dll.distance.argtypes = [c_double, c_double, c_double, c_double] distance = dll.distance _c_distance = dll.distance except OSError: #Fall back to Python implementation distance = _py_distance

Of course, this depends on the C code being compiled into cDistance.so and that file being available for linking - and it keeps the .so hardcoded so a windows DLL wont work. I really did intend to clean it up more before making it open source, but since I've been meaning to start open sourcing some of our tools for years now and never really found the time, I thought it would be better to thow it out there, and postpone making it pretty instead. I hope someone can find some use in this, and I'll try to get it cleaned upp and packaged Real Soon Now.