*This is part of a series of posts I’m doing on Census Bureau Demographic Data for Developers based on my experience creating a database of Census Data as part of my summer internship at Patch.com. *

Here’s the bottom line: if you’re measuring distance over relatively small distances and doing area comparisons instead of returning an actual area measure, use a geometry. If you’re measuring distances over a relatively large distance, especially across the International Date Line, use a geography. Read on for a discussion of why.

I think everyone has a basic grasp of the problems when representing a 3-dimensional sphere (such as the Earth) on a 2-dimensional map. Geographical features are distorted, especially as you move further from the Equator.

The same is true for data shown (or projected) in a GIS program such as Esri’s ArcGIS or QGIS. Different projections attempt to correct for this error by focusing on being correct for different areas of the globe. In the case of the Census, the standard projection is North American Datum – 1983.

I don’t pretend to be an expert in map projections or the process of transforming one map projection into another, but it’s important to point out that there are differences whether you’re projecting your data in 2 dimensions as a geometry or in 3 dimensions as a geography.

In 2 dimensions, calculations of area and distance are extremely straightforward. The points exist on a Cartesian graph and the calculations are simple. This means computationally these functions are cheap and can be run over a large number of features very quickly.

But with geographies, the calculations are much more complex. What was simple geometry is now more complex trigonometry as you’re calculating arc length rather than a simple line length.

So why use a geography at all? For measuring long distances, calculations using a geography will be much more accurate, particularly if you’re crossing the International Date Line.

For example, using a geometry to represent the locations of Heathrow Airport in London and SFO in San Francisco, the straight line distance is much different from the actual route flown by airplanes. Thankfully when planning air routes, they take into account the curvature of the Earth to plot the shortest route.

Something similar happens if you were to plot the distance between San Francisco and Beijing. Remember a 2 dimensional projection has no concept of how the Earth curves, so it will plot the distance from SFO across the US, over Europe and Asia, the exact opposite way you’d want to calculate it in real life. That’s because the Date Line (or at least the 180 degree of longitude it notionally traces) represents the east and west edges of the projection.

This is also important when you’re calculating areas. If you’re calculating the area of a geometry, you will get the result in the units of measure for that projection. In the case of census data, that will generally be decimal degrees, which aren’t really useful. If you cast the geometry into a geography (`::geography`

), you’ll get the result in square meters, a far more understandable measure.

For comparing areas of objects in the same projection, I prefer to do the comparison in the standard units for that projection (called SRID Units) and only do the conversion when I want to return an intelligible area. This preserves precision in the numbers I’m using.

To repeat the statement I made at the beginning, if you’re measuring distance over relatively small distances and doing area comparisons instead of returning an actual area measure, use a geometry. If you’re measuring distances over a relatively large distance, especially across the International Date Line, use a geography. If you need any more clarity on this, please check out the great explanation from OpenGeo.org or this great GIS Stack Exchange post.

Pingback: Spatial Demographic Data – Weighting Demographics with PostGIS | Datapolitan

So… would Google Maps use a geometric or a geographic algorithm?

It’s probably optimized for the scale of the operation, though they could probably eat the cost of complex calculations on a small scale without too much trouble. They’re also having to do some computation along streets and other access ways between points. I can ask the next time I see someone from Google Maps (which actually shouldn’t be that long, one of the perks of living in New York)