Notes on the TIGER/Line update
Jul. 26th, 2009 12:58 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
The Census Bureau has been more-or-less threatening to replace their custom text format for TIGER/Line data with "standard" shape files for a while, and appears to have finally actually done that; the 2008 shapefiles appear to be the second release in the new format. This means that I can take the TIGER data and pretty straightforwardly plug it into things like mapserver and my Haskell libgdal bridge, which in turn means that I can get broader data than I can readily get from sources like MassGIS.
The downside of this isn't hugely surprising, but it's that the data is broken up into lots of little pieces. TIGER has always been broken up by county (or other statistically equivalent area), so I can download "all line data for Middlesex County, MA". But I frequently leave the county biking, which is currently my main use for this data, so I need to figure out which counties I'm in and get the same data for each county. "Just grab all of Massachusetts" would seem plausible except that the URLs aren't hugely amenable to scripting. And then I probably want other layers, like water, so I need to grab *those*. Similarly, now I'll have a layer per type of thing per county, so I probably need to merge those, which I think ogr2ogr can do.
The other end of the plot is just rejiggering everything. If I have mapserver input files that reference the MassGIS data I need to change file names and feature codes to reference TIGER instead. A toy I've been playing with on-and-off for a while will generate cue sheets from a list of streets and MassGIS data, so I'll need to update it to understand TIGER's record layout instead. And so on.
I'll probably eventually do this -- an end goal is to produce an RSS feed for bike trips with little overview maps, and being limited to Massachusetts isn't good enough -- but maybe it'll be during a future vaguely productive weekend.
The downside of this isn't hugely surprising, but it's that the data is broken up into lots of little pieces. TIGER has always been broken up by county (or other statistically equivalent area), so I can download "all line data for Middlesex County, MA". But I frequently leave the county biking, which is currently my main use for this data, so I need to figure out which counties I'm in and get the same data for each county. "Just grab all of Massachusetts" would seem plausible except that the URLs aren't hugely amenable to scripting. And then I probably want other layers, like water, so I need to grab *those*. Similarly, now I'll have a layer per type of thing per county, so I probably need to merge those, which I think ogr2ogr can do.
The other end of the plot is just rejiggering everything. If I have mapserver input files that reference the MassGIS data I need to change file names and feature codes to reference TIGER instead. A toy I've been playing with on-and-off for a while will generate cue sheets from a list of streets and MassGIS data, so I'll need to update it to understand TIGER's record layout instead. And so on.
I'll probably eventually do this -- an end goal is to produce an RSS feed for bike trips with little overview maps, and being limited to Massachusetts isn't good enough -- but maybe it'll be during a future vaguely productive weekend.