MAF/TIGER Accuracy Improvement Project
Jul. 4th, 2005 09:49 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
6.170 introduced me to the Census Bureau's TIGER/Line data set, and I've been experimenting with it on and off for a couple of years now. The bike trip mapping plot has revealed several gaps in the data that it's not obvious I would have found if I wasn't trying to find routes using the data. But, for example, Farm Street in Dover is broken into two segments, with TLID 87283093 being a very short segment connecting the two labelled "Census 2000 collection block boundary not represented by existing physical feature" that happens to connect Farm Street to Farm Street. Eliot Street in Natick and Washington Street in Wellesley don't line up at the city line in spite of being the same road. That sort of thing.
In poking around at newer TIGER data, I discovered that there's a $200 million federal project to fix these sorts of inaccuracies. Which is great for people like me who use this data this way. But I'm sure the same data is available from commercial sources; it's probably not cheap, but, $200 million? Is TIGER really anything more than a data set used internally by the Census Bureau and by a small number of dedicated amateurs?
...this document discusses the scope of the project a little more. A large part of the project sounds like "redesign our internal database, it's 15 years old" more than "update the data", and also "make it possible for Census field agents to do their jobs and update the database without paper maps". And there's a requirement to support every type of address in the United States, not just the 90% or 99% case. Actually, this is a kind of interesting read if you're curious how the data got put together originally and why it has the problems it has. So the money is mostly sustaining this goofy constitutional requirement that we go around and count people every ten years; it feels a little more sensible now.
In poking around at newer TIGER data, I discovered that there's a $200 million federal project to fix these sorts of inaccuracies. Which is great for people like me who use this data this way. But I'm sure the same data is available from commercial sources; it's probably not cheap, but, $200 million? Is TIGER really anything more than a data set used internally by the Census Bureau and by a small number of dedicated amateurs?
...this document discusses the scope of the project a little more. A large part of the project sounds like "redesign our internal database, it's 15 years old" more than "update the data", and also "make it possible for Census field agents to do their jobs and update the database without paper maps". And there's a requirement to support every type of address in the United States, not just the 90% or 99% case. Actually, this is a kind of interesting read if you're curious how the data got put together originally and why it has the problems it has. So the money is mostly sustaining this goofy constitutional requirement that we go around and count people every ten years; it feels a little more sensible now.