County Polygons

Post Reply
Redfist
Global Moderator
Posts: 59
Joined: January 22nd, 2012, 3:38 pm

County Polygons

Post by Redfist »

So... I started looking into adding support for Counties as regions and have run into some data accuracy issues. I'm 99% sure I know how we should approach this but I wanted to gather some opinions.

First some background. Counties have shapes (duh). Sometimes those shapes are very irregular/complex. I downloaded the coordinates from the National Atlas of the US (most official source I know of) and got some pretty detailed polygons! One county in Arizona, for example, is defined by a perimeter of ~1300 coordinates. That's not bad! But guess what, it's not perfect...

(SIDE NOTE: I saw the coordinate set that some people use in GSAK for Utah counties (for example) and they only have ~15 points. That now seems horribly inaccurate in comparison.)

I've found instances where the state assigned by groundspeak is actually wrong (by a few feet). There is an Arizona cache which is actually in New Mexico. When I check against my AZ polygons, that cache doesn't get assigned a county.

Another problem is that even these highly complex polygons aren't entirely accurate. Some county perimeters are actually way more complex than that (think winding rivers). However, the coordinate set I have is the most accurate I know of. I'll add 2 pictures to illustrate this point.
This is a particularly bad area rendering the coordinates on the edge of AZ and CA.
This is a particularly bad area rendering the coordinates on the edge of AZ and CA.
GoogleEarthWithNationalAtlasPolygons.jpg (176.2 KiB) Viewed 4468 times
Here is the same area as viewed on Google Maps.  The state boundary there may be computer generated, it may have been human generated.  I don't know.
Here is the same area as viewed on Google Maps. The state boundary there may be computer generated, it may have been human generated. I don't know.
GoogleMaps.jpg (179.37 KiB) Viewed 4468 times
My assumption is that no matter how much we try to find a more accurrate coordinate polygon definition per state, we'll still eventually have some areas of inaccuracy.

So... that now comes down to the question "What do we do about it?"

There are a few options that I see but would love input.
1. Ignore it. Out of the small sample I compared against (13922 caches) only ~8 failed to get a county assignment (because the error pushed it outside of AZ). However, there may well have been other misassigned counties within the boundaries of the state. Being generous, let's estimate that at 100-200 caches out of 13922. That's an error rate of ~1%. That might be "good enough".
2. Find some other super accurate coordinate set. As far as I know, the best source is what I'm already using. It is the National Atlas afterall.
3. Have a process where we can "appeal" county designations. This should be ***INFREQUENT*** as it would place a burden on Corfman Clan and myself to police. We could potentially add a mechanism to grant priviledges to others to help w/ that burden but I definitely would NOT want the general public to be able to edit that (since it would affect the integrity of leaderboards).

WRT #3, it's easy for Corfman Clan and I to notice if something failed around the edge of the state. It's WAY less obvious if something was inaccurate within the state.

Thoughts?
Redfist
Global Moderator
Posts: 59
Joined: January 22nd, 2012, 3:38 pm

Re: County Polygons

Post by Redfist »

Another minor point to keep in mind - even if that Arizona/California border was computer generated (ie: coordinate set somewhere), that may only be at the state boundary level. That ideal data set (if it exists) may not have county boundaries.
bikephotog
Benefactor
Posts: 10
Joined: January 18th, 2012, 3:24 pm

Re: County Polygons

Post by bikephotog »

FYI -

http://gis.utah.gov/sgid-vector-downloa ... c=Counties

Most states should have something similar to the Utah AGRC. If you can work with shapefiles, this should work for you. I did not look to see how accurate these are.

BP
Redfist
Global Moderator
Posts: 59
Joined: January 22nd, 2012, 3:38 pm

Re: County Polygons

Post by Redfist »

Thanks - I'll look.

What I got from the National Atlas was shape files and I figured out how to grab the coordinate polygons from those. One thing I have to keep in mind is that the states fit together with gaps. IE: If I have perfect for state X but slightly off for adjacent state Y, there will likely be either overlaps along the border or gaps.
bikephotog
Benefactor
Posts: 10
Joined: January 18th, 2012, 3:24 pm

Re: County Polygons

Post by bikephotog »

If you can find someone on LCP with ESRI Arc software experience, it would be quick and easy for them to align the shapefiles of adjacent areas. There is a process called - snapping - that aligns the boundaries of adjacent shapefiles perfectly with no gaps or overlaps. You can define which shapefile snaps and which is the base shape. I could help, however, I am on sabbatical from the university where I teach and will not have access to that kind of software again until 2013. BP
Anywhere, anytime
User avatar
Corfman Clan
Global Moderator
Posts: 914
Joined: January 17th, 2012, 12:21 am

Re: County Polygons

Post by Corfman Clan »

Redfist wrote:I've found instances where the state assigned by groundspeak is actually wrong (by a few feet). There is an Arizona cache which is actually in New Mexico. When I check against my AZ polygons, that cache doesn't get assigned a county.
In actuality, the cache owner assigns the state when he/she fills out the hide a cache form. If you wanted to, you can change the state of all your caches to any other state than the one they are actually in. As far as I know, Geocaching.com makes no attempt to assign the correct state/country to a cache. Perhaps we want to add a state/country override capability so we can fix any caches with mis-assigned states/countries.
Redfist wrote:Another problem is that even these highly complex polygons aren't entirely accurate. Some county perimeters are actually way more complex than that (think winding rivers). However, the coordinate set I have is the most accurate I know of.
For complicated boundaries along state borders, you might think of uncomplicating the border by extending it into the next state, and then being cognizant of the state the county and cache is in when deciding what county to assign.
Redfist wrote:My assumption is that no matter how much we try to find a more accurrate coordinate polygon definition per state, we'll still eventually have some areas of inaccuracy.

So... that now comes down to the question "What do we do about it?"

There are a few options that I see but would love input.
1. Ignore it. Out of the small sample I compared against (13922 caches) only ~8 failed to get a county assignment (because the error pushed it outside of AZ). However, there may well have been other misassigned counties within the boundaries of the state. Being generous, let's estimate that at 100-200 caches out of 13922. That's an error rate of ~1%. That might be "good enough".
2. Find some other super accurate coordinate set. As far as I know, the best source is what I'm already using. It is the National Atlas afterall.
3. Have a process where we can "appeal" county designations. This should be ***INFREQUENT*** as it would place a burden on Corfman Clan and myself to police. We could potentially add a mechanism to grant priviledges to others to help w/ that burden but I definitely would NOT want the general public to be able to edit that (since it would affect the integrity of leaderboards).

WRT #3, it's easy for Corfman Clan and I to notice if something failed around the edge of the state. It's WAY less obvious if something was inaccurate within the state.

Thoughts?
Is it worth using complicated polygons instead of the simpler ones that, for example, GSAK uses? How often would a cache be assigned to the wrong county because of that? How much more work will it take to decide whether a cache is in this county or that if the county polygon has 1300 points as opposed to 15? I'd at least consider using some simplified polygons when able to speed things up.

I think we will need some form of number 3.

Also, keep in mind that different states have counties with the same name. For example, there is a Lincoln county CO, NV, & NM and a San Juan county CO, NV, & UT.
Image
Redfist
Global Moderator
Posts: 59
Joined: January 22nd, 2012, 3:38 pm

Re: County Polygons

Post by Redfist »

I believe having the more precise (yet still not perfect) more complex polygons will reduce the numbers of misassigned caches. Once the polygons are created, it's no longer a matter of code correctness although may become an issue of performance (TBD). Point.STIntersects(poly) does all the work regardless of # of points in the poly.

What I had in mind to deal w/ inaccuracies is some form of #3. Perhaps a mechanism where someone can report a cache that should be changed along w/ a recommendation of what to change. We can then approve/deny the change which would be just updating 1 row/column in a table. I'd likely want to have a seperate table that explicitly lists the overrides so that if we ever mass regenerate data, we don't loose the overrides. We could just reapply the override table over top of the live Geocache table (to avoid having to involve the override table when querying the Geocache table).

ReportPolyProblem: Inserts a row into Overrides table (status = pending) and sends a notification. When we approve, change status=applied and change the Geocache row.
Ranger Alpha
Posts: 6
Joined: March 31st, 2012, 10:47 pm

Re: County Polygons

Post by Ranger Alpha »

Team Opjim
Posts: 71
Joined: January 18th, 2012, 7:41 pm

Re: County Polygons

Post by Team Opjim »

Isuggest the following:
1. do it as accurately as possible
2. Create an option where the CO can define which region/county the cache falls into. Obviously, the CO is the one best able to determine where it should fall
3. Have a limited number of users with the ability to assist you with changes or appeals, maybe 1-2/region. This will result in Redfist and Corfman Clan only needing to wade in on truly unusual circumstance.
Post Reply