Forum OpenACS Q&A: Zip Code Discrepancies

Collapse
Posted by Gilbert Wong on
I've been working on a site which is using a zip code distance search and I think I found some errors with the current zip code data in ref-us-zipcodes.  The latitudes/longitudes of several zip codes which I checked by hand are significantly off.  The zip codes with the largest errors were in California.  I only checked a few zip codes in CA, but the fact that I'm seeing bad data is disconcerting.  For instance, if you use the lat/long values for 94304 (Palo Alto, CA) and you look it up on mapquest.com, it brings up Morgan Hill, CA, which is about 30 miles south.  😊

After doing some research on the US Census Bureau's website (where the zip codes came from), I found the most recent zip code data from the 2000 census.  The latitude/longitude values for those zip codes which had errors in the old data set matched to within a within a few miles.  The only problem is that the data is not in the same format as the 1990 or the 1999 data.

I'm going to update my database with the "correct" data.  If people are interested, I can dump the us_zipcodes table and post/upload it somewhere.

Gilbert

Collapse
2: Re: Zip Code Discrepancies (response to 1)
Posted by Jon Griffin on
Gilbert,
I maintain ref-* so please email me in private and I will update the ref-zipcodes, as there is a way to deprecate reference data in acs-reference.
I think I was using 1999 data, but it has been a while.
Collapse
3: Re: Zip Code Discrepancies (response to 2)
Posted by tammy m on
Hi

Has ref-zipcodes been updated and if so where can I get a most recent copy?

Also, any tips on how to implement a zip code proximity search for my website? Has anyone already done this with OACS that I could make use of?!

thanks:)

Collapse
4: Re: Zip Code Discrepancies (response to 1)
Posted by Jon Griffin on
Yes, head.
Collapse
5: Re: Zip Code Discrepancies (response to 3)
Posted by Gilbert Wong on

Here's the plpgsql function we use. It takes two zipcodes and calculates the distance between them. You can find the formulas on the internet.

create function zip_distance (char,char)
returns float8 as '
declare 
    p_zip1              alias for $1;
    p_zip2              alias for $2;
    v_lat               float8;
    v_lon               float8;
    v_lat_ref           float8;
    v_lon_ref           float8;
    v_distance          float8;
    lat_rad             float8;
    lon_rad             float8;
    lat_ref_rad         float8;
    lon_ref_rad         float8;
    dlon                float8;
    dlat                float8;
    a                   float8;
    c                   float8;
    d                   float8;
begin
    select latitude::float8, longitude::float8
        into v_lat, v_lon from us_zipcodes where zipcode = p_zip1;
    select latitude::float8, longitude::float8
        into v_lat_ref, v_lon_ref from us_zipcodes where zipcode = p_zip2;
    lat_rad := v_lat * 0.017453293;
    lon_rad := v_lon * 0.017453293;
    lat_ref_rad := v_lat_ref * 0.017453293;
    lon_ref_rad := v_lon_ref * 0.017453293;
    dlon := lon_rad - lon_ref_rad;
    dlat := lat_rad - lat_ref_rad;
    a := pow(sin(dlat/2),2) + (cos(lat_rad)*cos(lat_ref_rad)*(pow(sin(dlon/2),2)));
    c := 2*atan2(sqrt(a),sqrt(1-a));
    d := 3963 * c;
    return trunc(d);

end;' language 'plpgsql';

In your query, do something like this:

...
where zip_distance(:ref_zipcode,zipcode) <= $search_distance
...
Gilbert