Forum OpenACS Q&A: PortalHeapMemoryFree error

Collapse
Posted by Brian Mueller on
When doing a large query from /intranet/employees/admin/index.tcl, I get the following error:
NOTICE:  PortalHeapMemoryFree: 0x0x41a37610 not in alloc set!
NOTICE:  PortalHeapMemoryFree: 0x0x41a37cc8 not in alloc set!
NOTICE:  PortalHeapMemoryFree: 0x0x41a37ca0 not in alloc set!
ERROR:  btree scan list trashed; can't find 0x0x41a37bf0
The same thing happens running the query in psql. The PostgreSQL.org doxlist faq says...
5.2) What does the message "NOTICE:PortalHeapMemoryFree: 
0x402251d0 not in alloc set!" mean?

You are pfree'ing something that was not palloc'ed. 
Beware of mixing malloc/free and palloc/pfree. 
Where should I be looking to solve this? The postmaster process looks like so
  PID TTY      STAT   TIME COMMAND
  476 ?        S      0:04 /usr/local/pgsql/bin/postmaster -B 1000 -
S -o -S 8000 -D/usr/local/pgsql/data
Postgres is still accessible from other pages on this and other OpenACS sites running on the same machine. Thanks for your help.
The troublesome query (either half of the union can be run successfully):
select users.user_id , coalesce(info.salary, 0) as salary, 
users.last_name || ', ' || users.first_names as name,
                info.supervisor_id, info.years_experience as 
n_years_experience, info.salary_period, info.referred_by,
                to_char(info.start_date,'Mon DD, YYYY') as 
start_date_pretty,
                (case when info.project_lead_p = 't' 
then 'Yes'::varchar else 'No'::varchar end) as project_lead,
                (case when info.team_leader_p = 't' 
then 'Yes'::varchar else 'No'::varchar end) as team_lead,
                (case when supervisor_id is NULL 
then 'missing'
                 else (select s.first_names || ' ' || s.last_name 
from users s where s.user_id=supervisor_id) end) as supervisor_name,
                (case when info.referred_by is NULL 
then 'nobody'
                 else (select r.first_names || ' ' || r.last_name 
from users r where r.user_id=referred_by) end) as referral_name
           from users_active users, im_employee_info info, 
user_group_map ugm
          where users.user_id = ugm.user_id
            and ugm.group_id = 8
            and users.user_id = info.user_id
union
select users.user_id , 0 as salary, users.last_name || ', ' || 
users.first_names as name,
                '' as supervisor_id, '' as n_years_experience, '' as 
salary_period, '' as referred_by,
                '' as start_date_pretty,
                'No'::varchar as project_lead,
                'No'::varchar as team_lead,
                'missing' as supervisor_name,
                'nobody' as referral_name
           from users_active users, user_group_map ugm
          where users.user_id = ugm.user_id
            and ugm.group_id = 8
            and not exists (select 1 from im_employee_info
                            where user_id = users.user_id)
       order by upper(users.last_name),upper(users.first_names)

Collapse
Posted by Don Baccus on
Ouch ... what version are you running, PG 7.0.2?

The best thing to do is to submit this problem to the postgres bugs mailing list.  You'll have to register (it's a majordomo list).  The core developers monitor this list and are responsive, even when people do things like try running PG 6.5's pg_dump on a PG 7.0.2 database :)

(you know who you are, and I'm still giggling over it!)

You can find the PG bug mailing list from the "info central" part of www.postgresql.org.

Collapse
Posted by Jade Rubick on
I'm having the same problem. Here's what free says:
[jade@intranet /tmp]$ free
             total       used       free     shared    buffers     cached
Mem:        127848     121164       6684      49288       3448      26440
-/+ buffers/cache:      91276      36572
Swap:       409616      14900     394716
I'll follow your advice, Don, but it looks like this problem is fairly pervasive if it's not just me... Brian, how much memory and swap do you have? The intranet should be working, right?
Collapse
Posted by Jade Rubick on
The bug is the oldest bug in 7.02: http://www.postgresql.org/bugs/bugs.php?4~2.
Collapse
Posted by Brian Mueller on
I posted this item to the mailto:pgsql-bugs@postgresql.org mailing list, and Tom Lane sent the following response (I haven't implemented it yet):

"[I]t looks like this problem has already been fixed in current CVS sources --- it's a foulup in subplan handling for APPEND plans, which basically means that you need sub-SELECTS in a UNION to cause it.

Easiest patch for 7.0.* is to grab src/backend/optimizer/plan/setrefs.c
out of the CVS server or a recent nightly snapshot tarball and drop it
into 7.0.* sources as-is."

Collapse
Posted by Jade Rubick on
It works now 😊
Collapse
Posted by Jade Rubick on
Hmm, after reintalling OpenACS, I'm getting the same problems again. I dropped the user and reinstalled OpenACS cleanly, so I don't know why this would happen...

This is with a recompiled version WITH the fixed file.

Collapse
Posted by Ken Chakiris on
I followed the suggestion in Brian's comment and it worked!

I downloaded setrefs.c v1.63, copied to the postsql src directory in the src/backend/optimizer/plan subdirectory.  I recompilied postgres and installed it.  I then killed postmaster and shut down the nsd server for OpenACS.  I restarted postmaster and the nsd process.  I retried the intranet and did't get the previous errors or any additional new ones (as far as I could tell).

Note I didn't use the most recent setrefs.c as it had a comment that one needed to do an initdb. I used the next to most recent. I didn't reload the data model at all. Everything seemed to work fine with my previous data.