Forum OpenACS Q&A: learn to debug an error

Collapse
Posted by Yves Goldberg on
Hello,

I would like to learn how to debug an error. anyone can point me to the right direction?

I have 2 installations of openacs. one 5.1 and one HEAD.

I think I have the same error in them. Below is the error message I get in the HEAD installation.

I would like to LEARN TO FIND a sollution to this error MYSELF.

what should I do?

I tried to grep:
"where last_queried < current_timestamp - 1" within categories package and found it in the postgres procs.

maybe the problem is the syntax used?

=====================================
06/Jul/2004:00:16:01
    Error: Ns_PgExec: result status: 7 message: ERROR:  Unable to identify an operator '-' for types 'timestamptz' and 'integer'
    You will have to retype this query using an explicit cast

06/Jul/2004:00:16:01
    Error: Database operation "dml" failed

ERROR:  Unable to identify an operator '-' for types 'timestamptz' and 'integer'
    You will have to retype this query using an explicit cast

SQL:
        delete from category_search
        where last_queried < current_timestamp - 1

Database operation "dml" failed

ERROR:  Unable to identify an operator '-' for types 'timestamptz' and 'integer'
    You will have to retype this query using an explicit cast

SQL:
        delete from category_search
        where last_queried < current_timestamp - 1

    while executing
"ns_pg_bind dml nsdb0 {
        delete from category_search
        where last_queried < current_timestamp - 1
      }"
    ("uplevel" body line 1)
    invoked from within
"uplevel $ulevel [list ns_pg_bind $type $db $sql"
    ("postgresql" arm line 2)
    invoked from within
"switch $driverkey {
                oracle {
                    return [uplevel $ulevel [list ns_ora $type $db $sql] $args]
                }
      ..."
    invoked from within
"db_exec dml $db $full_statement_name $sql"
    ("uplevel" body line 2)
    invoked from within
"uplevel 1 $code_block "
    invoked from within
"db_with_handle -dbn $dbn db {
            db_exec dml $db $full_statement_name $sql
        }"
    (procedure "db_dml" line 100)
    invoked from within
"db_dml delete_old_searches """
    (procedure "category_synonym::search_sweeper" line 2)
    invoked from within
"category_synonym::search_sweeper"
    ("eval" body line 1)
    invoked from within
"eval [concat [list $proc] $args]"
    (procedure "ad_run_scheduled_proc" line 42)
    invoked from within
"ad_run_scheduled_proc {t f {0 16} category_synonym::search_sweeper {} 1089038332 0 f}"
================================================

Collapse
Posted by Jeff Davis on
Well, first find the function category_synonym::search_sweeper and the corresponding xql file for the tcl file and the query whose name is category_synonym::search_sweeper.delete_old_searches (the function with the query dot the query name). You get that information from the traceback.

The thing thats breaking is that postgres does not know how to subtract 1 from a timestamptz; you can tell which types it knows how to do "-" to via

psql yourserver
\do -
which should list all the "-" operator types.

Looking at that you should see:

   Schema   | Name |        Left arg type        |       Right arg type        |         Result type         |             Description             
------------+------+-----------------------------+-----------------------------+-----------------------------+-------------------------------------
...
 pg_catalog | -    | time with time zone         | interval                    | time with time zone         | minus
which tells you what you need rather than an integer is an interval. So change the 1 to an interval of 1 day in the xql file, and reload changed files for that package at /acs-admin/apm/ and see if that fixes it.

Collapse
Posted by Jeff Davis on
Oh, and to find the function you could either go into /api-doc/ and search for it which should tell you which file it comes from or do something like
grep -r category_synonym::search_sweeper *
in the packages directory.

I guess the real trick is knowing how to read the traceback; in this case you look for the first function down the list which is not in the db API (the db API might have a few bugs but the vast majority of problems like this are in the query itself not in the db API so that's the place to start).

Collapse
Posted by Yves Goldberg on
ok Jeff,

I've edited the file category-synonyms-procs-postgresql.xql and changed in category_synonym::search_sweeper.delete_old_searches

from:

"where last_queried < current_timestamp - 1"

to:

"where last_queried < current_timestamp - '1 day'::INTERVAL"

I've restarted the server and find in the log what's below.

Does that mean I've corrected the former problem and this is a new one to debug or I made a mystake with the 'INTERVAL' syntax?

Thanks a lot for your help Jeff.

Yves.

==========================================
Errors since Tue Jul  6 01:38:52 2004

06/Jul/2004:05:23:03
    Error: Ns_PgExec: result status: 7 message: ERROR:  Function trunc(interval) does not exist
    Unable to identify a function that satisfies the given argument types
    You may need to add explicit typecasts

06/Jul/2004:05:23:04
    Error: Aborting transaction due to error:
Database operation "0or1row" failed (exception NSDB, "Query was not a statement returning rows.")

ERROR:  Function trunc(interval) does not exist
    Unable to identify a function that satisfies the given argument types
    You may need to add explicit typecasts

SQL:
        select trunc(current_timestamp - max(coalesce(modified_date, insertion_date))) from robots

=============================================

Collapse
Posted by Jeff Davis on
Yeah, that is a new (but somewhat related) bug; both of these are related to changes in postgres from 7.2 to 7.3 for timestamp handling and this package was never updated.

In this case, trunc on an interval used to return the number of days but that was removed in 7.3, now it should use date_part('days',...).

the other thing to do after you fix these is to submit a patch in bug tracker so everyone else can benefit from your bug fixing :)

There are some instructions for that here It's good if they are attached to a bug report but even if not submitting patches is the best way (short of committing directly) to make sure the fixes find their way into the release.

Collapse
Posted by Yves Goldberg on
1/ I've posted a bug report and a resolve to it Jeff. Can you tell me if this is right? I don't know how to use cvs and specifficaly diff to create a patch though:

https://openacs.org/bugtracker/openacs/bug?filter%2estatus=resolved&bug%5fnumber=1958

2/ in packages/robot-detection/tcl/robot-detection-procs-postgresql.xql

I would need to change:
===============================================
select trunc(current_timestamp - max(coalesce(modified_date, in\
sertion_date))) from robots
===============================================
to use
===============================================
"date_part('days',current_timestamp - max(coalesce(modified_date, insertion_date)))".
===============================================

Is this the correct way to use date_part?

thanks

Collapse
Posted by Yves Goldberg on
does anyone can tell me if the following is correct? I will then open a bug and a resolve for this:

===========================
date_part('days',current_timestamp - max(coalesce(modified_date, insertion_date)))
===========================

and is this usefull for the community that way (I don't know how to create a diff file to upload) but would be ready to learn.

Collapse
Posted by Yves Goldberg on
I made a grep in package to find occurance of date_part. I found one in acs-tcl/tcl/community-core-procs-postgresql.xql

trunc(date_part('epoch',age(password_changed_date))/(60*60*24))

so maybe the right way to patch my bug is:

change in packages/robot-detection/tcl/robot-detection-procs-postgresql.xql:

===============================================
select trunc(current_timestamp - max(coalesce(modified_date, insertion_date))) from robots
===============================================
to use
===============================================
select trunc(date_part('days',current_timestamp - max(coalesce(modified_date, insertion_date))))
===============================================

Does anyone is willing to give me a feedback on this?

TIA

Collapse
Posted by Jarkko Laine on
Yves,

No, you don't need to and shouldn't use trunc anymore. So your previous solution should've been correct. So use only

date_part('days',current_timestamp - max(coalesce(modified_date, insertion_date)))

or EXTRACT which is the SQL standard (I'm not sure if it works already in 7.3, but in 7.4 sure). Please read http://www.postgresql.org/docs/7.4/interactive/functions-datetime.html for information on how these functions should be used.

Collapse
Posted by Yves Goldberg on
Thanks Yarkko,

I've created a bug and resolve report at:
https://openacs.org/bugtracker/openacs/bug?filter%2estatus=resolved&bug%5fnumber=1961

with the following:

in packages/robot-detection/tcl/robot-detection-procs-postgresql.xql

change:
===============================================
select trunc(current_timestamp - max(coalesce(modified_date, in\
sertion_date))) from robots
===============================================
with
===============================================
select date_part('days',current_timestamp - max(coalesce(modified_date,insertion_date))) from robots
===============================================

Thank you again.

Collapse
Posted by Jarkko Laine on
Yves,

The state 'Resolved' in bug tracker should only be used when the bug has been fixed in OpenACS CVS repository. Now you have a solution but it's not yet fixed in CVS. If you please, you could upload a patch to bug tracker, which would enhance the possibilities that someone with commit rights will fix the bug soon:

  1. Go to the directory in your checkout where the file in question lies.
  2. Write cvs diff -Nu robot-detection-procs-postgresql.xql > patch-file
  3. Go to your bug in bug tracker.
  4. Hit "Submit a Patch".
  5. Fill in the form and submit the patch file you got from diff.
Collapse
Posted by Jarkko Laine on
Oh yes, and please read the patch instructions Jeff pointed to earlier if you want to understand better the reasoning for this workflow.
Collapse
Posted by Yves Goldberg on
Thank you Jarkko for this information. But I don't have my server under CVS so I can't create diff files I believe.

I hope it will still be usefull to the community - to do it the way I did in bugtrakker.

Yves.

Collapse
Posted by Jarkko Laine on
You can use the "Bad way" in Jeff's doc even if you don't have a CVS checkout. That means using just diff instead of cvs diff.

The reason why using diff (or cvs diff) is useful is that you can then apply the patch file directly to the files being fixed with 'patch' command without hand editing, and thus diminish the chances of typos.

Anyway all bug seeking and fixing is always desirable and admirable, thanks for your good work so far. We just try to teach newcomers to use the common guidelines because it makes life easier for us lazy old bastards :)

Collapse
Posted by Yves Goldberg on
ok Jarkko. I uploaded a patch for robot detection. Would you be willing to check if I done it right?

TIA

Collapse
Posted by Jarkko Laine on
Yes, Yves, you did it really well. There seems to be a <-sign missing from the start of your xml version line but otherwise it looks fine.

It's quite a trivial thing to fix now, so I assigned it to Worker Bee, since I can't fix and test it myself right at this moment.