Forum OpenACS Q&A: Re: OpenACS Keepalive how to do it for windows based installation

Hi Sujata

what does the AOLserver error log report before the service stops? Are you using a Windows service to start and stop AOLserver?

best wishes
Brian

Thanks for the reply.

I am trying to get the logs of the server. I will post the relevant contents once i get them.

Yes, I am using windows service to start and stop the AOLserver.

Ok, I got the logs, I see this message repeated every day a number of times in error.log file but we don't have any problem in accessing the server.

But, the time it stops responding this was the last message.All other messages are of "Notice" type so not posting.

Please let me know if you can make out something. Generally it happens on weekends and we cannot access project-open on Mondays.

==========================================================

[24/Mar/2012:05:31:46][1244.3180][-sched:23-] Notice: im_mail_import.process_mails0: Error creating '/web/projop/Maildir/spam' folder: 'mkdir ("/web/projop/Maildir/spam") failed:

no such file or directory'
[24/Mar/2012:05:31:54][1244.3136][-sched:16-] Notice: acs-mail-lite: about to load qmail queue
[24/Mar/2012:05:31:54][1244.3136][-sched:16-] Notice: acs_mail_lite::load_mail_dir: queue_dir=''
[24/Mar/2012:05:31:54][1244.3136][-sched:16-] Notice: acs_mail_lite::load_mail_dir: queue dir = /new/*, no messages
[24/Mar/2012:05:32:16][1244.1684][-sched:25-] Notice: sync: uid=32644, pid=50898, day=2012-03-20 00:00:00+05:30
[24/Mar/2012:05:32:16][1244.1684][-sched:25-] Error: Tcl exception:
ambiguous option "file": must be authpassword, authuser, channel, close, content, contentlength, contentsentlength, contentchannel, copy, driver, encoding, files, fileoffset,

filelength, fileheaders, flags, form, headers, host, id, isconnected, location, method, outputheaders, peeraddr, peerport, port, protocol, query, request, server, sock, start,

status, url, urlc, urlencoding, urlv, version, or write_encoded
while executing
"ns_conn $var"
(procedure "ad_conn" line 90)
invoked from within
"ad_conn file"
(procedure "ad_parse_template" line 15)
invoked from within
"ad_parse_template -params [list [list exception_count $exception_count] [list exception_text $exception_text]] "/packages/acs-tcl/lib/ad-return-com..."
(procedure "ad_return_complaint" line 2)
invoked from within
"ad_return_complaint 1 "

  • [_ intranet-core.lt_Unable_to_determine_I]

    [_ intranet-core.lt_Maybe_somebody_has_ch]""
    (procedure "im_company_internal_helper" line 5)
    invoked from within
    "im_company_internal_helper"
    ("eval" body line 1)
    invoked from within
    "eval $script"
    invoked from within
    "ns_cache eval util_memoize $script {
    list $current_time [eval $script]
    }"
    (procedure "util_memoize" line 20)
    invoked from within
    "util_memoize [list im_company_internal_helper]"
    (procedure "im_company_internal" line 2)
    invoked from within
    "im_company_internal"
    (procedure "im_cost::new" line 3)
    invoked from within
    "im_cost::new -cost_name $cost_name -user_id $hour_user_id -creation_ip "0.0.0.0" -cost_type_id [im_cost_type_timesheet]"
    ("uplevel" body line 5)
    invoked from within
    "uplevel 1 $code_block "
    ("uplevel" body line 1)
    invoked from within
    "uplevel 1 $code_block "
    invoked from within
    "db_with_handle -dbn $dbn db {
    set selection [db_exec select $db $full_statement_name $sql]

    set counter 0
    while { [db_getrow $..."
    (procedure "db_foreach" line 36)
    invoked from within
    "db_foreach hours $sql {

    ns_log Notice "sync: uid=$hour_user_id, pid=$project_id, day=$day"
    set cost_name "Timesheet $hour_date $project_nr $user_na..."
    (procedure "im_timesheet2_sync_timesheet_costs" line 48)
    invoked from within
    "im_timesheet2_sync_timesheet_costs"
    ("eval" body line 1)
    invoked from within
    "eval [concat [list $proc] $args]"
    (procedure "ad_run_scheduled_proc" line 42)
    invoked from within
    "ad_run_scheduled_proc {t f 61 im_timesheet2_sync_timesheet_costs {} 1331615237 0 f}"
    [26/Mar/2012:10:10:34][1244.1376][-thread1376-] Notice: nsmain: AOLserver/4.5.1 stopping
    [26/Mar/2012:10:10:34][1244.1376][-thread1376-] Notice: driver: stopping: nssock
    [26/Mar/2012:10:12:50][208.336][-thread336-] Notice: nsmain: AOLserver/4.5.1 starting
    ========================================================

    Thanks,
    Sujata

  • Hi

    first of all if the service isn't stopping, you can just kill it. I don't think it's necessary to reboot the server.

    In terms of what's causing your issue, it's not too clear. Did somebody manually stop the server at 26/Mar/2012:10:10:34? I find it hard to believe that the Tcl exception at 24/Mar/2012:05:32:16 actually caused the server to freeze up.

    One thing you could try before rebooting is to do a telnet to port 80 (as described here http://philip.greenspun.com/seia/basics ) just to see if the server is really down.

    The TCL exception appears to be caused by a call to ad_conn from within a scheduled proc. This is a bug, see https://openacs.org/forums/message-view?message_id=162503 for example. The scheduled proc needs to take into account that ad_conn is not available.

    Hope this helps
    Brian

    Thanks for your reply.

    Yes, as server was not responding (I thought so! ) because project-open web page was not accessible, we manually tried to stop the service on 26th March. And as it was taking too much time to stop we just rebooted the machine.

    Next time it happens, i will do telnet and see. We have observed the issue on 24 March, and after that on 13th April. So we have to wait for few weeks for it to reappear, most probable dates are either 28-29 April or 5-6 May.

    I will update my observation and let you know what actually happened.

    I tried to search for "how to kill the AOL-server in Win XP"
    could not get satisfactory answers. Below is what i got :

    1) Go to task manager -> select "nsd.exe" and end process tree

    2) Use command "netstat -ano" , get PID for 8000 port, which is actual project open port and give command "taskkill /f /PID 5072

    For both the methods when i see in services its showing -> AOLserver-projop "starting" but its not doing anything, again i have to stop it manually. But if i stop it immediately it gets stopped and i can restart it again.

    I am new to this, and i really appreciate if you can confirm which way to be used to kill the server or i m doing totally wrong way.

    Thanks,
    Sujata

    For a forced kill, either of those methods should be fine. This is obviously in the situation where the normal shutdown isn't responding. You should be keeping an eye on the error logs for the "Notice: nsmain: AOLserver/4.5.1 stopping" message. Sometimes it can just take a long time to stop.

    There was a lengthy discussion recently on the AOLserver mailing list about these occasional problems with Windows versions of AOLserver shutting down. I'm not sure what the final resolution was. You can see some of the threads here. http://permalink.gmane.org/gmane.comp.web.aolserver/16523

    Brian