Forum OpenACS Development: Problem with sending out the email-address-verification email for new users (needs a package; exploring.)

when I try to send the email, it fails, shows me a blank page entitled account-closed, and the error log shows that "package require Trf" failed.

I don't yet know what this package is, I do know it's in ActiveTCL and NOT regular get-the-source-and-build-it tcl.

It's also not in tcllib, and I just read tcllib does use it.

Remember that in many countries, it's required by law that the verification email thing works.

I will explore this and report back. Any members of the openacs core team (OCT) using ActiveTCL?

-Jim

Correction to "When I try to send the email, it fails"; this should read: when I try to register as a new user, the openacs instance fails to send the verification email"
Hi Jim,

i am pretty sure, this comes from tcllib. what version of tcllib are you using from OpenACS? in case, you have xotcl-core installed, you see the used version it via: http://yourhost/xotcl/version-numbers

Checked VERY carfully already, it's not there.

It is here tho: http://sourceforge.net/projects/tcltrf/files/tcltrf/2.1.4/

I did this in the trf source dir:

mkdir build
cd build
../configure --prefix=(my prefix) --with-tcl=(my prefix)/lib
make
make install

I compiled (there's no C code in tcllib, so can't be there) it, installed it, it got past that.

New issue: something's trying to call ds_init which doesn't appear to be there... let's see, where does that come from... developer-support? I'm looking...

Checked VERY carfully already, it's not there.

The question is, what "it" is. i count 15 occurrences of "package require Trf" in tcllib 1.15, most of these are in a catch. It might be the case that you have an older version of tcllib installed, or it might be the case that you have some other package installed that does the "require" command without catch. in case you have a recent version of tcllib and the error is triggered from there, writing a bug report/patch for the tcllib people might be useful.

if you refer to "it" to the trf package: trf is a binary package, so this per definition not part of tcllib.

-gn

Yes, trf is the "it" which is not present in tcllib...

Yes, it is a binary package, tcllib doens't have any binaries

Anyway, yeah, I'll try a newer tcllib if there is one.

Still looking for ds_init, it's starting to look like it's been removed from developer support, not sure about that yet.

Gonna run an errand and come back to this.

=Jim

tcllib should not raise an error, when a binary package is not here. What version of tcl are you using?
ds_init comes from (is defined in) the developer support. It is called from the request processor (acs-tcl/tcl/request-processor-procs.tcl), protected by a "catch". What version of OpenACS are you using exactly?
Oh, and no xotcl-core on my system yet.
Oh, and version of tcllib on my system is 1.15

-Jim

version of openacs is 5.8, version of ds is a 5.6.0d2 but it's requiring an acs-kernel of 5.8.1d1

-Jim

This means that you are probably using a version of oacs-5-8 post the 5.8.0 release, and you upgraded either via CVS or via "upgrade from repository". From your other posting, i assume the latter. "Upgrade from repository" takes the files for the matching release, which were tagged in this particular case with "openacs-5-8-compat". OpenACS.org builds - based on the compat tags - every night .apm files which are used for upgrade. The tagging happens just occasionally and is done in the oacs-5-8 branch for the development of OpenACS 5.8.1, which will contain as well the application packages...

what can/should we do: you can either stick with the released OpenACS 5.8.0 (tar file release) or you can install via CVS using the oacs-5-8 tag to get the newest set of packages from the branch, or we (however this is) can advance the compat tags, do some testing, such that upgrade from repository will work.

Here's what I'm doing now... I removed the "catch {ds_init}" from the request processor proc, and then the errors that came up no longer mentioned ds_init.

About that, I scoured everything looking for mention of ds_init, I even dumped the database into a text file and grepped (in case the string ds_init was mistakenly or purposely placed in a parameter or elsewhere in the db).

Other than the rp proc, there is no mention of ds_init anywhere.

There IS mention of ds_init on the openacs website in the api docs, I'd be curious to know (1) what version of developer-support exists on openacs.org, and whether there is actually a definition of ds_init somewhere in the files of openacs.org; no hurry on this... would you be willing to look at some point?

-Jim

jim,

there seems to be something strange in your installation: it seems to me, as if the Tcl "catch" command stopped working on your installation. The command "catch {ds_init}" seems to raise an error, "catch {package require Trf}" seems to raise an error as well (since most "package require Trf" commands are wrapped in a catch).

ds_init is defined in the packages/acs-developer-support/tcl/acs-developer-support-procs.tcl in around line 683:
http://fisheye.openacs.org/browse/~br=oacs-5-8/OpenACS/openacs-4/packages/acs-developer-support/tcl/acs-developer-support-procs.tcl?hb=true

OpenACS.org uses acs-core (including developer support) of the most recent checkout of the oacs-5-8 branch with developer support. If you would install from cvs, you would have the same version. So, probably have installed via the web interface ("from repository") and you got an earlier versions as discussed earlier in this thread.

... nevertheless, the "catch {ds_init}" command should not raise an error. What version of Tcl are you using?

Tried again to add a user, still problems sending the verify email. Here is the error msg:

[31/Mar/2014:15:31:53][30656.7fb91b7e8700][-conn:mu-new:0] Error: Error sending registration confirmation 
to mailto:dev1@jam.sessionsnet.org.

    while executing
"ad_raise notfound"
    (procedure "rp_serve_abstract_file" line 32)
    invoked from within
"rp_serve_abstract_file "$root/$extra_url""
    ("uplevel" body line 2)
    invoked from within
"uplevel $code"
    invoked from within
"smtp::sendmessage ::mime::17 -originator mailto:bounce-2536-FF2AF25A99C424DCFB07A6552BBC66F242459615-376@jam.ses
sionsnet.org -header {From {"Jazz Theory and ..."
    ("eval" body line 1)
    invoked from within
"eval $cmd_string"
    (procedure "acs_mail_lite::smtp" line 30)
    invoked from within
"acs_mail_lite::smtp -multi_token $tokens  -headers $headers_list  -originator $originator"
    (procedure "acs_mail_lite::send_immediately" line 155)
    invoked from within
"acs_mail_lite::send_immediately  -to_addr $to_addr  -cc_addr $cc_addr  -bcc_addr $bcc_addr  -from_addr $f
rom_addr  -reply_to $reply_to  -subject $subj..."
    (procedure "acs_mail_lite::send" line 12)
    invoked from within
"acs_mail_lite::send -send_immediately  -to_addr $user(email)  -from_addr "\"$system_name\" <[parameter::g
et -parameter NewRegistrationEmailAddress -de..."
    (procedure "auth::send_email_verification_email" line 9)
    invoked from within
"auth::send_email_verification_email -user_id $user_id"
    ("uplevel" body line 2)
    invoked from within
"uplevel $body "
    invoked from within
"smtp::sendmessage ::mime::21 -originator mailto:bounce-2536-6BED452BBFA1887EE9CE07F4A7A18410495A80BB-376@jam.sessionsnet.org -header {From mailto:jim@jam.sessionsne...";
    ("eval" body line 1)
    invoked from within
"eval $cmd_string"
    (procedure "acs_mail_lite::smtp" line 30)
    invoked from within
"acs_mail_lite::smtp -multi_token $tokens  -headers $headers_list  -originator $originator"
    (procedure "acs_mail_lite::send_immediately" line 155)
    invoked from within
"acs_mail_lite::send_immediately  -to_addr $to_addr  -cc_addr $cc_addr  -bcc_addr $bcc_addr  -from_addr $from_addr  -reply_to $reply_to  -subject $subj..."
    (procedure "acs_mail_lite::send" line 12)
    invoked from within
"acs_mail_lite::send -send_immediately  -to_addr $user(email)  -from_addr $system_owner  -subject $subject  -body $body"
    (procedure "auth::password::email_password" line 48)
    invoked from within
"auth::password::email_password  -username $username  -authority_id $authority_id  -password $password  -from [parameter::get -parameter NewRegistratio..."
    ("uplevel" body line 2)
    invoked from within
"uplevel $body "

i've just now started my server from a fresh checkout from oacs-5-8, and tagged the following packages with openacs-5-8-compat:
oacs-5-8 acs-admin
oacs-5-8 acs-api-browser
oacs-5-8 acs-authentication
oacs-5-8 acs-automated-testing
oacs-5-8 acs-bootstrap-installer
oacs-5-8 acs-content-repository
oacs-5-8 acs-core-docs
oacs-5-8 acs-datetime
oacs-5-8 acs-developer-support
oacs-5-8 acs-events
oacs-5-8 acs-kernel
oacs-5-8 acs-lang
oacs-5-8 acs-mail-lite
oacs-5-8 acs-messaging
oacs-5-8 acs-reference
oacs-5-8 acs-service-contract
oacs-5-8 acs-subsite
oacs-5-8 acs-tcl
oacs-5-8 acs-templating
oacs-5-8 acs-translations
oacs-5-8 ajaxhelper
oacs-5-8 attachments
oacs-5-8 calendar
oacs-5-8 categories
oacs-5-8 faq
oacs-5-8 file-storage
oacs-5-8 general-comments
oacs-5-8 intermedia-driver
oacs-5-8 news
oacs-5-8 notifications
oacs-5-8 oacs-dav
oacs-5-8 openacs-default-theme
oacs-5-8 ref-countries
oacs-5-8 ref-language
oacs-5-8 ref-timezones
oacs-5-8 rss-support
oacs-5-8 search
oacs-5-8 tsearch2-driver
oacs-5-8 xotcl-core
oacs-5-8 xotcl-request-monitor
oacs-5-8 xowiki
To be on the safe side, i've bumped version numbers of acs-tcl and acs-developer support, such that install-from-repository will pick it up. tomorrow morning MEZ the .apm files will be regenerated with the files above.

i've as well upgraded openacs.org to tcllib 1.15. to have as well there the same versions. Still, everything works as expected.

Maybe the following might help you: when openacs has an incorrect SMTPHost setup, the error message from smtp::sendmessage is rather crude

error reading "sock10": connection refused
The easiest way to test your sendmail setup is probably to open ds/shell and try something like the following
acs_mail_lite::send_immediately -from_addr jim@xx.com -to_addr jim@xxx.com -subject hi -body hi
Note, that none of this explains, why "catch" apparently does not work for you. You have not answered what tcl-versions you are using.
I mtried running acs_mail_lite::send_immediately from the shell with parameters almost exactly as you suggested, and got this in the return box:

The exact line I used from the shell is near the bottom of this, except I obfuscated the actual "to" addr.

Back in a few hours...

ERROR:

    while executing
"ad_raise notfound"
    (procedure "rp_serve_abstract_file" line 32)
    invoked from within
"rp_serve_abstract_file "$root/$extra_url""
    ("uplevel" body line 2)
    invoked from within
"uplevel $code"
    invoked from within
"smtp::sendmessage ::mime::25 -originator mailto:bounce-2536-F6B959E1B3FE54F3CB1FBEDFAD213955D3697DF6-376@jam.sessionsnet.org -header {From mailto:jim@xx.com} -heade..."
    ("eval" body line 1)
    invoked from within
"eval $cmd_string"
    (procedure "acs_mail_lite::smtp" line 30)
    invoked from within
"acs_mail_lite::smtp -multi_token $tokens  -headers $headers_list  -originator $originator"
    (procedure "acs_mail_lite::send_immediately" line 155)
    invoked from within
"acs_mail_lite::send_immediately -from_addr mailto:jim@xx.com -to_addr mailto:dev1@obfus.foo -subject "ds/shell test" -body "ds/shell test""
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 [string map {"\\\r\n" " "} $script]"


tcl version on my system is 8.5.13
strange. That's the version to use. When you type in the ds/shell "catch {ds_init}" then you should see either a "0" or "1" in the reply box, but not an error message. Can you confirm that?

Since you have neither confirmed not denied that you used "install/upgrade from directory", i still assume that you installed that way. The .apm files are rebuild by now. have you updgraded from repository? same results?

The smtp client implementation of of tcllib 1.15 starts more or less with a
catch {package require Trf 2.0}
What do you get in the message box of ds/shell when you type this command in? Btw, i checked just now: openacs.org has no trf installed.
Well I actually built and installed the latest Trf, so just the package require itself would return 2.1.4 and so the catch returns 0.
Yes, as mentioned before, I did try that with the ds I had installed (that did not have ds_init) and from both the tclsh I built for naviserv and from ds/shell, catch {ds_init} returned 1.

Since then, I upgraded ds, and now ds_init is present, but I haven't tried the same test as above (which should return 0, yes?)

No, I didn't use install/upgrade from directory, I used install/upgrade from repo.

-Jim

yes, "catch {ds_init}" should return 0. what puzzles me most is that you wrote, that you got an error from "catch {ds_init}", and by removing this line, the error disappeared. from your last posting, i get the impression that catch works... very wierd.

anyhow, do a "upgrade from repository", maybe you get some more updates this way.

Yaknow, I have a thought...

I've been pondering why the comment "If you use this, I will kill you" was placed on the commentary to ad_raise... in the year 2000...

all ad_raise does is...

return -code error -errorcode [list "AD" "EXCEPTION" $exception] $value

Could this cause something else to seem to be the line an error occurred on?

Heya Gustaf,

What I realized about our exchange while I was looking at things was, at the time I was changing things faster than I was telling you, and I know from the past this can get confusing and makes it hard to know what to suggest. Sorry for that, I'll try to keep you more informed next time you're helping me to look at things.

Along those same lines, I have no idea how catch seemed to be failing, I tested catch on random strings (worked, returned 1), on catch {ds_init} -- which works: before upgrading ds, returned 1, after, returned 0 -- and catch {package require Trf} would return 1 before Trf is installed and 0 afterwards -- as mentioned below, an interaction between return -code error, ad_raise and ad_try occasionally causes information about errors to be lost, and so it also causes messages to become uninformative, but I'm getting ahead of myself.

I continued to look at the mail situation, and mostly the error I was getting announced itself as "Error, ad_raise notfound", and when I wanted to test the mail sending in a tchsh shell -- in order to look at the situation completely free of openacs -- this required me to stop using acs_mail_lite::sendimmediately and to start using smtp::sendmessage, which is not in openacs (it's in tcllib).

When I was building up the call (it required some other stuff, like headers and a mime part to send), I did it in ds/shell, and consequently this showed me reasonably informative errors, which allowed me to fix the problems as they came up, until there was one point where it again showed "Error: ad_raise notfound". When I moved the test (with the setup) to run on the tclsh that was built for naviserv, I actually got an informative result, which I'll get to momentarily.

What I discovered about ad_raise, is it does one simple return statement, it is meant to raise an exceptional condition, to be caught by ad_try. It does something like return -code error -errorcode [list this is an exception $exception_name] and my belief is that when the smtp::sendmessage also returns an error code, it gets caught by the request processor as if it were one of these exceptions and partly because of that, some details about the actual error is either lost, or just not reported properly. I don't have complete details yet on exactly what happens,

Lastly, when I try to run smtp::sendmessage in the tclsh shell, the precise code I'm running is:

package require mime

set part [mime::initialize -canonical text/plain -string hi]

package require smtp

smtp::sendmessage \
    $part \
    -originator mailto:jim@jam.sessionsnet.org \
    -recipients mailto:dev1@jam.sessionsnet.org \
    -header {message-id mailto:9123@jam.sessionsnet.org} \
    -header {date {Thu Apr  3 01:42:29 PDT 2014}}

And as I'm typing each separate command, I can observe that all but the last return no errors. The error returned by the last one (the smtp::sendmessage command) is:

421: 4.3.0 collect: Cannot write ./dfs338xUr4013746 (bfcommit, uid=0, gid=104): No such file or directory

This message is an improvement in how informative it is, and I have no idea what parts of it mean, I see it's running as root, I don't know what "collect" is, and I have no idea how a write to a file in . can fail with file/dir not found.

-Jim

Hi Jim,

I think the catch not catching is due to flooding a sequence of catches with errors. I've been searching for a reference about this without success.

I believe I ran into the problem once due to a permissions issue, where I inadvertently changed the permission of a file that nsd had previously checked permission on and was accessing or writing to but subsequently the OS denied. nsd then spun with high CPU and diagnosing was difficult because catch didn't work as expected.

So, if Gustaf hasn't identified the exact issue, I do believe he is on to a central cause, namely that a file permission has changed for nsd, perhaps a lib file.

cheers,

actually, the error messages for SMTP errors "4.3.0 collect: Cannot write ..." hint on a permission problem of the mail delivery system on SMTPHost, not a permission problem with nsd. in the particular case chmod 1777 /var/spool/mqueue (whether the value of 1777 as recommended is a perfect value can be discussed, but one can at least potentially rule out permission problems, if the problem persists). Other possible causes might be that sendmail runs under wrong permissions.

Maybe the package parameter SMTPHost should point to a different mailhost with a correct sendmail/postfix/... setup. maybe jim is trying this on a new instance (fresh linux/bsd/..., fresh openacs database with default SMTPHost, etc).

nevertheless, the error feedback from openacs (acs-mail-lite, maybe acs-tcl, templating involved) should be improved.

I've committed a change that avoids that potential errors from smtp::sendmail can be swallowed silently from higher calling levels. bumped as well the version numbers, such that tomorrow one can get this change via "install from repository". I hope, this improves the situation.
Gustaf, I have another suggestion for a commit to acs-mail-lite, and it depends on your read of the tcllib smtp::sendmessage. The question being: if one provides username and password, does smtp::sendmessage know to use authenticated smtp, and (main point is) if one does -not- provide these, does it use the original unathenticated protocol?

If so... I have a suggestion, and I'll post it a bit later, meanwhile I'm going to test a coupla more times.

-Jim

I finally got success by providing -servers {a.smart.host} to smtp::sendmessage. This solution completely bypasses using the virtual server machine (aka localhost), and instead uses a machine nearby.

Initial test on my changes to acs_mail_lite::sendimmediately show it needs more work. On that now, results coming soon.

-Jim

After setting the smarthost parameter, acs_mail_lite::send_immediately works too, and with my to-be-proposed changes. One more test...

-Jim

I wanted to change how it's decided whether to use smtp auth or not, so I added code that either adds the smtp password and username, or does not add them, depending on whether the user and password are set in the acs-mail-lite parameters.

The diff:

--- cut here ---
diff -Naur /home/mu-new/openacs-5.8.0/packages/acs-mail-lite//tcl/acs-mail-lite-procs.tcl acs-mail-lite//tcl/acs-mail-lite-procs.tcl
--- /home/mu-new/openacs-5.8.0/packages/acs-mail-lite//tcl/acs-mail-lite-procs.tcl      2013-08-29 02:53:44.000000000 +0400
+++ acs-mail-lite//tcl/acs-mail-lite-procs.tcl  2014-04-04 14:38:06.000000000 +0400
@@ -141,7 +141,16 @@
        foreach header $headers {
            append cmd_string " -header {$header}"
        }
-        append cmd_string " -servers $smtp -ports $smtpport -username $smtpuser -password $smtppassword"
+        append cmd_string " -servers $smtp -ports $smtpport"
+
+      set smtppass_p [expr {$smtppassword ne ""}]
+      set smtpuser_p [expr {$smtpuser ne ""}]
+
+      # change the condition as you like: right now, both user and pass must be set to use auth.
+      if { $smtpuser_p && $smtppass_p } {
+          append cmd_string " -username $smtpuser -password $smtppassword"
+      }
+
        ns_log Debug "send cmd_string: $cmd_string"
        eval $cmd_string
    }
--- cut here ---

-Jim

When looking a little closer, I noticed a problem:

When the user fills out the registration form and clicks OK, the system sends the registration email, which the new user receives. But, (when the user clicks OK on the reg form) it still shows a blank page entitled Account Closed. Everything else seems to be working, but the UI makes it seem it got stuck or something's wrong.

When the user receives the verification email, it contains the link, which verifies the user properly.

There's two cases... one, if the verification email used send_immediately, I guess it's not reporting an error to the caller.

If the verification email is queued, two things one, how is the user informed that s/he needs to check their email? and two, how would the system know whether an error occured in sending the mail?

I altered my copy of acs_mail_lite::sent_immediately, maybe something I did caused this problem. I'll attach my code to the next message so you can see and comment.

-Jim

To clarify further, what should happen after the new user clicks OK on the registration form, is the user should be told to expect a verification email in the next few minutes, and the system is not showing a web page with that message.

Can anyone confirm that in openacs-5.8 it's having the user wait for the verification email?

-Jim

It seems like both sendmail and exim4 are running on the machine... I'll look into that.

Sendmail (which, according to ps aux) is running as root, while exim is running as userid 109. So one could figure, semdmail can't be the permission problem... still, sendmail has pieces and maybe they run as different users. I dunno, maybe I'll replace it all with qmail (which I might be able to get working with webmail if I even want to do that), or with exim (which has an easy setup).

One thing we knew a few days ago, was the email problem existed completely outside openacs, and we found this out when I tried sending from a tclsh.

One thing, I used the unix tool "mail" to send a mail, and that worked.

Anyway, still exploring. I'm also going to look into one of Torbin's suggestions, that is to try a smarthost other than localhost, and configure acs-mail-lite accordingly.

-Jim