Forum .LRN Q&A: Re: Appreciate help with dotLRN performance

Collapse
Posted by Shankar Venkatagiri on
Thanks for the pointer. I went ahead and vacuumed the database (vacuumdb -a -f -v). I didn't see any progress even after this. More specifically, dotLRN drags when I try to load the Class Home page, which is a set of portlets. Any help?

Shankar

Collapse
Posted by Roberto Mello on
You forgot to analyze the database. That's the -z flag for vacuumdb. The -z flag is more important than the -f (full) flag for performance. I usually vacuum analyze my databases several times a day, but only vacuum full once a day, depending on DML operations of the database.

-Roberto

Collapse
Posted by Shankar Venkatagiri on
Thanks for the tip. I did go ahead and analyze the db. Not sure I understand all of the output, but will hve someone here look at it.

What I am positive about is that when I load the Class Home (set of portlets) the nsd processes take up a huge chunk of memory running whatever script they run. Apologies for the ignorance here:

  PID USER    PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM  TIME COMMAND

1748 shikshan  16  0 62528  61M  2336 S    2.9  6.1  0:13 nsd
1747 shikshan  15  0 62528  61M  2336 S    0.5  6.1  0:14 nsd

Also, the cache goes up significantly when this happens. Any help will be welcomed.

Shankar

Collapse
Posted by Andrew Piskorski on
A resident set size of 61 MB for your AOLserver is a "huge chunk of memory"? I don't think so. In your top output above, note that nsd is only taking 6% of your memory. That's not large, that's trivially small.
Collapse
Posted by Shankar Venkatagiri on
Thanks for the clarification. What I notice is that these processes don't "quit". Here's a sampler:

shikshan  1789  0.6  4.7 54276 48848 ?      S    11:36  0:12 [nsd]
shikshan  1790  0.0  4.7 54276 48848 ?      S    11:36  0:00 [nsd]
shikshan  1791  0.0  4.7 54276 48848 ?      S    11:36  0:00 [nsd]
shikshan  1792  0.0  4.7 54276 48848 ?      S    11:36  0:00 [nsd]
shikshan  1797  0.0  4.7 54276 48848 ?      S    11:36  0:01 [nsd]
shikshan  1798  0.0  4.7 54276 48848 ?      S    11:36  0:00 [nsd]
shikshan  1799  0.0  4.7 54276 48848 ?      S    11:36  0:01 [nsd]
shikshan  1800  0.0  4.7 54276 48848 ?      S    11:36  0:01 [nsd]
shikshan  1801  0.0  4.7 54276 48848 ?      S    11:36  0:00 [nsd]
shikshan  1802  0.0  4.7 54276 48848 ?      S    11:36  0:00 [nsd]

Any clues? Also, does using Apache instead of AOLServer improve my situation?

Thanks in advance -
Shankar

Collapse
Posted by Jeff Davis on
AOLserver is multithreaded so what you are seeing is multiple threads in one process, not processes that fail to exit.

Using apache might improve your situation immensely but I doubt it will do so if you intend to run OpenACS.

Collapse
Posted by Shankar Venkatagiri on

Thanks for the pointer, Jeff. I reported the server's response to ps almost 10 minutes after my last interaction with dotLRN. The same processes linger on even now, five hours after I last posted the previous message. Could this indicate un-exiting processes?

I will go ahead and test dotLRN out with apache. I do not, however, seem to understand the distinction between OpenACS and dotLRN. My bad!

Shankar

Collapse
Posted by Jeff Davis on
Neither dotLRN nor OpenACS will work under apache (well, you might be able to fight with mod_aolserver for a few months and get it to run acceptibly but I certainly would not recommend it). I don't really think of dotLRN as being seperate from OpenACS (rather it is a particular install of OpenACS).

You also don't seem to understand the difference between a thread and a process. AOLServer is multithreaded, it creates threads within the server process to handle requests and typically those threads do not go away until the server process exits. ps on linux has the annoying -- not sure if you would call it a feature or a bug -- that it displays threads like they are processes, these don't really take any extra memory than is already taken by the server (note how they are all listed as being the same size and were all created at the same time -- thats because its all just the same server process).