Forum OpenACS Q&A: Need of advice for improving performance

Hi!

In my server are running multiple openacs sites, I had some troubles 
about performance (slow site response) so maybe here somebody can 
help me without saying to me "buy another server, more ram or so", I 
can't because this server is rented and we don't have more money to 
invest on it right now. So your help will be really appreciated.

Here is the situation:
600 Celeron
512 Ram
hda HDD IDE 20Gb running on it:
The OS Linux redhat 7.1
1Gb Swap
postgres 7.1.3
Most of the source pages for the sites

hdb HDD IDE 20Gb running on it:
mostly used for backup and for the source pages of two sites

--------------------------------

the next sites running:
www.yosisideral.net
www.lectario.com
www.descubra.info
www.eeduk.com
www.viaro.net
www.elencuentro.info

Here are more info:

iostat
Linux 2.4.2-2 (fj.viaro.net)    10/01/2002

avg-cpu:  %user   %nice    %sys   %idle
           3.52    0.00    1.98    2.89
Device:        tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0        2.34         6.95         0.67  531691040  173359450
dev3-1        0.04         0.75         1.49    1759474    3491758



[root@fj /root]# iostat -x /dev/hda
Linux 2.4.2-2 (fj.viaro.net)    10/01/2002

avg-cpu:  %user   %nice    %sys   %idle
           3.52    0.00    1.98    2.89
Device:  rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s avgrq-sz avgqu-
sz   await  svctm  %util
hda       14.13   2.80 14.22  6.37    6.95    0.67    14.61     
0.16    7.00   0.63   0.13

[root@fj /root]# iostat -x /dev/hdb
Linux 2.4.2-2 (fj.viaro.net)    10/01/2002

avg-cpu:  %user   %nice    %sys   %idle
           3.52    0.00    1.98    2.89
Device:  rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s avgrq-sz avgqu-
sz   await  svctm  %util
hdb        0.18   0.17  0.02  0.02    0.75    1.49    54.30     0.04  
879.15 134.42   0.06



 free
             total       used       free     shared    buffers     
cached
Mem:        512344     509620       2724          0      13160     
130024
-/+ buffers/cache:     366436     145908
Swap:      1052216     161100     891116

Collapse
Posted by Jonathan Ellis on
your system is reporting the cpu isn't very busy, so if all pages are sluggish it's probably from the disk.  What is your PG shared buffer size?  May want to increase that.  Too bad you're not running 7.2; if you were, you could check the PG cache hit rate directly instead of guessing.  If you could show iostat from a period of sluggishness, rather than the global averages since reboot, that would be more informative.  From your average it doesn't look like you'd gain much from moving things to different disks, but that could be misleading.

Another thing you can check is how many threads you are giving nsd.  I know from experience that setting this too high can cause slowdown for no apparent reason.

I ran pg with this flag:  -B 1000  (I think is the buffer size), is that size good?

I'll try decreasing the threads configuration.

I'll post iostat, is long, but seems that the most opened thing is pg and their data directory.

I misunderstand when you said:
check is how many threads you are giving nsd
I was thinking on the database pool connection... =)
well, you mean how many nsd process do I have:


[nsadmin@fj aol32]$ ps ax | grep nsd
 3358 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 2
 3359 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 2
 3360 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 2
 3361 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 2
 3362 ?        S      0:06 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 2
 3546 ?        S      0:06 /usr/local/aolserver/bin/nsd -u nsadmin -g web -c not
 3547 ?        S      0:02 /usr/local/aolserver/bin/nsd -u nsadmin -g web -c not
 3548 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -c not
 3549 ?        S      0:06 /usr/local/aolserver/bin/nsd -u nsadmin -g web -c not
 3553 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -c not
 3554 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -c not
10449 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -t descubra.tcl
10450 ?        S      0:03 ../aol33/bin/nsd76 -u nsadmin -g web -t descubra.tcl
10451 ?        S      0:03 ../aol33/bin/nsd76 -u nsadmin -g web -t descubra.tcl
10456 ?        S      0:14 ../aol33/bin/nsd76 -u nsadmin -g web -t descubra.tcl
10561 ?        S      0:07 ../aol33/bin/nsd76 -u nsadmin -g web -t descubra.tcl
24160 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -ft
24471 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -ft
24472 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -ft
24474 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -ft
24603 ?        S      0:13 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -ft
32655 ?        S      0:01 ../aol33/bin/nsd76 -ft gnc.tcl
32660 ?        S      0:00 ../aol33/bin/nsd76 -ft gnc.tcl
32671 ?        S      6:06 ../aol33/bin/nsd76 -ft gnc.tcl
  985 ?        S      0:00 ../aol33/bin/nsd76 -ft gnc.tcl
 3734 ?        S      0:00 ../aol33/bin/nsd76 -ft gnc.tcl
 2715 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
 3356 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
 3357 ?        S      0:02 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
 4059 ?        S      0:14 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
 4566 ?        S      0:12 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
23430 ?        S      0:01 /usr/local/aolserver/bin/nsd8x -kt viaro.tcl
23431 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -kt viaro.tcl
23432 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -kt viaro.tcl
23519 ?        S      0:05 /usr/local/aolserver/bin/nsd8x -kt viaro.tcl
24720 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -kt viaro.tcl
24721 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -kt viaro.tcl
 4803 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -t elencunetro.t
 4824 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t elencunetro.t
 4825 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t elencunetro.t
 5413 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -t eeduk.tcl
 5420 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t eeduk.tcl
 5426 ?        S      0:04 ../aol33/bin/nsd76 -u nsadmin -g web -t eeduk.tcl
 6038 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t elencunetro.t
 6479 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t eeduk.tcl
20382 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t elencunetro.t
12018 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 1
12019 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 1
12020 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 1
12041 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -t 1
12277 ?        S      0:02 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
12282 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
12296 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
12311 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
13536 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
13537 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
13685 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t eeduk.tcl
 1749 ?        S      0:24 /usr/local/aolserver/bin/nsd -u nsadmin -g web -i -c
 1808 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -i -c
 1809 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -i -c
 1825 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -i -c
14003 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -i -c
14004 ?        S      0:00 /usr/local/aolserver/bin/nsd -u nsadmin -g web -i -c
31042 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -t lectario.tcl
31046 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t lectario.tcl
31047 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t lectario.tcl
31929 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t lectario.tcl
 7688 ?        S      0:00 ../aol33/bin/nsd76 -u nsadmin -g web -t lectario.tcl
22965 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
23582 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
 5770 ?        S      0:01 ../aol33/bin/nsd76 -u nsadmin -g web -kt open1.tcl
20692 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -u nsadmin -g web -ft
20693 ?        S      0:00 /usr/local/aolserver/bin/nsd8x -ft emisorasunidas.tcl
sorry again, I was thinking in lsof when you said iostat....
Collapse
Posted by Jonathan Ellis on
-B 1000 is very small for a production DB.  each buffer is 8k (unless you changed it at compile time) so this is only 8M.  My server also runs with 512M of memory, and I run with -B 32000.  You may not want it that high, but 1000 is definitely too small.
This is my entire flags for PG,

-N 100 -B 1000 -o "-S 2000"

any other recommendation?

Thanks,

Collapse
Posted by Jonathan Ellis on
those are the only ones you should need to play with in 7.1.3.  Here -N and -S are probably fine as they are.  You'll know if you need to increase -N b/c you'll see nsd complaining it can't connect in the log.  The sort memory specified in -S is per backend, not shared, so making it large "just in case" isn't a great idea.  2M is plenty for most uses.  With -B you should make it as large as you can w/o causing other stuff to swap when PG is under load.
Collapse
Posted by russ m on
SORT_MEM is actually per-sort, not per-backend... from the docs -
SORT_MEM specifies the amount of memory to be used by internal sorts and hashes before switching to temporary disk files. The value is specified in kilobytes, and defaults to 512 kilobytes. Note that for a complex query, several sorts and/or hashes might be running in parallel, and each one will be allowed to use as much memory as this value specifies before it starts to put data into temporary files. And don't forget that each running backend could be doing one or more sorts. So the total memory space needed could be many times the value of SORT_MEM.