High Load

All installation and configuration problems and questions

Moderators: gerski, enjay, williamconley, Op3r, Staydog, gardo, mflorell, MJCoate, mcargile, Kumba, Michael_N

High Load

Postby Vijay » Sat Sep 23, 2006 2:31 am

Hello,
we are facing a problem with load on server, its a new site.
Server is DualXeon 3.0,m SCSI drive, 2 GB Ram
with even 5 agents logged in server shows a load of 100

here are the running processes:

top - 02:31:08 up 1:48, 1 user, load average: 111.48, 72.85, 37.61
Tasks: 131 total, 1 running, 129 sleeping, 0 stopped, 1 zombie
Cpu(s): 19.2% us, 25.1% sy, 8.4% ni, 14.8% id, 31.1% wa, 0.2% hi, 1.3% si
Mem: 2068224k total, 2013624k used, 54600k free, 7300k buffers
Swap: 2097136k total, 0k used, 2097136k free, 1778992k cached


PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27673 root 15 0 103m 27m 4520 S 77.7 1.3 9:03.79 asterisk
3139 mysql 16 0 114m 33m 3260 S 6.3 1.6 1:49.48 mysqld
25206 root 21 2 5016 3812 1516 S 3.3 0.2 0:00.10 AST_send_action
25224 root 18 2 5020 3816 1516 S 3.0 0.2 0:00.09 AST_send_action
25279 root 20 2 5020 3816 1516 S 3.0 0.2 0:00.09 AST_send_action
25230 root 20 2 5016 3812 1516 S 2.7 0.2 0:00.08 AST_send_action
25242 root 21 2 5020 3816 1516 S 2.7 0.2 0:00.08 AST_send_action
25251 root 22 2 5016 3812 1516 S 2.3 0.2 0:00.07 AST_send_action
25256 root 18 2 5020 3816 1516 S 2.0 0.2 0:00.06 AST_send_action
8531 nobody 16 0 6124 4156 1516 S 1.7 0.2 0:01.62 httpd
3179 nobody 16 0 6108 4148 1532 S 1.3 0.2 0:09.74 httpd
6808 nobody 16 0 6488 4548 1588 S 1.0 0.2 0:11.26 httpd
15227 nobody 16 0 6476 4448 1568 S 1.0 0.2 0:09.76 httpd
27752 root 17 2 4520 3316 1532 S 1.0 0.2 0:01.51 AST_manager_sen
3348 nobody 16 0 5928 3868 1532 S 1.0 0.2 0:02.62 httpd
3349 nobody 16 0 6440 4468 1532 S 1.0 0.2 0:03.11 httpd
1254 root 15 0 0 0 0 S 0.7 0.0 0:14.14 kjournald
14821 root 15 0 4776 3620 1536 S 0.7 0.2 0:06.88 AST_VDauto_dial
15956 nobody 16 0 6488 4536 1568 S 0.7 0.2 0:09.23 httpd
16375 nobody 16 0 6468 4540 1572 S 0.7 0.2 0:08.63 httpd
29954 nobody 15 0 6472 4492 1568 S 0.7 0.2 0:07.76 httpd
30880 nobody 16 0 6476 4536 1568 S 0.7 0.2 0:02.93 httpd
4424 nobody 16 0 5824 3816 1524 S 0.7 0.2 0:01.68 httpd
8698 nobody 15 0 6124 4160 1532 S 0.7 0.2 0:02.29 httpd
19742 nobody 16 0 5820 3764 1524 S 0.7 0.2 0:00.84 httpd
4047 nobody 15 0 6440 4520 1572 S 0.3 0.2 0:07.40 httpd
23663 nobody 16 0 5952 3896 1532 S 0.3 0.2 0:03.58 httpd
27735 root 17 2 5440 4204 1532 S 0.3 0.2 0:00.92 AST_manager_lis
3329 nobody 16 0 6436 4512 1568 S 0.3 0.2 0:01.42 httpd
20569 nobody 16 0 5624 3592 1504 S 0.3 0.2 0:00.14 httpd
24140 root 16 0 2064 1120 836 R 0.3 0.1 0:00.07 top
1 root 16 0 684 252 216 S 0.0 0.0 0:01.50 init
2 root RT 0 0 0 0 S 0.0 0.0 0:00.05 migration/0


Any idea what could be wrong here.
Vijay
 
Posts: 14
Joined: Mon Jul 03, 2006 6:04 pm

Postby Vijay » Sat Sep 23, 2006 2:33 am

Is it something related to memory leakage of asterisk.....
Vijay
 
Posts: 14
Joined: Mon Jul 03, 2006 6:04 pm

Postby mflorell » Sat Sep 23, 2006 7:13 am

Holy &*$%!

a loadavg of 111! I'm suprised your command line was even responding. There could be any number of issues causing this.

Have you tried "show full processlist" in mysql to see if your DB is locked up in some heavy query?

In the Asterisk CLI is it locked in a loop or something?
mflorell
Site Admin
 
Posts: 18335
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Postby Vijay » Sat Sep 23, 2006 8:42 am

Here is the output of sql query
+------+------+-----------------+----------+---------+------+-------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+------+-----------------+----------+---------+------+-------+-----------------------+
| 1296 | cron | 127.0.0.1:54018 | asterisk | Sleep | 206 | | NULL |
| 1297 | cron | 127.0.0.1:54020 | asterisk | Sleep | 205 | | NULL |
| 1298 | cron | 127.0.0.1:54022 | asterisk | Sleep | 0 | | NULL |
| 1303 | cron | 127.0.0.1:54030 | asterisk | Sleep | 2 | | NULL |
| 1304 | cron | 127.0.0.1:54031 | asterisk | Sleep | 0 | | NULL |
| 1305 | cron | 127.0.0.1:54032 | asterisk | Sleep | 0 | | NULL |
| 1316 | cron | 127.0.0.1:54046 | asterisk | Sleep | 11 | | NULL |
| 1320 | root | localhost | asterisk | Query | 0 | NULL | show full processlist |
+------+------+-----------------+----------+---------+------+-------+-----------------------+


Even asterisk is not in loop, usually when we are not dialing, the load on server remains as 0.05 or so(in normal cases on other servers), but now it seems any S/w is causing the trouble in this server, in normal state as well its showing a load of 1.25, and with 1 hour of dialing with 6 agents, it is going upto 150, here is the output again of top when no dialing is happening, and even asterisk does not seems to be in any loop, because stopping asterisk also does not bring down load on server, any guess what could be the issue???

top - 08:42:04 up 4:40, 3 users, load average: 1.00, 1.02, 1.00
Tasks: 102 total, 1 running, 94 sleeping, 0 stopped, 7 zombie
Cpu(s): 0.2% us, 0.2% sy, 0.0% ni, 99.1% id, 0.4% wa, 0.0% hi, 0.0% si
Mem: 2068224k total, 1344632k used, 723592k free, 117276k buffers
Swap: 2097136k total, 0k used, 2097136k free, 950800k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3185 mysql 16 0 99.8m 14m 3228 S 1.3 0.7 0:11.98 mysqld
21907 root 16 0 0 0 0 Z 0.3 0.0 0:00.07 AST_VDhopper.pl <defunct>
1 root 16 0 680 252 216 S 0.0 0.0 0:01.36 init
2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
3 root 39 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:00.02 migration/1
5 root 39 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2
7 root 39 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2
8 root RT 0 0 0 0 S 0.0 0.0 0:00.03 migration/3
9 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3
10 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/0
11 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/1
12 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/2
13 root 10 -5 0 0 0 S 0.0 0.0 0:00.01 events/3
14 root 10 -5 0 0 0 S 0.0 0.0 0:00.01 khelper
15 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread
20 root 10 -5 0 0 0 S 0.0 0.0 0:00.17 kblockd/0
21 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 kblockd/1
22 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/2
23 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/3
24 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid
134 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khubd
367 root 15 0 0 0 0 S 0.0 0.0 0:03.39 pdflush
368 root 15 0 0 0 0 S 0.0 0.0 0:03.42 pdflush
369 root 16 0 0 0 0 S 0.0 0.0 0:06.05 kswapd0
370 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0
371 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 aio/1
372 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 aio/2
373 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 aio/3
404 root 16 0 0 0 0 S 0.0 0.0 0:00.00 shpchpd_event
992 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kseriod
1083 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_2
Vijay
 
Posts: 14
Joined: Mon Jul 03, 2006 6:04 pm

Postby mflorell » Sat Sep 23, 2006 10:43 am

Something is certainly wrong somewhere. Have you triede runing something like memtest for several hours to see if there is an issue with RAM going bad?

We had some bad RAM and it made the load on the server go up until we replaced it.

What Linux distro are you using?
mflorell
Site Admin
 
Posts: 18335
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Postby Vijay » Sat Sep 23, 2006 11:09 am

Every is as per scratch install.
Slackware 10.2 with 2.6 kernel.
Everything was fine on the same server till last 2 days, suddenly the load has started increasing, actually the server got restarted due to power failure once, i don't know if that has caused a problem, but this does not looks like a hardware issue, because when i login to other kernel 2.4 (the default one), i see the load as normal (0.08 without calls), but i am using ztdummy, so that will not work.
Vijay
 
Posts: 14
Joined: Mon Jul 03, 2006 6:04 pm

Postby dev_4901 » Sat Sep 23, 2006 4:39 pm

Vijay,

This is what I found on a website about /proc/loadavg :

This file contains information about the system load. The first three numbers represent the number of active tasks on the system - processes that are actually running - averaged over the last 1, 5, and 15 minutes. The next entry shows the instantaneous current number of runnable tasks - processes that are currently scheduled to run rather than being blocked in a system call - and the total number of processes on the system. The final entry is the process ID of the process that most recently ran.

Example output:

0.55 0.47 0.43 1/210 12437

Coz, I'm facing this problem about high loads too.
Are you using g729 as voice codec?

Dev Singhal.
dev_4901
 
Posts: 58
Joined: Sat Jul 22, 2006 1:48 am
Location: New Delhi, India

Postby Vijay » Sun Sep 24, 2006 9:37 am

I am not using G729, i am using transcoding from Alaw to GSM.
Moreover i have reinstalled the kernel, and in the new kernel memory usage is normal, with about 20 agents it goes upto 1.78 as maximum.
But if i login back to the old kernel then the memory leak is there, don't know what is causing this, i have not removed that kernel to debug to make sure no one of us face this problem in future, and if required we have the solution to it, i will let you know if i finds out the reason of high load in that kernel.
Any help from the forum will also be appreciated.
Vijay
 
Posts: 14
Joined: Mon Jul 03, 2006 6:04 pm


Return to Support

Who is online

Users browsing this forum: Majestic-12 [Bot] and 83 guests