Asterisk keeps channels open sometimes then crashes

gvv
Posts: 29
Member Since:
2008-01-31

I've prepared a system which is going into production from tomorrow, but I have a big issue with it. Calls are going well most of the time, but it seems that at random some ZAP channels are not disconnected properly.

We have a 6 channel ISDN line from KPN in The Netherlands. and about 80% of the time all is ok, but the other 20% the channel keeps occupied even though both ends (internal SIP/External POTS) hung op.

If I do a show channels sometimes I don't get a result back and say 1 hour later asterisk crashes!

I'm using TrixBox 2.6.0.3 with the install-ZAPHFC script 1.4.0 and a Junghanns QuadBRI card.

this is my zaptel.conf:
# Span 1: ztqoz/1/1 "quadBRI PCI ISDN Card 1 Span 1 [TE] (cardID 0)"
span=1,1,1,ccs,ami
# termtype: te
bchan=1-2
dchan=3
# Span 2: ztqoz/1/2 "quadBRI PCI ISDN Card 1 Span 2 [TE] (cardID 0)"
span=2,2,1,ccs,ami
# termtype: te
bchan=4-5
dchan=6
# Span 3: ztqoz/1/3 "quadBRI PCI ISDN Card 1 Span 3 [TE] (cardID 0)"
span=3,3,1,ccs,ami
# termtype: te
bchan=7-8
dchan=9
# Span 4: ztqoz/1/4 "quadBRI PCI ISDN Card 1 Span 4 [TE] (cardID 0)"
span=4,4,1,ccs,ami
# termtype: te
bchan=10-11
dchan=12
# Global data
loadzone = nl
defaultzone = nl

this is my zapata.conf:
[channels]
language=nl
switchtype=euroisdn
pridialplan = unknown
prilocaldialplan = unknown
nationalprefix=0
internationalprefix=00
usecallingpres=yes
echocancel=yes
echocancelwhenbridged=yes
echotraining=100
rxgain=0.0
txgain=0.0
; Span 1: ztqoz/1/1 "quadBRI PCI ISDN Card 1 Span 1 [TE] (cardID 0)"
group=0,11
context=from-pstn
switchtype = euroisdn
signalling = bri_cpe
channel => 1-2
group=
context=default
; Span 2: ztqoz/1/2 "quadBRI PCI ISDN Card 1 Span 2 [TE] (cardID 0)"
group=0,12
context=from-pstn
switchtype = euroisdn
signalling = bri_cpe
channel => 4-5
group=
context=default
; Span 3: ztqoz/1/3 "quadBRI PCI ISDN Card 1 Span 3 [TE] (cardID 0)"
;group=0,13
;context=from-pstn
;switchtype = euroisdn
;signalling = bri_cpe
;channel => 7-8
;group=
;context=default
; Span 4: ztqoz/1/4 "quadBRI PCI ISDN Card 1 Span 4 [TE] (cardID 0)"
group=0,14
context=from-pstn
switchtype = euroisdn
signalling = bri_cpe
channel => 10-11
group=
context=default
group=1

I hope I missed something here, because I don't know where to look.
What I saw is that the current BRIstuff from Junghanns is named test6, for asterisk 1.4.17
If their are no hints, does it make sense do downgrade to asterisk 1.2.26 to use their production BRIstuff? And does trixbox 2.6.0.3 work with asterisk 1.2.26, or do I also need to downgrade trixbox?

Thanks, and happy eastern!



W1zz
Posts: 562
Member Since:
2006-05-31
IIRC this has been discussed

IIRC this has been discussed before.

Use the TB Forum Search and search for "KPN"

--

Alan

install-ZAPHFC

Look here for more help.
Current version is 1.4.0 (25 January 2008)



gvv
Posts: 29
Member Since:
2008-01-31
Not having D-Channel issues

Hi Alan, thanks for you reply.
I do not have D-channel issues with "empty HDLC frame or bad CRC received".

Asterisk is not giving any issues on this.
I've read about the issue before that KPN is dropping the D-channel when not used, so I called KPN about this.

They mentioned that this could only be the issue on a single ISDN line using PTMP.
We have multiple ISDN (total 6 channels) called "ISDN meervoudig" and it's using PTP.

I don't see D-channels going down, but I do have issues with some channels not being freed correctly, I have to soft hangup these.

Apart from the D-channel issue, I don't see anything specific to KPN in the search, maybe you can point me in the right direction.



W1zz
Posts: 562
Member Since:
2006-05-31
At the asterisk CLI turn on

At the asterisk CLI turn on BRI debug. Watch the BRI dialogue to see if the reason is obvious.

The thread I was thinking of was:

http://www.trixbox.org/forums/trixbox-forums/help/hfc-isdn-card-i...

--

Alan

install-ZAPHFC

Look here for more help.
Current version is 1.4.0 (25 January 2008)



gvv
Posts: 29
Member Since:
2008-01-31
Wait and see

I've enabled BRI debug on all 4 spans, so now we have to wait for it.

It hasn't happened today, I'll try to post when it happens again.

The only part of the debug I don't understand is this:

2 -- Restarting T203 counter
4 -- Restarting T203 counter
1 -- Restarting T203 counter
2 -- Restarting T203 counter
4 -- Restarting T203 counter
1 -- Restarting T203 counter
1 -- Restarting T203 counter
1 -- Restarting T203 counter

this is repeating itself every 30 seconds or so even with no active calls, is this correct behaviour?
FYI span3 is not used at the moment.

also I sometimes see:
Unable to set TOS to 184



gvv
Posts: 29
Member Since:
2008-01-31
Here it is

Ok, now I have an open channel with nobody connected at this time...

show channels = empty

zap show channels gives that channel 2 is connected

zap show channel 2:

Channel: 2I>
File Descriptor: 11
Span: 1*CLI>
Extension: [my external number]
Dialing: no>
Context: from-pstn
Caller ID: [Cust number]
Calling TON: 17
Caller ID name:
Destroy: 0I>
InAlarm: 0I>
Signalling Type: ISDN PRI
Radio: 0CLI>
Owner: Zap/2-1
Real: Zap/2-1
Callwait:
Threeway:
Confno: -1I>
Propagated Conference: -1
Real in conference: 0
DSP: yesCLI>
Relax DTMF: no
Dialing/CallwaitCAS: 0/0
Default law: alaw
Fax Handled: no
Pulse phone: no
Echo Cancellation: 128 taps, currently ON
PRI Flags: Call
PRI Logical Span: Implicit
Actual Confinfo: Num/0, Mode/0x0000
Actual Confmute: No
Hookstate (FXS only): Onhook

tail /var/log/messages:
Mar 26 13:33:07 trixbox ntpd[2243]: synchronized to 192.87.106.3, stratum 1
Mar 26 13:50:20 trixbox ntpd[2243]: synchronized to 192.87.36.4, stratum 1
Mar 26 13:57:50 trixbox ntpd[2243]: synchronized to 192.87.106.3, stratum 1
Mar 26 14:04:56 trixbox ntpd[2243]: synchronized to 192.87.36.4, stratum 1
Mar 26 15:13:21 trixbox ntpd[2243]: synchronized to 192.87.106.3, stratum 1
Mar 26 16:19:46 trixbox ntpd[2243]: synchronized to 192.87.36.4, stratum 1
Mar 26 16:27:59 trixbox kernel: zaptel Disabled echo canceller because of tone (rx) on channel 1

I couldn't find anything in the debugging.



W1zz
Posts: 562
Member Since:
2006-05-31
Was this a fax call by

Was this a fax call by chance?

--

Alan

install-ZAPHFC

Look here for more help.
Current version is 1.4.0 (25 January 2008)



gvv
Posts: 29
Member Since:
2008-01-31
No fax

Hi Alan,

I've checked this, and it wasn't a fax call, it was an international call to +35



W1zz
Posts: 562
Member Since:
2006-05-31
So is there anything

So is there anything different in what comes from your International switches?

Is it a consistent behavior from your ISC's?

--

Alan

install-ZAPHFC

Look here for more help.
Current version is 1.4.0 (25 January 2008)



gvv
Posts: 29
Member Since:
2008-01-31
Not only international

Hi Alan,

It's not only international numbers, and I don't threat them different, for now the trixbox is only connected to pots via the junghanns, and there's one dialplan that just forwards this though to our sip phones.

At this moment I'm installing a clean ISO 2.6.0.5 and I was wondering if I should go for another zaptel/asterisk combo than the *1.4.17.
Can you advise?

At this site they patched the bristuff to a new version:
http://updates.xorcom.com/astribank/bristuff/1.4/

0.4.0-test6-xr2
- Zaptel 1.4.9.2.xpp.r5566
- Asterisk 1.4.18.1
- Addons 1.4.6
- Zaptel drivers moved to kernel/: adapted zaptel patchs.
- libpri patch: Reset T303 timer on ALERTING and PROGRESS response
to setup.
- Disabled ast-send-message and its users due to its API changes.

This is with asterisk 1.4.18.1 and a newer zaptel.
Would it make sense to try this, or am I looking in a wrong direction and should the 1.4.17 be fine also.

Also I've upgraded the 2.4 to 2.6 in the past and see some issues with this also.

Thanks a zillion!



gvv
Posts: 29
Member Since:
2008-01-31
It appears to be an asterisk bug

I've reinstalled my server with a new trixbox 2.6.0.5 and also tried a new bristuffed version of asterisk 1.4.18. Still same issue.

There are actually 2 major issues:

- Channels keep themselves open sometimes
- Asterisk hang completely and no calls are possible

Before asterisk hangs, if I do a show channels it doesn't give me an interface back, this appears to be the same issue with asterisk 1.4.17 and 1.4.18. A few minutes later asterisk hangs and no calls are possible.

My problem is described in this bug:
http://bugs.digium.com/view.php?id=11712

Especially this post is giving me hopes:
http://bugs.digium.com/view.php?id=11712#83963

There is a patch attached that would resolve the issue with this but as I'm rather new to trixbox/asterisk I would rather ask for some advice here...

the solution is given here:
http://bugs.digium.com/view.php?id=11712#83974

But I don't know what I have to do with this. Can someone please advice, if this could also be applied on a bristuffed version of asterisk and how to patch it.

I have these problems daily and it's only for a 7 user environment. I have to restart asterisk at least 2/3 times a day.

I also read that this problem doesn't occur with asterisk 1.2 is it an option to downgrade? and can this be done witht trixbox 2.6 or do I need 2.2 for that?

Thanks again



W1zz
Posts: 562
Member Since:
2006-05-31
With the problems you are

With the problems you are describing it might be easier to debug if you do a manual build of just asterisk, freepbx, BRIstuff. If this is too daunting then earlier versions of TB would be a way to go.

--

Alan

install-ZAPHFC

Look here for more help.
Current version is 1.4.0 (25 January 2008)



gvv
Posts: 29
Member Since:
2008-01-31
Probably going back to 2.2 is better for me

Hello Alan,

My first experience with Linux / asterisk / trixbox was in december 2007 so I would not name myself an expert. I've read tons of forums /tutorials/ etc but this is probably a bridge to far for me, especially when I wan't it to run in production.

If I'm going back to TrixBox 2.2 with asterisk, what would I be missing in functionality, is this described somewhere?
Also is zaptel working good in 2.2?
Can I update via the package manager / YUM or should I stay away from this because it would update me back to trixbox 2.6

Thanks!



SchlingBlade
Posts: 114
Member Since:
2007-11-29
This appears to be the same

This appears to be the same issue I am experiencing on a clean 2.6.0.7 install with about 50 users. Rebooting the box once a day seems to be helping, but it doesn't eliminate the issue.



gvv
Posts: 29
Member Since:
2008-01-31
Keep it or downgrade

Hi SchlingBlade, Are you keeping it like this or are you also considering downgrade to 2.2?



SchlingBlade
Posts: 114
Member Since:
2007-11-29
I like the features of 2.6,

I like the features of 2.6, so I'm going to see if I can make it work.

I'm reading up on the changelog for Asterisk 1.4.19, which was released on April 2. There may be some fixes in there.



gvv
Posts: 29
Member Since:
2008-01-31
Did you read my bug findings

In my post, 3 or 4 posts up, I referenced to the bug topic, it seems rather fresh, so I wouldn't expect this to be fixed in 1.4.19.

Do you use it in combination with BRISTUFF / zaptel?



SchlingBlade
Posts: 114
Member Since:
2007-11-29
http://bugs.digium.com/view.p

http://bugs.digium.com/view.php?id=12307

http://bugs.digium.com/view.php?id=11712

Looks like some of the fixes they implemented to fix locking still had issues. This has been addressed again in Asterisk 1.4.19.



SchlingBlade
Posts: 114
Member Since:
2007-11-29
1106 2008-02-25 23:42 +0000

1106 2008-02-25 23:42 +0000 [r104102-104106] Russell Bryant
1107
1108 * apps/app_chanspy.c: This patch fixes some pretty significant
1109 problems with how app_chanspy handles pointers to channels that
1110 are being spied upon. It was very likely that a crash would occur
1111 if the channel being spied upon hung up. This was because the
1112 current ast_channel handling _requires_ that the object is locked
1113 or else it could disappear at any time (except in the owning
1114 channel thread). So, this patch uses some channel datastore magic
1115 on the spied upon channel to be able to detect if and when the
1116 channel goes away. (closes issue #11877) (patch written by me,
1117 but thanks to kpfleming for the idea, and to file for review)
1118
1119 * main/utils.c: Improve the lock tracking code a bit so that a
1120 bunch of old locks that threads failed to lock don't sit around
1121 in the history. When a lock is first locked, this checks to see
1122 if the last lock in the list was one that was failed to be
1123 locked. If it is, then that was a lock that we're no longer
1124 sitting in a trylock loop trying to lock, so just remove it.
1125 (inspired by issue #11712)

And yes, this system has a full PRI attached via a Sangoma A101 card. I was using a Digium TE210P originally, and replaced it in attempts to isolate the problem. The issue remains no matter what hardware I've tried so far.



gvv
Posts: 29
Member Since:
2008-01-31
Stuck with my own lack of knowledge

As I'm rather new to linux / trixbox I wouldn't even dream right now about building my own patched asterisk 1.4.19 with BRISTUFF, so I guess it's better for me to go back to trixbox 2.2

Do you know what I'm going to miss in regards to functionality?



SchlingBlade
Posts: 114
Member Since:
2007-11-29
If you can survive with the

If you can survive with the current config, it might be worth waiting to see if Trixbox will come out with a new package for asterisk based off of 1.4.19.

2.2 you go backwards with FreePBX. I've never run 2.2 in production, so I'm not too familiar with it. I know it's using Asterisk 1.2, and older software in general.



gvv
Posts: 29
Member Since:
2008-01-31
Also need zaptel/bristuff

For my QuadBRI I'm also bound to the manufacturer creating a new bristuffed version of asterisk. As I understand it, it's a custom patched version of Asterisk, so it would not allow me to upgrade if I don't have a new patched version.

For now I'm working with the bristuffed version from here:
http://updates.xorcom.com/astribank/bristuff/1.4/

this has just recently been patched to v1.4.18 but the
install-ZAPHFC and junghanns are both still on v1.4.17



an3s
Posts: 1
Member Since:
2008-02-08
Having the same problem here

I've been battling this same problem for months now: Trixbox 2.6.1.1 with Junghanns QuadBRI and Bristuff on a 6 line PTP KPN connection (20 users, about 400 calls a day). There's another 2 line PTMP KPN connection on the 4th BRI interface but that is functioning just fine

Channels randomly NOT hungup several times a day on the PTP lines. When these channels are not closed the webinterface becomes very slow and unresponsive but CPU load stays down. Haven't had any crashes that I can relate to this issue. No relevant information in debug logs.

Recently I rebuild asterisk to version 1.4.21.2 with Bristuff 0.4.0-RC3c but problem remains.

I wonder if the OP ever got his problem solved. This one is freaking me out.

Gonna rebuild soon on another machine with mISDN to test if this will eliminate the problem. If so, it might just be another bristuff bug.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.