Discussion:
[Unbound-users] unbound-1.5.0 crash(arc4random) on rhel7
Jarno Huuskonen
2014-11-20 18:18:31 UTC
Permalink
Hi,

Just started testing unbound-1.5.0 on rhel7 and I'm getting random
crashes. Couple of times unbound crashed when run with gdb (run -d):
on those times it crashed in arc4random:
#0 arc4random () at compat/arc4random.c:220
#1 0x00007f5261e39075 in arc4random_uniform (upper_bound=29768)
at compat/arc4random_uniform.c:51
#2 0x00007f5261e397a3 in ub_random_max (x=<optimized out>,
state=<optimized out>) at util/random.c:110
...

compile options:
%configure --with-libevent --with-pthreads --with-ssl \
--disable-rpath --enable-debug --disable-static \
--with-conf-file=%{_sysconfdir}/%{name}/unbound.conf \
--with-pidfile=%{_localstatedir}/run/%{name}/%{name}.pid \
--with-rootkey-file=%{_sharedstatedir}/unbound/root.key \
--enable-sha2 --disable-gost

unbound.conf has: num-threads: 2

The crash usually happens when I use a simple test ruby script that
send queries(with few hundred threads).

Any ideas how I could debug this further ? Or can I provide
more information if somebody else could look into this ?

-Jarno
--
Jarno Huuskonen
W.C.A. Wijngaards
2014-11-21 08:23:41 UTC
Permalink
Hi Jarno,
Post by Jarno Huuskonen
Hi,
Just started testing unbound-1.5.0 on rhel7 and I'm getting random
crashes. Couple of times unbound crashed when run with gdb (run
-d): on those times it crashed in arc4random: #0 arc4random () at
compat/arc4random.c:220 #1 0x00007f5261e39075 in
arc4random_uniform (upper_bound=29768) at
compat/arc4random_uniform.c:51 #2 0x00007f5261e397a3 in
ub_random_max (x=<optimized out>, state=<optimized out>) at
util/random.c:110 ...
compile options: %configure --with-libevent --with-pthreads
--with-ssl \ --disable-rpath --enable-debug --disable-static \
--with-conf-file=%{_sysconfdir}/%{name}/unbound.conf \
--with-pidfile=%{_localstatedir}/run/%{name}/%{name}.pid \
--with-rootkey-file=%{_sharedstatedir}/unbound/root.key \
--enable-sha2 --disable-gost
unbound.conf has: num-threads: 2
The crash usually happens when I use a simple test ruby script
that send queries(with few hundred threads).
There is a race condition in the new arc4random fallback code.
Post by Jarno Huuskonen
Any ideas how I could debug this further ? Or can I provide more
information if somebody else could look into this ?
This is the bugfix for it:

Index: compat/arc4_lock.c
===================================================================
- --- compat/arc4_lock.c (revision 3276)
+++ compat/arc4_lock.c (working copy)
@@ -53,8 +53,10 @@

void _ARC4_LOCK(void)
{
- - if(!arc4lockinit)
+ if(!arc4lockinit) {
+ arc4lockinit = 1;
lock_quick_init(&arc4lock);
+ }
lock_quick_lock(&arc4lock);
}


Best regards,
Wouter
Jarno Huuskonen
2014-11-21 12:17:37 UTC
Permalink
Hi Wouter,

Thanks, that was fast :)
Post by W.C.A. Wijngaards
There is a race condition in the new arc4random fallback code.
Is this fallback used when the system (g)libc doesn't have arc4random ?
Post by W.C.A. Wijngaards
Post by Jarno Huuskonen
Any ideas how I could debug this further ? Or can I provide more
information if somebody else could look into this ?
...

Looks good, I ran my test scripts for about 15mins w/out crashes.

-Jarno
--
Jarno Huuskonen
W.C.A. Wijngaards
2014-11-21 12:35:54 UTC
Permalink
Hi Jarno,
Post by Jarno Huuskonen
Hi Wouter,
Thanks, that was fast :)
Post by W.C.A. Wijngaards
There is a race condition in the new arc4random fallback code.
Is this fallback used when the system (g)libc doesn't have
arc4random ?
Yes, to get good random numbers, and it was changed in the 1.5.0 release.
Post by Jarno Huuskonen
Post by W.C.A. Wijngaards
Post by Jarno Huuskonen
Any ideas how I could debug this further ? Or can I provide
more information if somebody else could look into this ?
...
Looks good, I ran my test scripts for about 15mins w/out crashes.
Thank you!

Best regards,
Wouter

Loading...