Discussion:
[Unbound-users] AAAA filter patch proposal
Stephane Lapie
2014-11-13 03:08:10 UTC
Permalink
Hello,

There was a post several days ago about AAAA filter, and questions about
an implementation as a Python module by Christophe.

https://unbound.net/pipermail/unbound-users/2014-October/003579.html

I work at the same company as him (a Japanese ISP, which are all
subjected to NTT's flaky practices with use of IPv6), and have been
working with him on this issue.

To sum up the scenario we are trying to fix :
- customers in Japan have a physical carrier (NTT in most cases) on top
of which they get their own internet provider, to which they connect via
PPPoE. In our case, we currently provide only IPv4 service at this point
in time.
- some customers also get an on-demand video service via his carrier,
NTT, which give them a default IPv6 route via IPoE, or they raise a
second PPPoE session (NTT usage terms allow for up to 2 sessions for
this scenario).
- the catch: said IPv6 default route does not go outside on the
internet, and only enables access to NTT's closed network. Therefore, a
customer in this configuration trying to access any IPv6 site is in for
a world of pain as his browser times out and retries, hopefully
fallbacking to IPv4. This prompted every Internet service provider in
Japan to either provide native IPv6 or to filter AAAA records for
"non-v6 only sites", which in this instance pretty much means everyone
uses BIND.


To answer Bill Manning's earlier statement, "we can not change
providers", first reason being because we ARE a provider, but also
because even if we wanted to change carriers, everyone in Japan is
entirely dependant on the majority physical carrier, i.e NTT. Since it
happens at a lower layer than the ones we have control over (we work at
PPPoE encapsulated level, they work at Ethernet level), we have no
control whatsoever over said carrier-provided route, short of ourselves
providing IPv6 service over or PPPoE and an overriding route, or via
IPoE (and then tell NTT to buzz off and provide our route, if we had
one). This is obviously scheduled as the proper solution, but requires
overhauling all of the network infrastructure, which can not be done
instantly.

Also, thanks a lot to Daisuke Higashi for his statement, using
"private-address/private-domain" is initially what we planned on doing,
except this gets scary when we think about "what if NTT springs yet
another domain on top of that, that we need to allow access to?" or
"what if another customer tries to access yet another IPv6-only site in
the future?", and the "whack-a-mole" administration nightmare it might
become.



In the meantime, we still need to get rid of BIND, which can't handle
the resource exhaustion DDoS attacks
(http://dnsamplificationattacks.blogspot.jp/2014/02/authoritative-name-server-attack.html) we are seeing since february of this year. This is where we wanted to go with unbound, except since it does not have AAAA-filter functionality, we could not use it in production for most of our customers.

This is where Christophe attempted to intercept queries with Python and
we found out :
- Python API does not enable to spawn sub queries (for each AAAA query,
the relevant A record has to exist in cache, for AAAA filter logic to
work)
- Python API does not enable to lookup RRset cache for a given record
(in this case, for an A record matching the queried name)
- Python API does not allow for easy scrubbing of packets (it IS
possible, but very painful)



I therefore came to the conclusion that the Python API was not
appropriate to do these things, and that the most appropriate place to
implement a filtering/scrubbing logic was the iterator module itself.

I coded the following patch (also attached) :
http://www.yomi.darkbsd.org/~darksoul/aaaa-filter-iterator.patch
It has been on tests and running in pre-production for roughly a month
now (while undergoing some tuning as I got around to understanding how
the state machine works).

The patch provides :
- a "aaaa-filter" config option which is off by default, so as to not be
intrusive (I am fully aware this functionality is enough of an
abomination as is). It can also be used in conjunction to
private-address/private-domain without any issues.
- the relevant manual entry
- modifications to iterator/iter_scrub.c in scrub_sanitize() to remove
AAAA records for queries that either are not AAAA type, OR that did
return an A record, IF cfg->aaaa_filter is enabled.
- modifications to iterator/iter_utils.c to provide AAAA filter "on/off"
info to the iterator environment.
- modifications to iterator/iterator.c :
-- a new ASN_FETCH_A_FOR_AAAA_STATE from which we branch into from
QUERYTARGETS_STATE if this is a AAAA query (modifies iter_handle(),
iter_state_to_string(), iter_state_is_responsestate(),
-- asn_processQueryAAAA() function that throws a subquery and flags the
parent query as "already fetching an A subquery" so as to not loop
-- modifications to iter_inform_super() to handle the new state for AAAA
parent queries having a A subquery.
-- asn_processAAAAResponse() function that basically takes after what
error_supers() and processTargetResponse() do, except it does not alter
target queries counters.
- modifications to iterator/iterator.h : declaration of new flags for
iter_env (configuration option), iter_qstate (status flag), and the new
iter_state



At this point, the patch is pretty much stable and performing as
expected, but I am looking for pointers as to stuff I could improve on
that patch, especially style-wise, to ensure it is applicable as long as
possible. In its current state, I can apply it up to 1.4.22.

I also know from previous postings that unbound development staff's
opinion is that this functionality as a whole would harm IPv6 adoption,
and therefore can probably not be officially endorsed, but I still
intend to provide it freely (my company has given approval) to people
suffering from the same scenario. (that is, mainly Japanese users at
this point...)

Thanks for your time,
--
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
W.C.A. Wijngaards
2014-11-13 09:10:51 UTC
Permalink
Hi Stephane,
Post by Stephane Lapie
Hello,
There was a post several days ago about AAAA filter, and questions
about an implementation as a Python module by Christophe.
https://unbound.net/pipermail/unbound-users/2014-October/003579.html
I work at the same company as him (a Japanese ISP, which are all
subjected to NTT's flaky practices with use of IPv6), and have
been working with him on this issue.
The patch looks to have nice clean code.

If you are looking for feedback on the code, this is what I can find:
iterator.h, comment for fetch_a_for_aaaa is misleading: say that a
subquery has been made for fetching A records. It now seems as if the
flag is set in the subquery, but it is set in the superquery (to avoid
asking twice).

iterator.c, asn_processAAAAResponse: this routine can be shortened, I
think. After changing the super_iq->state and log_query_info lines,
it can simply return. However, the current code does not fail either
; it might be more 'optimal' and save the statemachine some work.

Thank you for publishing the patch. Are you all right if I put this
patch in the source contrib/ directory to make it more easily
available to the users?

We don't provide support for the contrib material, but it may be
useful for users in weird circumstances.

Best regards,
Wouter
Post by Stephane Lapie
To sum up the scenario we are trying to fix : - customers in Japan
have a physical carrier (NTT in most cases) on top of which they
get their own internet provider, to which they connect via PPPoE.
In our case, we currently provide only IPv4 service at this point
in time. - some customers also get an on-demand video service via
his carrier, NTT, which give them a default IPv6 route via IPoE, or
they raise a second PPPoE session (NTT usage terms allow for up to
2 sessions for this scenario). - the catch: said IPv6 default route
does not go outside on the internet, and only enables access to
NTT's closed network. Therefore, a customer in this configuration
trying to access any IPv6 site is in for a world of pain as his
browser times out and retries, hopefully fallbacking to IPv4. This
prompted every Internet service provider in Japan to either provide
native IPv6 or to filter AAAA records for "non-v6 only sites",
which in this instance pretty much means everyone uses BIND.
To answer Bill Manning's earlier statement, "we can not change
providers", first reason being because we ARE a provider, but also
because even if we wanted to change carriers, everyone in Japan is
entirely dependant on the majority physical carrier, i.e NTT. Since
it happens at a lower layer than the ones we have control over (we
work at PPPoE encapsulated level, they work at Ethernet level), we
have no control whatsoever over said carrier-provided route, short
of ourselves providing IPv6 service over or PPPoE and an overriding
route, or via IPoE (and then tell NTT to buzz off and provide our
route, if we had one). This is obviously scheduled as the proper
solution, but requires overhauling all of the network
infrastructure, which can not be done instantly.
Also, thanks a lot to Daisuke Higashi for his statement, using
"private-address/private-domain" is initially what we planned on
doing, except this gets scary when we think about "what if NTT
springs yet another domain on top of that, that we need to allow
access to?" or "what if another customer tries to access yet
another IPv6-only site in the future?", and the "whack-a-mole"
administration nightmare it might become.
In the meantime, we still need to get rid of BIND, which can't
handle the resource exhaustion DDoS attacks
(http://dnsamplificationattacks.blogspot.jp/2014/02/authoritative-name-server-attack.html)
we are seeing since february of this year. This is where we wanted to go
Post by Stephane Lapie
with unbound, except since it does not have AAAA-filter
functionality, we could not use it in production for most of our
customers.
This is where Christophe attempted to intercept queries with Python
and we found out : - Python API does not enable to spawn sub
queries (for each AAAA query, the relevant A record has to exist in
cache, for AAAA filter logic to work) - Python API does not enable
to lookup RRset cache for a given record (in this case, for an A
record matching the queried name) - Python API does not allow for
easy scrubbing of packets (it IS possible, but very painful)
I therefore came to the conclusion that the Python API was not
appropriate to do these things, and that the most appropriate place
to implement a filtering/scrubbing logic was the iterator module
itself.
http://www.yomi.darkbsd.org/~darksoul/aaaa-filter-iterator.patch It
has been on tests and running in pre-production for roughly a
month now (while undergoing some tuning as I got around to
understanding how the state machine works).
The patch provides : - a "aaaa-filter" config option which is off
by default, so as to not be intrusive (I am fully aware this
functionality is enough of an abomination as is). It can also be
used in conjunction to private-address/private-domain without any
issues. - the relevant manual entry - modifications to
iterator/iter_scrub.c in scrub_sanitize() to remove AAAA records
for queries that either are not AAAA type, OR that did return an A
record, IF cfg->aaaa_filter is enabled. - modifications to
iterator/iter_utils.c to provide AAAA filter "on/off" info to the
iterator environment. - modifications to iterator/iterator.c : -- a
new ASN_FETCH_A_FOR_AAAA_STATE from which we branch into from
QUERYTARGETS_STATE if this is a AAAA query (modifies
iter_handle(), iter_state_to_string(),
iter_state_is_responsestate(), -- asn_processQueryAAAA() function
that throws a subquery and flags the parent query as "already
fetching an A subquery" so as to not loop -- modifications to
iter_inform_super() to handle the new state for AAAA parent queries
having a A subquery. -- asn_processAAAAResponse() function that
basically takes after what error_supers() and
processTargetResponse() do, except it does not alter target queries
counters. - modifications to iterator/iterator.h : declaration of
new flags for iter_env (configuration option), iter_qstate (status
flag), and the new iter_state
At this point, the patch is pretty much stable and performing as
expected, but I am looking for pointers as to stuff I could improve
on that patch, especially style-wise, to ensure it is applicable as
long as possible. In its current state, I can apply it up to
1.4.22.
I also know from previous postings that unbound development
staff's opinion is that this functionality as a whole would harm
IPv6 adoption, and therefore can probably not be officially
endorsed, but I still intend to provide it freely (my company has
given approval) to people suffering from the same scenario. (that
is, mainly Japanese users at this point...)
Thanks for your time,
_______________________________________________ Unbound-users
http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users
Stephane LAPIE
2014-11-13 10:30:42 UTC
Permalink
Hello Wouter,

Many thanks for the quick reply and review.

Right, that comment in iterator.h is a remnant of something I initially
planned on doing, and changed ideas along the road.
I just fixed it.

As for processAAAAResponse, I wrote it this way to get a verbose trace
of what is going on should it fail. (based mainly on how the other
handler functions operate)
Actually, thinking back, the name for this function is probably not the
best...

Of course, I am totally fine with contributing this patch.
Though, I was just wondering if there is a show-stopper in integrating
it in the main code, since I provide an option to use this behavior or
not, and this is made as to not impact default behavior. (Of course,
this would add one config/environment flag check per query at execution)

Now, I do understand this is one feature you are not exactly keen with
in the first place, let alone provide support for it.
However, if I can further brush up the code to make it seaworthy for the
main repository, I am fine with pulling in the effort.

I understand a lot of Japanese users would be extremely thankful for
easy availability of this feature via their favorite distribution,
instead of manual building/packaging.
(Of course, there would remain the option of applying the contrib/ patch
at distro packaging level, like Debian and *BSD do, but this would
multiply efforts)

Cheers,
Post by W.C.A. Wijngaards
Hi Stephane,
Post by Stephane Lapie
Hello,
There was a post several days ago about AAAA filter, and questions
about an implementation as a Python module by Christophe.
https://unbound.net/pipermail/unbound-users/2014-October/003579.html
I work at the same company as him (a Japanese ISP, which are all
subjected to NTT's flaky practices with use of IPv6), and have
been working with him on this issue.
The patch looks to have nice clean code.
iterator.h, comment for fetch_a_for_aaaa is misleading: say that a
subquery has been made for fetching A records. It now seems as if the
flag is set in the subquery, but it is set in the superquery (to avoid
asking twice).
iterator.c, asn_processAAAAResponse: this routine can be shortened, I
think. After changing the super_iq->state and log_query_info lines,
it can simply return. However, the current code does not fail either
; it might be more 'optimal' and save the statemachine some work.
Thank you for publishing the patch. Are you all right if I put this
patch in the source contrib/ directory to make it more easily
available to the users?
We don't provide support for the contrib material, but it may be
useful for users in weird circumstances.
Best regards,
Wouter
Post by Stephane Lapie
To sum up the scenario we are trying to fix : - customers in Japan
have a physical carrier (NTT in most cases) on top of which they
get their own internet provider, to which they connect via PPPoE.
In our case, we currently provide only IPv4 service at this point
in time. - some customers also get an on-demand video service via
his carrier, NTT, which give them a default IPv6 route via IPoE, or
they raise a second PPPoE session (NTT usage terms allow for up to
2 sessions for this scenario). - the catch: said IPv6 default route
does not go outside on the internet, and only enables access to
NTT's closed network. Therefore, a customer in this configuration
trying to access any IPv6 site is in for a world of pain as his
browser times out and retries, hopefully fallbacking to IPv4. This
prompted every Internet service provider in Japan to either provide
native IPv6 or to filter AAAA records for "non-v6 only sites",
which in this instance pretty much means everyone uses BIND.
To answer Bill Manning's earlier statement, "we can not change
providers", first reason being because we ARE a provider, but also
because even if we wanted to change carriers, everyone in Japan is
entirely dependant on the majority physical carrier, i.e NTT. Since
it happens at a lower layer than the ones we have control over (we
work at PPPoE encapsulated level, they work at Ethernet level), we
have no control whatsoever over said carrier-provided route, short
of ourselves providing IPv6 service over or PPPoE and an overriding
route, or via IPoE (and then tell NTT to buzz off and provide our
route, if we had one). This is obviously scheduled as the proper
solution, but requires overhauling all of the network
infrastructure, which can not be done instantly.
Also, thanks a lot to Daisuke Higashi for his statement, using
"private-address/private-domain" is initially what we planned on
doing, except this gets scary when we think about "what if NTT
springs yet another domain on top of that, that we need to allow
access to?" or "what if another customer tries to access yet
another IPv6-only site in the future?", and the "whack-a-mole"
administration nightmare it might become.
In the meantime, we still need to get rid of BIND, which can't
handle the resource exhaustion DDoS attacks
(http://dnsamplificationattacks.blogspot.jp/2014/02/authoritative-name-server-attack.html)
we are seeing since february of this year. This is where we wanted to go
Post by Stephane Lapie
with unbound, except since it does not have AAAA-filter
functionality, we could not use it in production for most of our
customers.
This is where Christophe attempted to intercept queries with Python
and we found out : - Python API does not enable to spawn sub
queries (for each AAAA query, the relevant A record has to exist in
cache, for AAAA filter logic to work) - Python API does not enable
to lookup RRset cache for a given record (in this case, for an A
record matching the queried name) - Python API does not allow for
easy scrubbing of packets (it IS possible, but very painful)
I therefore came to the conclusion that the Python API was not
appropriate to do these things, and that the most appropriate place
to implement a filtering/scrubbing logic was the iterator module
itself.
http://www.yomi.darkbsd.org/~darksoul/aaaa-filter-iterator.patch It
has been on tests and running in pre-production for roughly a
month now (while undergoing some tuning as I got around to
understanding how the state machine works).
The patch provides : - a "aaaa-filter" config option which is off
by default, so as to not be intrusive (I am fully aware this
functionality is enough of an abomination as is). It can also be
used in conjunction to private-address/private-domain without any
issues. - the relevant manual entry - modifications to
iterator/iter_scrub.c in scrub_sanitize() to remove AAAA records
for queries that either are not AAAA type, OR that did return an A
record, IF cfg->aaaa_filter is enabled. - modifications to
iterator/iter_utils.c to provide AAAA filter "on/off" info to the
iterator environment. - modifications to iterator/iterator.c : -- a
new ASN_FETCH_A_FOR_AAAA_STATE from which we branch into from
QUERYTARGETS_STATE if this is a AAAA query (modifies
iter_handle(), iter_state_to_string(),
iter_state_is_responsestate(), -- asn_processQueryAAAA() function
that throws a subquery and flags the parent query as "already
fetching an A subquery" so as to not loop -- modifications to
iter_inform_super() to handle the new state for AAAA parent queries
having a A subquery. -- asn_processAAAAResponse() function that
basically takes after what error_supers() and
processTargetResponse() do, except it does not alter target queries
counters. - modifications to iterator/iterator.h : declaration of
new flags for iter_env (configuration option), iter_qstate (status
flag), and the new iter_state
At this point, the patch is pretty much stable and performing as
expected, but I am looking for pointers as to stuff I could improve
on that patch, especially style-wise, to ensure it is applicable as
long as possible. In its current state, I can apply it up to
1.4.22.
I also know from previous postings that unbound development
staff's opinion is that this functionality as a whole would harm
IPv6 adoption, and therefore can probably not be officially
endorsed, but I still intend to provide it freely (my company has
given approval) to people suffering from the same scenario. (that
is, mainly Japanese users at this point...)
Thanks for your time,
_______________________________________________ Unbound-users
http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users
_______________________________________________
Unbound-users mailing list
http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users
--
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
--
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
Loading...