Bookmark and Share
RSS

Recent Posts

Are You Being Naughty?

October 14, 2013

I have an AIX server—on the Internet—and I have been naughty! Shame on me!

My intent is that this server  is  “open” just enough so "random" activity looking for servers to breach does not take it down. I say "random" because I doubt my ISP would be happy if I were the target of directed or sustained attacks. So, I try not to be too inviting.

So, how have I been naughty? By being complacent about an excessive amount of processor usage by the named9 process. What was my bad behavior? Noticing, but choosing to ignore this “cry for help.” Naughty me.

What I should have done?

In an open environment, by definition, an eye-catching activity should activate my attention (curiosity) sufficiently that I look for a root cause. Finally, after weeks of complacency a different unusual activity prodded me into action.

At the start of a system maintenance cycle I changed the default route from the external interface to the internal interface, effectively removing its active or open connection to the Internet. However, after this change I continued to see a high amount of incoming packets on my Internet-facing interface. My expectation was that the interface would quickly go idle, but it did not. Finally, alarm bells went off in my head.

How could this be happening?!

The interface to the “open” (NAT router) interface was still active so incoming packets were still coming in. What was lacking from my side was a route to respond to the incoming packets. Complacency aside—awake finally—I started tcpdump and saw numerous incoming UDP packets directed at my named service.

00:00:00.000000 IP XXX.4.173.222.54122 > XXX.168.2.1.53:  10754+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000584 IP XXX.168.2.1.53 > XXX.4.173.222.54122:  10754- 0/13/16 (529)
00:00:00.020531 IP XXX.210.112.24.8865 > XXX.168.2.1.53:  26644+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000385 IP XXX.168.2.1.53 > XXX.210.112.24.8865:  26644- 0/13/16 (529)
00:00:00.005970 IP XXX.210.112.24.9471 > XXX.168.2.1.53:  11736+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000361 IP XXX.168.2.1.53 > XXX.210.112.24.9471:  11736- 0/13/16 (529)
00:00:00.027968 IP XXX.4.173.222.29306 > XXX.168.2.1.53:  22637+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000422 IP XXX.168.2.1.53 > XXX.4.173.222.29306:  22637- 0/13/16 (529)
00:00:00.000650 IP XXX.210.112.24.50321 > XXX.168.2.1.53:  48240+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000361 IP XXX.168.2.1.53 > XXX.210.112.24.50321:  48240- 0/13/16 (529)
00:00:00.088836 IP XXX.4.173.222.11542 > XXX.168.2.1.53:  50151+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000381 IP XXX.168.2.1.53 > XXX.4.173.222.11542:  50151- 0/13/16 (529)
00:00:00.003490 IP XXX.4.173.222.19393 > XXX.168.2.1.53:  8824+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000142 IP XXX.4.173.222.28585 > XXX.168.2.1.53:  20359+ [1au] A (QM)? fkfkfkfa.com. (41)

Feeling Guilty …

I realized I should have suspected something long ago as I had wondered why the named process was frequently the largest consumer of CPU cycles. Now I had a possible explanation for this unexpected load. Fortunately, I did not have to pay dearly for my lack of attention: The result was neither really extreme nor damaging to what was supposed to be happening on the server. However, I realize this could have been traumatic.

What was the root cause of failure here?

I made an assumption and accepted the unusual behavior rather than investigating it. Another way to describe this: I was being lazy. This is my wake-up call—“Do as I preach!” (FYI: I had accepted the high CPU usage as a result of queries being made by tests I was running on other servers. Now, in hindsight, I realize I was being naïve.)

Taking Action

As this is a well-defined situation: protocol UDP, port number 53, I used smit to create an ipsec filter rule that would "shun any host" that tried to query my server using UDP. In other words, dynamically block any host (IP address) that sends a packet to port 53 using UDP incoming on en0 interface (with address XXX.168.2.1). For now, I’m leaving TCP port open. I may need to add a rule to block TCP requests that start “outside” while permitting TCP queries originating on my server to continue.

The effective command created is:

/usr/sbin/genfilt -v 4  -a 'H’ \
  -s '0.0.0.0' -m '0.0.0.0' -d '192.168.2.1' -M '255.255.255.255' \
  -c 'udp' -o 'any' -O 'eq' -P '53' -r 'B' -w 'I' -l 'N' -t '0' -i 'en0'

Explanation:

Keyword

Description

Genfilt

generate filter

-v 4

IP Version 4

-a 'H'

Action shun host

-s '0.0.0.0' -m '0.0.0.0'

Any Source Addresses -m 0.0.0.0 will match any source address because the match pattern is 0 bits

-d '192.168.2.1' ‑M '255.255.255.255'

Only this destination address because -M 255.255.255.255 is 32-bit match

-c udp

Protocol udp

-o 'any'

Any source port number

-O 'eq' -P '53'

Destination port number equals 53

-r 'L'

Packets destined/starting from local host

-w 'I'

Direction Incoming

-l 'N'

No logginh

-t 0

Not a tunnel (t0 is not a tunnel)

-i 'en0'

Interface en0

Immediate Results

About 15 minutes after first activating the packet filter, I had 60 dynamically generated deny hosts rules similar to below. When I checked a day later, the count was down to about 30. When I started writing this blog post (after about three days running) the count was down to one.

Since I had not saved any data, I was afraid I would need some time to get some new data for my examples. I need not have feared. The moment I turned ipsec off, activity jumped as if someone was watching. One positive denial (a response) was enough to wake up others. Result: immediately I had the data I needed to write this blog post. Remember, there was only one IP address being actively blocked but within seconds there were three systems making DNS requests (see results). And as I finish the post, the number of systems continued to grow even though no more replies were being sent—currently holding steady between five and nine dynamic deny rules. My guess is that these systems share with one another, and once one reports success more start trying.

The initial collection was direct to screen (CLI) using:


michael@x054:[/home/michael]tcpdump -i en0

When I saw the predominate activity on UDP port 53, I started collecting data to /tmp/named.tcpdump with the following command:

michael@x054:[/home/michael]tcpdump -U -w /tmp/named.tcpdump -i en0 -n -ttt port 53 &

With the data being stored I could experiment with different flags to get a better understanding of my system behavior. I took a liking to the tcpdump –ttt option.

Word to the wise: Do not forget to kill tcpdump -w after you have collected enough data.

michael@x054:[/home/michael]tcpdump -r /tmp/named.tcpdump -i en0 -n -ttt port 53 |\
   sed -e "s/ [0-9]*\./ XXX./g" | head -15

reading from file /tmp/named.tcpdump, link-type 1
00:00:00.000000 IP XXX.4.173.222.54122 > XXX.168.2.1.53:  10754+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000584 IP XXX.168.2.1.53 > XXX.4.173.222.54122:  10754- 0/13/16 (529)
00:00:00.020531 IP XXX.210.112.24.8865 > XXX.168.2.1.53:  26644+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000385 IP XXX.168.2.1.53 > XXX.210.112.24.8865:  26644- 0/13/16 (529)
00:00:00.005970 IP XXX.210.112.24.9471 > XXX.168.2.1.53:  11736+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000361 IP XXX.168.2.1.53 > XXX.210.112.24.9471:  11736- 0/13/16 (529)
00:00:00.027968 IP XXX.4.173.222.29306 > XXX.168.2.1.53:  22637+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000422 IP XXX.168.2.1.53 > XXX.4.173.222.29306:  22637- 0/13/16 (529)
00:00:00.000650 IP XXX.210.112.24.50321 > XXX.168.2.1.53:  48240+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000361 IP XXX.168.2.1.53 > XXX.210.112.24.50321:  48240- 0/13/16 (529)
00:00:00.088836 IP XXX.4.173.222.11542 > XXX.168.2.1.53:  50151+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000381 IP XXX.168.2.1.53 > XXX.4.173.222.11542:  50151- 0/13/16 (529)
00:00:00.003490 IP XXX.4.173.222.19393 > XXX.168.2.1.53:  8824+ [1au] A (QM)? fkfkfkfa.com. (41)
00:00:00.000142 IP XXX.4.173.222.28585 > XXX.168.2.1.53:  20359+ [1au] A (QM)? fkfkfkfa.com. (41)

I wish I could give an accurate description of what this output means. Unfortunately, I’m not certain. From  what I could find I became concerned that these queries are meant to poison my DNS process. What jumps out is the repeated queries on the domain fkfkfkfk.com. What I am not showing (for brevity) is the sudden jump in activity from several locations within seconds of deactivating the filter!

Monitoring Continuing Activity

The command below gives me an indication of how busy the outside is by counting the number of deny filters the shun-host action is creating dynamically. Notice that the -a flag needed for the lsfilt command to see the dynamic, active filters:

michael@x054:[/home/michael]lsfilt -v4 -O -a | grep deny

2|deny|XXX.112.51.63|255.255.255.255|0.0.0.0|0.0.0.0|yes|all|any|0|any|0|both|inbound|no|all packets|0|en0|~314|||
3|deny|XXX.4.173.222|255.255.255.255|0.0.0.0|0.0.0.0|yes|all|any|0|any|0|both|inbound|no|all packets|0|en0|~184|||
4|deny|XXX.210.112.24|255.255.255.255|0.0.0.0|0.0.0.0|yes|all|any|0|any|0|both|inbound|no|all packets|0|en0|~184|||

Although my prompt (michael@x054) may not look like it, I have been doing all of these commands with an euid of 0 (aka superuser): tcpdump -U -w ... genfilt ... and (not shown here)  mkfilt -v4 -u - to activate the changes to the rules in the ODM. If you want to clear all active rules, the simplest way is to deactivate then reactivate the ipsec_v4 device using these two commands:

rmdev -l ipsec_v4

mkdev -l ipsec_v4

If you know enough to explain the meaning of the tcpdump example above, please share that knowledge in a comment, and also, maybe suggest a book or a site for people like me to learn more!

Posted October 14, 2013| Permalink

Post a Comment

Note: Comments are moderated and will not appear until approved

comments powered by Disqus