Updates
- 01-27ter: rp_filter information added
- 01-27bis: Note that the 9.04 box works on a different interface.
- 01-27: Added interface configuration information and analysis of a packet.
Original Post
I've got two extremely similar hardware configurations (SuperMicro 1U systems with dual Xeon CPUs and two Ethernet ports on board), one running Ubuntu 8.04 (Linux 2.6.24-26-server) and one runing Ubuntu 9.04 (Linux 2.6.28-17-server). These both have eth1 conne开发者_如何学JAVActed to the same network on which various other servers are sending broadcast UDP packets to various ports. On both hosts, using tcpdump on eth1, I can see these broadcast UDP packets arriving.
However, while on the 8.04 box I can have a simple program listen to them just fine, on the 9.04 box an identical program never receives them. As a high-level overview, here is a sample Haskell program that works on one but not the other (using identical versions of GHC on both):
import Network.Socket
port = 5515
main :: IO ()
main = do
do sock <- socket AF_INET Datagram defaultProtocol
bindSocket sock $ SockAddrInet (fromIntegral port) iNADDR_ANY
loop sock
where
loop sock =
do msg <- recv sock 2048
print msg
loop sock
In case the issue happened to be something very strange in GHC (though it's an identical build on both), I wrote a C program to do the same thing:
#include <arpa/inet.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#define BUFLEN 512
#define NPACK 10
#define PORT 5515
void diep(char *s)
{
perror(s);
exit(1);
}
void showb(int s) {
int val, len, retval;
len = sizeof(val);
retval = getsockopt(s, SOL_SOCKET, SO_BROADCAST, &val, &len);
printf("showb retval=%d val=%d\n", retval, val);
}
int main(int argc, char **argv)
{
struct sockaddr_in si_me, si_other;
int s, i, slen=sizeof(si_other);
char buf[BUFLEN];
if ((s=socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP))==-1)
diep("socket");
showb(s);
i = 1;
if (setsockopt(s, SOL_SOCKET, SO_BROADCAST, &i, sizeof(i))==-1)
diep("setsockopt");
showb(s);
memset((char *) &si_me, 0, sizeof(si_me));
si_me.sin_family = AF_INET;
si_me.sin_port = htons(PORT);
si_me.sin_addr.s_addr = htonl(INADDR_ANY);
if (bind(s, &si_me, sizeof(si_me))==-1)
diep("bind");
puts("Listening.");
for (i=0; i<NPACK; i++) {
if (recvfrom(s, buf, BUFLEN, 0, &si_other, &slen)==-1)
diep("recvfrom()");
printf("Received packet from %s:%d\nData: %s\n\n",
inet_ntoa(si_other.sin_addr), ntohs(si_other.sin_port), buf);
}
close(s);
return 0;
}
You'll note that, just for fun in this case, I'm also turning on the SO_BROADCAST flag on the socket, and confirming that it gets turned on, though it makes no difference to the behaviour of the program, which is the same. Even if I copy the binary built on 8.04 to the 9.04 box, or vice versa, in all cases the program running on the 8.04 box sees the UDP broadcast packets and the 9.04 box does not.
What am I doing wrong?
Update 01-27:
Here's the output of ip link and ip ether for the working (8.04) host:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1362 qdisc pfifo_fast qlen 1000
link/ether 00:30:48:d3:4b:06 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:30:48:d3:4b:07 brd ff:ff:ff:ff:ff:ff
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1362 qdisc pfifo_fast qlen 1000
link/ether 00:30:48:d3:4b:06 brd ff:ff:ff:ff:ff:ff
inet 192.168.228.130/28 brd 192.168.228.143 scope global eth0
inet6 fe80::230:48ff:fed3:4b06/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:30:48:d3:4b:07 brd ff:ff:ff:ff:ff:ff
inet 172.40.4.130/24 brd 172.40.4.255 scope global eth1
inet6 fe80::230:48ff:fed3:4b07/64 scope link
valid_lft forever preferred_lft forever
And for the non-working (9.04) server:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1362 qdisc pfifo_fast state UP qlen 1000
link/ether 00:30:48:d9:38:da brd ff:ff:ff:ff:ff:ff
3: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:1b:21:36:19:fd brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 100
link/ether 00:30:48:d9:38:db brd ff:ff:ff:ff:ff:ff
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1362 qdisc pfifo_fast state UP qlen 1000
link/ether 00:30:48:d9:38:da brd ff:ff:ff:ff:ff:ff
inet 192.168.228.132/28 brd 192.168.228.143 scope global eth0
inet6 fe80::230:48ff:fed9:38da/64 scope link
valid_lft forever preferred_lft forever
3: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:1b:21:36:19:fd brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 100
link/ether 00:30:48:d9:38:db brd ff:ff:ff:ff:ff:ff
inet 172.40.4.132/24 brd 172.40.4.255 scope global eth1
inet6 fe80::230:48ff:fed9:38db/64 scope link
valid_lft forever preferred_lft forever
Note that for both cases, eth1 is the port on which the broadcasts are arriving.
Here is a full decode (from tshark on the non-working 9.04 server) of a sample broadcast packet that the program is not receiving:
Frame 193555 (271 bytes on wire, 271 bytes captured)
Arrival Time: Jan 25, 2010 08:00:00.535345000
[Time delta from previous captured frame: 0.001508000 seconds]
[Time delta from previous displayed frame: 0.000000000 seconds]
[Time since reference or first frame: 6590.956186000 seconds]
Frame Number: 193555
Frame Length: 271 bytes
Capture Length: 271 bytes
[Frame is marked: False]
[Protocols in frame: eth:ip:udp:data]
Ethernet II, Src: Cisco_aa:c0:28 (00:d0:bb:aa:c0:28), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Destination: Broadcast (ff:ff:ff:ff:ff:ff)
Address: Broadcast (ff:ff:ff:ff:ff:ff)
.... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
Source: Cisco_aa:c0:28 (00:d0:bb:aa:c0:28)
Address: Cisco_aa:c0:28 (00:d0:bb:aa:c0:28)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
Type: IP (0x0800)
Internet Protocol, Src: 192.166.1.120 (192.166.1.120), Dst: 255.255.255.255 (255.255.255.255)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 257
Identification: 0xfad3 (64211)
Flags: 0x04 (Don't Fragment)
0... = Reserved bit: Not set
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 252
Protocol: UDP (0x11)
Header checksum: 0xc0f9 [correct]
[Good: True]
[Bad : False]
Source: 192.166.1.120 (192.166.1.120)
Destination: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: 56172 (56172), Dst Port: 5515 (5515)
Source port: 56172 (56172)
Destination port: 5515 (5515)
Length: 237
Checksum: 0x01ba [correct]
[Good Checksum: True]
[Bad Checksum: False]
Data (229 bytes)
0000 41 37 30 33 34 30 38 30 30 30 30 30 30 31 31 30 A703408000000110
0010 4b 52 53 50 49 4f 50 4b 32 49 4b 52 34 32 30 31 KRSPIOPK2IKR4201
0020 45 32 32 32 35 33 30 30 32 31 30 30 30 30 30 30 E222530021000000
0030 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
0040 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
0050 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
0060 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
0070 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
0080 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
0090 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
00a0 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
00b0 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
00c0 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
00d0 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 0000000000000000
00e0 30 30 30 30 ff 0000.
Data: 413730333430383030303030303131304B525350494F504B...
I've diffed this against the same packet from a dump taken on the working 8.04 server, and the packets themselves are identicial; the only difference is in the frame number (within the pcap file) and the time the packet was received (1.224 milliseconds difference, which seems high given that the two hosts use the same NTP server, but not totally unreasonable).
Update 01-27bis
I've experimented further, generating my own broadcast packets on the 8.04 host and sending them to 9.04 host, and the 9.04 host receives the packets just fine when the 8.04 host sends them and they arrive on either eth0 or eth1.
Update 01-27ter
The output of sp 3; sysctl -a 2>/dev/null | grep '\.rp_filter' | sort
on the 8.04 host is:
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 0
net.ipv4.conf.eth1.rp_filter = 0
net.ipv4.conf.lo.rp_filter = 1
and on the 9.04 host is:
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth1.rp_filter = 1
net.ipv4.conf.eth2.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 0
So the issue is the net.ipv4.conf.eth1.rp_filter sysctl setting. With it set to 0 on the 8.04 box, I'm doing loose reverse path checking, which means that a packet may come in from any destination I can route to on any interface. On the 9.04 box, I was doing strict checking, which means that it would reject packets arriving on an interface if replies to those packets would go out a different interface.
The packets arriving on eth1 to 255.255.255.255 are ones that I should not be receiving under were everything functioning properly because 255.255.255.255 is the local network broadcast address, and yet the source of those packets is not on the local network. So essentially, something is misconfigured on the network where I receive a feed, and I have to deal with this misconfiguration.
精彩评论