Has anyone out there done the work of sitting on top of a packet capture interface (like jpcap) with an implementation of UDPSocket
(for UDP datagrams) and InputStream
(for TCP streams)?
I suppose it wouldn'开发者_如何学Pythont be too hard to do given the callback API in jpcap, but has anyone out there already done it? Are there any issues with doing this (do I have to figure out how to reassemble a TCP stream myself, for example?)
I have not done this particular thing, but I do do a lot of work with parsing captured packets in C/C++. I don't know if there exist Java libraries for any of this.
Essentially, you need to work your way up the protocol stack, starting with IP. The pcap data starts with the link-level header, but I don't think there's much in it that you're concerned about, other than ignoring non-IP packets.
The trickiest thing with IP is reassembling fragmented datagrams. This is done using the More Fragments bit in the Flags field and the Fragment Offset field, combined with the Identification field to distinguish fragments from different datagrams Then you use the Protocol field to identify TCP and UDP packets, and the Header Length field to find the start of the corresponding header.
The next step, for both TCP and UDP, is demultiplexing, separating out the various connections in the captured packet stream. Both protocols identify connections (well, UDP doesn't have connections per se, but I don't have a better word handy) by the 4-tuple of the source and destination IP address and the source and destination port, so a connection would be a sequence of packets that matches on all 4 of these values.
Once that's done, for UDP, you're just about finished, unless you want to check the checksum. The Length field in the UDP header tells you how long the packet is; subtract 8 bytes for the header and there's your data.
TCP is somewhat more complicated, as you do indeed have to reassemble the stream, This is done using the sequence number in the header, combined with the length. The sum of these two tells you the next sequence number in the stream. Remember that you're keeping track of the traffic in two directions.
(This is a lot easier than writing an actual TCP implementation, as then you have to implement the Nagle algorithm and other minutiae.)
There's a lot of information on the net about the header formats; google "IP header" for starters. A network analyzer like Wireshark is indispensable for this work, as it will show you how your captured data is supposed to look. Indeed, as Wireshark is open source, you can probably find out a lot by looking at how it does things
Tcp reassembly can be done with JNetPcap. Here is a complete example:
final String SOME_PORT = 8888;
StringBuilder errbuf = new StringBuilder();
Pcap pcap = Pcap.openOffline("/dir/someFile.pcap", errbuf); //Can be replace with .openLive(...)
if (pcap == null) {
System.err.printf("Error: "+errbuf.toString());
return;
}
//Handler that receive Tcp Event one by one
AnalyzerListener<TcpStreamEvent> handler = new AnalyzerListener<TcpStreamEvent>() {
@Override
public void processAnalyzerEvent(TcpStreamEvent evt) {
JPacket packet = evt.getPacket();
Tcp tcp = new Tcp();
if (packet.hasHeader(tcp)) {
//Limiting the analysis to a specific protocol
if (tcp.destination() == SOME_PORT || tcp.source() == SOME_PORT) {
String data = new String(tcp.getPayload());
System.out.println("Capture data:{"+data+"}");
}
}
}
};
TcpAnalyzer tcpAnalyzer = JRegistry.getAnalyzer(TcpAnalyzer.class);
tcpAnalyzer.addTcpStreamListener(handler, null);
//Starting the capture
pcap.loop(Pcap.LOOP_INFINATE, JRegistry.getAnalyzer(JController.class), null);
精彩评论