1214478Srpaulo@(#) $Header: /tcpdump/master/tcpdump/README,v 1.68 2008-12-15 00:05:27 guy Exp $ (LBL)
217680Spst
3214478SrpauloTCPDUMP 4.x.y
475115SfennerNow maintained by "The Tcpdump Group"
575115SfennerSee 		www.tcpdump.org
617680Spst
7190207SrpauloPlease send inquiries/comments/reports to:
8190207Srpaulo	tcpdump-workers@lists.tcpdump.org
975115Sfenner
10214478SrpauloAnonymous Git is available via:
11214478Srpaulo	git clone git://bpf.tcpdump.org/tcpdump
1275115Sfenner
13214478SrpauloVersion 4.x.y of TCPDUMP can be retrieved with the CVS tag "tcpdump_4_xrely":
14214478Srpaulo	cvs -d :pserver:cvs.tcpdump.org:/tcpdump/master checkout -r tcpdump_4_xrely tcpdump
1575115Sfenner
16251158SdelphijPlease submit patches by forking the branch on GitHub at
1775115Sfenner
18251158Sdelphij	http://github.com/mcr/tcpdump/tree/master
19251158Sdelphij
20251158Sdelphijand issuing a pull request.
21251158Sdelphij
2275115Sfennerformerly from 	Lawrence Berkeley National Laboratory
2375115Sfenner		Network Research Group <tcpdump@ee.lbl.gov>
2475115Sfenner		ftp://ftp.ee.lbl.gov/tcpdump.tar.Z (3.4)
2575115Sfenner
2617680SpstThis directory contains source code for tcpdump, a tool for network
2775115Sfennermonitoring and data acquisition.  This software was originally
2875115Sfennerdeveloped by the Network Research Group at the Lawrence Berkeley
2975115SfennerNational Laboratory.  The original distribution is available via
3075115Sfenneranonymous ftp to ftp.ee.lbl.gov, in tcpdump.tar.Z.  More recent
3175115Sfennerdevelopment is performed at tcpdump.org, http://www.tcpdump.org/
3217680Spst
3375115SfennerTcpdump uses libpcap, a system-independent interface for user-level
3417680Spstpacket capture.  Before building tcpdump, you must first retrieve and
3575115Sfennerbuild libpcap, also originally from LBL and now being maintained by
3675115Sfennertcpdump.org; see http://www.tcpdump.org/ .
3717680Spst
3817680SpstOnce libpcap is built (either install it or make sure it's in
3917680Spst../libpcap), you can build tcpdump using the procedure in the INSTALL
4017680Spstfile.
4117680Spst
4239297SfennerThe program is loosely based on SMI's "etherfind" although none of the
4339297Sfenneretherfind code remains.  It was originally written by Van Jacobson as
4439297Sfennerpart of an ongoing research project to investigate and improve tcp and
4539297Sfennerinternet gateway performance.  The parts of the program originally
4639297Sfennertaken from Sun's etherfind were later re-written by Steven McCanne of
4739297SfennerLBL.  To insure that there would be no vestige of proprietary code in
4839297Sfennertcpdump, Steve wrote these pieces from the specification given by the
4939297Sfennermanual entry, with no access to the source of tcpdump or etherfind.
5017680Spst
5139297SfennerOver the past few years, tcpdump has been steadily improved by the
5239297Sfennerexcellent contributions from the Internet community (just browse
5339297Sfennerthrough the CHANGES file).  We are grateful for all the input.
5417680Spst
5539297SfennerRichard Stevens gives an excellent treatment of the Internet protocols
5639297Sfennerin his book ``TCP/IP Illustrated, Volume 1''. If you want to learn more
5739297Sfennerabout tcpdump and how to interpret its output, pick up this book.
5817680Spst
5917680SpstSome tools for viewing and analyzing tcpdump trace files are available
6017680Spstfrom the Internet Traffic Archive:
6117680Spst
6239297Sfenner	http://www.acm.org/sigcomm/ITA/
6317680Spst
6439297SfennerAnother tool that tcpdump users might find useful is tcpslice:
6539297Sfenner
6639297Sfenner	ftp://ftp.ee.lbl.gov/tcpslice.tar.Z
6739297Sfenner
6839297SfennerIt is a program that can be used to extract portions of tcpdump binary
6939297Sfennertrace files. See the above distribution for further details and
7039297Sfennerdocumentation.
7139297Sfenner
72111726SfennerProblems, bugs, questions, desirable enhancements, etc. should be sent
73190207Srpauloto the address "tcpdump-workers@lists.tcpdump.org".  Bugs, support
74251158Sdelphijrequests, and feature requests may also be submitted on the GitHub issue
75251158Sdelphijtracker for tcpdump at
7617680Spst
77251158Sdelphij	https://github.com/mcr/tcpdump/issues
7875115Sfenner
79111726SfennerSource code contributions, etc. should be sent to the email address
80251158Sdelphijabove or submitted by forking the branch on GitHub at
8175115Sfenner
82251158Sdelphij	http://github.com/mcr/tcpdump/tree/master
83111726Sfenner
84251158Sdelphijand issuing a pull request.
85251158Sdelphij
86251158SdelphijCurrent versions can be found at www.tcpdump.org.
87251158Sdelphij
8875115Sfenner - The TCPdump team
8975115Sfenner
9075115Sfenneroriginal text by: Steve McCanne, Craig Leres, Van Jacobson
9175115Sfenner
9217680Spst-------------------------------------
9317680SpstThis directory also contains some short awk programs intended as
9417680Spstexamples of ways to reduce tcpdump data when you're tracking
9517680Spstparticular network problems:
9617680Spst
9717680Spstsend-ack.awk
9817680Spst	Simplifies the tcpdump trace for an ftp (or other unidirectional
9917680Spst	tcp transfer).  Since we assume that one host only sends and
10017680Spst	the other only acks, all address information is left off and
10117680Spst	we just note if the packet is a "send" or an "ack".
10217680Spst
10317680Spst	There is one output line per line of the original trace.
10417680Spst	Field 1 is the packet time in decimal seconds, relative
10517680Spst	to the start of the conversation.  Field 2 is delta-time
10617680Spst	from last packet.  Field 3 is packet type/direction.
10717680Spst	"Send" means data going from sender to receiver, "ack"
10817680Spst	means an ack going from the receiver to the sender.  A
10917680Spst	preceding "*" indicates that the data is a retransmission.
11017680Spst	A preceding "-" indicates a hole in the sequence space
11117680Spst	(i.e., missing packet(s)), a "#" means an odd-size (not max
11217680Spst	seg size) packet.  Field 4 has the packet flags
11317680Spst	(same format as raw trace).  Field 5 is the sequence
11417680Spst	number (start seq. num for sender, next expected seq number
11517680Spst	for acks).  The number in parens following an ack is
11617680Spst	the delta-time from the first send of the packet to the
11717680Spst	ack.  A number in parens following a send is the
11817680Spst	delta-time from the first send of the packet to the
11917680Spst	current send (on duplicate packets only).  Duplicate
12017680Spst	sends or acks have a number in square brackets showing
12117680Spst	the number of duplicates so far.
12217680Spst
12317680Spst	Here is a short sample from near the start of an ftp:
12417680Spst		3.00    0.20   send . 512
12517680Spst		3.20    0.20    ack . 1024  (0.20)
12617680Spst		3.20    0.00   send P 1024
12717680Spst		3.40    0.20    ack . 1536  (0.20)
12817680Spst		3.80    0.40 * send . 0  (3.80) [2]
12917680Spst		3.82    0.02 *  ack . 1536  (0.62) [2]
13017680Spst	Three seconds into the conversation, bytes 512 through 1023
13117680Spst	were sent.  200ms later they were acked.  Shortly thereafter
13217680Spst	bytes 1024-1535 were sent and again acked after 200ms.
13317680Spst	Then, for no apparent reason, 0-511 is retransmitted, 3.8
13417680Spst	seconds after its initial send (the round trip time for this
13517680Spst	ftp was 1sec, +-500ms).  Since the receiver is expecting
13617680Spst	1536, 1536 is re-acked when 0 arrives.
13717680Spst
13817680Spstpacketdat.awk
13917680Spst	Computes chunk summary data for an ftp (or similar
14017680Spst	unidirectional tcp transfer). [A "chunk" refers to
14117680Spst	a chunk of the sequence space -- essentially the packet
14217680Spst	sequence number divided by the max segment size.]
14317680Spst
14417680Spst	A summary line is printed showing the number of chunks,
14517680Spst	the number of packets it took to send that many chunks
14617680Spst	(if there are no lost or duplicated packets, the number
14717680Spst	of packets should equal the number of chunks) and the
14817680Spst	number of acks.
14917680Spst
15017680Spst	Following the summary line is one line of information
15117680Spst	per chunk.  The line contains eight fields:
15217680Spst	   1 - the chunk number
15317680Spst	   2 - the start sequence number for this chunk
15417680Spst	   3 - time of first send
15517680Spst	   4 - time of last send
15617680Spst	   5 - time of first ack
15717680Spst	   6 - time of last ack
15817680Spst	   7 - number of times chunk was sent
15917680Spst	   8 - number of times chunk was acked
16017680Spst	(all times are in decimal seconds, relative to the start
16117680Spst	of the conversation.)
16217680Spst
16317680Spst	As an example, here is the first part of the output for
16417680Spst	an ftp trace:
16517680Spst
16617680Spst	# 134 chunks.  536 packets sent.  508 acks.
16717680Spst	1       1       0.00    5.80    0.20    0.20    4       1
16817680Spst	2       513     0.28    6.20    0.40    0.40    4       1
16917680Spst	3       1025    1.16    6.32    1.20    1.20    4       1
17017680Spst	4       1561    1.86    15.00   2.00    2.00    6       1
17117680Spst	5       2049    2.16    15.44   2.20    2.20    5       1
17217680Spst	6       2585    2.64    16.44   2.80    2.80    5       1
17317680Spst	7       3073    3.00    16.66   3.20    3.20    4       1
17417680Spst	8       3609    3.20    17.24   3.40    5.82    4       11
17517680Spst	9       4097    6.02    6.58    6.20    6.80    2       5
17617680Spst
17717680Spst	This says that 134 chunks were transferred (about 70K
17817680Spst	since the average packet size was 512 bytes).  It took
17917680Spst	536 packets to transfer the data (i.e., on the average
18017680Spst	each chunk was transmitted four times).  Looking at,
18117680Spst	say, chunk 4, we see it represents the 512 bytes of
18217680Spst	sequence space from 1561 to 2048.  It was first sent
18317680Spst	1.86 seconds into the conversation.  It was last
18417680Spst	sent 15 seconds into the conversation and was sent
18517680Spst	a total of 6 times (i.e., it was retransmitted every
18617680Spst	2 seconds on the average).  It was acked once, 140ms
18717680Spst	after it first arrived.
18817680Spst
18917680Spststime.awk
19017680Spstatime.awk
19117680Spst	Output one line per send or ack, respectively, in the form
19217680Spst		<time> <seq. number>
19317680Spst	where <time> is the time in seconds since the start of the
19417680Spst	transfer and <seq. number> is the sequence number being sent
19517680Spst	or acked.  I typically plot this data looking for suspicious
19617680Spst	patterns.
19717680Spst
19817680Spst
19917680SpstThe problem I was looking at was the bulk-data-transfer
20017680Spstthroughput of medium delay network paths (1-6 sec.  round trip
20117680Spsttime) under typical DARPA Internet conditions.  The trace of the
20217680Spstftp transfer of a large file was used as the raw data source.
20317680SpstThe method was:
20417680Spst
20517680Spst  - On a local host (but not the Sun running tcpdump), connect to
20617680Spst    the remote ftp.
20717680Spst
20817680Spst  - On the monitor Sun, start the trace going.  E.g.,
20917680Spst      tcpdump host local-host and remote-host and port ftp-data >tracefile
21017680Spst
21117680Spst  - On local, do either a get or put of a large file (~500KB),
21217680Spst    preferably to the null device (to minimize effects like
21317680Spst    closing the receive window while waiting for a disk write).
21417680Spst
21517680Spst  - When transfer is finished, stop tcpdump.  Use awk to make up
21617680Spst    two files of summary data (maxsize is the maximum packet size,
21717680Spst    tracedata is the file of tcpdump tracedata):
21817680Spst      awk -f send-ack.awk packetsize=avgsize tracedata >sa
21917680Spst      awk -f packetdat.awk packetsize=avgsize tracedata >pd
22017680Spst
22117680Spst  - While the summary data files are printing, take a look at
22217680Spst    how the transfer behaved:
22317680Spst      awk -f stime.awk tracedata | xgraph
22417680Spst    (90% of what you learn seems to happen in this step).
22517680Spst
22617680Spst  - Do all of the above steps several times, both directions,
22717680Spst    at different times of day, with different protocol
22817680Spst    implementations on the other end.
22917680Spst
23017680Spst  - Using one of the Unix data analysis packages (in my case,
23117680Spst    S and Gary Perlman's Unix|Stat), spend a few months staring
23217680Spst    at the data.
23317680Spst
23417680Spst  - Change something in the local protocol implementation and
23517680Spst    redo the steps above.
23617680Spst
23717680Spst  - Once a week, tell your funding agent that you're discovering
23817680Spst    wonderful things and you'll write up that research report
23917680Spst    "real soon now".
240