README revision 75115
1@(#) $Header: /tcpdump/master/tcpdump/README,v 1.58 2000/12/08 06:59:11 mcr Exp $ (LBL)
2
3TCPDUMP 3.6
4Now maintained by "The Tcpdump Group"
5See 		www.tcpdump.org
6
7Please send inquiries/comments/reports to 	tcpdump-workers@tcpdump.org
8
9Anonymous CVS is available via:
10	cvs -d cvs.tcpdump.org:/tcpdump/master login
11	(password "anoncvs")
12	cvs -d cvs.tcpdump.org:/tcpdump/master checkout tcpdump
13
14Version 3.6 of TCPDUMP can be retrived with the CVS tag "tcpdump_3_6":
15	cvs -d cvs.tcpdump.org:/tcpdump/master checkout -r tcpdump_3_6 tcpdump
16
17Please send patches against the master copy to patches@tcpdump.org.
18
19formerly from 	Lawrence Berkeley National Laboratory
20		Network Research Group <tcpdump@ee.lbl.gov>
21		ftp://ftp.ee.lbl.gov/tcpdump.tar.Z (3.4)
22
23This directory contains source code for tcpdump, a tool for network
24monitoring and data acquisition.  This software was originally
25developed by the Network Research Group at the Lawrence Berkeley
26National Laboratory.  The original distribution is available via
27anonymous ftp to ftp.ee.lbl.gov, in tcpdump.tar.Z.  More recent
28development is performed at tcpdump.org, http://www.tcpdump.org/
29
30Tcpdump uses libpcap, a system-independent interface for user-level
31packet capture.  Before building tcpdump, you must first retrieve and
32build libpcap, also originally from LBL and now being maintained by
33tcpdump.org; see http://www.tcpdump.org/ .
34
35Once libpcap is built (either install it or make sure it's in
36../libpcap), you can build tcpdump using the procedure in the INSTALL
37file.
38
39The program is loosely based on SMI's "etherfind" although none of the
40etherfind code remains.  It was originally written by Van Jacobson as
41part of an ongoing research project to investigate and improve tcp and
42internet gateway performance.  The parts of the program originally
43taken from Sun's etherfind were later re-written by Steven McCanne of
44LBL.  To insure that there would be no vestige of proprietary code in
45tcpdump, Steve wrote these pieces from the specification given by the
46manual entry, with no access to the source of tcpdump or etherfind.
47
48Over the past few years, tcpdump has been steadily improved by the
49excellent contributions from the Internet community (just browse
50through the CHANGES file).  We are grateful for all the input.
51
52Richard Stevens gives an excellent treatment of the Internet protocols
53in his book ``TCP/IP Illustrated, Volume 1''. If you want to learn more
54about tcpdump and how to interpret its output, pick up this book.
55
56Some tools for viewing and analyzing tcpdump trace files are available
57from the Internet Traffic Archive:
58
59	http://www.acm.org/sigcomm/ITA/
60
61Another tool that tcpdump users might find useful is tcpslice:
62
63	ftp://ftp.ee.lbl.gov/tcpslice.tar.Z
64
65It is a program that can be used to extract portions of tcpdump binary
66trace files. See the above distribution for further details and
67documentation.
68
69Problems, bugs, questions, desirable enhancements, etc. 
70should be sent to the address "tcpdump-workers@tcpdump.org".
71
72Source code contributions, etc. should be sent to the email address 
73"patches@tcpdump.org".
74
75Current versions can be found at www.tcpdump.org
76
77 - The TCPdump team
78
79original text by: Steve McCanne, Craig Leres, Van Jacobson
80
81-------------------------------------
82This directory also contains some short awk programs intended as
83examples of ways to reduce tcpdump data when you're tracking
84particular network problems:
85
86send-ack.awk
87	Simplifies the tcpdump trace for an ftp (or other unidirectional
88	tcp transfer).  Since we assume that one host only sends and
89	the other only acks, all address information is left off and
90	we just note if the packet is a "send" or an "ack".
91
92	There is one output line per line of the original trace.
93	Field 1 is the packet time in decimal seconds, relative
94	to the start of the conversation.  Field 2 is delta-time
95	from last packet.  Field 3 is packet type/direction.
96	"Send" means data going from sender to receiver, "ack"
97	means an ack going from the receiver to the sender.  A
98	preceding "*" indicates that the data is a retransmission.
99	A preceding "-" indicates a hole in the sequence space
100	(i.e., missing packet(s)), a "#" means an odd-size (not max
101	seg size) packet.  Field 4 has the packet flags
102	(same format as raw trace).  Field 5 is the sequence
103	number (start seq. num for sender, next expected seq number
104	for acks).  The number in parens following an ack is
105	the delta-time from the first send of the packet to the
106	ack.  A number in parens following a send is the
107	delta-time from the first send of the packet to the
108	current send (on duplicate packets only).  Duplicate
109	sends or acks have a number in square brackets showing
110	the number of duplicates so far.
111
112	Here is a short sample from near the start of an ftp:
113		3.00    0.20   send . 512
114		3.20    0.20    ack . 1024  (0.20)
115		3.20    0.00   send P 1024
116		3.40    0.20    ack . 1536  (0.20)
117		3.80    0.40 * send . 0  (3.80) [2]
118		3.82    0.02 *  ack . 1536  (0.62) [2]
119	Three seconds into the conversation, bytes 512 through 1023
120	were sent.  200ms later they were acked.  Shortly thereafter
121	bytes 1024-1535 were sent and again acked after 200ms.
122	Then, for no apparent reason, 0-511 is retransmitted, 3.8
123	seconds after its initial send (the round trip time for this
124	ftp was 1sec, +-500ms).  Since the receiver is expecting
125	1536, 1536 is re-acked when 0 arrives.
126
127packetdat.awk
128	Computes chunk summary data for an ftp (or similar
129	unidirectional tcp transfer). [A "chunk" refers to
130	a chunk of the sequence space -- essentially the packet
131	sequence number divided by the max segment size.]
132
133	A summary line is printed showing the number of chunks,
134	the number of packets it took to send that many chunks
135	(if there are no lost or duplicated packets, the number
136	of packets should equal the number of chunks) and the
137	number of acks.
138
139	Following the summary line is one line of information
140	per chunk.  The line contains eight fields:
141	   1 - the chunk number
142	   2 - the start sequence number for this chunk
143	   3 - time of first send
144	   4 - time of last send
145	   5 - time of first ack
146	   6 - time of last ack
147	   7 - number of times chunk was sent
148	   8 - number of times chunk was acked
149	(all times are in decimal seconds, relative to the start
150	of the conversation.)
151
152	As an example, here is the first part of the output for
153	an ftp trace:
154
155	# 134 chunks.  536 packets sent.  508 acks.
156	1       1       0.00    5.80    0.20    0.20    4       1
157	2       513     0.28    6.20    0.40    0.40    4       1
158	3       1025    1.16    6.32    1.20    1.20    4       1
159	4       1561    1.86    15.00   2.00    2.00    6       1
160	5       2049    2.16    15.44   2.20    2.20    5       1
161	6       2585    2.64    16.44   2.80    2.80    5       1
162	7       3073    3.00    16.66   3.20    3.20    4       1
163	8       3609    3.20    17.24   3.40    5.82    4       11
164	9       4097    6.02    6.58    6.20    6.80    2       5
165
166	This says that 134 chunks were transferred (about 70K
167	since the average packet size was 512 bytes).  It took
168	536 packets to transfer the data (i.e., on the average
169	each chunk was transmitted four times).  Looking at,
170	say, chunk 4, we see it represents the 512 bytes of
171	sequence space from 1561 to 2048.  It was first sent
172	1.86 seconds into the conversation.  It was last
173	sent 15 seconds into the conversation and was sent
174	a total of 6 times (i.e., it was retransmitted every
175	2 seconds on the average).  It was acked once, 140ms
176	after it first arrived.
177
178stime.awk
179atime.awk
180	Output one line per send or ack, respectively, in the form
181		<time> <seq. number>
182	where <time> is the time in seconds since the start of the
183	transfer and <seq. number> is the sequence number being sent
184	or acked.  I typically plot this data looking for suspicious
185	patterns.
186
187
188The problem I was looking at was the bulk-data-transfer
189throughput of medium delay network paths (1-6 sec.  round trip
190time) under typical DARPA Internet conditions.  The trace of the
191ftp transfer of a large file was used as the raw data source.
192The method was:
193
194  - On a local host (but not the Sun running tcpdump), connect to
195    the remote ftp.
196
197  - On the monitor Sun, start the trace going.  E.g.,
198      tcpdump host local-host and remote-host and port ftp-data >tracefile
199
200  - On local, do either a get or put of a large file (~500KB),
201    preferably to the null device (to minimize effects like
202    closing the receive window while waiting for a disk write).
203
204  - When transfer is finished, stop tcpdump.  Use awk to make up
205    two files of summary data (maxsize is the maximum packet size,
206    tracedata is the file of tcpdump tracedata):
207      awk -f send-ack.awk packetsize=avgsize tracedata >sa
208      awk -f packetdat.awk packetsize=avgsize tracedata >pd
209
210  - While the summary data files are printing, take a look at
211    how the transfer behaved:
212      awk -f stime.awk tracedata | xgraph
213    (90% of what you learn seems to happen in this step).
214
215  - Do all of the above steps several times, both directions,
216    at different times of day, with different protocol
217    implementations on the other end.
218
219  - Using one of the Unix data analysis packages (in my case,
220    S and Gary Perlman's Unix|Stat), spend a few months staring
221    at the data.
222
223  - Change something in the local protocol implementation and
224    redo the steps above.
225
226  - Once a week, tell your funding agent that you're discovering
227    wonderful things and you'll write up that research report
228    "real soon now".
229