IP Fragmentation

At the beginning has been TCP with no IP at all, the TCP has been spited in 2 TCP/IP at version 3 and it has become operational and highly used at version 4 (that’s why the 4 version from IP). The splitting of TCP has been decided because of the following reasons: the need to have different layers for Network and Transport; the overkill of gateways (the former name of routers) to deal with end-to-end protocol for both routing packets between devices and reliable communications between end hosts. The IPv6 name has been chosen to avoid any confusion between the non-used IPv5 protocol (Internet Stream Protocol) and the new IPv6 protocol.

Just to be clear since the beginning: the fragmentation can be done on IP layer and on MPLS layer and that’s why on layer 2 link (like QinQ) the frames are dropped (depending, for more details you can check Almighty-MTUl) if the layer 2 MTU is less than the frames size, at this moment there is no layer 2 fragmentation definition or RFCs.

From the above Internet Datagram Header format described in RFC 791, in fragmentation the Total Length, Identification, Flags and Fragment Offset fields are the most important ones.

Total Length field is 16 bits long and it represents the length of the datagram measured in octets / bytes, including the Internet header and the actual data. The maximum length of the datagram allowed by this field is 65.535 octets/bytes, and the minimum allowed length must be 576 octets/bytes for all hosts. The 576 octets/bytes size is selected to allow a data size of 512 octets/bytes to be transmitted, 64 octets/bytes represent the headers of the datagram, IP header having a minimum size of 20 octets/bytes and a maximum size of 60 octets/bytes (if all IP Options are used) allowing a 4 octets/bytes margin for the upper layers protocols.

Identification field is 16 bits long and it represents a value assigned by the sender to aid in assembling the fragments of a datagram. This field is used to allow the receiving device to sort out which fragments belong to which data block; each fragment from a particular data stream has the same Identification Field. If one or more fragments are lost, after the timeout period, the receiving device will discard all the fragments from its buffer with the given Identification field and will wait for the new fragments to be retransmitted.

Flags field is 3 bits long and may have one of the values below:

– Bit 0 – Reserved – Must be always 0;

– Bit 1 – DF – Don’t Fragment bit – has the following 2 values:

– DF = 0 – May Fragment – The packet should be fragmented if needed.

– DF = 1 – Don’t Fragment – The packet should not be fragmented, even if needed, it should be discarded.

– Bit 2 – MF – More Fragments bit – has the following 2 values:

– MF = 0 – Last Fragment – No more fragments are to come for the current datagram, with the same Identification field. The receiving device should now start the reassembly process of the fragments with the same Identification field, in the order specified by the Fragment Offset field.

– MF = 1 – More Fragments – More fragments are to come for the current datagram, with the same Identification field. The receiving device will allocate buffer resources for reassembly and pass all frames containing that unique Identification Field to the buffer.

Fragment Offset field is 13 bits long, indicates where in the datagram the current fragment belongs to. It is measured in units of 8 octets/bytes (64 bits). The offset of the first fragment is always 0.

Now, let’s suppose we have a link with the IP MTU set to 1500 octets/bytes on which a packet with 5000 octets/bytes length must be transmitted:

1. First segment is created by taking the first 1480 bytes from the original packets, to which is added the 20 bytes for IP Header including the Source and Destinations IPs. The total length is set to 1500 bytes, including the IP headers, the Identification (I) is copied from the original packet, the More Fragments (MF) is set to 1 and Offset is 0 (meaning the beginning of the “big packet”).

2. The second segment is creating by taking the next 1480 bytes (starting from the position 1480 from the “big packet”), to which is added the 20 bytes for IP Header including the Source and Destinations IPs. The total length is set to 1500 bytes, including the IP headers, the Identification (I) is copied from the original packet, the More Fragments (MF) is set to 1 and Offset is set to 185 = (1500-20)/8 (meaning this fragment starts from the position 1480 from the “big packet”).

3. The third segment is creating by taking the next 1480 bytes (starting from the position 2960 (1480*2) from the “big packet”), to which is added the 20 bytes for IP Header including the Source and Destinations IPs. The total length is set to 1500 bytes, including the IP headers, the Identification (I) is copied from the original packet, the More Fragments (MF) is set to 1 and Offset is set to 370 = 2*(1500-20)/8 (meaning this fragment starts from the position 2960 from the “big packet”).

4. The fourth segment is creating by taking the next 1480 bytes (starting from the position 4440 (1480*3) from the “big packet”), to which is added the 20 bytes for IP Header including the Source and Destinations IPs. The total length is set to 560 bytes, including the IP headers, the Identification (I) is copied from the original packet, the More Fragments (MF) is set to 0 and Offset is set to 555 = 3*(1500-20)/8 (meaning this fragment starts from the position 4440 from the “big packet”).

For a better understanding you can see the picture below where the initial big packet is the first line, following it by the fragments, to all of them is added the 20 bytes the IP header:

The most important things to notice here are the following:

– The initial IP header is not copied to the fragments IP headers, it is transformed, most of the fields are copied but the Flags and Fragment Offset are recalculated for each fragment.

– Dividing the offset by 8 allows it to fit in 13 bits instead of 16. This means every packet but the last must contain a number of data bytes that is a multiple of 8.

– The last fragment could be much smaller than the actual size of the MTU; it represents the data left un-transmitted by the other segments.

– The transferred segmented packets sum is exceeding the initial “big packet” size, because at each segment the 20 bytes of IP header is added (in my example, 60 bytes more).In this way, the throughput of the path will be highly reduced, more about throughput can be found here

– Reassembly is the complement to fragmentation, but it is done only by the last hop (destination IP). Supposing on the path from the source to the destination we have a link with a smaller MTU, all the fragments will be fragmented again, below is the fragmentation for the first initial fragment, all of the fragments will be fragmented accordingly:

– If one fragment is missing from the destination router and if the reassembly timer expires, all other already received fragments are discarded and an ICMP Time Exceeded message is generated. Since IP is unreliable, it relies on higher layer protocols such as TCP to determine that the message was not properly received and then retransmit it.

By Mihaela Paraschivu

IP dreams

Pages

IP Fragmentation

Sunday, March 16, 2014

No comments: