Besides IP fragmentation we
have MPLS fragmentation for some of the MPLS application, or so the RFC are
saying and some MPLS fragmentation should exist in all vendors’ hardware and
software. MPLS fragmentation should exist at L3VPN (based, of course, on IP fragmentation
described in RFC 3032), it might exist on PW (described in RFC 4623), and also
it might exist on VPLS if IP fragmentation is presented on CE (RFC 4665). So
MPLS fragmentation exists, at least it is described on the RFCs, and no the labeled
fragmented packets will not get lost and no will not lose the source and the destination
because of the fragmentation.
Again, the best scenario is
to avoid fragmentation and to increase the MTU size as much as possible to
increase the link throughput. Also it would be best to have the same MTU size
along all the paths in the network, but if not possible at least to have the
same MTU size along the same layer: core, aggregation and distribution,
considering the fact in distribution layer could be some not so smart devices.
RFC 3032 - MPLS Label Stack Encoding
The MPLS packets could be “too
big” for the link because one of the following reasons:
-
IP packets entering in the MPLS cloud are too
big.
-
IP packets entering in the MPLS cloud are
getting the labels pushed and are becoming too big.
-
in MPLS cloud, some MPLS packets are getting more
labels pushed and are becoming too big.
Even
if the packet is too big and DF is not set, the LSR might silently discard the packet. If MPLS fragmentation is
implemented and DF is not send, then the LSR should do the following:
1. POP all the labels from the label stack to
obtain IP datagram.
2. Let N be the number of bytes in the label
stack.
3. Check the DF bit of the IP datagram, if not
set fragment the packet based on IP fragmentation rules.
4. Each IP datagram fragment should have the
size equal with MTU – N, where N is the value noted at point 2 (number of bytes
in the label stack).
5. PUSH to each IP datagram fragment same label
header stack that had the original non-fragmented packet.
6. Forward the fragments.
If
the stripped of IP datagram (point 3) has the DF bit set, then the datagram
should not be fragmented and forwarded, should be discarded and an ICMP host Unreachable
message (with code 3 "Fragmentation Required and DF Set") should be
generated an transmitted to the source, if possible.
Considering
an IP packet with 5000 bytes size from a L3VPN CE entering in the MPLS domain with 1500 bytes MTU size,
first to the packet will be pushed 2 MPLS labels:
Second, the big MPLS packet will be fragmented based on IP fragmentation (for more details you can check IP-Fragmentation ) plus the label stack copied to each fragment, as you can see below:
From the above table the following can be noted:
- The labels values from the initial packet are
copied to all fragments.
- The Length of the IP fragments, except the
last one, is 1492 bytes = 1500 bytes (MTU) – 8 bytes (2 labels).
- The Offset (representing where in the datagram the current fragment
belongs to) of the IP fragments is calculated based on the 1492 bytes value.
Just to be able to make an analogy between the IP and MPLS fragmentation, you can check the below table in which the same IP 5000 bytes size is fragmented on IP cloud and when entering in MPLS cloud (considering a 1500 bytes IP MTU first and MPLS MTU second):
RFC 4623 - Pseudowire Emulation Edge-to-Edge (PWE3) Fragmentation and Reassembly
As even the RFC is stated,
the fragmentation should be avoided as much as possible due to processing overhead,
but in case the fragmentation is needed then the following fragmentation and reassembly
domains are defined:
-
The first method is again to let the CE to do
the IP fragmentation and to send the fragments to PW.
- Fragmentation is done in the transmitting PE immediately
before the PW encapsulation.
- Reassembly is done in
the receiving PE immediately after the PW decapsulation.
Because there is no Fragment
Offset from IP, using the Sequence Number field on fragmented packets is mandatory.
For this purpose it is used the Control Word from the VC signaling, with already
defined Pseudowire Interface Parameter Sub-TLV (parameter 0x99, length 4). The
presence of this parameter in the VC label advertisement indicates that the
receiver is able to do the reassembly and not that the transmitter will use
fragmentation; the absence of this parameter will notify the sender not to use
fragmentation.
The fragmentation bits are
on the position 8 and 9 in the control word format and have the following significance:
BE
|
Significance
|
00
|
the
entire (un-fragmented) payload is carried in a single packet
|
01
|
the
packet carrying the first fragment
|
10
|
the
packet carrying the last fragment
|
11
|
indicates
a packet carrying an intermediate fragment
|
RFC 4665 - Service
Requirements for Layer 2 Provider-Provisioned Virtual Private Networks
There is no implemented
fragmentation method (until now) for VPLS services, a VPLS domain may implement
IP fragmentation only on the IP CE sides.
Again, at the end,
fragmentation should be avoided and eliminated as much as possible because
besides increasing the processing overload, the different types of delays and
possible packet drops it also increase the bandwidth due to the additional
headers overload for each fragment…if not avoided, then good luck with fragmenting
the Moby-Dick.
By Mihaela Paraschivu
No comments: