Ethernet and Overlay technologies over Ethernet (802.3, 802.1q, 802.1ad
QinQ, MPLS, 802.1ah mac-in-mac, Cisco Fabricpath, TRILL, OTV, LISP, VxLAN, NVGRE,
and STT) are everywhere and in each presentation or implementation. So, let’s
start from the beginning (which is 802.3 Ethernet) going to almost all New
Ethernet or Overlay Ethernet frame encapsulation methods.
The main idea of this post is that I wanted to find most of the Ethernet encapsulations (as MPLS is L2.5 and the new Ethernet standards are considered to be L2 routing) put together to see the frame fields and the total frame sizes “on the wire”. The most important things in this post are the tables below, the rest of it only define the fields and give some more details, not necessarily all.
The internal structure of
an Ethernet frame – with IFG
IEEE 802.3 Ethernet frame - 1538 bytes/octets
|
||||||||
Pre
SFD |
DMAC
|
SMAC
|
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460
|
CRC
FCS |
IFG
|
8
|
6
|
6
|
2
|
46-1500
|
4
|
12
|
IEEE 802.1Q - tagged Ethernet frame - 1542 bytes/octets
|
|||||||||
Pre
SFD |
DMAC
|
SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460
|
CRC
|
IFG
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
IEEE 802.1AD - double tagged Ethernet frame - QinQ - 1546
bytes/octets
|
||||||||||
Pre
SFD |
DMAC
|
SMAC
|
P-TAG
0x88a8 |
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
4
|
2
|
46-1500
|
4
|
12
|
IEEE 802.3 frame with 3 MPLS Headers - 1570 bytes/octets
|
|||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
Type
0x8847 |
LSP
Label |
RSVP Label
|
VPN
Label |
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
2
|
4
|
4
|
4
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
Cisco FabricPath Ethernet frame - 1558 bytes/octets
|
||||||||||||
Pre
SFD |
ODA
|
OSA
|
FP
TAG |
C-DMAC
|
C-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
IEEE 802.1AH - PBB Ethernet frame - MACinMAC - 1568 bytes/octets
|
||||||||||||||
Pre
SFD |
B-DMAC
|
B-SMAC
|
B-TAG
0x88a8 |
I-TAG
0x88e7 |
C-DMAC
|
C-SMAC
|
P-TAG
0x88a8 |
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
6
|
6
|
6
|
4
|
4
|
2
|
46-1500
|
4
|
12
|
TRILL Ethernet frame - 1568 bytes/octets
|
|||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
|
Type
|
TRILL HEADER
|
I-SMAC
|
I-SMAC
|
I-TAG
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
8
|
6
|
6
|
4
|
4
|
2
|
46-1500
|
4
|
12
|
OTV Ethernet 802.1Q frame by IETF - 1596 bytes/octets
|
||||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
0x0800 |
Type
0x8100 |
Outer IP header
|
O-UDP Header
|
OTV Header
|
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
20
|
8
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
OTV Ethernet 802.1Q frame by Cisco (?) - 1596 bytes/octets
|
|||||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
0x0800 |
Type
0x8100 |
Outer IP header
|
GRE Header
|
MPLS label
|
OTV
|
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
20
|
8
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
LISP Ethernet 802.1Q frame
- 1596 bytes/octets
|
||||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
0x0800 |
Type
0x8100 |
Outer IP header
|
O-UDP Header
|
LISP Header
|
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
20
|
8
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
VxLAN Ethernet 802.1Q frame
- 1596 bytes/octets
|
||||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
0x0800 |
Type
0x8100 |
Outer IP header
|
O-UDP Header
|
VxLAN Header
|
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
20
|
8
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
NvGRE Ethernet 802.1Q frame
- 1588 bytes/octets
|
|||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
0x08100
|
Type
0x8100 |
Outer IP header
|
NvGRE 0x6558
|
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes |
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
20
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
STT Ethernet 802.1Q frame
- 1608 bytes/octets
|
||||||||||||||||
Pre
SFD |
O-DMAC
|
O-SMAC
|
O-TAG
0x0800 |
Type
0x8100 |
Outer IP Header
|
Outer TCP Header
|
STT Header
|
I-DMAC
|
I-SMAC
|
C-TAG
0x8100 |
Type
0x0800 |
IP
20 bytes
|
TCP
20 bytes |
Payload / Data
6-1460 |
CRC
FCS |
IFG
|
8
|
6
|
6
|
4
|
2
|
20
|
20
|
8
|
6
|
6
|
4
|
2
|
46-1500
|
4
|
12
|
The minimum
frame size allowed to be transmitted is 64 bytes =
Header + CRC + Payload, the minimum Payload size is 46 bytes for
non-tagged frames and 42 (46-4 bytes tag) for tagged frames.
The maximum
frame size allowed to be transmitted is 1518 bytes =
Header + CRC + Payload, the Preamble + SFD, the CRC or IFG are not
calculated at frame transmission are important only to calculate the maximum
throughput of the link.
Something else worth to
mention here is that the frame is transmitted with the most-significant octet
first (starting from Preamble until the CRC, IFG is just like an idle time),
but in each octet, the least-significant bit is transmitted first, meaning in
each octet the bits order are inversed when the frame is transmitted.
More details about each field in the above Ethernet frames formats can be found in the below description (which it is way too big to be in the initial post).
Fields Description
Let's start with the
field description starting with the wiki links for more details:
1. Pre – Preamble – 7 bytes – The Preamble is an alternating
pattern of ones and zeros that tells receiving stations that a frame is coming.
2. SFD/SOD – Start of Frame Delimiter – 1 byte – Consists in an an
alternating pattern of ones and zeros, ending with 11 indicating that the next
bit is the Destination MAC address (the leftmost bit of it).
3. DMAC – Destination MAC – 6 bytes – Identifies which MAC
address should receive the frame. The first 3 bytes, in transmission order,
correspond to OUI - Organizationally Unique Identifier assigned by IEEE to
each organization; the following 3 bytes are assigned by that
organization.
4. SMAC – Source MAC – 6 bytes – Identifies the sending MAC, it is always an
individual address and the leftmost bit in this field is always 0.
5. Type / Length – 2 bytes – Might indicate
the Length - number of bytes that are contained in the Data field, if the number
is less or equal with 1500 (0x05DC HEX); or might indicate the Ether type if
the value is greater or equal with 1536 (0x0600 HEX).
The most important
Ethertypes are below:
0x0800
Internet Protocol version 4 (IPv4)
0x0806
Address Resolution Protocol (ARP)
0x22F3
IETF TRILL Protocol
0x8035
Reverse Address Resolution Protocol
0x8100
Tagged IEEE 802.1Q,Shortest Path Bridging IEEE 802.1aq
0x86DD
Internet Protocol Version 6 (IPv6)
0x8808
Ethernet flow control
0x8809
Slow Protocols (IEEE 802.3)
0x8847
MPLS unicast
0x8848
MPLS multicast
0x8863
PPPoE Discovery Stage
0x8864
PPPoE Session Stage
0x8870
Jumbo Frames[2]
0x888E
EAP over LAN (IEEE 802.1X)
0x88A8
PB IEEE 802.1ad,Shortest Path Bridging IEEE 802.1aq
0x88CC
Link Layer Discovery Protocol (LLDP)
0x88E5
MAC security (IEEE 802.1AE)
0x88F7
Precision Time Protocol (IEEE 1588)
0x8902
IEEE 802.1ag CFM Protocol Y.1731 (OAM)
0x892F
High-availability Seamless Redundancy (HSR)
0x9100
Q-in-Q (old)
See more details
on EtherType
6. Payload / Data – 46-1500 bytes – The actual Data contained in the frame. After
physical layer and link layer processing is complete this data will eventually
be sent to an upper layer protocol. In this field are encapsulated the upper
layer protocols, including IP, TCP or UDP headers.
7. CRC / FCS – Cyclic Redundancy Check / Frame Check
Sequence – 4
bytes – Contains a CRC
check value created by the Source MAC and recalculated by the Destination MAC
to check for the damage which might have occurred on the frame transit.
8. IFG – Inter-frame GAP – 12 bytes – Represents minimum idle period between transmissions
of Ethernet frames, a recovery time between frames which allows the devices to
prepare the reception of the next frame. It is inserted by the physical layer.
9. Single tagged Header -
802.1 Q - C-TAG - Customer Tag - 4 bytes - Consists of the
following fields, the C from the beginning of each field comes from Customer
and it is there in order to be able to better understand the further
encapsulations:
4 bytes
|
|||
C-TPID
0x8100 |
C-TCI
|
||
P-PCP C-COS
|
C-DEI
|
C-VID
|
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
9.1 TPID – Tag Protocol Identifier – 2 bytes / 16 bits – Set to 0x8100 HEX to identify the frame as an IEEE 802.1Q-tagged frame
Tag.
9.2 TCI – Tag Control Information – 2 bytes / 16 bits – Has the following 3 fields:
9.2.1 PCP – Priority Code Point or COS Class of Services – 3
bits – Refers
to IEEE 802.1p priority, from 0 which
represents best effort to 7 the highest priority.
9.2.2 DEI – Drop Eligible Indicator – 1 bit – Can be used together
with PCP to indicate the frame is eligible to be discarded in the presence of
congestion; called also CFI - Canonical Format Identifier, it is always set to
0 for the Ethernet switches and to 0 for Token-Ring-types network.
9.2.3 VID – Vlan ID – 12 bits – Indentifies the vlan to which the frame belongs
to. The HEX value of 0x000 is reserved for representing a frame without tag, in
this case the 802.1Q tag specifies the priority and it is called priority tag,
also the 0xFF frame is reserved.
The insertion of 802.1Q
tag forces the FCS/CRC recalculation.
10. Double Tagged Header
- QinQ - 802.1 AD Header - 8 Bytes - consists
in 2 802.1Q headers (P-TAG and C-TAG)
8
bytes
|
|||||||
P-TAG
- 4 bytes
|
C-TAG
- 4 bytes
|
||||||
P-TPID
0x88a8 |
P-TCI
|
C-TPID
0x8100 |
C-TCI
|
||||
P-PCP
P-COS |
P-DEI
|
P-VID
|
C-PCP
C-COS |
C-DEI
|
C-VID
|
||
16
bits
|
3
bits
|
1
bit
|
12
bits
|
16
bits
|
3
bits
|
1
bit
|
12
bits
|
10.1 P-TAG – Provider or Outer TAG – the newly inserted 802.1AD header – The only difference between the initial 802.1Q
header is the TPID value, which in this case is 0x88a8 representing the 802.1AD
ether-type, the PCP/COS field represents the 802.1P priority for the Provider
or Outer Tag.
10.2 C-TAG – Customer or Inner TAG – the newly inserted 802.1AD header – The initial 802.1Q TAG, with the TPID of
0x8100HEX value.
Of course, in this case
also the FCS/CRC is forced to be recalculated in order for the second 802.1Q
tag to be included.
11. MPLS Headers
MPLS
Label - 4 bytes
|
|||
Label
|
EXP
TC |
S
|
TTL
|
20
bits
|
3
bits
|
1
bit
|
5
bit
|
11.1 Label – 20 bits – The label values, the
range is from 0 through (220-1), but the labels
values from 1-15 are reserved, 4-15 reserved for future use and 0-3
defined below (defined in RFC3032 - MPLS Label Stack Encoding):
– 0 – IPv4 Explicit NULL value – Indicates that the label stack must
be popped and the packet must be forwarded based on IPv4 header.
– 1 – Router Alert Label – If this value is on top of the label
stack, then the packet must be delivered to local software for processing. It
can be seen as the "Router Alert Option" in IPv4 packets, for example
ping with record-route option.
– 2 – IPv6 Explicit NULL value – Indicates that the label stack must
be popped and the packet must be forwarded based on IPv6 header.
– 3 – Implicit NULL value – Indicates
that LSR pops the top label from the stack and forwards the rest of the packet
(label or unlabeled) through the outgoing interface.
11.2. EXP – TC Field – Experimental or Traffic Class Field – 3 bits – Traffic class for QoS priority and ECN - Explicit
Congestion Notification.(RFC 5462 renamed the EXP bits to TC).
11.3. S – Bottom of the Stack flag – 1 bit – If this is
set, it signifies that the current label is the last in the stack.
11.4. TTL – Time to Live – 8 bits – The usage of the TTL field in the label is the same
as the TTL in the IP header. When an IP packet enters the MPLS cloud - such as
on the ingress LSR - the IP TTL value is copied (after being decremented by 1)
to the MPLS TTL values of the pushed label(s). At the egress LSR, the label is
removed, and the IP header is exposed again. The IP TTL value is copied from
the MPLS TTL value in the received top label after decrements it by 1.
MPLS Frame Format with 2
and 3 Labels
There is no point in
having MPLS and no Applications on top of it, so i will just skip the 1 label
MPLS Frame. When the label stack is referenced to, the first left label in this
picture is called the Top label, the Outer label or Outermost label, and the
last label is called (first right label in this picture) Bottom label, Inner
label or Innermost label.
The frame with 2 MPLS
labels can be seen in the following scenarios:
1. VPN - both Layer 2
and Layer 3 VPN, in which the Top label represents the IGP path to the next BGP
PE router which originates the VPN route and the Bottom label, represents the
VPN route.
2. VPN over TE (PE to PE
TE) - both Layer 2 and Layer 3 VPN, in which the Top label
represents the RSVP Tunnel label to the peer TE PE (considering of course PE to
PE TE, eventually a peer of TE for both inbound and outbound traffic) and
the the Bottom label represents the VPN route.
The frame with 3 MPLS
label represents the VPN over TE (PE-P TE and PE-P-PE TE, with LDP over TE
enabled), for both Layer 2 and Layer 3 VPN. The Top label is the TE label
(between PE and P), the middle label is the LDP label, and finally the Bottom
label represents the VPN Label.
The last thing worth to
mention here is about the FAT label, Entropy label and GMPLS and MPLS-TP.
The FAT label is
actually a flow label used for optimizing load-balancing over L2VPN and VPLS.
The ingress PE maps all flows inside of the PW3 or VPLS to the same label which
could causes inefficient balancing or the same traffic path.
This why, the ingress PE can insert a new label, with no reserved value, but
with TTL set to 0 in order to be discarded at the egress PE.
The Entropy
label is a generalized extension
to FAT label, it can be applied to IP forwarding, L2VPN or L3VPN, the entropy
label is not used for forwarding, is not signaled its only purpose is to
improve load-balancing. The egress LSR signals the ELI (Entropy Label
indicator, reserved value 7) which indicates that the following label is the
Entropy label. The ingress LSR includes the ELI in the label stack along
with the Entropy label. The S bit for ELI is set to 0, and the TTL value is set
to the same TTL value of above label. (RFC 6790).
GMPLS (Generalized MPLS)
labels can be fibers, wavelength, timeslots and so on. The MPLS-TP (Transport
Profile) has the same frame format as MPLS but has a restricted forwarding
plane and control plane can be either static defined or dynamic using GMPLS.
12. Cisco FabricPath
Headers
The
FabricPath encapsulation uses a MAC-in-MAC encapsulation format. The original
Ethernet frame, including an 802.1Q tag, is prepended by a 48-bit outer source
address (SA), a 48-bit outer destination address (DA), and a 32-bit FabricPath
tag. While the outer SA and DA may appear as 48-bit MAC addresses, FabricPath
switches receiving such frames on a FabricPath core port parse these fields
according to the format shown below.
16
bytes
|
||||||||||||||||||||
6
bytes
|
6
bytes
|
4
bytes
|
||||||||||||||||||
ODA
|
OSA
|
FP
Tag
|
||||||||||||||||||
Endnode
ID
|
U
L
|
I
G
|
Endnode
ID
|
RSVD
|
OOO
DL
|
Switch
ID
|
Sub
switch
ID
|
Port
ID
|
Endnode
ID
|
U
L
|
I
G
|
Endnode
ID
|
RSVD
|
OOO
DL
|
Switch
ID
|
Sub
switch
ID
|
Port
ID
|
Etype
0x8903 |
Ftag
|
TTL
|
6
bits
|
1
bit
|
1
bit
|
2
bits
|
1
bit
|
1
bit
|
12
bits
|
8
bits
|
16
bits
|
6
bits
|
1
bit
|
1
bit
|
2
bits
|
1
bit
|
1
bit
|
12
bits
|
8
bits
|
16
bits
|
16
bits
|
10
bits
|
6
bits
|
– ODA – O-DMAC – Outer Destination MAC –
48 bits / 6 bytes
– OSA – O-SMAC – Outer Source MAC –
48 bits / 6 bytes
– FP Tag
– FabricPath Tag – 32 bits / 4 bytes
12.1 Endnode ID – 6 bits – Reserved, not yet used. The presence of this field may provide the future capability for a FabricPath-enabled end station to uniquely identify itself, allowing FabricPath-based forwarding decisions down to the virtual or physical end-station level.
12.2. U/L
bit – 1 bit – FabricPath switches set
this bit in all unicast outer SA and DA fields, indicating the MAC address is
locally administered (as opposed to universally unique). This is required since
the outer SA and DA fields are not in fact MAC addresses and do not uniquely
identify a particular hardware component as a standard MAC address would.
12.3 I/G
bit – 1 bit – The I/G bit serves the same
function in FabricPath as in standard Ethernet. Any multidestination addresses
have this bit set.
12.4 OOO/DL
bit – 1 bit – The function of the OOO
(out-of-order)/don't learn (DL) bit varies depending on whether the bit is set
in the outer DA (OOO) or the outer SA (DL). Reserved, not yet used.
12.5 Switch
ID – 12 bits
– Every
switch in the FabricPath domain is assigned a unique 12-bit Switch ID. In the
outer SA, this field identifies the FabricPath switch that originated the frame
(typically the ingress FabricPath edge switch). In the outer DA, this field
identifies the destination FabricPath switch.
12.6 Sub-Switch
ID – 8 bits – The sub-switch ID (sSID) field
identifies the source or destination VPC+ port-channel interface associated
with a particular VPC+ switch pair. FabricPath switches running VPC+ use this
field to identify the specific VPC+ port-channel on which traffic is to be
forwarded. The sSID value is locally significant to each VPC+ switch pair. Note
that, because this field is 8 bits, using the sSID to identify VPC+
port-channels imposes a limit of roughly 250 VPC+ port-channels per VPC+ switch
pair (244 to be precise). In the absence of VPC+, this field is always set
to 0.
12.7 Port
ID – Local
Identifier (LID) – 16 bits – Can be used to identify the specific physical or
logical interface on which the frame was sourced or is destined. When used, the
value is locally significant to each switch. This field in the outer DA allows
the egress FabricPath switch to forward the frame to the appropriate edge
interface without requiring a MAC address table lookup. For frames sourced from
or destined to a VPC+ port-channel, this field is set to a common value shared
by both VPC+ peer switches, and the sSID is used by default to select the
outgoing port instead.
12.8
Etype – EtherType
– 16 bits – The EtherType value for
FabricPath encapsulated frames is 0x8903.
13.9 FTAG
– 10 bits
– The
function of the forwarding tag, or FTAG, depends on whether a particular frame
is unicast or multidestination. In the case of unicast frames, the FTAG
identifies the FabricPath topology the frame is traversing. The system
selects a unique FTAG for each topology configured. In the case of
multidestination frames, the FTAG identifies which multidestination forwarding
tree in a given topology the frame should traverse.
12.10 TTL – 6 bits – The Time to Live (TTL) field serves the same
purpose in FabricPath as it does in traditional IP forwarding - each switch hop
decrements the TTL by 1, and frames with an expired TTL are discarded. The TTL
in FabricPath prevents Layer 2 bridged frames from looping endlessly in the
event that a transitory loop occurs (such as during a reconvergence event).
13. PBB Ethernet Header - 802.1AH - MACinMAC - 42 bytes header - consists in the following fields:
42
bytes
|
||||||||||||||||||||
22
bytes
|
12
bytes
|
8
bytes
|
||||||||||||||||||
6
bytes
|
6
bytes
|
B-TAG
- 4 bytes
|
2
bytes
|
I-TAG
- 4 bytes
|
C-DMAC
|
C-SMAC
|
P-TAG
- 4 bytes
|
C-TAG
- 4 bytes
|
||||||||||||
B-DMAC
|
B-SMAC
|
B-TPID
|
B-TCI
|
I-Type
0x88e7 |
I-TPID
|
I-TCI/SID
|
P-TPID
0x88a8 |
P-TCI
|
C-TPID
0x8100 |
C-TCI
|
||||||||||
B-Type
0x88a8 |
B-COS
|
B-DEI
|
B-VID
|
I-SID
|
I-COS
|
I-DEI
|
Reserved
|
P-COS
|
P-DEI
|
P-VID
|
C-COS
|
C-DEI
|
C-VID
|
|||||||
48
bits
|
48
bits
|
16
bits
|
3
bits
|
1
bit
|
12
bits
|
16
bits
|
24
bits
|
3
bits
|
1
bit
|
4
bits
|
16
bits
|
3
bits
|
1
bit
|
12
bits
|
16
bits
|
3
bits
|
1
bit
|
12
bits
|
||
Backbone
Components - 16 bytes
|
Service
Components - 6 bytes
|
6
bytes
|
6
bytes
|
QinQ
- Components - 8 bytes
|
13.1 Backbone Components
– 16 bytes –
Have the following fields:
13.1.1 B-DMAC – Backbone Destination MAC – 6 bytes
13.1.2 B-SMAC – Backbone Source MAC – 6 bytes
13.1.3 B-TAG – Backbone 802.1Q TAG – 4 bytes
13.2 Service Components – 6 bytes – Have the following fields:
13.2.1 I-Type – 2 bytes –
Represents Service Ethertype, 0x88e7 HEX in this case
13.2.2 I-TAG – 4 bytes – Represents a "modified type of 802.1Q
TAG" with the following internal fields:
13.2.2.1 I-SID – 3 bytes –
Represents Service Identifier or Service Instance VLAN ID, it allows to
distinguish services within the same PBB domain
13.2.2.2 I-PCP/ I-COS – 3 bits – Same as PCP in normal 802.1Q frame header
13.2.2.3 I-DEI – 1 bit – Same as DEI in normal 802.1Q frame header
13.2.2.4 Reserved – 4 bits
13.3 QinQ Components – 8 bytes – same fields as in 802.1 AD Header
14. TRILL (Transparent Interconnection of Lots of Links) Headers
40
bytes
|
||||||||||||||
16
bytes
|
2 bytes
|
6
bytes
|
16
bytes
|
|||||||||||
O-DMAC
|
O-SMAC
|
O-TAG
|
Etype
|
V
|
R
|
M
|
OL
|
HC
|
E-Rbridge Nickname
|
E-Rbridge Nickname
|
I-DMAC
|
I-SMAC
|
I-TAG
|
|
OUTER
MAC HEADER
|
Etype
|
TRILL
HEADER
|
INNER
MAC HEADER
|
|||||||||||
48 bits
|
48 bits
|
32 bits
|
16 bits
|
2 bits
|
2 bits
|
1 bits
|
5 bits
|
6 bits
|
16 bits
|
16 bits
|
48 bits
|
48 bits
|
32 bits
|
|
14.1 Outer MAC header 16
bytes – contains O-DMAC (Outer
Destination MAC), O-SMAC (Outer Source MAC), O-TAG (Outer 802.1Q TAG) – Those contain same fields as described above in
Ethernet frame and in 802.1Q Ethernet frame.
14.2 Etype – TRILL Ether- type – 2 bytes – Most probably this value
should be 0x22F3.
14.3 TRILL HEADER – 6 bytes – This contains the following fields:
14.3.1 V – Version – 2 bits – Represents
TRILL protocol version.
14.3.2 R – Reserved – 2 bits – Reserved for future use
in extensions to the TRILL version.
14.3.3 M – Multi-destination bit – 1 bit – Indicates that the frame is to be delivered to a
class of destination end stations via distribution tree and that the egress
Nickname field specifies this tree:
– M=0 – The egress RBridge Nickname contains a Nickname of the
egress Rbridge for a known unicast MAC address
– M=1 – The egress RBridge Nickname contains a Nickname that specifies a
distribution tree (RBridge that is the root of the tree)
14.3.4 OL – Option Length – 5 bits – Specifies in the TRILL Header if
that a frame is using an optional capability and the need to encode information
into the header in connection with that capability. If it is 0, there is no
option present. If the options are present, they follow immediately after the
Ingress Rbridge Nickname field.
14.3.5 HC – Hop Count – 6 bits – A Rbridge drops the
frames received with a hop count of 0, otherwise it decrements the hop count.
14.3.6 E-Rbridge
Nickname – Egress Rbridge
Nickname – 16 bits – Both the Egress and the Ingress Nicknames are
dynamically assigned that act as abbreviations for RBridges' ISIS IDs to
achieve a more compact encoding and can be used to specify potentially
different trees with the same root.
– For known unicast frames and M=0, the egress RBridge nickname
fields specifies the egress RBridge (which should remove the TRILL
encapsulation).
– For multi-destination TRILL frames and M=1, the Egress RBridge
contains a nickname specifying the distribution tree selected to be used to
forward the frame.
14.3.7 I-Rbridge Nickname – Ingress Rbridge
Nickname – 16 bits – Is set to a
nickname of the ingress Rbridge for TRILL data frames and to a nickname of the
source RBridge. If the RBridge settings the ingress nickname has
multiple nicknames, it should use the same nickname in the ingress field
whenever it encapsulates a frame with any particular Inner.Mac.SA and
Inner.Vlan value.
14.4 Inner MAC header 16
bytes – Contains I-DMAC
(Inner Destination MAC), I-SMAC (Inner Source MAC), I-TAG
(Inner 802.1Q TAG) - Those contain same fields as described above in Ethernet frame
and in 802.1Q Ethernet frame.
15. Cisco OTV - Overlay
Transport Virtualization
Cisco Overlay Transport Virtualization
(OTV) is a Layer 2-over-Layer 3 encapsulation "MAC-in-IP" technology
that is designed to extend the reach of Layer 2 domains across data center
pods, domains, and sites. It uses stateless tunnels to encapsulate Layer 2
frames in the IP header and does not require the creation or maintenance of
fixed stateful tunnels. OTV encapsulates the entire Ethernet frame in an IP and
User Datagram Protocol (IP/UDP) header, so that the provider or core network is
transparent to the services offered by OTV.
OTV uses Ethernet over
Generic Router Encapsulation (GRE) and adds an OTV shim to the header to encode
VLAN information. The OTV encapsulation is 42 bytes, which is less than virtual
private LAN service (VPLS) over GRE.
What I have found on INE - OTV Decoded – A Fancy GRE Tunnel , very interesting and nice way to explain OTV:
MPLS?
GRE? Where did those come from? That’s right, OTV is in fact a fancy GRE
tunnel. More specifically it is an Ethernet over MPLS over GRE tunnel. My poor
little PINGs between R2 and R3 are in fact encapsulated as ICMP over IP over
Ethernet over MPLS over GRE over IP over Ethernet (IoIoEoMPLSoGREoIP for
short).
Here, because there are
way too many fields I will put them grouped by different types, each with own
color.
OTV Headers - 72 bytes /octets
|
||||
Outer Ethernet 802.1Q
18 bytes / octets
|
Outer IP header
20 bytes / octets
|
Outer UDP Header
8 bytes / octets
|
OTV Shim Header
8 bytes / octets
|
Inner Ethernet 802.1Q
18 bytes / octets
|
15.1 Outer Ethernet 802.1Q Headers – 18 Bytes
Outer Ethernet 802.1Q - 18 bytes / octets
|
||||||
O-DMAC
|
O-SMAC
|
Type
0x8100 |
O-COS
|
O-DEI
|
O-VID
|
Etype
0x0800 |
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
16 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types.
15.2 Outer IP Header – 20 bytes
Outer IP header - 20 bytes / octets
|
|||||||||||
V
|
IHL
5 |
TOS
|
Total
Length |
Identification
|
Flag
DF=1 |
Fragment Offset
|
Time to Live
|
Protocol
17 |
Header Checksum
|
S-IP
|
D-IP
|
4 bits
|
4 bits
|
8 bits
|
16 bits
|
16 bits
|
3 bits
|
13 bits
|
8 bits
|
8 bits
|
16 bits
|
32 bits
|
32 bits
|
15.2.1 V – Version – 4 bits – Set to value 4 in decimal.
15.2.2 IHL – 4 bits – Set to
value 5 in decimal meaning there are no IP options present in an OTV
encapsulated packet.
15.2.3 TOS – Type of Service – 8 bits – The 802.1P bits from the Ethernet Frame are
copied to this field.
15.2.4 Total Length – 16 bits – The total length of the IP datagram in bytes. This
includes the IP header, the UDP header, the OTV header, and the L2 frame
without the preamble and CRC fields.
15.2.5 Identification – 16 bits – Set randomly by the OTV Edge Device.
15.2.6 Flags – 3 bits – The
DF bit should be set to 1.
15.2.7 TTL Time to Live – 8 bits – Set
by the OTV Edge Device and is configurable.
15.2.8 Protocol – 8 bits – Since
the packet is UDP encapsulated, this field is set to 17 decimal.
15.2.9 Header Checksum – 16 bits – Must
be computed by the OTV Edge Device over the IP header fields.
15.2.10 S-IP – Source Address – 32 bits – The
IP address of the OTV Edge Device doing the encapsulation of the L2 frame.
15.2.11 D-IP – Destination Address – 32 bits – The
IP unicast or multicast address set by the OTV Edge Device which is
encapsulating the L2 frame. The Edge
Device decides when the address is set to a unicast or multicast address.
15.3 Outer UDP Header – 8 bytes
Outer UDP Header - 8 bytes / octets
|
|||
S-Port
|
D-Port
8472 |
UDP
length |
UDP Checksum
|
16 bits
|
16 bits
|
16 bits
|
16 bits
|
15.3.1 S-Port – Source Port – 16 bits – Is
chosen by the OTV Edge Device which is encapsulating the L2 frame based on a
hash of the L2 frame. This allows packets to be load-split evenly over LAGs on
routers in the core, responsible for delivering these IP encapsulated packets.
15.3.2 D-Port – Destination Port – 16 bits –
This is an IANA assigned well-known user port number. Packets encapsulated by
an OTV Edge Device put value 8472 in the destination port field.
15.3.3 UDP Length – 16 bits - Is the length in bytes of the UDP header, the
OTV header, and the L2 frame without the preamble and CRC fields.
15.3.4 UDP Checksum – 16 bits – This is set to 0 by the OTV Edge Device when
doing encapsulation and ignored by the OTV Edge Device which is decapsulating
at the destination site.
15.4 OTV Shim Header – 8 bytes
OTV Shim Header - 8 bytes
/ octets
|
||||||||||
R
|
R
|
R
|
R
|
I
|
R
|
R
|
R
|
Overlay ID
|
Instance ID
|
Reserved
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
24 bits
|
24 bits
|
8 bits
|
15.4.1 R - Reserved bits – 7 bits at the beginning of header and 8 bits at
the end
15.4.2 I - Instance-ID –1 bit – When
set to 1, it indicates the Instance ID should be used in the forwarding lookup.
15.4.3 Overlay ID – 24 bits – Is
used only for control plane packets such as the URP/MRP (IS-IS) to identify
packets for a specific overlay.
15.4.4 Instance ID – 24 bits – Set
by the OTV Edge Device doing the encapsulation to specify a logical table that
should be used for lookup by the OTV Edge Device at the destination site.
15.5 Inner 802.1Q Ethernet Headers
– 18 Bytes
Inner Ethernet 802.1Q - 18 bytes / octets
|
|||||
I-DMAC
|
I-SMAC
|
Type
0x8100 |
I-COS
|
I-DEI
|
I-VID
|
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types.
See more details on the draft-hasmit-otv-01 draft-hasmit-otv-01
The difference between
OTV and VPLS .
16. LISP - Location/Identifier
Separation Protocol
The Cisco
Location/Identifier Separation Protocol, or LISP, is designed to address the
challenges of using a single address field for both device identification and
topology location. LISP addresses the problem by uniquely identifying two
different number sets: routing locators (RLOCs), which describe the topology
and location of attachment points and hence are used to forward traffic, and
endpoint identifiers (EIDs), which are used to address end hosts separate from
the topology of the network.
LISP [I-D.ietf-lisp]
essentially provides an IP over IP overlay where the internal addresses are end
station Identifiers and the outer IP addresses represent the location of the end
station within the core IP network topology. The LISP overlay header uses a 24
bit Instance ID used to support overlapping inner IP addresses.
LISP Headers - 72 bytes /octets
|
||||
Outer Ethernet 802.1Q
18 bytes / octets
|
Outer IP header
20 bytes / octets
|
Outer UDP Header
8 bytes / octets
|
LISP Shim Header
8 bytes / octets
|
Inner Ethernet 802.1Q
18 bytes / octets
|
16.1 Outer Ethernet 802.1Q Headers – 18 Bytes
Outer Ethernet 802.1Q - 18 bytes / octets
|
||||||
O-DMAC
|
O-SMAC
|
Type
0x8100 |
O-COS
|
O-DEI
|
O-VID
|
Etype
0x0800 |
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
16 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types
16.2 Outer IP Header – 20 bytes
Outer IP header - 20 bytes / octets
|
|||||||||||
V
|
IHL
5 |
TOS
|
Total
Length |
Identification
|
Flag
DF=1 |
Fragment Offset
|
Time to Live
|
Protocol
17 |
Header Checksum
|
S-IP
|
D-IP
|
4 bits
|
4 bits
|
8 bits
|
16 bits
|
16 bits
|
3 bits
|
13 bits
|
8 bits
|
8 bits
|
16 bits
|
32 bits
|
32 bits
|
Same as OTV IP header
(or any IP header) format, DF bit is not mandatory to be set.
The LISP architecture
and protocols LISP introduces two new numbering spaces, Endpoint Identifiers
(EIDs) and Routing Locators (RLOCs)
which are intended to replace most use of IP addresses on the Internet. To
provide flexibility for current and future applications, these values can be encoded in
LISP control messages using a general syntax that includes Address Family
Identifier (AFI), length, and value fields.
See more details on the LISP Canonical Address Format
16.3 Outer UDP Header – 8 bytes
Outer UDP Header - 8 bytes / octets
|
|||
S-Port
|
D-Port
4341 |
UDP
length |
UDP Checksum
|
16 bits
|
16 bits
|
16 bits
|
16 bits
|
Same as OTV IP header,
with the difference that UDP port number is 4341, but when the headers are used
for encapsulating L2 frames, the UDP Destination Port is set to 8472 (same as OTV).
16.4 LISP Shim Header – 8 bytes
LISP Shim Header - 8
bytes / octets
|
|||||||
N
|
L
|
E
|
V
|
I
|
Flags
|
Nonce / Map Version
|
Instance ID / Locator Status Bits
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
3 bits
|
24 bits
|
32 bits
|
16.4.1 N – Nonce Present – 1 bit –The N bit is the nonce-present bit. When this bit is set to 1, the low-order
24-bits of the first 32-bits of the LISP header contains a Nonce. Both N and V bits MUST NOT be set in
the same packet. If they are, a decapsulating ETR MUST treat the
"Nonce/Map-Version" field as having a Nonce value present.
16.4.2 L – Locator Status Bit – 1 bit – When this bit is set to 1, the Locator Status
Bits in the second 32-bits of the LISP header are in use.
16.4.3 E – Echo-nonce-request – 1 bit – This bit
MUST be ignored and has no meaning when the N bit is set to 0. When the N bit is set to 1 and this bit is
set to 1, means an ITR is requesting for the nonce value in the Nonce field to
be echoed back in LISP encapsulated
packets when the ITR is also an ETR.
16.4.4 V – Map Version – 1 bit – When this
bit is set to 1, the N bit MUST be 0. This bit indicates that the LISP header
is encoded in this case as below:
LISP Shim Header - 8
bytes / octets
|
||||||||
N
|
L
|
E
|
V
|
I
|
Flags
|
Source Map Version
|
Destination Map Version
|
Instance ID / Locator Status Bits
|
0
|
x
|
0
|
1
|
x
|
x x x
|
12 bits
|
12 bits
|
32 bits
|
16.4.5 I – Instance ID –
1 bit – When
this bit is set to 1, the Locator Status Bits field is reduced to 8-bits and
the high-order 24-bits are used as an Instance ID. If the L-bit is set to 0,
then the low-order 8 bits are transmitted as zero and ignored on receipt. The
format of the LISP header would look like in this case:
LISP Shim Header - 8
bytes / octets
|
||||||||
N
|
L
|
E
|
V
|
I
|
Flags
|
Nonce / Map Version
|
Instance ID
|
LSBs
|
x
|
x
|
x
|
1
|
x
|
x x x
|
24 bits
|
24 bits
|
8 bits
|
16.4.6 Flags – 3 bits – Reserved for future flag use It MUST be set to 0
on transmit and MUST be ignored on receipt.
16.4.5 Nonce – 24 bits – The LISP nonce field is a 24-bit value that is
randomly generated by an ITR when the N-bit is set to 1. Nonce generation algorithms are an
implementation matter but are required to generate different nonces when
sending to different destinations. However, the same nonce can be used for a
period of time to the same destination. The nonce is also used when the E-bit is set to request the nonce value
to be echoed by the other side when packets are returned. When the E-bit is clear but the N-bit is set,
a remote ITR is either echoing a previously requested echo-nonce or providing a
random nonce.
16.4.6 LSB – LISP Locator Status Bits – 24 bits
– When the L-bit is also
set, the locator status bits field in the LISP header is set by an ITR to indicate
to an ETR the up/down status of the Locators in the source site. Each RLOC in a Map-Reply is assigned an
ordinal value from 0 to n-1 (when there are n RLOCs in a mapping entry). The
Locator Status Bits are numbered from 0 to n-1 from the least significant bit
of field. The field is 32-bits when the
I-bit is set to 0 and is 8 bits when the I-bit is set to 1. When a Locator Status Bit is set to 1, the ITR is
indicating to the ETR the RLOC associated with the bit ordinal has up status. When
a site has multiple EID-prefixes which result in multiple mappings (where each
could have a different locator-set), the Locator Status Bits setting in an
encapsulated packet MUST reflect the mapping for the EID-prefix that the
inner-header source EID address matches. If the LSB for an anycast locator is set to 1, then there is at least
one RLOC with that address the ETR is considered 'up'.
16.5 Inner 802.1Q Ethernet Headers
– 18 Bytes
Inner Ethernet 802.1Q - 18 bytes / octets
|
|||||
I-DMAC
|
I-SMAC
|
Type
0x8100 |
I-COS
|
I-DEI
|
I-VID
|
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types.
See more details on
the draft-ietf-lisp-23#page-19
17. VxLAN - Virtual
eXtensible LANs
Virtual Extensible LAN,
or VXLAN, is a Layer 2 overlay scheme over a Layer 3 network. It uses an IP/UDP
encapsulation so that the provider or core network does not need to be aware of
any additional services that VXLAN is offering. A 24-bit VXLAN segment ID or
VXLAN network identifier (VNI) is included in the encapsulation to provide up
to 16 million VXLAN segments for traffic isolation and segmentation, in contrast
to the 4000 segments achievable with VLANs. Each of these segments represents a
unique Layer 2 broadcast domain and can be administered in such a way that it
can uniquely identify a given tenant's address space or subnet.
In short, VXLAN is a Layer 2 overlay scheme over a Layer 3
network. Each overlay is termed a VXLAN
segment. Only VMs within the same VXLAN
segment can communicate
with each other. Each VXLAN segment is scoped through a 24 bit segment ID
hereafter termed the VXLAN Network Identifier (VNI). This allows up to 16M
VXLAN segments to coexist within the same administrative domain.
VxLAN Headers - 72 bytes /octets
|
||||
Outer Ethernet 802.1Q
18 bytes / octets
|
Outer IP header
20 bytes / octets
|
Outer UDP Header
8 bytes / octets
|
OTV Shim Header
8 bytes / octets
|
Inner Ethernet 802.1Q
18 bytes / octets
|
17.1 Outer Ethernet 802.1Q Headers – 18 Bytes
Outer Ethernet 802.1Q - 18 bytes / octets
|
||||||
O-DMAC
|
O-SMAC
|
Type
0x8100 |
O-COS
|
O-DEI
|
O-VID
|
Etype
0x0800 |
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
16 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types
17.2 Outer IP Header – 20 bytes
Outer IP header - 20 bytes / octets
|
|||||||||||
V
|
IHL
5 |
TOS
|
Total
Length |
Identification
|
Flag
DF=1 |
Fragment Offset
|
Time to Live
|
Protocol
17 |
Header Checksum
|
S-IP
|
D-IP
|
4 bits
|
4 bits
|
8 bits
|
16 bits
|
16 bits
|
3 bits
|
13 bits
|
8 bits
|
8 bits
|
16 bits
|
32 bits
|
32 bits
|
It is the same as OTV IP header (or any IP
header) format. The source IP address
is indicating the IP address of the VTEP over which the communicating VM (as
represented by the inner source MAC address) is running. The destination IP
address can be a unicast or multicast IP address. When it is a unicast IP
address, it represents the IP address of the VTEP connecting the communicating
VM as represented by the inner destination MAC address.
17.3 Outer UDP Header – 8 bytes
Outer UDP Header - 8 bytes / octets
|
|||
S-Port
|
D-Port
4789 |
UDP
length |
UDP Checksum
|
16 bits
|
16 bits
|
16 bits
|
16 bits
|
It is the same as OTV UDP header, with the
difference that UDP Destination port
number is 4789. Some early implementations of VXLAN have used other values for the destination port. To enable
interoperability with these implementations, the destination port SHOULD be
configurable. It is recommended that the source port number be calculated using
a hash of fields from the inner packet - one example being a hash of the inner
Ethernet frame`s headers. This is to enable a level of entropy for ECMP/load
balancing of the VM to VM traffic across the VXLAN overlay.
The UDP checksum field SHOULD be transmitted as
zero. When a packet is received with a UDP checksum of zero, it MUST be
accepted for encapsulation. Optionally, if the encapsulating endpoint includes
a non-zero UDP checksum, it MUST be correctly calculated across the
entire packet including the IP header, UDP header, VXLAN header and
encapsulated MAC frame. When a dencapsulating
endpoint receives a packet with a non-zero checksum it MAY choose to verify the
checksum value. If it chooses to perform such verification, and the verification fails, the packet
MUST be dropped. If the decapsulating destination chooses not to perform the
verification, or performs it successfully, the packet MUST be accepted for
decapsulation.
17.4
VxLAN Shim Header – 8 bytes
VxLAN Shim Header - 8
bytes / octets
|
||||||||||
R
|
R
|
R
|
R
|
I
|
R
|
R
|
R
|
Reserved
|
VXLAN Network Identifier (VNI)
|
Reserved
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
24 bits
|
24 bits
|
8 bits
|
17.4.1 R – Reserved – 1 bit sequence – 7 fields of 1 bit representing the Reserved
Bits, must be set to 0.
17.4.2 I – VXLAN Network ID (VNI) – 1 bit – Must be set to 1.
17.4.3 Reserved (24 bits and 8 bits) – MUST be set to zero.
17.4.4
VXLAN Network ID (VNI) – 24 bits – Designate the individual VXLAN
overlay network on which the communicating
VMs are situated. VMs in different VXLAN overlay networks cannot communicate
with each other.
17.5
Inner 802.1Q Ethernet Headers – 18 Bytes
Inner Ethernet 802.1Q - 18 bytes / octets
|
|||||
I-DMAC
|
I-SMAC
|
Type
0x8100 |
I-COS
|
I-DEI
|
I-VID
|
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
VXLAN is typically
deployed in data centers on virtualized hosts, which may be spread across
multiple racks. The individual racks may be parts of a different Layer 3
network or they could be in a single Layer 2 network. The VXLAN
segments/overlay networks are overlaid on top of these Layer 2 or Layer 3
networks
See more details on
the draft-mahalingam-dutt-dcops-vxlan/?include_text=1
18. NVGRE - Network Virtualization using
Generic Routing Encapsulation
Network Virtualization Using Generic Routing
Encapsulation, or NVGRE, allows the creation of virtual Layer 2 topologies on
top of a physical Layer 3 network. This design is achieved by tunneling
Ethernet frames inside an IP packet over a physical network. NVGRE supports a
24-bit segment ID or virtual subnet identifier (VSID), providing up to 16
million virtual segments that can uniquely identify a given tenant's segment or
address space.
Network virtualization involves creating virtual
Layer 2 and/or Layer 3 topologies on top of an arbitrary physical Layer 2/Layer
3 network. Connectivity in the virtual topology is provided by tunneling
Ethernet frames in IP over the physical network. Virtual broadcast domains are
realized as multicast distribution trees. The multicast distribution trees are
analogous to the VLAN broadcast domains. A virtual Layer 2 network can span
multiple physical subnets. Support for bi-directional IP unicast and multicast
connectivity is the only requirement from the underlying physical network to
support unicast communications within a virtual network. If the operator
chooses to support broadcast and multicast traffic in the virtual topology the
physical topology must support IP multicast.
NvGRE Headers - 64 bytes /octets
|
|||
Outer Ethernet 802.1Q
18 bytes / octets
|
Outer IP header
20 bytes / octets
|
NvGRE Header
8 bytes / octets
|
Inner Ethernet 802.1Q
18 bytes / octets
|
18.1 Outer Ethernet 802.1Q Headers – 18 Bytes
Outer Ethernet 802.1Q - 18 bytes / octets
|
||||||
O-DMAC
|
O-SMAC
|
Type
0x8100 |
O-COS
|
O-DEI
|
O-VID
|
Etype
0x0800 |
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
16 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types.
The source Ethernet
address in the outer frame is set to the MAC address associated with the NVGRE endpoint.
The destination Ethernet address is set to the MAC address of the nexthop IP
address for the destination NVE. The destination endpoint may or may not be on
the same physical subnet. The outer VLAN tag information is optional and can be
used for traffic management and broadcast scalability.
18.2 Outer IP Header – 20 bytes
Outer IP header - 20 bytes / octets
|
|||||||||||
V
|
IHL
5 |
TOS
|
Total
Length |
Identification
|
Flag
DF=1 |
Fragment Offset
|
Time to Live
|
Protocol
47 |
Header Checksum
|
S-IP
|
D-IP
|
4 bits
|
4 bits
|
8 bits
|
16 bits
|
16 bits
|
3 bits
|
13 bits
|
8 bits
|
8 bits
|
16 bits
|
32 bits
|
32 bits
|
Same as any IP header format, protocol being set
to 47.
18.3 NvGRE Header – 8 bytes
NvGRE Header - 8 bytes /
octets
|
||||||||
C
0
|
K
1
|
S
0
|
Reserved0
|
V
|
Protocol
0x6558
|
Virtual Subnet ID (VSID)
|
Flow ID
|
|
1 bit
|
1 bit
|
1 bit
|
1 bit
|
9 bits
|
3 bits
|
16 bits
|
16 bits
|
8 bits
|
18.3.1 C – Checksum Present – 1 bit – The value is set to
zero meaning that both the Checksum and the Reserved1 fields are not present.
18.3.2 K – Key Present – 1 bit – The bit is set to 1 meaning the key field is present in the GRE
header.
18.3.3 S – Sequence Number Present – 1 bit – The value
is set to zero meaning that the Sequence Number field is not present in the GRE
header.
18.3.4 Reserved0 – 9 bits – A receiver MUST discard a packet where any of bits
1-5 are non-zero, unless that receiver implements RFC 1701. Bits 6-12 are reserved for future use. These
bits MUST be sent as zero and MUST be ignored on receipt.
18.3.5 V – Version – 3 bits – The Version Number field MUST contain the value
zero.
18.3.6 Protocol – 16 bits – The Protocol Type field contains the protocol
type of the payload packet. The protocol type field in the GRE header is set to
0x6558 (transparent Ethernet bridging).
18.3.7 VSID – Virtual Subnet ID – 16 bits – The first
24 bits of the Key field are used
for VSID. The VSID can be
crafted in such a way that it uniquely identifies a specific tenant's subnet.
The VSID is carried in an outer header allowing unique identification of
the tenant's virtual subnet to various devices in the network. NVGRE leverages the GRE header to carry VSID information in each packet.
The VSID information in each packet can be used to build multi-tenant-aware tools
for traffic analysis, traffic inspection, and monitoring.
18.3.7 Flow ID – 8 bits – The last
8 bits of the Key field are (optional) FlowID, which can be used to add
per-flow entropy within the same VSID, where the entire Key field (32-bit) MAY
be used by switches or routers in the physical network infrastructure for ECMP
purposes (Equal-Cost, Multi-Path). If a FlowID is not generated, the FlowID
field MUST be set to all zeros.
16.5 Inner 802.1Q Ethernet Headers
– 18 Bytes
Inner Ethernet 802.1Q - 18 bytes / octets
|
|||||
I-DMAC
|
I-SMAC
|
Type
0x8100 |
I-COS
|
I-DEI
|
I-VID
|
48 bits
|
48 bits
|
16 bits
|
3 bits
|
1 bit
|
12 bits
|
Same as any 802.1Q
frame, with the above fields and Ether-types.
18. STT - Stateless Transport Tunneling
Stateless transport tunneling (STT) is an overlay encapsulation
scheme over Layer 3 networks that use a TCP-like header within the IP header.
The use of TCP fields has been proposed to provide backward compatibility with
existing implementations of NICs to enable offload logic, and hence STT is
specifically useful for deployments that are target end systems (such as
virtual switches on physical servers). Note that, as the name implies, the TCP
fields do not use any TCP connection state.
I don't intend to describe frame format here, more
information can be found on Draft-davie-stt-02#page-13.
And finally I couldn't find any frame encapsulation for Juniper Qfabric , qfabric or Juniper Meta Fabric
Now, before going to sleep, I wonder how a frame will look like if
it will be an IPSEC over GRE over VPLS (protected with TE PE-P FRR) over
MAC-in-MAC...for sure I don't like to troubleshoot something like that and I am
very glad that MTU 9000 is supported in almost all core routers...by Mihaela
Paraschivu dreaming in the night...
No comments: