Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hsflowd on Debian12/VyOS unidentified src/dst interface #69

Open
ichdasich opened this issue Nov 15, 2024 · 1 comment
Open

hsflowd on Debian12/VyOS unidentified src/dst interface #69

ichdasich opened this issue Nov 15, 2024 · 1 comment

Comments

@ichdasich
Copy link

ichdasich commented Nov 15, 2024

Moin,
this is probably related to #68 ;

I experience a similar issue on a device with two vlans where hsflowd does not identify either the src or dst interface depending on the packet flow:

mod_pcap:macsrc=E2072CAC99F5, macdst=AEFE58D81C53
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=336 cap_len=114 mac_len=14 (E2072CAC99F5 -> AEFE58D81C53 et=0x8100)
dbg2:selected sampler eth0 ifIndex=2

Capturing on the raw interface leads to a higher number of samples with in and out .

the sflow conf is:

sflow {
  polling=30
  sampling=100
  sampling.bps_ratio=0
  agentIP=2a06:d1c3:2::1
  agent=eth0.750
  collector { ip = 2001:db8::1 udpport = 6343 }
  pcap { dev=eth0 }
}

A corresponding sflowtool dump looks as follows:

startSample ----------------------
sampleType_tag 0:1
sampleType FLOWSAMPLE
sampleSequenceNo 198
sourceId 0:2
meanSkipCount 100
samplePool 19800
dropEvents 0
inputPort 1073741823
outputPort 9
flowBlock_tag 0:1
flowSampleType HEADER
headerProtocol 1
sampledPacketSize 126
strippedBytes 4
headerLen 122
headerBytes FA-43-25-32-48-98-E2-07-2C-AC-99-F5-81-00-02-EE-86-DD-60-09-BD-CD-00-40-3A-39-2A-06-D1-C1-00-0A-00-00-01-95-01-91-01-97-01-97-2A-06-D1-C3-00-02-00-03-00-00-00-00-00-00-00-01-80-00-4F-84-46-0B-00-B5-1C-49-37-67-00-00-00-00-D7-C7-01-00-00-00-00-00-10-11-12-13-14-15-16-17-18-19-1A-1B-1C-1D-1E-1F-20-21-22-23-24-25-26-27-28-29-2A-2B-2C-2D-2E-2F-30-31-32-33-34-35-36-37
dstMAC fa4325324898
srcMAC e2072cac99f5
decodedVLAN 750
decodedPriority 0
IPSize 104
IPTOS 0
IP6_label 0x9bdcd
IPV6_payloadLen 64
IPTTL 57
srcIP6 2a06:d1c1:000a:0000:0195:0191:0197:0197
dstIP6 2a06:d1c3:0002:0003:0000:0000:0000:0001
IPProtocol 58
endSample   ----------------------
endDatagram   =================================

tcpdump of tcpdump -i eth0 -n -s 1600 -w icmp6.pap src 2a06:d1c1:000a:0000:0195:0191:0197:0197 and dst 2a06:d1c3:2:3::1 attached:
icmp.zip

(Not the same packet as te sample, but same path/flow)

Could the issue be some confusion because the in and out vlan differ? (in eth0.742 out eth0.750)

edit: just saw that the debug log is for the reverse path; I in fact seem not to see any flows that get correctly attributed to eth0.742.

[email protected]:~$ cat sflow.log |grep eth0.742
dbg3:reading interface eth0.742
dbg1:adaptor eth0.742 came up
dbg1:device eth0.742 Get SIOCGIFADDR failed : Cannot assign requested address
dbg1:adaptorAddOrReplace: byMac: replacing adaptor [ifindex: 7 peer: 0 nmacs: 1 mac0: E2072CAC99F5 name: eth0.710] with [ifindex: 8 peer: 0 nmacs: 1 mac0: E2072CAC99F5 name: eth0.742]
dbg1:adaptorAddOrReplace: byMac: replacing adaptor [ifindex: 8 peer: 0 nmacs: 1 mac0: E2072CAC99F5 name: eth0.742] with [ifindex: 9 peer: 0 nmacs: 1 mac0: E2072CAC99F5 name: eth0.750]
dbg1:adaptor eth0.742 has 802.1Q vlan 742
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:adaptor eth0.742 has v6 address 2a06d1c3000100040000000000000002 with scope 0x0
dbg1:adaptor eth0.742 has v6 address fe80000000000000e0072cfffeac99f5 with scope 0x20
dbg3:reading interface eth0.742
dbg1:device eth0.742 Get SIOCGIFADDR failed : Cannot assign requested address
dbg1:ETHTOOL_GMODULEINF0 eth0.742 failed : Operation not supported
dbg1:setAdaptorSpeed(ETHTOOL_GLINKSETTINGS1): eth0.742 ifSpeed == 0 (changed=NO)
dbg1:adaptor eth0.742 has 802.1Q vlan 742
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:adaptor eth0.742 has v6 address 2a06d1c3000100040000000000000002 with scope 0x0
dbg1:adaptor eth0.742 has v6 address fe80000000000000e0072cfffeac99f5 with scope 0x20
dbg3:detectInterfaceChange: testing eth0.742
dbg3:reading interface eth0.742
dbg1:device eth0.742 Get SIOCGIFADDR failed : Cannot assign requested address
dbg1:setAdaptorSpeed(ETHTOOL_GLINKSETTINGS1): eth0.742 ifSpeed == 0 (changed=NO)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.742 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:adaptor eth0.742 has v6 address 2a06d1c3000100040000000000000002 with scope 0x0
dbg1:adaptor eth0.742 has v6 address fe80000000000000e0072cfffeac99f5 with scope 0x20
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742
dbg3:detectInterfaceChange: testing eth0.742

while logs for eth0.750 exist

dbg3:reading interface eth0.750
dbg1:adaptor eth0.750 came up
dbg1:device eth0.750 Get SIOCGIFADDR failed : Cannot assign requested address
dbg1:adaptorAddOrReplace: byMac: replacing adaptor [ifindex: 8 peer: 0 nmacs: 1 mac0: E2072CAC99F5 name: eth0.742] with [ifindex: 9 peer: 0 nmacs: 1 mac0: E2072CAC99F5 name: eth0.750]
dbg1:adaptor eth0.750 has 802.1Q vlan 750
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:adaptor eth0.750 has v6 address 2a06d1c3000200000000000000000001 with scope 0x0
dbg1:adaptor eth0.750 has v6 address fe80000000000000e0072cfffeac99f5 with scope 0x20
agent=eth0.750
dbg3:reading interface eth0.750
dbg1:device eth0.750 Get SIOCGIFADDR failed : Cannot assign requested address
dbg1:ETHTOOL_GMODULEINF0 eth0.750 failed : Operation not supported
dbg1:setAdaptorSpeed(ETHTOOL_GLINKSETTINGS1): eth0.750 ifSpeed == 0 (changed=NO)
dbg1:adaptor eth0.750 has 802.1Q vlan 750
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:adaptor eth0.750 has v6 address 2a06d1c3000200000000000000000001 with scope 0x0
dbg1:adaptor eth0.750 has v6 address fe80000000000000e0072cfffeac99f5 with scope 0x20
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=77 cap_len=77 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=1284 cap_len=114 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=78 cap_len=78 mac_len=14 (F6C0A0ACE0D1 -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=79 cap_len=79 mac_len=14 (E2072CAC99F5 -> F6C0A0ACE0D1 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (F6C0A0ACE0D1 -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=58 cap_len=58 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=108 cap_len=108 mac_len=14 (E2072CAC99F5 -> FA4325324898 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=44 cap_len=44 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=108 cap_len=108 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=79 cap_len=79 mac_len=14 (E2072CAC99F5 -> F6C0A0ACE0D1 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (F6C0A0ACE0D1 -> E2072CAC99F5 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=108 cap_len=108 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=79 cap_len=79 mac_len=14 (E2072CAC99F5 -> 121C4616DB6D et=0x8100)
dbg3:detectInterfaceChange: testing eth0.750
dbg3:reading interface eth0.750
dbg1:device eth0.750 Get SIOCGIFADDR failed : Cannot assign requested address
dbg1:setAdaptorSpeed(ETHTOOL_GLINKSETTINGS1): eth0.750 ifSpeed == 0 (changed=NO)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:readL3Addresses: ifa_name=eth0.750 up=1 loopback=0 promisc=0 bond(master=0,slave=0)
dbg1:adaptor eth0.750 has v6 address 2a06d1c3000200000000000000000001 with scope 0x0
dbg1:adaptor eth0.750 has v6 address fe80000000000000e0072cfffeac99f5 with scope 0x20
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (F6C0A0ACE0D1 -> E2072CAC99F5 et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=48 cap_len=48 mac_len=14 (E2072CAC99F5 -> FEA99FA0DA9F et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=79 cap_len=79 mac_len=14 (E2072CAC99F5 -> 121C4616DB6D et=0x8100)
mod_pcap:srcdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=eth0.750 out=<not found> pkt_len=58 cap_len=58 mac_len=14 (E2072CAC99F5 -> F6C0A0ACE0D1 et=0x8100)
mod_pcap:dstdev=eth0.750(9)(peer=0)
takeSample: hook=0 tap=eth0 in=<not found> out=eth0.750 pkt_len=79 cap_len=79 mac_len=14 (121C4616DB6D -> E2072CAC99F5 et=0x8100)
@sflow
Copy link
Owner

sflow commented Nov 15, 2024

With this kind of pcap sampling hsflowd only has the MAC addresses to go on, and it looks like the various eth0. sub interfaces all have the same MAC so they collapse together - and then hsflowd recognizes them as local so it writes the port number as 0x3FFFFFFF to mean “internal” - which makes more sense for end-hosts than for routers, but that is what is going on.

The sFlow standard is only supposed to report in/out interface index numbers by physical port anyway, so it’s actually defensible, and the right place to look for the VLAN detail is in the decode of the 802.1Q header… which appears as expected.

Why only report in/out physical ports? To understand that you have to consider that sFlow was designed for whole-network monitoring, not just for individual routers. By adopting a data-model of “devices connected by links” you get link objects that mean the same thing to the neighboring devices. So you have a coherent model and can resolve the end-to-end traffic matrix for the network ( and avoid being presented with potentially thousands of sub-interfaces that only mean something to the router they are on). Once you have a stable and scalable network model, with counters for every link, then all the detail about VLANs, MPLS labels, priorities, routes, congestion bits, tunnels etc. can be captured in the annotated packet samples.

So for a packet like this that came in on one VLAN and was routed to another the correct approach in sFlow would be to include the “extended_switch” structure and use it to indicate the in+out VLANs regardless of whether the packet was sampled on ingress or egress. The “extended_router” structure could also be used to report CIDR prefixes and nexthop.

We have thought about what that would look like on Linux. It probably involves either (1) querying netlink for routing info (assuming we can know which routing table was applied, since that might have been set by nftables with a ‘mark’) or (2) moving the packet sampling so that it happens inside the routing engine (eg FRR) where the full picture is known.

Another approach is for the sflow collector to peer with a route reflector and retrofit those routing interpretations as best it can. That is never as good as getting the details “straight from the horses mouth” but it’s good enough for many purposes.

The pcap module is quite versatile, and the kernel-based sampling that it employs is good for performance. It’s not perfect, but it has been a lot easier to use than the ULOG and NFLOG modules that we started with.

We might find that we can instrument a VPP dataplane more accurately in due course.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants