-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPDK Segmentation Fault #19
Comments
Have tried on CentOS 7.1 still failing in the same place, upgraded DPDK to the latest version. DPDK appears to be working but still coring EAL: PCI memory mapped at 0x7f9258bec000 Had to make the following chances to Makefile.dpdk add -lrte_pmd_i40e |
thank you for your report. I confirmed the issue. since linux-libos-tools was forked from net-next-nuse, DPDK (and netmap) are not well tested. I will work on this issue but it would be also great if you come up with the patches (and pull request in the end). the issue in nuse_bind() would be that NUSE should handle the socket opened by dpdk library as a host fd, but tried to look up the internal fd table which is used by NUSE application (i.e., ping). |
Thanks, I'll try and have a look. Will not have time for a couple of weeks however. |
Hi Hajime int nuse_bind(int fd, const struct sockaddr *name, socklen_t namelen)
} the fd in nuse_bind is passed by vfio_mp_sync_socket_setup, nuse_bind should not look fd in nuse_fd_table[]? if not, what do you mean handle fd as host fd? |
I meant, if nuse_bind() can't find a fd in nuse_fd_table, then it means it wasn't opened by nuse_socket. in that case, host system call should be used. write(2) system call (nuse_write()) is already implemented like that. https://github.com/libos-nuse/linux-libos-tools/blob/master/nuse-glue.c#L279 though the issue of DPDK is not only the part of nuse_bind(), the above should be a proper way to fix it. |
you mean like below? int nuse_bind(int fd, const struct sockaddr *name, socklen_t namelen)
} |
@vincentmli yes, something like that. |
I have struggled a couple of combination between dpdk verison, host kernel version, on VM or bare-metal. and decided tentatively to revert dpdk version to the older one, 1.7.1, which we have been tested in the past with the older host kernel version (not LibOS kernel, around 3.13). My Fedora21 on vmware has newer kernel (4.0.4), thus not worked and tested on VM yet. the following outputs is the result with the commit 86eb208, which should work with the specific environment. uname -a
and NIC.
to build dpdk-enable NUSE
with a nuse.conf
then execute it.
voila :) I think I will close this issue if some of you guys successfully follow this instruction. Let me know how you experienced with the information. |
thank you for your effort, I am sorry to bother you. I couldn't use bare metal machine like yours but VM environment on KVM. for clarification, with DPDK 1.7.1, do I still need to modify nuse_bind to handle the fd as host fd ? I tried 1.7.1 and still got the same coredump backtrace, then modified nuse_bind to handle fd as host fd, got coredump again with nuse_listen, modified nuse_listen to handle fd as host fd, got coredump with nuse_accept, modified nuse_accept. no more coredump, but got: <5>Linux version 4.1.0-rc7+ (root@ubuntu14) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #0 Thu Jun 25 07:09:40 PDT 2015 EAL: Detected 4 lcore(s) here is my nic status: Network devices using DPDK-compatible driver0000:00:07.0 'Virtio network device' drv=igb_uio unused= Network devices using kernel driver0000:00:03.0 'Virtio network device' if= drv=virtio-pci unused=igb_uio Other network devicesroot@ubuntu14:/home/vincent/net-next-nuse/arch/lib/tools# lspci |
@vincentmli in my bare-metal, I don't need to modify nuse_bind() etc and it's working. |
@thehajime I can confirm it works on my bare-meal dell poweredge R220 with Intel 82571EB running on ubutnu 14.04.1 with kernel Linux 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux ubuntu 14.04.2 with kernel Linux ubuntu14 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux does not work it is odd there would be such difference between ubuntu kernel 3.13.0-32 and 3.16.0-30, is this kernel or rump or dpdk issue? |
Hello,
Have you ever seem this before, running on FC21:
NUSECONF=nuse.conf ./nuse ping www.google.com
<5>Linux version 4.0.0+ (root@roman) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #0 Sat May 9 12:04:41 BST 2015
<6>NET: Registered protocol family 16
<6>NET: Registered protocol family 2
<6>TCP established hash table entries: 512 (order: 0, 4096 bytes)
<6>TCP bind hash table entries: 512 (order: 0, 4096 bytes)
<6>TCP: Hash tables configured (established 512 bind 512)
<6>UDP hash table entries: 128 (order: 0, 4096 bytes)
<6>UDP-Lite hash table entries: 128 (order: 0, 4096 bytes)
<6>NET: Registered protocol family 1
<6>Netfilter messages via NETLINK v0.30.
<6>nfnl_acct: registering with nfnetlink.
<6>nf_conntrack version 0.5.0 (32 buckets, 128 max)
<6>nf_tables: (c) 2007-2009 Patrick McHardy [email protected]
<6>ip_set: protocol 6
<6>ipip: IPv4 over IPv4 tunneling driver
<6>nsc: GRE over IPv4 demultiplexor driver
<6>nsc: GRE over IPv4 tunneling driver
<6>nsc: (C) 2000-2006 Netfilter Core Team
<6>Initializing XFRM netlink socket
<6>NET: Registered protocol family 10
<6>nsc: Mobile IPv6
<6>nsc: IPv6 over IPv4 tunneling driver
<6>NET: Registered protocol family 17
<6>NET: Registered protocol family 15
<6>DCCP: Activated CCID 2 (TCP-like)
<6>DCCP: Activated CCID 3 (TCP-Friendly Rate Control)
<6>nsc: Hash tables configured (established 512 bind 512)
create vif dpdk0
address = 192.168.0.10
netmask = 255.255.255.0
macaddr = 00:01:01:01:01:01
type = 2
failed to get interface status
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 1 on socket 0
EAL: Detected lcore 8 as core 2 on socket 0
EAL: Detected lcore 9 as core 3 on socket 0
EAL: Detected lcore 10 as core 4 on socket 0
EAL: Detected lcore 11 as core 5 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 12 lcore(s)
./nuse: line 9: 1574 Segmentation fault (core dumped) LD_LIBRARY_PATH=.:../../../ LD_PRELOAD=liblinux.so:libnuse-linux.so $*
./nuse: line 9: 1574 Segmentation fault (core dumped) LD_LIBRARY_PATH=.:../../../ LD_PRELOAD=liblinux.so:libnuse-linux.so $*
[root@roman tools]# gdb ping core.1574
GNU gdb (GDB) Fedora 7.8.2-38.fc21
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ping...Reading symbols from /root/net-next-nuse/arch/lib/tools/ping...(no debugging symbols found)...done.
(no debugging symbols found)...done.
warning: core file may not match specified executable file.
[New LWP 1574]
[New LWP 1575]
[New LWP 1576]
[New LWP 1577]
[New LWP 1578]
[New LWP 1579]
[New LWP 1580]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ping www.google.com'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f68902052a5 in nuse_bind (fd=7, name=0x7ffcf4f69fc0, namelen=110) at nuse-glue.c:198
198 struct SimSocket *kernel_socket = nuse_fd_table[fd].nuse_sock->kern_sock;
Missing separate debuginfos, use: debuginfo-install iputils-20140519-4.fc21.x86_64
(gdb) back
#0 0x00007f68902052a5 in nuse_bind (fd=7, name=0x7ffcf4f69fc0, namelen=110) at nuse-glue.c:198
#1 0x00007f68901f9026 in vfio_mp_sync_socket_setup () at /root/net-next-nuse/arch/lib/tools/dpdk/lib/librte_eal/linuxapp/eal/eal_pci_vfio_mp_sync.c:351
#2 pci_vfio_mp_sync_setup () at /root/net-next-nuse/arch/lib/tools/dpdk/lib/librte_eal/linuxapp/eal/eal_pci_vfio_mp_sync.c:379
#3 0x00007f68901f66c3 in rte_eal_pci_init () at /root/net-next-nuse/arch/lib/tools/dpdk/lib/librte_eal/linuxapp/eal/eal_pci.c:624
#4 0x00007f68901f1665 in rte_eal_init (argc=3, argv=) at /root/net-next-nuse/arch/lib/tools/dpdk/lib/librte_eal/linuxapp/eal/eal.c:755
#5 0x00007f689018a510 in dpdk_if_init (dpdk=0x7f6892a0c160) at nuse-vif-dpdk.c:180
#6 0x00007f689018a924 in nuse_vif_dpdk_create (ifname=0x7f6892a0df40 "dpdk0") at nuse-vif-dpdk.c:261
#7 0x00007f6890203650 in nuse_vif_create (type=NUSE_VIF_DPDK, ifname=0x7f6892a0df40 "dpdk0") at nuse-vif.c:69
#8 0x00007f6890208946 in nuse_netdev_create (vifcf=0x7f6892a0df40) at nuse.c:367
#9 0x00007f68902090cf in nuse_init () at nuse.c:524
#10 0x00007f6890cd4f2a in call_init.part () from /lib64/ld-linux-x86-64.so.2
#11 0x00007f6890cd503b in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#12 0x00007f6890cc5d2a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#13 0x0000000000000002 in ?? ()
#14 0x00007ffcf4f6c68b in ?? ()
#15 0x00007ffcf4f6c690 in ?? ()
#16 0x0000000000000000 in ?? ()
(gdb)
[root@roman tools]# dpdk/tools/dpdk_nic_bind.py --status
Network devices using DPDK-compatible driver
0000:03:00.1 'Ethernet Controller X710 for 10GbE SFP+' drv=igb_uio unused=i40e,vfio-pci
Network devices using kernel driver
0000:03:00.0 'Ethernet Controller X710 for 10GbE SFP+' if=enp3s0f0 drv=i40e unused=igb_uio,vfio-pci
0000:04:00.0 'I350 Gigabit Network Connection' if=enp4s0f0 drv=igb unused=igb_uio,vfio-pci
0000:04:00.1 'I350 Gigabit Network Connection' if=enp4s0f1 drv=igb unused=igb_uio,vfio-pci
0000:06:00.0 'SFC9020 [Solarstorm]' if=enp6s0f0 drv=sfc unused=igb_uio,vfio-pci
0000:06:00.1 'SFC9020 [Solarstorm]' if=enp6s0f1 drv=sfc unused=igb_uio,vfio-pci
0000:08:00.0 'I350 Gigabit Network Connection' if=eno1 drv=igb unused=igb_uio,vfio-pci Active
0000:08:00.1 'I350 Gigabit Network Connection' if=eno2 drv=igb unused=igb_uio,vfio-pci
Other network devices
[root@roman tools]#[root@roman tools]# more nuse.conf
interface dpdk0
address 192.168.0.10
netmask 255.255.255.0
macaddr 00:01:01:01:01:01
if macaddr is not specified, random mac addr is used.
route
network 0.0.0.0
netmask 0.0.0.0
gateway 192.168.0.1
[root@roman tools]#
I did have to patch the DPDK to work with 3.19 kernel.
The text was updated successfully, but these errors were encountered: