[tulip] starfire bring down and up interface problem

agibson@ptm.com agibson@ptm.com
Mon, 31 Jul 2000 17:40:24 -0400


I apologize ahead of time, as I realize this is the tulip list, but starfire does not have a list and tulip is the closest list to it.

With the pci-scan and starfire(starfire.c:v0.15 4/07/2000) module loaded, and then bring up eth1(first interface of starfire), everything works fine.  If I then ifdown and then ifup the eth1 interface, the starfire driver does not seem to be able to send outgoing packets but thinks it does according to tcpdump on that same system.

ping from the system with the starfire driver(10.10.15.10)
# tcpdump -n -i eth1  (tcpdump running on 10.10.15.10)
16:40:48.247496 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:49.247476 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:50.247654 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:51.247478 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:52.247485 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:53.280676 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:54.277485 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:55.277476 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)


a seperate ping initiated from the remote system(10.10.15.1)
# tcpdump -n -i eth1  (tcpdump running on 10.10.15.10)
16:41:17.082389 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:17.082476 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)
16:41:18.512809 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:18.512878 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)
16:41:19.514154 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:19.514221 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)
16:41:20.515417 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:20.515484 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)

NOTE: The system with the starfire driver shows that the packets are transmitting(see trace above, the light on the starfire card blinks), but the remote system never gets the packets.  I verified this running windump on the remote system.

STARFIRE DEBUG LEVEL AT 7:
SUCCESSFUL PING BEFORE BRINGING THE INTERFACE DOWN
Jul 31 14:06:34 agibson2 kernel: eth1: Tx #38 slot 6  b1010062 0f03cd82. 
Jul 31 14:06:34 agibson2 kernel: eth1: Transmit frame #39 queued in slot 7. 
Jul 31 14:06:34 agibson2 kernel: eth1: Interrupt status 9001. 
Jul 31 14:06:34 agibson2 kernel: eth1: Tx Consumer index is 7. 
Jul 31 14:06:34 agibson2 kernel: eth1: Tx completion entry 38 is 8f000030. 
Jul 31 14:06:34 agibson2 kernel: eth1: Interrupt status 0000. 
Jul 31 14:06:34 agibson2 kernel: eth1: exiting interrupt, status=0x0000. 
Jul 31 14:06:34 agibson2 kernel: eth1: Interrupt status 8101. 
Jul 31 14:06:34 agibson2 kernel:   netdev_rx() status of 17 was 60110062. 
Jul 31 14:06:34 agibson2 kernel:   netdev_rx() normal Rx pkt length 98, bogus_cnt 255. 
Jul 31 14:06:34 agibson2 kernel:   Rx data 00:00:d1:ed:7f:b5 00:20:af:dd:ef:5f 0800 69.0.0.84. 
Jul 31 14:06:34 agibson2 kernel:   exiting netdev_rx() status of 18 was 00000000 0. 
Jul 31 14:06:34 agibson2 kernel: eth1: Tx Consumer index is 7. 
Jul 31 14:06:34 agibson2 kernel: eth1: Interrupt status 0000. 
Jul 31 14:06:34 agibson2 kernel: eth1: exiting interrupt, status=0x0000. 

(INTERFACE BROUGHT DOWN AND BACK UP HERE)

Jul 31 14:06:35 agibson2 kernel: eth1: Shutting down ethercard, status was Int 0000. 
Jul 31 14:06:35 agibson2 kernel: eth1: Queue pointers were Tx 39 / 39,  Rx 18 / 18. 
Jul 31 14:06:35 agibson2 kernel:  
Jul 31 14:06:35 agibson2 kernel:   Tx ring at 01d75000: 
Jul 31 14:06:35 agibson2 kernel:  #0 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #1 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #2 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #3 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #4 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #5 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #6 desc. b1010062 0f03cd82 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:  #7 desc. b101006e 0f03cc62 -> 00000000. 
Jul 31 14:06:35 agibson2 kernel:   Rx ring at 01a91000 -> c1bf6000: 
Jul 31 14:06:35 agibson2 kernel:  #0 desc. 0e9a7011 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #1 desc. 0f50a011 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #2 desc. 0f285011 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #3 desc. 0f284011 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #4 desc. 08cba811 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #5 desc. 08ff3011 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #6 desc. 08ec9011 -> 00000000 
Jul 31 14:06:35 agibson2 kernel:  #7 desc. 0f50a811 -> 00000000 
Jul 31 14:06:40 agibson2 last message repeated 2 times
Jul 31 14:06:41 agibson2 kernel: eth1: netdev_open() irq 7. 
Jul 31 14:06:41 agibson2 kernel: eth1:  Filling in the station address. 
Jul 31 14:06:41 agibson2 kernel: eth1:  Setting the Rx and Tx modes. 
Jul 31 14:06:41 agibson2 kernel: eth1: Done netdev_open(). 
Jul 31 14:06:42 agibson2 kernel: eth1: Tx #0 slot 0  b101002a 0ffc4dc2. 
Jul 31 14:06:42 agibson2 kernel: eth1: Transmit frame #1 queued in slot 1. 
Jul 31 14:06:42 agibson2 kernel: eth1: Interrupt status 9001. 
Jul 31 14:06:42 agibson2 kernel: eth1: Tx Consumer index is 1. 
Jul 31 14:06:42 agibson2 kernel: eth1: Tx completion entry 0 is 86dc0000. 
Jul 31 14:06:42 agibson2 kernel: eth1: Interrupt status 0000. 
Jul 31 14:06:42 agibson2 kernel: eth1: exiting interrupt, status=0x0000. 
Jul 31 14:06:43 agibson2 kernel: eth1: Tx #1 slot 1  b101002a 0f2a6e62. 
Jul 31 14:06:43 agibson2 kernel: eth1: Transmit frame #2 queued in slot 2. 
Jul 31 14:06:43 agibson2 kernel: eth1: Interrupt status 9001. 
Jul 31 14:06:43 agibson2 kernel: eth1: Tx Consumer index is 2. 
Jul 31 14:06:43 agibson2 kernel: eth1: Tx completion entry 1 is 99360008. 
Jul 31 14:06:43 agibson2 kernel: eth1: Interrupt status 0000. 
Jul 31 14:06:43 agibson2 kernel: eth1: exiting interrupt, status=0x0000. 
Jul 31 14:06:44 agibson2 kernel: eth1: Tx #2 slot 2  b101002a 0ffc4dc2. 
Jul 31 14:06:44 agibson2 kernel: eth1: Transmit frame #3 queued in slot 3. 
Jul 31 14:06:44 agibson2 kernel: eth1: Interrupt status 9001. 
Jul 31 14:06:44 agibson2 kernel: eth1: Tx Consumer index is 3. 
Jul 31 14:06:44 agibson2 kernel: eth1: Tx completion entry 2 is 8c020010. 
Jul 31 14:06:44 agibson2 kernel: eth1: Interrupt status 0000. 
Jul 31 14:06:44 agibson2 kernel: eth1: exiting interrupt, status=0x0000.

AFTER STARFIRE INTERFACE IS BROUGHT DOWN AND BACK UP
ping from the system with the starfire driver(10.10.15.10)
# tcpdump -n -i eth1  (tcpdump running on 10.10.15.10)
16:40:48.247496 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:49.247476 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:50.247654 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:51.247478 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:52.247485 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:53.280676 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:54.277485 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)
16:40:55.277476 > arp who-has 10.10.15.1 tell 10.10.15.10 (0:0:d1:ed:7f:b5)


ping from the remote system(10.10.15.1)
# tcpdump -n -i eth1  (tcpdump running on 10.10.15.10)
16:41:17.082389 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:17.082476 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)
16:41:18.512809 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:18.512878 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)
16:41:19.514154 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:19.514221 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)
16:41:20.515417 B arp who-has 10.10.15.10 tell 10.10.15.1
16:41:20.515484 > arp reply 10.10.15.10 (0:0:d1:ed:7f:b5) is-at 0:0:d1:ed:7f:b5 (0:20:af:dd:ef:5f)

NOTE: The system with the starfire driver shows that the packets are transmitting(see trace above, the light on the starfire card blinks), but the remote system never gets the packets.

The only way to fix it is to remove the module while the interface is down, reinsert the module, and then bring the interface up again.  If I ever need to bring the interface down and back up again, the same problem will appear.  I have tested the same sequence with a regular 3com 3c905 and did not see any problems.

Is there a known issue with bringing down and back up the interface with the starfire drivers?  Any information would be appreciated.

Redhat6.2 kernel 2.2.16

Adam