Ethernet issue

  • Hello,

    we've encountered an issue with the ethernet connection on the PicoCore imx8mm over PoE. After a while (hours or even days) it

    seems to get in an undefined state, see below, and we loose the connection. Could this be some driver issue? How can

    we solve it? We are using Linux Kernel fsimx8mm 5.15.131-F+S and basically just the standard default F&S image for now. The issue gets

    fixed after restarting the device. Please find some logs below.


    First restart:


    Second restart:

    Code
    1. # /etc/init.d/S40network restart
    2. Stopping network: ifdown: interface eth0 not configured
    3. OK
    4. Starting network: RTNETLINK answers: File exists
    5. Qualcomm Atheros AR8035 30be0000.ethernet-1:04: phy_poll_reset failed: -110
    6. fec 30be0000.ethernet eth0: Unable to connect to phy
    7. RTNETLINK answers: No such device
    8. FAIL



    Best regards,

    BS

  • Hello,


    we were testing the last couple of days. The error happens with and without PoE on our baseboard. The eth0 link suddenly switches to down (fec 30be0000.ethernet eth0: Link is Down). We recognize this as soon as our external client application looses connection to the PicoCore part. A restart of the interface is not possible anymore due to error phy_poll_reset failed: -110, Unable to connect to phy, No such device.


    Do you have any ideas so far? We can arrange remote access for F&S, if this would be helpful.


    Best regards,

    BS

  • Hello,


    did you find a way to reproduce the issue faster? Maybe with high loads?


    We do not configure the HW-PHY-RESET pin in linux to keep the settings from U-Boot while booting.

    Could you try to reset the PHY "by hand" an see if it starts again in case of the error?


    Code
    1. ifconfig eth0 down
    2. gpioset gpiochip0 5=0
    3. sleep 0.1
    4. gpioset gpiochip0 5=1
    5. ifconfig eth0 up



    Your F&S Support Team

  • Hello,


    thanks for your comment.


    We have been testing different scenarios over multiple days, but we still cannot reproduce the issue any faster yet.

    I wanted to reset the PHY manually, but it seems that gpioset needs to be added to the Buildroot image manually, as it’s not included by default. Is this correct? Do I also need to change something in the default device tree?


    Best regards,

    BS


  • Apologies for the delayed response. We conducted tests under high network loads using iperf3 in bi-directional mode between the server (picocore) and the client.


    Server: iperf3 -s -D

    Client: iperf3 -c 10.0.0.252 --bidir -t 86400 -i 60


    This results in similar behavior, only faster, within a few seconds or minutes. We also replicated this setup on the F&S carrier board with iperf3 and observed the same behavior. Additionally, before eth0 fully disconnects, we receive multiple messages similar to the following:


    On our baseboard:

    Code
    1. fec 30be0000.ethernet eth0: Link is Down
    2. fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Half - flow control off
    3. fec 30be0000.ethernet eth0: Graceful transmit stop did not complete!
    4. fec 30be0000.ethernet eth0: Link is Down
    5. fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Half - flow control off
    6. fec 30be0000.ethernet eth0: Link is Down
    7. fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Half - flow control off
    8. fec 30be0000.ethernet eth0: Link is Down
    9. fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Half - flow control off
    10. fec 30be0000.ethernet eth0: Link is Down

    => Result after Reset:

    Qualcomm Atheros AR8035 30be0000.ethernet-1:04: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:04, irq=POLL)

    fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx

    IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready


    On the F&S baseboard:

    Code
    1. fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
    2. IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
    3. fec 30be0000.ethernet eth0: Link is Down
    4. (usually just one "Link is down" notification)

    => Result after Reset:

    Qualcomm Atheros AR8035 30be0000.ethernet-1:04: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:04, irq=POLL)

    IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

    fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx


    Resetting the HW-PHY basically gets us back in a working state, but it's not really good in production:

    Code
    1. ifconfig eth0 down
    2. gpioset gpiochip0 5=0
    3. sleep 1
    4. gpioset gpiochip0 5=1
    5. ifconfig eth0 up

    Does the command gpioset gpiochip0 5=0 set the voltage level of ETH_A_D2_P to zero? I’m trying to confirm if I’ve interpreted the GPIO Reference Card for Rev. 1.30 correctly.


    Did we miss something? Do you have any suggestions on how to resolve this issue?


    Thank you

    Best regards,

    BS

  • Hello,


    gpioset gpiochip0 5=0 pulls the RESET pad if the Atheros to 0. It is not connected to the base board unless you have a version without Ethernet PHY, then it is connected to ETH_B_D4P.


    We will setup an iperf3 test like yours over the weekend.


    Can you make sure that the power connection of your board is stable and provides enough power? At least 5W to make sure that it is not an power connection issue.


    How many boards are affected by this?

    Could you connect 2 PicocoreMX8MM Starter Kits and run the iperf3 test? This way we could make sure to use the same setup.


    Your F&S Support Team

  • Hello,


    thanks, just a quick reply.


    I quickly setup another PicocoreMX8MM starter kit (PCoreBBDSI Rev1.40), same PicoCore, run the iperf3 test with the same results. Eth0 is down after 60 seconds. A reset gets the interface back online. The starter kit's power supply provides an output of 5.0V/3.0A. For testing with our baseboard, we are using a laboratory power supply, which should be sufficient for this purpose.


    Quote

    How many boards are affected by this?

    So far 4 boards.

    Quote

    unless you have a version without Ethernet PHY

    How can I check that?


    Have a good weekend.

    Best regards,

    BS

  • Hello,


    we conducted additional tests today on the F&S starter kit, a PicoCore with default image from the SD card directory of the Buildroot release (Linux fsimx8mm 5.15.131-F+S #2 SMP PREEMPT Fri Nov 10 18:39:16 CET 2023 aarch64 GNU/Linux), nothing included (no iperf3 or gpioset). I could reproduce the failing eth0 interface issue a couple of times by downloading a random file from the PicoCore in a loop and ping flooding:


    1.) create e.g. a random 200MB file on the rootfs

    2.) start the httpd server: httpd -p 8000

    3.) start a powershell script on the client, which downloads the random file in (1) via Invoke-WebRequest

    4.) in multiple ssh sessions run: ping <client ip> -A -s 60000

    5.) in multiple powershell sessions run: ping <server ip> -t -a -l 60000


    While it’s running, you will receive multiple messages like this until it fails to recover:

    Code
    1. fec 30be0000.ethernet eth0: Link is Down
    2. fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
    3. fec 30be0000.ethernet eth0: Link is Down
    4. fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
    5. fec 30be0000.ethernet eth0: Link is Down
    6. fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
    7. fec 30be0000.ethernet eth0: Link is Down
    8. fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
    9. fec 30be0000.ethernet eth0: Link is Down


    The NIC load was approximately 40% - 60%.


    Best regards,

    BS

  • Hello,


    we did some tests and noticed, that the PicoCoreMX8MM is using polling to get the Link Status of the Phy, while all other boards are using interrupt.

    Switching to interrupt increased the stability of the link status, so that the above described error could not be reproduced anymore.

    If you are facing similar problems, please add the lines 5 and 6 to the mdio node of your picocoremx8mm-lpddr4.dts

    Code
    1. mdio {
    2. #address-cells = <1>;
    3. #size-cells = <0>;
    4. ethphy0: ethernet-phy@0 {
    5. interrupt-parent = <&gpio1>;
    6. interrupts = <4 0>;
    7. };


    This setting will be default in future releases of fsimx8mm.


    Your F&S Support Team