Posts by fs-support_HK

    We can not reproduce the hanging boot process with our current software here. Can you tell us exactly what release version you are using? Thanks.


    Your F&S Support Team

    M.2 SSDs are available as SATA or PCIe (NVME) devices. The PicoCoreMX8MP does not provide a SATA port, so only PCIe would be possible. But it has only one PCIe Gen 3 lane. Typically M.2 SSDs use 2 or even 4 lanes, and nowadays even Gen 4, so one lane will be rather slow. An alternative would be to use eMMC on port SD_A (this port provides an 8-bit interface) or connecting the SSD via USB 3 (up to 5 MBit/s, i.e. USB3.2 Gen 1, a.k.a. USB 3.0).


    Your F&S Support Team

    Do you have the right settings? 115200 bd, no flow control. Do you use a Null-Modem Cable? Do you use the right UART port on the board. There are some ports that use TTL levels, not RS232 levels, but the debug output is on a port with RS232. If you are using a Serial-to-USB adapter, this also might cause problems.


    When using Linux, you should better use a terminal program like Putty or TeraTerm. DCUTerm does not support the terminal emulation commands that are sent by Linux devices to switch text color or bold, italics, etc. Then you see regular output mixed with Escape sequences.


    Your F&S Support Team

    Of course we take the needs of our customers seriously and we have immediately started working on the Buildroot release. It will take a few days until all builds are done, all the tests are done and the release packages are assembled.


    Because of the worldwide shortage of electronic parts, we had to do new revisions of quite a lot of our boards and modules and we are struggling with the amount of release updates that are required as a result. We try to do those releases first that we think are used by the most customers, but this is always just a rough guess. So it is actually very important that the customers tell us their immediate needs so that we can give those releases higher priority. So thanks for telling us. We try to do the release as quickly as possible.


    Your F&S Support Team

    Ok, I see you are using git in your Linux kernel directory. This has the effect that the kernel version now has a "-dirty" attached. This means that the kernel does not find its modules (on the board) because they use a different path in /lib/modules/<version>. So one dirty trick would be to create a symbolic link there that redirects the new version to the old path ...


    Code
    1. cd /lib/modules
    2. ln -s <old-version> <new-version>


    If you can not log in, you probably have to reinstall the previous kernel, create the link and then go back again to the new kernel.


    Or you also have to rebuild the rootfs with all the modules, so that the module names match again the kernel version. A clean build should be done anyway after you are done with your development, when you have committed all your changes to git and when you build your own final version to be released to your customers. Then all "-dirty" settings should be gone.


    Your F&S Support Team

    Ok, here is the explanation what happens.


    When starting up, the spi-imx.c driver calls a function to register the SPI controller. This function parses the device tree and collects all GPIO chip select numbers in an array. Then when the function returns, the driver calls gpio_request() for each entry in the array so that all these GPIOs can be used as chip selects afterwards.


    However the above function does much more than just collecting GPIO numbers. It also parses all sub-nodes in the SPI device tree node and creates a device for each of them. This actually leads to a call of function spi_imx_setup() for each device which is used to set the GPIO direction of the appropriate chip select to output. But as you can see, this happens *before* the original function returns to the probe function, where the GPIOs are requested. So the sequence, that a GPIO must be requested first before it can be used, is violated.


    On GPIO1, where other drivers already requested some GPIOs, this nonetheless works because the clock to this GPIO block is already active from the other GPIOs. But on GPIO3, where no other GPIOs are requested by other drivers, this fails because the clock is still off when the output direction should be set. The clock is only activated later by the gpio_request() call.


    This made it quite difficult to locate the problem. Of course I immediately thought of the clocks, but after the probe function, when the system is running, the clock to GPIO3 was active, too. So at first I did not understand at all why only the accesses to GPIO3 fail. It needed quite a lot of debug output and many many tries to understand that the clock is still off at the point of time when the GPIO direction is set.


    The attached patch will move the gpio_request() from the probe function to function spi_imx_setup(). This fixes the call sequence.


    Go to the Linux kernel source directory and call


    Code
    1. patch -p1 < 0001-spi-imx.c-Fix-GPIO-request-and-direction-sequence.patch


    Please note the less-than character '<' to redirect the input.


    Then recompile and reinstall the kernel. Now everything should work as expected. Sorry for the inconvenience.


    Your F&S Support Team

    Maybe no one in the forum is aware of this problem because they all use our predefined Development Machine where this problem does not exist. :-)


    Choosing the right distribution release and setting up the build machine is not as easy as it may look like. First of all, the compilers and other tools must neither be too old, nor too young. They must more or less match the time of the software that is to be compiled. For example the buildroot-2021.02 stuff may be targeted for at most GCC-9.x. So having GCC-10 or even GCC-11 may cause unexpected warnings and errors. Ubuntu 22.04 is rather new and may cause exactly such problems. And yes, I'm also talking of x86-tools on the build machine. The build process also builds some x86 tools and intermediate steps that are needed for the final cross-compilation process.


    The next thing is that the build process may need some additional packages installed on the build machine. For example you typically need some devel-packages, i.e. packages that provide the header files for the main package. And you also need additional packages for some tools (e.g. bison, ncurses, etc). This often depends on the configuration that you want to build. On our development machine, we have already installed all the tools and packages that we know are needed for our given configurations. Of course when using your own development machine, you are responsible for installing all these packages.


    Unfortunately, the screenshot does not give enough information to locate the problem. The compiler complains about a stray backslash, and we see that the backslash is at the beginning of a line with an #include command. This is strange. Is this already part of the original code or is this the result of some pre-processing, like the C-preprocessor inserting all macros or some code generation step that was run earlier? If this is really in the code, maybe this part of the code is not supposed to be compiled, e.g. because an earlier #if condition should not be true. This could be again the result of some different behavior of newer development tools. Or it is only the result of an earlier error that is the main reason for the build failure.


    Your F&S Support Team

    We have located the problem. It is a driver thing, rather complicated. Now we are looking for the easiest way to fix it. The patch should be available soon.


    Your F&S Support Team

    One more thing that is worth a try. In all i.MX8M device trees, NXP is using the value 0x40000 for the SPI chip select pad setting. We simply copied that without thinking too much about it. However when looking at it now, this value is rather strange. It does not match what we know about these pad settings. Typically, this value is the bit combination that should be written to the appropriate PAD settings register in the iomuxc register file. In addition, there are two special bits. If bit 31 is set, then the value should not be modified. This is for signals that are already set earlier, by the bootloader for example, and should not be changed in Linux. And if bit 30 is set, then the SION flag should be set. This bit activates the input branch of the pad, so that the value can be read back. This is necessary for signals that need to read the real pad value for loop-back purposes. This SION bit is actually in a different register, but to avoid having an additional value just for this bit, NXP uses this trick. If bit 30 is set, then the driver will set the SION bit in the other register.


    On i.MX8M CPUs, only a few bits in the PAD settings register are actually valid. No bit higher than bit 8 is ever used, so the value can always be represented by at most three hex digits. So the value 0x40000 does not make sense. It does not refer to a valid bit in the PAD settings register, but it is also not one of the two bits 30 and 31, that are handled specially by the driver.


    So it might actually be a mistake by NXP. Maybe they copied a value with SION enabled from somewhere else (0x40000xxx), removed the last three digits of the original number (these are the valid bits of the register), but forgot to append three new digits for the new bits. If this assumption is correct, then the PAD settings bits would actually be set to all zero and the unimplemented bit that was set by the 4 has no effect. This would result in a rather weak driver strength of the signal.


    So in our point of view, it makes more sense to define an own value here that actually represents the PAD settings register bits. So try to use 0x40000156 instead of 0x40000 in all the pad settings in the pinctrl_ecspi3_cs: ecspi3cs subnode. This will use maximum driver strength, a fast Slew Rate and a weak Pull-up. Maybe this modification does the trick for you.


    Your F&S Support Team

    To keep you updated, we are working on it, but strangely enough, it is working here, as far as we can see up to now. Unfortunately it is difficult to test some pins on our baseboard, so it takes a little bit longer to check all your chip selects.

    Just one more test. If you move the <&gpio3 1 GPIO_ACTIVE_LOW> to the first position in the CS-Sequence, so that it becomes CS0, does it work then? Of course you have to move the sub-node accordingly to position 0 then. This would show if there is a general problem with the other GPIOs or if only entry 0 will work.


    I know we had some modifications in the SPI driver in the past to make this work with more than 1 CS signal, maybe there were some changes when updating to the newer kernel so that there is again something wrong. We have to check this, but the above test would help us in locating the problem.


    Your F&S Support Team

    Perhaps I'm missing a kernel module?

    That is also possible. For our boards, we use a rather minimized set of drivers. If you are adding own hardware, you typically also have to activate the driver for it. So in Linux menuconfig, go to "Device Drivers" -> "Industrial I/O support" -> "Digital to analog converters" and activate "Linear Technology LTC2632-12/10/8 DAC spi driver". You can either do this by "Y", then the driver is included in the kernel image, or you can set it to "M", then the driver is built as a kernel module. Then the kernel must be rebuilt and reinstalled. In the latter case, you also have to rebuild your Buildroot/Yocto rootfs to have the new kernel module included.

    If you are telling the driver with num-cs = <5> that you use 5 devices, then you also have to define 5 sub-nodes for these devices. If you do not need all devices right now, you can use compatible = "linux,spidev" for the others. We have more than one CS on our efusa9, so see


    arch/arm/boot/dts/efusa9qdl.dts


    in node &ecspi1 for an example with three CS signals.


    Your F&S Support Team

    As far as I know, pixels are counted from zero, i.e. starting with an even pixel. Even pixels should be on LVDS channel 0 then, odd pixels on LVDS channel 1. If the result is still showing swapped channels at the end, I believe you can fix this by using LVDS1 as display device instead of LVDS0. Then the even and odd pixels should be swapped, too.


    Your F&S Support Team

    Yes, this is possible. In the device tree, e.g. armstonea9q.dts, you have the following setting:


    Code
    1. /*
    2. * Define this for a two-channel display, i.e. one display, one framebuffer,
    3. * but two LVDS channels, even pixels from one channel, odd pixels from the
    4. * other channel. Only define either DISPLAY_LVDS0 or DISPLAY_LVDS1 in this
    5. * case, using the full display resolution.
    6. */
    7. //#define CONFIG_ARMSTONEA9_LVDS_SPLIT_MODE

    So just uncomment the define and then set the full resolution in your LVDS0 timings entry. That's it.


    Your F&S Support Team

    How is the reboot done? Do you really execute the "reboot" command or do you just switch the board off and on again?


    You have to be aware that if you have writable filesystems, you cannot switch the board simply off. Never! You *always* have to shut it down correctly. This is the reason why our default images on our Starterkits typically have a read-only root filesystem. Then switching off the power is possible.


    If you switch off filesystems that are in use, then the metadata may not be written back to the real media. Which means if you delete a file, then this is stored in the buffer cache in RAM, but it is not yet written to flash memory. If you simply switch off and on again, then the system had never written this information to the flash and so the file reappears the next time, because the metadata in flash still has the file valid.


    A good procedure is to keep filesystems in read-only state and only switch to write mode for the short period of time when you actually write data. After that, switch back to read-only mode immediately again. This is the safest method. Or at least use some form of sync after writing data.


    Normally, using reboot should unmount all filesystems before the restart, so all meta data should be correctly saved.

    The customer who received the board in the first place got a fully functioning board with functioning software. With the serial number, he also had access to all the necessary source code, even newer versions. By deleting the software and removing all identification features, he wilfully destroyed the board. I find it somewhat unfair that you now draw the conclusion that we, the company F&S, have done something wrong so that you no longer want to recommend us. If someone wilfully deletes the BIOS on a PC and then passes the PC on, it can also no longer be used without further ado. Would this also be the fault of the manufacturer in your eyes?


    In addition we have replaced these modules with a newer board revision because there were hardware issues with these 1.00 revisions that are fixed in the new revision. This is a rather common way of doing things. And that a company does not support old revisions, that should not be in the field anyway, is also a completely normal process. So why do you blame us for this non-functioning board? This board is not supposed to exist anymore. A deliberately destroyed board of an unsupported old revision that was not directly bought from the manufacturer cannot be taken as a measure of the quality of this company.


    I understand that this is disappointing for your, but what do you expect from us?