Start a new topic

4.14.y kernel for TS-4200?

The 2.6.36.2 kernel + Debian Stretch on the TS-4200 is getting very long in the tooth, and I thought I'd take a stab at bringing up a newer kernel on the board, with a view to eventually bringing up a new userspace (which will require a newer kernel). So far, I've imported the 2.6.36.2 configuration into a Linux 4.14.134 source tree (`make oldconfig`), and started trying to write a device tree definition for the board based on the old `board-ts42xx.c`. What I have so far (patch attached) builds for ARMv5 with the GCC cross-compiler available from the Debian Buster repos (`gcc-8-arm-linux-gnueabi`). Then `cat` the `at91sam9g20-ts4200.dtb` file onto the end of the `zImage`, and write that to the first `da` partition of the TS-4200 SD card image.

With that done, the board does begin to boot. And if you edit the `linuxrc` script to include `set -x` near the beginning, you can see that the script executes (so the shell works, and the console works) and runs until any access to the FPGA (e.g., `let x=`devmem 0x3000000c``, or if you comment that out, `eval `ts4200ctl --info``). Then it hangs. I suspect this is because the kernel doesn't think that range of memory is valid. But I don't know what I need to do (in terms of drivers I'm missing, or in terms of device tree nodes) to teach the kernel about the range, or how to even get any better debugging information printed to the console. As you can see in the device tree, I've tried defining the range as a syscon, but it's not clear from the boot output whether any driver is ever actually looking for that definition and doing anything with it. Any suggestions?

Samuel,


An interesting effort, we are interested to see how it plays out; please keep us up to date with any new developments as they come.


I don't believe the kernel would hang in the case of this. Generally, if you attempted to access memory that you shouldn't from userspace, the kernel would yell at you. Either returning a segfault, bad address, or just always returning 0x0. The hang to me indicates that the kernel is happy with that address access, but the device there is not responding, and the kernel sits forever waiting for it. This can happen for a few reasons. The FPGA needs a clock and a proper reset cycle before it will function.


The bootrom does set up the clock PCK1 to 49.5 MHz, and sets the AT91 PIO PB31 to the correct state to output the PCK1 clock. However, the advent of FTD more or less invalidates this paradigm without significant workaround (which we've had to do on other platforms that we have ported newer kernels to). This is because with the Device Tree, every peripheral is specified in some way and is then touched by the kernel. This means that if you don't specify setting up PCK1 to 49.5 MHz in your FDT, then when the kernel runs through the FDT and sets up peripherals, its going to essentially turn that output off. Same with the PIO pin that the FPGA clock comes out on; if the IOMUX of that is not set up, the kernel will put everything in a known sane state which would likely be I/O mode.


This has its advantages. It touches all of the hardware and puts them in a known sane state, but can invalidate what the bootrom does.


I would recommend disabling all of the calls in the initramfs to the FPGA. Try as best as you can to get to a shell. Once you are there, you can start poking around at register statuses to confirm the state of the FPGA clock. You may need to manually turn it back on, and then issue a reset via the CPU IO pin to bring the FPGA back to life.


Let me know if you have any further questions, we will do what we can to help. But this is definitely a large undertaking.

Thanks for the reply, Kris. That's really helpful. I'll be sure to update here if I get any further.

One of the difficulties I'm encountering is that because I can't write to the FPGA, I can't disable the watchdog – so if I do push forward to getting a shell, I'll have a very limited amount of time to do any poking around. On the bright side, a cold boot to Busybox is pretty quick!...

You may be able to take our stock image, boot to that, disable the WDT, and then 'kexec' the new kernel. I'm honestly not sure how well this would work.


I can offer you another option, however, this may have other unknown side effects and may void your warranty.


The WDT is fed via a 200 Hz input clock. This design choice was made because it is possible to turn off the FPGA clock to save power. The idea being that an application could feed the WDT for a period of time, turn off the FPGA clock, wait in this lower power state for some time, turn the FPGA clock back on, feed the WDT and repeat. Any failures in this process would still allow the WDT to kick the whole system. This 200 Hz input is fed from U9. If you were to remove this, it would stop the WDT clock, and should hopefully allow you to get to a shell and poke around without fear of being WDT reset.


As I said, this may have other side effects; but looking at the verilog for that FPGA I don't expect any. And again, if there is any damage caused while removing this part, it will void the warranty on the device.


Hope this helps!


1 person likes this
Good news: while muddling around trying to wire up the clock, I found the `clk_ignore_unused` boot argument (`Documentation/clk.txt`), which suppresses the kernel's default behaviour of disabling any clocks which it thinks aren't required by the device tree. Not a great long-term solution, but for now, setting it 1) proves that yes, the FPGA clock being disabled is my only problem, not anything related to memory mapping, and 2) lets me get booted into a shell.

The environment is pretty unhappy with me because I didn't copy any of the new kernel modules over. And while the M41T00S RTC and LM73 temperature sensor are accessible (and appear to be working correctly) via sysfs, the i2c-dev device appears as `/sys/class/i2c-dev/i2c-1`, not `i2c-0`, and throws “Device or resource busy” errors on access, so `ts4200ctl` doesn't like life much. More frustratingly/worryingly, `nandctl` runs, but only one `nbd-client` process will run at a time (launching a second one crashes both with a “Device being setup by another task” kernel message); and the `/dev/nbdX` devices can't be mounted (mount outputs “mount: mounting /dev/nbd3 on /mnt/root failed: Invalid argument”, and there are no kernel messages).

Still, one thing at a time. I'm going to get the kernel modules sorted, then see where I get from there. Thanks for all the help.

That is definitely a start!


I will caution you that NAND functionality is likely going to be the most complex; even though its fully in userspace for this exact reason. The problem is that NBD got a huge overhaul a number of years back. It remained backwards compatible to some extent for a while, but I'm not sure of the current state of it. Modern nbd-client binaries only want to be run once, and only be passed the whole disk which should have a proper partition table. From there, it creates /dev/nbdXpY devices, where /dev/nbdX is the whole disk.

I figured NAND would be a fairly large hurdle, and I'd planned to leave it for last – just wanted to see if I could quickly assess how far away from working it was. Thanks for the pointers once more. With only one, whole-disk `nbd-client` instance running (and both the `nandctl` and `nbd-client` block sizes set to 512B), `fdisk -l` evaluates the partition table at the root of the NAND, although it thinks it's 549.8GB in size. And I can mount and operate on the /dev/nbd0pY partitions. I'll keep working off the SD card for now just because my development loop of pull the SD card/attach it to my workstation/blast my modifications onto it/pull it/put it back in the TS-4200 is pretty fast and convenient. But it's nice to know that the NAND might not be too far off from working once I get there.

Anyway, with correct modules in place, I can boot into the full SD card environment (not just the fastboot/Busybox environment). I've got Ethernet, so that's nice. Next challenge still seems to be the userspace I²C access I mentioned before, probably followed by porting the MUXIRQ driver for PC/104. I see there's a similar driver for the TS-4800 in mainline already (`ts4800-irqc`), but from a quick glance at the TS-4800 documentation, it seems like the interface is quite different, so I think it makes more sense to keep the TS-4200 driver separate.

1 person likes this
Login or Signup to post a comment