Lime2 fileserver HD noise?

Started by mbosschaert, November 10, 2024, 05:27:19 PM

Previous topic - Next topic

mbosschaert

For a long time I've been struggling with random locks of my Lime2 Rev.L boards when a HD is connected (see earlier posts). After some time the board starts freezing for some seconds or minutes (eg running dmesg or ps ax may take minutes to complete) and the logs are filling with harddisk errors. As reported earlier I'm running the latest olimex debian kernels, have sufficient stabilized power, tried to de-noise the HD power supply, reduced max cpu frequency, etc, without any improvement. One of my fileservers recently crashed again with fatal HD errors after which I had to do a forced fsck to have it come back again. After the recovery of the HD I reinstalled all but did not put the board and HD back in its enclosure and rebooted. Now after having it run for some days still no locks and the board remains responsive as all boards I'm running without HD's connected.

The enclosure I'm using was created as compact as possible having both the 2.5" HD and lime2 board.



When I took the fileserver from the cabinet, it appeared that the bottom (precisely over the CPU location of the Lime2 the PETG was completely worn out and brittle and apparently the CPU had heated up so much that the material was brownish decolored.



In the enclosure the axis of the harddisk and the CPU of the Lime2 are just a few mm separated.



Would it be possible that the HD produces some RF or magnetic radiation which interferes with the Lime2 processor, creating unpredictable and irregular behaviour?

If this could be the case what would be a solution (except of designing a larger enclosure)? Would it help to add an alu-foil layer between the HD and the board connected to the board ground? What other solutions I should think of? Could the wiring (original Olimex power and data wires) cause any trouble? As Olimex still sells the complete sets for NAS, and even provides for a metal enclosure (not physically separating the HD from the board) I do not understand why my setup gives me troubles now for years.


LubOlimex

#1
Interesting pictures. It appears the meltdown is just below the A20 main chip.

My first advice is - if you think the case is the problem with the freezes, why don't you just test one setup that freezes often without the case and see if stops freezing. This will be clear indicator that the problems are related to the case and possible overheating.

Now about the damage on the case - assuming that the board works alright and there is no hardware problem (e.g. short-circuit or failed component), then it is probably caused by combination of things - extra heat from hard disk, small case with no extra holes for air, thermally-susceptible material used for the cover. Also it is not clear where the box is placed, but if place it over flat surface with all of the box cover in contact with the surface - this will increase the thermal issues, avoid placing it over flat surface use some separator from the surface (rubber feet or bolts or something, or some plastic separators at the four corners at the case).

My advice is to try to drill few holes so some air can get trough and also think of some slim aluminum piece to fit over the processor and the RAM. From my tests this piece here lowers the CPU temperature by around 10 degrees C:

https://www.olimex.com/Products/Components/Misc/ALUMINIUM-HEATSINK-20x20x6MM/

But it won't fit between the HDD and the board in your enclose I guess, so you might need to experiment with other form of aluminum chip to insert over. Whatever aluminum piece you get just make sure to attach it over both the A20 CPU and the banks of RAM since these are the components that might heat up and cause issues, and also make sure it is attached to them with some adhesive that is not a thermal isolator. Mind that it is good idea this aluminum piece to be well attached since it is metal and if it touches certain GPIOs or legs and power supply it might cause short circuit.

QuoteWould it be possible that the HD produces some RF or magnetic radiation which interferes with the Lime2 processor, creating unpredictable and irregular behaviour?

No.

QuoteAs Olimex still sells the complete sets for NAS, and even provides for a metal enclosure (not physically separating the HD from the board) I do not understand why my setup gives me troubles now for years.

We haven't had reports about the LIME2-SERVER setup overheating but our case is not that tight and there are ventilation holes at the bottom and larger holes around the connectors so some air can ventilate. Check this picture at the bottom of the case:

https://www.olimex.com/Products/OLinuXino/Home-Server/LIME2-SERVER-NO-HDD/images/LIME2-SERVER7.jpg
Technical support and documentation manager at Olimex

mbosschaert

Thanks @LubOlimex for your thoughts
I've started monitoring temperature now on 5 servers all with SATA-HD connected. I've left them in the enclosures as they were and will see if the height of the cpu_temp coincides with the irregular behaviour as described.
Will report back in probably some weeks