Like many of you I’ve had some frustrations with my L3+ units, mostly surrounding the hash boards and various failures. Bitmain previously released an L3+ Maintenance Guide in 2018 that did a great job summarizing some basic failures, how to identify, and then the Internet took over with the various fixes. I’ve reviewed Bitmain’s manual many times and have boiled down most of the pertinent info below, as well I’ve added my two cents. Make that more than two cents, I hopefully dive into some of their topics a little deeper and attempt to explain it in more detail and make it easier to understand. Their manual can be cryptic and it’s missing a lot of the middle pieces tying some things together so I attempted to fill in the blanks as well. Some of the graphics and schematics I pulled from their document so I’d like to acknowledge them here.
This is meant to be a living document so I’ll update it as I find new solutions to problems. Furthermore there are several areas where I couldn’t find enough information or wasn’t able to reverse engineer how something functions so I denoted those with an assumption note.
If you need to order any of these parts you can find a link to vendors here.
I’m all about sharing so I’m putting this all out there to help everyone out, but please, if you re-use some of this give me a shout-out. I’ve had things plagiarized in the past and while it’s not anything earth shattering, it’s also a work in progress so I don’t want someone putting info out there that’s not complete nor something they can explain as to how they came to a specific conclusion. It not only makes them look bad, but it can cause distrust in the data being put out there. Long story short, share it and shout outs.
A general note regarding testing voltages at the various test points. You must have the hash board power and I/O cable plugged in and a working L3 control board or test fixture. The PIC (which provides the signal to turn on the 10V) won’t initialize the 10V buck controller until it receives the programming information from the control board, therefore not only the 10V but the 14.2V and the individual power regulators won’t have power to function.
Another quick note, if you don’t have one, getting a hot air rework station will make some of these fixes easier. You can pick up a station complete with a hot air gun and soldering iron for about $50, I use the FEITA 8586 SMD Hot Air Rework Station from Amazon.
PIC (programmable interface controller)
U3 (U74 on V1.6) is a 16LF1704/5 8-bit microcontroller PIC (programmable interface controller.) This is what makes the hash board, well, hash. If this is not programmed properly the hash board may exhibit all sorts of strange errors, anything from missing chips to bad temperature readings, to no or partial hashing, or the board not being recognized at all.
The PIC takes the frequency and voltage information from the L3 Control Board (what you set them at through the software interface) and adjusts the voltage output and frequency for the BM1485 chain. The PIC communicates with the other ASICs and ICs through I2C (SDA, SCL, and GND) which basically is an IC to IC communication protocol.
It is possible to reprogram this from code from your other hash boards which I will cover in another post at a later date (it’s quite an in-depth process and involves using a Pickit3 programmer and a little luck.)
Common Failures related to U3/U74 (PIC)
No 10V found (measured at both sides of C948 or between either pin of L2 and ground.)
- The PIC may have dropped its code or you may have cold solder joints in the PIC. There are two methods to replace the code, blissz did a great job walking through folks on flashing a custom firmware that will update this. I haven’t tried this method myself although it seems to be successful and a little more straightforward. The other method is to attach a Pickit3 and manually pull the code from a working hash board in your system then dump it into your bad hash board.
Improper voltage (less than 10V) measured at C948.
- Your buck converter may have failed or may have cold solder joints. First try reflowing the joints and if that doesn’t work, replace the part. Depending on the voltage it puts out you may still be able to run at a lower frequency and have it work. I have one board with a failing buck converter (it only puts out around 8V but it varies) and I’ve tested out various frequencies and actually have it working up to 300MHz. That’s more of an FYI and something my curiosity had me research. You should always conduct the repair before operating them in abnormal conditions, I’m just waiting for a replacement part at the moment.
Temperature Sensor
U87 (U88 on V1.6) is a T451/TMP451 temperature sensor. This is powered by the 1.8V voltage regulator in the first chain. You can get failures if the voltage regulator (U86 on V1.6 hash board) is damaged by a voltage spike/surge.
In reviewing the schematic of the temperature sensor, we can see that the T451 uses I2C to communicate on the hash board as well as having its own built in temperature sensor that is attached externally (pin 2(D+) and pin 3(D-)) to ASIC 1 I believe.
TEMP_P and TEMP_N are the PNP or NPN transistor in the ASIC (BM1485) itself that read the temperature. This is most likely a thermal substrate transistor that outputs to the TMP451 for processing then reports this information out via the I2C bus. I also see there are pads across the board for numerous T451’s, probably an enhanced feature they never implemented?
Common Failures related to U88 (Temperature Sensor)
“Failure to get temp data” error in L3+ Kernel log.
- Check voltage levels between pin 1 (1.8V) and pin 5 (GND), improper voltage from the first LDO chain can lead to a failure of the temperature sensor working properly.
- Your temperature sensor may have failed or may have cold solder joints. First try reflowing the joints and if that doesn’t work, replace the part.
BM1485 ASIC
There are a total of 72 BM1485 ASIC’s on-board the L3+ hash board. There is no data sheet readily available for this but what I have gleaned from info they’ve put out there aren’t too many failures that are easily tied to these (at least nothing that’s easy for a DIY fix.) These are powered by 12 individual 1.8V power domains.
(*Note: assumption) The clock signal comes from Y1, a 25 MHz crystal, and each board communicates with each other through the CI/CO signals for synchronization.
If one BM1485 in the chain goes bad, it may break the communication to BM1485’s down the line. I’ve found that it can either be fairly simple, or really hard, to diagnose these. One key here is that if you get an error that shows only a lower number of ASICs, like 27 ASICs found, then ASIC 28 may be your problem. You can verify this by checking the test points as I have listed under below under the 1.8V Power Domain.
Furthermore errors such as CRC5 (as found in the kernel log) often point to a bad ASIC. I recently diagnosed this in my article on What to do with 0 asics found and CRC5 errors.
Without a data sheet there is quite a bit of mystery behind how this actually works, but a little reverse engineering and research has led me to assume some of the following (key word is assume):
(*Note: big time assumptions here)
The frequency on each BM1485 (as set in your “Miner Configurations” pages) is set through programming received on CI/CO (command input/output) and RI/RO (respond input/output.) A command string is sent to the BM1485 to set the PLL divide for the various frequency levels on the BM1485.
U1 (ASIC 1) is actually the BM1485 ASIC that the temperature is read from. The TMP451 reads the temperature data from pins 6 and 7 of the BM1485, processes this and returns the information on the I2C bus to pins 15 and 16 of the BM1485.
One question I’m still waiting on Bitmain to answer (not sure if they will ever get to it) is whether or not RF and TF are wired up to each ASIC for the I2C bus.
Common Failures related to the BM1485 ASIC
ASICs showing as “x” or no ASICs found.
- Check voltage levels at the following test points on each voltage domain (set your ground lead on the negative size of the tantalum capacitor or ground plane closest to each set of the test points.) Make sure you have the boards plugged into power and I/O cables to the control board. Without this the hash board won’t initialize the voltage domains (if using a text fixture go ahead and press start button or IP_Sig on control board.)
- RST – 1.8V
- BO – 0V
- RI – 1.6V – 1.8V
- CO – 1.6V – 1.8V
- CLKO – 0.9V
Note: The signals for RST, BO, CO, and CLKO run from from U1 to U72. RI actually runs in reverse from this, going from U72 to U1.
1.8V Power Domains for BM1485
The L3+ has a total of 72 ASIC chips spread across 12 voltage domains (with 6 BM1485’s each.) Each voltage domain is controlled by a 1.8V voltage regulator that has changed over various version of the hash board.
* Version 1.5 and earlier hash boards – One SPX5205M5-L-1-8 1.8V voltage regulator (pkg SOT23-5) for each of the 12 voltage domains (6 – BM1485 ASICs per domain.) These are generally labeled as R18 on the top along with the date code.
* Some versions of the hash board have also used the SGM2202-1.8 (pkg SOT23-5.) These are generally labeled as G49xx on the top (xx is date code.)
* Version 1.6 hash board – U75, U76, U77, U78, U79, U80, U81, U82, U83, U84, U85, and U86 are LN1134A182MR 1.8V voltage regulators (pkg SOT23-5L) for each of the 12 voltage domains (6 – BM1485 ASICs per domain.) The first 10 domains are generally labeled 4VK4. Make sure to use higher input voltage LN1134’s for the last 2 domains as their input voltage from the boost circuit is greater than the other domains. The last two may be labeled as 4AK4.
These are extremely susceptible to damage from a voltage spike out of the boost circuit. To measure the output of each one you need to measure between pin 2 and pin 5. You can measure the input voltage between pins 1 or 3 and pin 5.
Common Failures related to 1.8V Voltage Domains
ASICs showing as “x” or no ASICs found.
- Check voltage levels at the following test points on each voltage domain (set your ground lead on the negative size of the capacitor closest to each set of the test points.) Make sure you have the boards plugged into power and I/O cables to the control board. Without this the hash board won’t initialize the voltage domains.
- RST – 1.8V
- BO – 0V
- RI – 1.6V – 1.8V
- CO – 1.6V – 1.8V
- CLKO – 0.9V
Note: The signals for RST, BO, CO, and CLKO run from from U1 to U72. RI actually runs in reverse from this, going from U72 to U1.
If you encounter a voltage domain that does not output 1.8V on pin 5 of the voltage regulator then it most likely should be replaced. It’s common for these to go out, especially during voltage spikes. The boost circuit has been known to knock the later domains (ASICs 49-54, 55-60, 61-66, and 67-72) when they fail.
10V Circuit
U1 is a LM27402 synchronous buck controller. On the V1.5 it is called U74, and on the V1.6 it is called U200. On versions 1.5 and later it is a uP9305W synchronous buck controller and has other components not originally on the earlier versions.
This output value is set by the PIC and is derived from the value that the user selects in the firmware (i.e. if you change the chain voltage from the “Miner Configuration -> Advanced Settings” page this is where that voltage changes.)
Q1, Q3 (not installed), Q4, and Q5 are THPR9003NL MOSFET N-Channel Switching Regulators. These rarely fail and they are fairly difficult to isolate.
On version 1.6 there are additional components for the 10V circuit, Q30 and Q32 are 8040 NPN Silicon Epitaxial Planar Transistors, Q31 and Q33 are a 6040 PNP Silicon Epitaxial Planar Transistors.
Common Failures related to the 10V circuit
Improper voltage (less than 10V) measured at C948.
- Your buck converter may have failed or may have cold solder joints. First try reflowing the joints and if that doesn’t work, replace the part. Depending on the voltage it puts out you may still be able to run at a lower frequency and have it work. I have one board with a failing buck converter (it only puts out around 8V but it varies) and I’ve tested out various frequencies and actually have it working up to 300MHz. That’s more of an FYI and something my curiosity had me research. You should always conduct the repair before operating them in abnormal conditions, I’m just waiting for a replacement part at the moment.
14.2V Boost Circuit
U111 is an RT8537 switching power supply. This is used to boost the 10V supply (coming through the inductor L2) to 14.2V. It’s generally considered the most finicky part of the hash board itself and seems to fail quite often. Early on Bitmain discovered that components were oxidizing and therefore leading to failure so they started covering the circuit in epoxy. Why you may ask? I have no idea why this oxidizes any more or less than other parts of the circuit, but they seemed convinced this would fix the problem. In the end of the day, nope, they still fail and we still add in external booster circuits.
Basically this circuit boosts 10V from the buck controller to 14.2V. L1 is used to store the energy for the boost and D1 isolates the 14.2V BOOST_OUT through reverse protection. When you install an external boost circuit you have to remove D1 to eliminate the possibility of interference from the old boost circuit.
Failure of the boost circuit can also cause damage to the 1.8V voltage domains.
Common Failures related to the 14.2V boost circuit
Improper voltage at D1 (less than 14V, generally much less.)
- Your on-board boost circuit may have failed. Install an external boost circuit and remove D1 to isolate the on-board boost from the external boost.
- Your 10V may also be off from the buck controller, this should be researched and solved before assuming the boost circuit is bad.
- If your boost circuit is failing it may still show 14V, however the last two power domains may appear to have bad LDO’s (good input voltage however the output is less than 1.8V.) The LDO’s are most likely fine, try installing a boost circuit as this should fix the problem.
Other Components (data later on)
Q10 is a 1AMP SOT-23 package which I believe is a MMBT3904 NPN general purpose amplifier. More on this later once I figure out what’s going on here.
Y1 is a 25.000 MHz crystal that’s used as the clock signal for all ASICs. If this goes out, nothing works, you’ll need an oscilloscope to verify this.
10v circuit and 14v boost circuit running Normal, but the hashboard still not showing/not detect..What are the other causes of damage??
Sorry my ebglish not good, im from indonesia..
Please help me sir.
Try checking out these solutions: https://asicbasics.com/fix-it/fix-it-l3/
Best of luck!