We engineers spend way too large of a share of our lives trying repair the damage that the gremlins do to our designs and products. You know what I mean–the odd and sometimes downright evil quirks that show up when you least expect it–software super-bugs, local fluctuations in the laws of physics, that really weird circuit glitch that shows up between -4 and -2 °C and only on Tuesdays when it’s dark. When the design just doesn’t work right; when manufacturing yields go from three 9s to three 7s overnight; when your user interface flashes blue once every 297 hours….
We all curse these gremlins and we all try to get ahead of them–to design smarter, write better code, check our work again and again. But still they come, like an alien flood, to jack our schedules and interfere with our four hours of sleep a night.
Besides all of the above, there is one set of tools in our tool chest which can help: the sometimes maligned Failure Mode Effect Analysis, or FMEA. Basically an FMEA is a highly structured gedanken experiment whereby the cross functional product development team attempts to predict possible failure modes and rate them for Severity(what bad will happen if these failure modes do occur), Occurrence (how often might the mode occur) and Detection (how easy can these failures be detected). FMEAs can be on designs (DFMEAs) or process (PFMEAs) or frankly just about any aspect that fits the product. DFMEAs for example can be done at the component level the the way up through the system level. For sample templates and score sheet see our website’s download section, http://www.zebulonsolutions.com/index_files/Page380.htm.
The good news is that FMEAs really work in terms of identifying where the gremlins might be hiding before they manifest themselves as schedule slips or field failures. The bad news is that the medicine is viewed by many as being nearly as bad as the disease. For to do an FMEA right you must lock your best people in a windowless room for 12 hours a day, confiscate their phones and cut of their WiFi (supplying lost of coffee and munchies however is allowed). Then they must crawl through the design (or process or whatever) component by component, subsystem by subsystem. And then there is the process which must be followed–an outside facilitator is pretty much mandatory just to control this.
But when the team finally stumbles back into the light of day, rubbing their eyes in the bright moonlight (unless your company is in the north of Sweden on midsummers day you are unlikely to finish during daylight), they are finally armed to fight the gremlins. It should be noted however that following up on the action items (AIs) is also mandatory, see below.
I am actually not much of an expert on this, but my partner at Zebulon Solutions, Keith Howard, really is. He grew up in the automotive industry, where such practices are very ingrained, and has facilitated dozens of FMEAs and taught the subject as well. We started offering DMFEA facilitation as a service a few months back, part of our productization services offering, and to my surprise (but not Keith’s) we had three such engagements in as many months, for a diverse set of customers spanning broad market areas. Since one of these customers is one for which I am personally heavily involved with as an interim VP of Engineering, I can safely say that while we did not find all the gremlins, we did gain some ground back from them. And frankly at least once a month a gremlin does pop up that when I look back to the DFMEA report I find that we did ID that gremlin too, we just did not follow up rigorously enough (see above–attack those AIs!), perhaps a subject for a future blog on the relative uselessness of information which is not acted upon. But still we, and our customers, learned from these DFMEAs where the gremlin caves are, where the gaps in our force fields might be, and what weapons may thwart the flood.
Good luck with your battle against the gremlins.