没有管理好迭代开发中的技术债务是敏捷失败的最大原因，2010年敏捷圈中的老大Robert Martin （人称Bob大叔，Bob是Robert的昵称，2001年诞生了敏捷宣言的血鸟城会议的发起人）写了一篇“被Scrum遗忘的角落” （The Land that Scrum Forgot, 文章链接: https://www.scrumalliance.org/community/articles/2010/december/the-land-that-scrum-forgot.https://www.youtube.com/watch?v=hG4LH6P8Syk) 的文章，列举了许多产生技术债务的危害实践。其实这个问题不仅存在于敏捷开发模式中，传统方法中也一样普遍，在各种借口下走捷径的比比皆是，捷径让Tech遇见Debt。许多美国软件组织也不例外，下面是我“敏捷项目管理”课的学生给出的十个产生技术债务的实际例子。还是那句老话，借债不怕，关键要还，要把债务认真管理起来。
1. In a few instances, I’ve encounter old versions of the same software language. In a previous company, the original software was developed using an old version of Java. As new versions of Java were being released, time wasn’t taken to update the deprecated parts of code to update to the newer version. Because of the lack of maintenance, the thought of refactoring of the code became more daunting and was therefore pushed back more. The software engineers couldn’t use the new features and libraries might not be compatible.
3. Many times when velocity is emphasized in a software company, unit testing the code is what suffers.This might be fine for a simple piece of code, but as code becomes more complex or if another develops on the code later, changing the underlying code might have unexpected consequences. The lack of testing is technical debt. Unit testing allows that code can continue to work as expected without having to ask the previous developer what the function was doing.
4. An example of when I incurred technical debt is when I wrote a tree like structure with some complicated logic inside. I knew it worked for what it was supposed to do but it wouldn’t scale. Of course, months later, I was asked to add a few conditions to the structure and by this time, I had already forgotten how my own code worked. I incurred technical debt on myself. I ended up just refactoring the code to make it cleaner. This feature took a bit longer to complete because it required that refactor.
5. A friend of mine worked for a local company, and he had constant headaches due to unrealistic deadlines set by management. Him and his team could see the code base getting more and more fragile with each release, but there wasn’t sufficient time to fix it. So each release ended up taking longer and longer as the code broken in unpredictable ways, and obviously management was upset that deadlines were getting missed, but no one was willing to allow the time to fix it.
5. Allowing Tool Rot: Only in rare circumstances, will my employer obsolete a product. This means that on occasion, we need to fix a bug in 20 year old source code. If the compiler or IDE that was used ran on DOS or in Win 95, you can bet that it will not run on Win 10. If developer resources were unlimited, it would have been good to fire up (even better, recreate / reinstall) the development environment every year, to make sure that it still runs and we can get a md5sum match on the binary. This way, a forced change in the build environment would be caught and could be addressed proactively not reactively, with a customer breathing down our necks. An aspect of this can be viewed as an argument FOR having technical debt; upgrading tools and their licenses because it is January, and the boss says we HAVE to build the project even though there are no bugs/changes to be addressed is a waste of time and money… there may never be a need to deliver updated binaries. Since in the real world, resources are limited, a balance must be struck, and a more reasonable interval should be chosen.
6. Allowing Code Rot: Code that has not been touched will not literally rot (the ones and zeroes should be safe in the configuration management system,) but it is possible that it could suddenly be found to no longer work. This can happen if a shared library evolves, but not all projects that use it are brought up to date with the changes. Suddenly, when someone tries to build against the updated library, errors or bugs have been introduced. In some cases, this may be intentional – a discovered bug forcing an interface change, which leads to compilation errors. In other cases it may be accidental – a change was made without considering all projects, leading to a hidden bug. A more thorough impact analysis of the library changes would help identify that the module is used in other projects, and thus the impact analysis should also include them.
7. Weak collaboration / documentation: Where I work, we are a small team, and don’t always collaborate enough on projects. As a result one developer might be the only one to touch an element of a project (maybe the firmware, or perhaps one of the supporting PC applications.) When this developer leaves the company, much of the knowledge leaves with them. This problem could be addressed by the XP practice of pair programming, as well as improving peer reviewing of the source AND documentation. Simply producing documentation is not enough; confidence must be increased that the documentation effectively conveys the system architecture and behavior.
8. Undefined SCM collaboration strategies: It isn’t rare to find repositories that aren’t configured to follow a certain collaboration strategy. Teams need to configure their repositories to prevent changes be pushed to the main development branch. Tags need to be used exclusively for project releases. Project’s commit histories need to look clean. Branches need to follow naming conventions and have merging strategies that allow them to be eligible for merging onto master.
9. Minor or temporary fixes for major bug issues increases technical debt. Management and or the customer may simply want to release a product even though it has known issues. When this occurs, then the typical solution is to delay the release and fix the known bugs completely or to do a small quick fix (find a way around the problem) and live with it. Although the quick fix gets the product up and running faster, it doesn’t get rid of the problem. Fixing the problem is being pushed to a later date instead of allowing the team to extend the development and finish the product. This increases the technical debt because the known issues are still there so it may be wasting the users time to do the work around, and it will surely result in a new project later on to fix the issues. A whole new project or process to fix some project issues takes more time then fixing the issues while the development project was still open.
10. Technical debt due to lack of design. Last year, our team was tasked with implementing 6 new types of functionality to our software. Each of the types was broken into 2-3 subordinate functions, and the work was distributed to the engineers based on skill level, experience, and workload. We came upon a situation where we needed to pass a relatively complicated function to the most junior engineer on the team. We gave him the opportunity to trade, or to simply work with a more experienced co-worker, and he politely refused. He was eager to get started, and was sure he could do it all on his own. At my place of business, when new procedures are required which will help the system perform tasks it has never done before, we hold design meetings with all the team members. This allows everybody to help find use cases, edge cases, test procedures, and potential pitfalls. At the end of each meeting, the engineer who was responsible for that function would create a sequence diagram, which we would peer review, and then we would all go code and test our software. In this specific case, the young engineer created a UML diagram, and in the peer review, we told him to fix several things. Our process is such that we trust that once the peer review is completed, and defects are assigned, that the engineer will take care of it from there. In this case the young man just took notes on what we told him, and he never updated his diagram… furthermore he coded his function more or less from memory, since the software couldn’t have possibly worked the way his sequence diagram was written. The result of this was buggy code, missed requirements, and a software segment which needed to be almost completely rewritten! Luckily for us, we caught it only a few weeks after it was completed, so it didn’t cause any real damage.