Sundarrajk's Weblog

Archive for the ‘Software’ Category

Your Code As a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your ProgramsYour Code As a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your Programs by Adam Tornhill
My rating: 4 of 5 stars

A very different way of looking at problems in the code. The author suggests various non-traditional means of identifying problems in software development.

Number of times a code has changed, code churn (number of lines added, removed), number of developers working on a single piece of code, code changing together.

The primary requirement for all of this that the version control should be adhered to correctly. Every developer should have a personal login Id and should use it to checkin at regular intervals.

Many of the techniques provided in the book are available in the form the tool from…. These are only starting tools. These need to be coupled with other tools like d3.js to get a good representation of the status of the code.

A must read for all software developers.

View all my reviews

How to Stop Sucking and Be Awesome InsteadHow to Stop Sucking and Be Awesome Instead by Jeff Atwood
My rating: 4 of 5 stars

Another nice book of blog entries from Jeff Atwood.

First few blogs are about how one should determine at a very early age if one can program or not and should drop out of a programming career if one is not. He speaks about “sheep that can program and goats that cannot program” should be separated out early in the career so that software can become better.

Some of the key observations that I liked are “You have to truly believe, as a company, and as peers, that crucial innovations and improvements can come from everyone at the company at any time, in bottom-up fashion – they aren’t delivered from on high at scheduled release intervals in the almighty master plan.

In another blog he speaks about how important it is to persuade others to do something. He refers to a set of dialog from the movie based on Idi Amin. Idi Amin is speaking to his trusted aide a Scottish Doctor.
Idi Amin: I want you to tell me what to do!
Garrigan: “You want me to tell you what to do?
Amin: Yes You are my advisor. You are the only one I can trust here. You should have told me not to throw the Asians out in the first place!
Garrigan: I did!
Amit: But you did not persuade me, Nicholas. You did not persuade me!

Not a very atypical dialog one is likely to have with either one’s manager or client. 🙂

Another important advice with which I cannot agree more since I have given the same advice to many others who have asked me for my opinion. “Whatever project you are working on, consider it an opportunity to learn and practice your craft. It is worth doing because, well it is worth doing. The journey of the project should be its own reward regardless of whatever happens to lie at the end of that journey.”
The corollary is another thing that I keep stressing on; “Never get attached to a project. Execute the project to the best of the abilities, learn along the way and if for some reason beyond your control the project fails or does not see the light of the day, so be it.”

Speaking on merit based growth in an organization Jeff has to say the following “Remove barriers that rob people in management and in engineering of their right to pride of workmanship. This means [among other things] abolishment of annual or merit rating and management by objectives. “Even people who think of themselves as Deming-ites have trouble with this one. They are left gasping. What the hell are we supposed to do instead? Deming’s point is that MBO and its ilk are copouts. By using simplistic extrinsic motivators to goad performance, managers excuse themselves from harder matters such as investment, direct personal motivation, thoughtful team formation, staff retention, and ongoing analysis and redesign of work procedures. Our point here is somewhat more limited: Any action that rewards team members differentially is likely to foster competition. Managers need to take steps to decrease or counteract this effect.

In one blog Jeff compares F-86 and MIG-15s. The latter was far more superior to the former, but the fighter pilots preferred the former. The difference was that F-86 had Hydraulic flight controller compared to the manual flight controller of MIG-15. This meant that each maneuver increased the fatigue of the MIG-15 pilot even though he might have out-maneuvered the F-86 pilot. The F-86 could maneuver quicker as compare to the MIG-15 as he was less fatigued and this tilted the pilots to favour F-86 despite its limited abilities. Jeff calls this the Boyd’s Law of Iteration which states “Speed of iteration beats quality of iteration”. Jeff argues the same is true for software development. Although he says in other places that quality cannot be sacrificed beyond a point.

There is a whole set of blogs in User Interface and Usability. One book the author highly recommends is Don’t make me think by Steve Krug… and another one is Rocket Surgery made easy again by Steve Krug….
He speaks about the Fitts law which states “Put all commonly accessed UI elements on the edges of screen. Because the cursor automatically stops at the edges, they will be easier to click on. Make clickable areas as large as you can. Larger targets are easier to click on”. One should not ignore the corollary of rule which would read “Make all the clicks that the user must be kept safe from as difficult as possible”. Jeff refers to this principle as the “seat ejector” button. This button should be easy to find in an emergency, but should not be place such that the pilot ends up turning this on instead of the navigation lights. Buttons like delete all my mails and such should be available, but should be placed such that the no user would click it by mistake.

Speaking on importance of saying not to demands, Jeff says “It is easy to dismiss Just say No as a negative mindset, but I think it is a healthy and natural reaction to observation that optimism is an occupational hazard of programming”. Cannot agree with him more as I have been pulled up for saying no many a time in my career.

Speaking about usability Jeff argues that most users of the applications do not progress beyond the intermediary stage. He argues that most move from the novice to Intermediary stage quite quickly or drop off as an user of the system if they find it too difficult to use. Once they reach this stage they remain in this stage for a long time and only a very few move to the expert level. Given this he argues that software should be targeted towards these users rather than at the novices or the experts. He states that most marketing people would advocate making software for the novice as these would be the ones that the marketing people would encounter most of the times, while the software developer would want to address the experts as it is likely that they would be geeks themselves and would want maximum flexibility and most features.

Writing on security and hacking the author says that today to hack into a site one needs social skills and not technical skills. Technology has advanced to a level where hacking into a website has been made difficult enough, but people are not inure to social engineering and that it is much easier way of hacking into a system.

On the whole a very good read.

View all my reviews

Eloquent JavaScriptEloquent JavaScript by Marijn Haverbeke
My rating: 3 of 5 stars

A good introductory book for beginners in JavaScript. Any beginner can to through the book and start writing real good JavaScript.

The author introduces modules, regex, DOM, the HTTP Protocol and also gives direction to write two different fun program with JavaScript.

Some of the chapters could be seen as outdated given the changes that have come into JavaScript, but nevertheless still a relevant book.

After this has been read one can read “JavaScript – The Good Parts” by Douglas Crockford.

View all my reviews

Release It!: Design and Deploy Production-Ready Software (Pragmatic Programmers)Release It!: Design and Deploy Production-Ready Software by Michael T. Nygard
My rating: 4 of 5 stars

The author asserts that software of today is built for passing the tests of the QA and not for the rigours of the Production environment. The author provides tips to design systems which will withstand the assaults it will have to face in the Production environments.
The author states that most decisions made upfront are the decisions that are the ones that impact the system the most, are most difficult to reverse or change and ironically these are the ones that are taken when the knowledge about the required system is minimal.
The author ironically states decrees such as “Use EJB container-managed persistence!”, “All UIs shall be constructed with JSF!”, “All that is, all that was and all that shall ever be lives in Oracle!” are given by architects in ivory towers.

In the stability section the author speaks about how to create and maintain stable systems in this section. The first example the author gives is of an airline company which had the following code:

. . .
public class FlightSearch implements SessionBean {
    private MonitoredDataSource connectionPool;
    public List lookupByCity(. . .) throws SQLException, RemoteException {
        Connection conn = null;
        Statement stmt = null;
        try {
            conn = connectionPool.getConnection();
            stmt = conn.createStatement();
            // Do the lookup logic
            // return a list of results
        } finally {
            if (stmt != null) {
            if (conn != null) {
Which looks and feels good. But if the stmt.close() ever throws an exception the conn.close() will never be called, resulting in connections leaking from the connection pool leading to all the connections in the pool being used up.

The author suggests that one should be prepared for as many points of breakages as possible. Tight coupling between systems leads to cascading failures. To avoid this there should be loose coupling between systems. As a corollary calls across systems should be asynchronous. This is not always possible and where possible complicates communication. So one needs to take a proper call on where to have asynchronous processing and where not to.

In chapter 4 the author discusses anti-patterns that lead to failures. The first anti-pattern is that all points of integration are fragile and can lead to failures. It is highlighted that most connections are based on TCP/IP. In TCP IP the first step is a three way handshake to setup the connection between the two systems that need to communicate. The first step is for the requestor to send a SYN packet. This has to be acknowledged by a SYN-ACK packet from the listener and finally the requestor sends a SYN packet to complete the three way handshake and establish the connection. If there is no listener then the failure is quick as the OS responds with RESET packet telling the requestor that its request does not have listener. This is a manageable situation. But if the listener is slow then the request will languish in the listen queue till it is timed out. The typical timeout is in minutes. This means that the requestor can wait for a long time before realising the problem.
The classic example of firewall killing the TCP connection between the application server and database server due to long time idle connection is quoted as an example.
The next example is about how it is difficult to timeout HTTP Connections in Java.
It is stressed again and again that it is better to be cynical than optimistic when developing software. Be prepared for the worst.
It is suggested that circuit breakers, i.e. stopping to retry a transaction after a particular number of failed attempts and/or maintaining the status of the underlying layer and deciding to not invoke the layer if there is a problem and timing out after waiting for a reasonable amount of time for the underlying layer to respond are two key mechanisms to avoid cascading problems from one layer to another.
The storing of large datasets in the session is highlighted as one of the more frequent ways of running out of memory. It is suggested that either the session be kept light or Softreference be used for storing large datasets to prevent out of memory errors.
The author rightly points out that the usage of synchronized keyword can be dangerous in a highly concurrent environment.
The advice is also to test third party libraries for breakability.
The author coins a new word called “Attack of Self Denial” where an event is published which leads to a flood of requests to the specified application. E.g. the news of a deep discount on a product for a retailer could be a cause for “Attack of Self Denial”. One needs to be prepared beforehand to handle such situations better.
A very good suggestion is that if it is not possible to build a shared nothing architecture then limit the number of systems sharing the resources. E.g. instead of sharing the sessions with all the application servers sharing it amongst two application servers so that the replication factor is limited.
One key point that the author brings about is most systems “treat the database with far too much trust.” and this is the major cause of problems in most systems. The author illustrates this with an example of how making an unbounded query resulted in continuous crashes at a retailer. The author suggests that always limit the results fetched from the database as a precaution.

The Stability Patterns
In this the author lists the patterns that will help the system be more stable.
1. Timeout: Use timeouts whenever interacting with a third party, especially when this involves some form of network, even though it may be within a LAN. It is a fail fast pattern to be used along with a circuit breaker, where if a few requests timeout then the resource is marked as down till it is found to be good again. The retry to check if the resource is good can be done at a regular interval suitable for the resource, i.e. delay the retry.
2. Circuit Breaker: Akin to the electrical circuit breaker a software circuit breaker prevents the entire software from collapsing under stress by stopping requests to the faulty interface. The users may see errors if this happens to be a crucial interface, but this is better than the user not being able to use the whole system. Typically if an interface frequently times out or fails frequently then the circuit breaker can mark this interface as broken for sometime. After a suitable amount of time it can retry the interface and if found functional it can close the circuit once again enabling the execution of the specific interface. All opening and closing of circuit breaker should be logged and made visible to the operations team so that they are aware of the change in the status.
3. Bulkheads: Bulkheads are compartments in a ship which prevent the ship from sinking if there is a damage to the hull. Each bulkhead stops the water from entering beyond it. Similarly use of multiple servers to deploy applications is one form of bulkhead. If the application is compartmentalized so that impact to one compartment does not impact the other is creating bulkheads in application. One example quoted is that of the airlines where ticketing system, flight status systems, flight search system, checkin system could all be deployed separately so that one does not interfere with the other.
Another example is if there are two systems which require the same service and if both the systems are critical, it makes sense to have separate setup of the common service for the two systems. Problem access of the common service in one system will not impact the service access of the other system.
Grouping pool of thread for specific purpose in a single process will ensure that problem in one thread pool does not prevent the process from servicing other types of requests.
The negative side of bulkheads is that it can make optimization of resource usage difficult. One would potentially have to provide more capacity than actually required.
4. Steady State: Maintaining a steady state of the systems is very important. Any kind of fiddling with the system for any reason can lead to instability. At the same time to maintain steady state some cleaning up is required. Log files will be generated by the applications and it is important to have a process that will keep removing the log files at the same rate or greater than the rate of generation. Similarly archival of records in a database is important to ensure that the queries on the database continue run consistently.
It is important to ensure that one has a finite, controlled number of entries in the in memory cache. Use an LRU or LFU mechanism to keep clearing the cache if it is expected to keep growing beyond known values.
5. Fail Fast: Quickly failing a request is very important to the health of the transaction. One should upfront have the statuses of all the external systems before beginning to process a transaction and if any of the external system is in a state which will mean that the transaction will fail then it is better to fail the transaction immediately. This will ensure that no compute power is wasted in processing doomed transactions.
6. Handshaking: It is important to have handshaking between any two systems so that the server process has the ability to state that it has its hands full and cannot respond and the client does not waste time trying to make a request which is going to take the server a long time to respond. This helps in failing fast.
7. Test Harness: A test harness should be able to emulate bizarre problems, like accepting a connection, but not sending any response, resetting the connection without ever accepting it and so on, not responding for a very very long time, send out large amounts of data as response. Testing the against such a test harness will help test how the system will behave under unexpected conditions.
8. Decoupling Middleware: A middleware typically helps shielding the requestor from the nitty-gritties of the server and also from the failures of the server. It helps decoupling two systems while integrating them.

The author very rightly concludes that “Sadly, the absence of a problem is not usually noted. You might be salvaging a badly botched implementation in which case you now have an opportunity to look like a hero. On the other hand, if you’ve done a great job of designing a stable system from the beginning, it’s unlikely that anyone will notice your system’s lack of downtime. That’s just the way it is. Deliver an unbreakable system, and users will surely go on to
complain about something else. That’s just what users do. In fact, with a system that never goes down, the users will most likely complain that it’s slow. Next, you’ll look at capacity and performance and how to get the most out of your resources.”

In a case study it is illustrated how usage of sessions killed the application. The bots and the regular users increased the number of sessions far beyond what the system could handle and site crashed. This was later resolved by supporting session through URL rewriting so that no new sessions are created by the bots and also by creating a throttling mechanism to control the total number of sessions in the system. The key learning is that the performance test only tested for happy paths and never for situations like bots hitting the site.
When planning for capacity it is important to ensure that the software written is optimal and has minimum wastage. If this is not done it would lead to increasing costs of resources required to run the application. As an example if an HTML page has 1K of junk data, this will translate into 1GB of extra bandwidth usage if there are a million requests to this page. The cost of resources multiplies as the usage of the application increases.

Some good patterns to follow are:
1. Pool resources, size them properly and monitor them.
2. Use caching, limit the maximum memory that can be used by the cached objects and monitor the hit ratio.
3. Precompute whatever is possible and recompute only when absolutely necessary.
4. Tune Garbage Collection

Some Network points
1. Servers in production tend to be multi-homed and it is important to bind the applications to the right home to prevent security issues.
2. Given the above scenario it becomes important to correctly make the network routing scenarios.
3. Use Virtual IPs where native clustering of applications is not possible. Applications need to be written keeping in mind that this will be the case in production systems.

Some Security aspects:
1. Follow the principle of “least privilege”. This states that every action should done with the least privilege required to execute the action. Rnu each application with its own user so if one application is compromised it is only that application and none of the others.
2. Ensure that the passwords use to access other services are secured properly. Ensure that the memory dumps of the processes will not reveal the passwords. Keep the passwords away from the installation directory.

Some Availability Aspects
1. The cost of the a system grows exponentially with the required availability. Availability should be defined realistically, not idealistically.
2. The SLAs should be well defined and measurable. SLAs should be defined by features and dependent on 3rd party SLAs available. The location from where the application is accessed also matters.
3. Load Balancing and Reverse Proxies should be used to balance the load across the multiple servers and across the various tiers.
4. Clustering will be required in scenarios where the servers need to communicate with each other to exchange some data.

To ensure reliability of the system the topology of the QA environment should be same as that of the Production although the capacity may be far lower.
Configuration of the application and environment related configuration should be separated out.
Application should be able to announce if it has not started properly.
Provide command line options to configure the systems. GUI can be used when sufficient time is at hand and automation is not required.

Every system needs to be transparent, i.e. it needs to show what it is using and what it is doing. Without this information it is very difficult to manage the system. While it is necessary to know the status of the individual parts, it is important to also know the status across all the parts of the system. This helps in analysing any problem that is manifesting in the system.
It is not necessary to log the stack trace of a business exception like a validation error which states a mandatory parameter was not entered. It is vital to log the stack trace in case a non business exception occurred.
It is important to have a network separate from the production data network for monitoring traffic.
A good monitoring system provide visibility to to business outcome and not just technical parameters.

A very good comparison between crystals and tight coupling in software design.
“A cluster of objects that can exist together only in a tight collaboration resembles a crystal in a metal. The objects stay together in a tightly bound relationship, just as the atoms in a crystal are tightly bound. In metal, small crystals mean greater malleability. More malleable metals recover from stress better. Large crystals encourage crack formation. In software, large “crystals” make it harder to change the software. When objects in one grain participate in multiple collaboration patterns, they bridge two crystals, forming a larger grained crystal—further reducing the malleability of the software.
There is no limit to how far this region of tightly bound crystals can spread. In the extreme case, the crystal grows until it is the boundary of the application. When that happens, every object suits exactly one purpose to which it is supremely adapted. It fits perfectly into place and ultimately relates to every other object. These crystal palaces might even be beautiful in a baroque sort of way. They admit no improvement, in part, because no incremental change is possible and, in part, because nothing can be moved without moving every other object. These tend to be dead structures. Developers tiptoe through crystal palaces, speaking in hushed tones and trying not to touch anything.”

View all my reviews

Ship It!Ship It! by Jared Richardson
My rating: 3 of 5 stars

A collection of lessons learned by various developers in the trenches. The book starts off with a quote of Aristotle “We are what we repeatedly do. Excellence, then, is not an act, but a habit.”. The book strengthens this argument by stating “Extraordinary products are merely side effects of good habits.”. So the first tip of the book is “Choose your habits”. Do not follow something just because it is popular or well known or is practised by others around you.

The author says that there are three aspects that one needs to pay attention to:

  1. Techniques: How the project is developed? I.e. Daily meetings, Code Reviews, Maintaining a To Do List etc.
  2. Infrastructure: Tools used to develop the project. I.e. Version Control, Build Scripts, Running Tests, Continuous Build etc.
  3. Process: The process followed in developing the applications. Propose Objects, Propose Interfaces, Connection Interfaces, Add Functions, Refactor Refine Repeat.

Tools and Infrastructure

The author highlights the need for a proper tool for Source Control Management. The author also issues a warning that the right tool should be chosen. A tool should not be chosen because it is backed by a big ticket organization. Vendors would push for “supertools”, but one needs to exercise discretion when choosing between the tools.

Good Development Practices

  1. Develop in a Sandbox, i.e. changes of one developer should not impact the other until the changes are ready.
  2. Each developer should have a copy of everything they need for development, this includes web server, application server, database server, most importantly source code and anything else.
  3. Once all the changes by the developer are finished they should check it in to the Source Control so that the others can pick up and integrate it with their code and make any changes they need to make to integrate.
  4. The checked in changes should be fine grained.

Tools Required for ensuring Good Development Practices

  1. SCM
  2. Build Scripts
  3. Track Issues

What to keep in SCM?

  1. While it can be debated whether runtimes like Java need to be kept in the SCM, it is important that all the third party libraries (jars, dlls) and configuration templates be available in the SCM. Note that configuration templates need to be available as the contents itself can change from environment to environment.
  2. Anything that is generated as part of the build process (jars, dlls, exes, war) should not be stored in the SCM.

What a Good SCM should offer

  1. Ensure that the usage of SCM is painless to the developers. The interactions with the SCM should be fast enough to ensure that the developers do not hesitate to use it.
  2. A minimal set of activities that should be supported by the SCM are
  • Check out the entire project.
  • Look at the differences between your edits and the latest code in the SCM.
  • View the history for a specific file—who changed this file and when did they do it?
  • Update your local copy with other developers’ changes.
  • Push (or commit) your changes to the SCM.
  • Remove (or back out) the last changes you pushed into the SCM.
  • Retrieve a copy of the code tree as it existed last Tuesday.

Script the Build

Once the required artefacts are checked out from the SCM it should be possible for any developer to run a script and have a working system (sandbox) of her own to work on. For this one needs a Build Script. This should be a completely automated build requiring no manual intervention or steps. This build script should be outside of the IDE so that it can be used irrespective of the IDE being used. The IDE could use the same script for local builds.
Once the one step/command build script is ready, automate the build. Ideally everytime a code is checked in the following should be done.

  1. Checkout the latest code and build
  2. Run a set of smoke tests to ensure that the basic functionality is not broken.
  3. Configure the build system to notify the stakeholders of new code checked, the build and the test results.

This is Continuous Integration

Tracking the Issues

It is important to track the issues that are reported for the application so that they can be tracked and fixed.
At a bare minimum one needs to know the following about an issue:

  • What version of the product has the issue?
  • Which customer encountered the issue?
  • How severe is it?
  • Was the problem reproduced in-house (and by whom, so they can help you if you’re unable to reproduce the problem)?
  • What was the customer’s environment (operating system, database, etc.)?
  • In what version of your product did the issue first occur?
  • In what version of your product was it fixed?
  • Who fixed it?
  • Who verified the fix?

Some more that will help in the long term

  • During what phase of the project was the bug introduced?
  • The root cause of the bug
  • The sources that were changed to fix the problem. If the checkin policy demands that the checkin comment indicate the reason for the fixes, then it should be possible to correlate the checkin with the issue that they fixed or requirement that they addressed.
  • How long did it take to fix the error? (Time to analyze, Fix, Test)

Some warning signs that things are not OK with the issue system

  • The system isn’t being used.
  • Too many small issues have been logged in the system
  • Issue-related metrics are used to evaluate team member performance.

Tracking Features

Just as it is important to track the issues, it is important to track the features that have been planned for the application.
The system used to track issues may also be used to track the features as long as it provides the ability to identify them separately.

Test Harness

Have a good Test Harness which can be used to run automated tests on the system.

  1. Use a standard Test Harness which can generate all the required reports.
  2. Ensure that every team member uses the same tool.
  3. Ensure that the tool can be run from the command line. This will enable driving it from an external script or a tool.
  4. Ensure that the tool is flexible to test multiple types of applications and not specific to a particular type.

Different types of testing needs to be planned for

  1. Unit Testing – Testing small pieces of code. This forces the developers to break up the code into smaller pieces. This makes is easier to maintain and understand, reduces copy paste, ensures that overall functionality is, if at all, minimally impacted by refactoring.
  2. Functional Testing – Testing all the functions of the application.
  3. Performance Testing – Testing the application to ensure that the application is performing within acceptable limits and meets the SLAs.
  4. Load Testing – This is similar to the Performance Testing. The goal of this is to ensure that the application does not collapse under load.
  5. Smoke Testing – This is a light-weight testing which will test the key functionality of the application. This should be included as part of Continuous Integration so that any breakage in key functionality comes to light very quickly.
  6. Integration Testing – This ensures that the integration of the modules within the application and the integration of the application with the external systems is functioning correctly.
  7. Mock Client Testing – This mocks the client requests and ensures that the client get the right response and within the expected time period.

Pragmatic Project Techniques

Some of the good practices to follow when working in projects are as follows:

  1. Maintaining a list of activities to do. This should be visible and accessible to everybody on the project. Even the client should have visibility to the list so that they are check the speed and prioritize the items in the list. Each item should have a target time. The list should reflect the current status and should not be out of date.
  2. Having Tech Leads in the project is important. The Tech lead should guide the team in the selection and utilization of the technology. Tech lead should be responsible to ensure that the deadlines are realistic. The Tech lead should act as the bridge between the developers and the management. It is an important role to be played by a person with the right temperament.
  3. Coordinating and Communicating on daily basis is very important. Meetings need to be setup on a daily basis. These meetings should be short and to the point, with everybody sharing details of what they are doing and what they plan to do. Team should highlight any problem they are facing. The solutions for these problems should not be part of this meeting, but should happen separately.
  4. Code review is a very crucial part of the project and every piece of code should be reviewed. Some good practices of code review are
    1. Review only a small amount of code at any time
    2. A code should not be reviewed by more than two people
    3. Code should be reviewed frequently, possibly several times a day
    4. Consider pair programming as a continuous code review process.

Tracer Bullet Development

Just like it is possible to fire a Tracer Bullet in the night to track the path before aiming the real bullet, it should be possible to predict the path of the project using the process opted for.


Have a process to follow.
The process followed should not claim exclusivity in success of projects. If it does so, then suspect it.
Follow a process that embraces periodic reevaluation and inclusion of whatever practices work well for the projects.


  • Define the layers that will exist in the application.
  • Define the interfaces between the layers.
  • Let each layer be developed by a separate team, relying on the interface promised by the adjacent layers.
  • Keep it flexible so that the interface can be changed as it is hard to get the interfaces perfect the first time around.
  • First create the large classes like the Database Connection Manager, Log Manager etc required for each layer, then write the fine grained classes.
  • Collaboration between the teams developing the different layers is key to the success. These collaborations will Trace the Path that the project will take.
  • Do not let an architect sitting in an ivory tower dictate the architecture.
  • It is dangerous to have one person driving the whole project. If this person leaves, the project will come to a standstill.
  • Create stubs, or mock the interfaces of the adjacent layers so that it becomes easy to test.
  • Code the tough and key pieces first and test them before addressing the simpler ones. It may take time to show progress, but when the progress happens it will be very quick.

Common Problems and How to fix Them

What to do when legacy code is inherited?

  1. Build it – Learn to build it and script the build.
  2. Automate it – Automate the build.
  3. Test it – Test to understand what the system does and write automated test cases.

Don’t change legacy code unless you can test it.

Some other tips from the chapter

  1. If a code is found unsuitable for automated test, then refactor the code slowly so that it becomes amenable to automated testing.
  2. If a project keeps breaking repeatedly, automated test cases, emulating the user actions will help reduce the incidents.
  3. Ensure that the automated tests are updated with change in code/logic whenever required, otherwise these would become useless.
  4. It is important to have a Continuous Intergration so that the automated tests can be run regularly.
  5. Early checkins (in fact daily or more than once a day) and quick updates by the developers is important so that the integration problems are detected as early as possible.
  6. It is important to communicating with the customers and getting regular feedback.
  7. Best way to show the customer the progress of the project is to show them a working demo of the application.
  8. Introduce a process change when the team is not under pressure. Point out the benefit the stakeholders will have with the new process. Show them the benefit of the process/practice rather than talk and preach about it.

A wonderful Dilbert quote from the book
“I love deadlines. I especially love the swooshing sound they make as they go flying by.” — Scott Adams

Some Excerpts from the book

View all my reviews

When I started programming,
I just talked to the customer,
And coded and everyone was rocking,
Nothing was only bluster.

Then they said we need to do waterfall
Somebody talked to the customer,
Someone else coded leading to downfall,
And the project manager went a fluster.

Now they are saying use Agile,
Everybody talking to the customer,
It is making everything fragile,
And the project manger has lost his luster.

These are excerpts from the book. These are the summary from end of each of the chapter.

Chapter 1 – A Method in Madness

  • Make sure to do the following:
    • Work out why the software is behaving unexpectedly.
    • Fix the problem.
    • Avoid breaking anything else.
    • Maintain or improve overall quality.
    • Ensure that the same problem does not occur elsewhere and cannot occur again.
  • Leverage your software’s ability to show you what’s happening.
  • Work on only one problem at a time.
  • Make sure that you know exactly what you’re looking for:
    • What is happening?
    • What should be happening?
  • • Check simple things first.

Chapter 2 – Reproduce

  • Find a reproduction before doing anything else.
  • Ensure that you’re running the same version as the bug was reported against.
  • Duplicate the environment that the bug was reported in.
  • Determine the input necessary to reproduce the bug by:
    • Inference
    • Recording appropriate inputs via logging
  • Ensure that your reproduction is both reliable and convenient through iterative refinement:
    • Reduce the number of steps, amount of data, or time required.
    • Remove nondeterminism
    • Automate.

Chapter 3 – Diagnose

  • Construct hypotheses, and test them with experiments.
    • Make sure you understand what your experiments are going to tell you.
    • Make only one change at a time.
    • Keep a record of what you’ve tried.
    • Ignore nothing.
  • When things aren’t going well:
    • If the changes you’re making don’t seem to be having an effect, you’re not changing what you think you are.
    • Validate your assumptions.
    • Are you facing multiple interacting causes or a changing underlying system?
  • Validate your diagnosis.

Chapter 4 – Fix

  • Bug fixing involves three goals:
    • Fix the problem.
    • Avoid introducing regressions.
    • Maintain or improve overall quality (readability, architecture, test coverage, and so on) of the code.
  • Start from a clean source tree.
  • Ensure that the tests pass before making any changes.
  • Work out how you’re going to test your fix before making changes.
  • Fix the cause, not the symptoms.
  • Refactor, but never at the same time as modifying functionality.
  • One logical change, one check-in.

Chapter 5 – Reflect

“The six stages of debugging” and reads as follows:

  1. That can’t happen.
  2. That doesn’t happen on my machine.
  3. That shouldn’t happen.
  4. Why is that happening?
  5. Oh, I see.
  6. How did that ever work?

  • Take the time to perform a root cause analysis:
    • At what point in your process did the error arise?
    • What went wrong?
  • Ensure that the same problem can’t happen again:
    • Automatically check for problems.
    • Refactor code to remove the opportunity for incorrect usage.
    • Talk to your colleagues, and modify your process if appropriate.
  • Close the loop with other stakeholders.

Chapter 6 – Discovering that you have a problem

  • Make the most of your bug-tracking system:
    • Pick one at an appropriate level of complexity for your particular situation.
    • Make it directly available to your users.
    • Automate environment and configuration reporting to ensure accurate reports.
  • Aim for bug reports that are the following:
    • Specific
    • Unambiguous
    • Detailed
    • Minimal
    • Unique
  • When working with users, do the following:
    • Streamline the bug-reporting process as much as possible.
    • Communication is key—be patient and imagine yourself in the user’s shoes.
  • Foster a good relationship with customer support and QA so you can leverage their support during bug fixing.

Chapter 7 – Pragmatic Zero Tolerance

  • Detect bugs as early as possible, and fix them as soon as they come to light.
  • Act as though bug-free software was an attainable goal, but temper perfectionism with pragmatism.
  • If you find yourself faced with a poor quality codebase, do the following:
    • Recognize there is no silver bullet.
    • Make sure that the basics are in place first.
    • Separate clean code from unclean, and keep it clean.
    • Use bug triage to keep on top of your bug database.
    • Incrementally clean up bad code by adding tests and refactoring.

Chapter 8 – Special Cases

  • When patching an existing release, concentrate on reducing risk.
  • Keep on the lookout for compatibility implications when fixing bugs.
  • Ensure that you have completely closed any timing windows, not just decreased their size.
  • When faced with a heisenbug, minimize the side effects of collecting information.
  • Fixing performance bugs always starts with an accurate profile.
  • Even the most restricted communication channel can be enough to extract the information you need.
  • Suspect your own, ahead of third-party, code.

Chapter 9 – The Ideal Debugging Environment

  • Automate your tests, ensuring that they do the following:
    • Unambiguously pass or fail
    • Are self-contained
    • Can be executed with a single click
    • Provide comprehensive coverage
  • Use branches in source control sparingly.
  • Automate your build process:
    • Build and test the software every time it changes.
    • Integrate static analysis into every build.

Chapter 10 – Teach your Software to Debug Itself

  • Use assertions to do the following:
    • Both document and automatically validate your assumptions
    • Ensure that your software, although robust in production, is fragile during debugging
  • Create a debug build that
    • Is compiled with debug-friendly compiler options
    • Allows key subsystems to be replaced by debugging equivalents
    • Builds in control that will prove useful during diagnosis
  • Detect systemic problems, such as resource leaks and exception handling issues, preemptively.

Chapter 11 – Anti Patterns

  • Keep on top of your bug database to ensure that it accurately reflects your true priorities.
  • The polluter pays—don’t allow anyone to move onto a new task until they’ve completely finished their current one. If bugs come to light in their work, they fix them.
  • Make a single team responsible for a product from its initial concept through deployment and beyond.
  • Firefighting will never fix a quality problem. Take the time to identify and fix the root cause.
  • Avoid “big bang” rewrites.
  • Ensure that your code ownership strategy is clear.
  • Treat anything you don’t understand as a bug.