Sundarrajk's Weblog

Archive for the ‘Software Development Process’ Category

Your Code As a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your ProgramsYour Code As a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your Programs by Adam Tornhill
My rating: 4 of 5 stars

A very different way of looking at problems in the code. The author suggests various non-traditional means of identifying problems in software development.

Number of times a code has changed, code churn (number of lines added, removed), number of developers working on a single piece of code, code changing together.

The primary requirement for all of this that the version control should be adhered to correctly. Every developer should have a personal login Id and should use it to checkin at regular intervals.

Many of the techniques provided in the book are available in the form the tool from https://github.com/adamtornhill/code-…. These are only starting tools. These need to be coupled with other tools like d3.js https://d3js.org/ to get a good representation of the status of the code.

A must read for all software developers.

View all my reviews

Advertisements
Release It!: Design and Deploy Production-Ready Software (Pragmatic Programmers)Release It!: Design and Deploy Production-Ready Software by Michael T. Nygard
My rating: 4 of 5 stars

Introduction
The author asserts that software of today is built for passing the tests of the QA and not for the rigours of the Production environment. The author provides tips to design systems which will withstand the assaults it will have to face in the Production environments.
The author states that most decisions made upfront are the decisions that are the ones that impact the system the most, are most difficult to reverse or change and ironically these are the ones that are taken when the knowledge about the required system is minimal.
The author ironically states decrees such as “Use EJB container-managed persistence!”, “All UIs shall be constructed with JSF!”, “All that is, all that was and all that shall ever be lives in Oracle!” are given by architects in ivory towers.

In the stability section the author speaks about how to create and maintain stable systems in this section. The first example the author gives is of an airline company which had the following code:

package com.example.cf.flightsearch;
. . .
public class FlightSearch implements SessionBean {
    private MonitoredDataSource connectionPool;
    public List lookupByCity(. . .) throws SQLException, RemoteException {
        Connection conn = null;
        Statement stmt = null;
        try {
            conn = connectionPool.getConnection();
            stmt = conn.createStatement();
            // Do the lookup logic
            // return a list of results
        } finally {
            if (stmt != null) {
                stmt.close();
            }
            if (conn != null) {
                conn.close();
            }
        }
    }
}
Which looks and feels good. But if the stmt.close() ever throws an exception the conn.close() will never be called, resulting in connections leaking from the connection pool leading to all the connections in the pool being used up.

The author suggests that one should be prepared for as many points of breakages as possible. Tight coupling between systems leads to cascading failures. To avoid this there should be loose coupling between systems. As a corollary calls across systems should be asynchronous. This is not always possible and where possible complicates communication. So one needs to take a proper call on where to have asynchronous processing and where not to.

In chapter 4 the author discusses anti-patterns that lead to failures. The first anti-pattern is that all points of integration are fragile and can lead to failures. It is highlighted that most connections are based on TCP/IP. In TCP IP the first step is a three way handshake to setup the connection between the two systems that need to communicate. The first step is for the requestor to send a SYN packet. This has to be acknowledged by a SYN-ACK packet from the listener and finally the requestor sends a SYN packet to complete the three way handshake and establish the connection. If there is no listener then the failure is quick as the OS responds with RESET packet telling the requestor that its request does not have listener. This is a manageable situation. But if the listener is slow then the request will languish in the listen queue till it is timed out. The typical timeout is in minutes. This means that the requestor can wait for a long time before realising the problem.
The classic example of firewall killing the TCP connection between the application server and database server due to long time idle connection is quoted as an example.
The next example is about how it is difficult to timeout HTTP Connections in Java.
It is stressed again and again that it is better to be cynical than optimistic when developing software. Be prepared for the worst.
It is suggested that circuit breakers, i.e. stopping to retry a transaction after a particular number of failed attempts and/or maintaining the status of the underlying layer and deciding to not invoke the layer if there is a problem and timing out after waiting for a reasonable amount of time for the underlying layer to respond are two key mechanisms to avoid cascading problems from one layer to another.
The storing of large datasets in the session is highlighted as one of the more frequent ways of running out of memory. It is suggested that either the session be kept light or Softreference be used for storing large datasets to prevent out of memory errors.
The author rightly points out that the usage of synchronized keyword can be dangerous in a highly concurrent environment.
The advice is also to test third party libraries for breakability.
The author coins a new word called “Attack of Self Denial” where an event is published which leads to a flood of requests to the specified application. E.g. the news of a deep discount on a product for a retailer could be a cause for “Attack of Self Denial”. One needs to be prepared beforehand to handle such situations better.
A very good suggestion is that if it is not possible to build a shared nothing architecture then limit the number of systems sharing the resources. E.g. instead of sharing the sessions with all the application servers sharing it amongst two application servers so that the replication factor is limited.
One key point that the author brings about is most systems “treat the database with far too much trust.” and this is the major cause of problems in most systems. The author illustrates this with an example of how making an unbounded query resulted in continuous crashes at a retailer. The author suggests that always limit the results fetched from the database as a precaution.

The Stability Patterns
———————-
In this the author lists the patterns that will help the system be more stable.
1. Timeout: Use timeouts whenever interacting with a third party, especially when this involves some form of network, even though it may be within a LAN. It is a fail fast pattern to be used along with a circuit breaker, where if a few requests timeout then the resource is marked as down till it is found to be good again. The retry to check if the resource is good can be done at a regular interval suitable for the resource, i.e. delay the retry.
2. Circuit Breaker: Akin to the electrical circuit breaker a software circuit breaker prevents the entire software from collapsing under stress by stopping requests to the faulty interface. The users may see errors if this happens to be a crucial interface, but this is better than the user not being able to use the whole system. Typically if an interface frequently times out or fails frequently then the circuit breaker can mark this interface as broken for sometime. After a suitable amount of time it can retry the interface and if found functional it can close the circuit once again enabling the execution of the specific interface. All opening and closing of circuit breaker should be logged and made visible to the operations team so that they are aware of the change in the status.
3. Bulkheads: Bulkheads are compartments in a ship which prevent the ship from sinking if there is a damage to the hull. Each bulkhead stops the water from entering beyond it. Similarly use of multiple servers to deploy applications is one form of bulkhead. If the application is compartmentalized so that impact to one compartment does not impact the other is creating bulkheads in application. One example quoted is that of the airlines where ticketing system, flight status systems, flight search system, checkin system could all be deployed separately so that one does not interfere with the other.
Another example is if there are two systems which require the same service and if both the systems are critical, it makes sense to have separate setup of the common service for the two systems. Problem access of the common service in one system will not impact the service access of the other system.
Grouping pool of thread for specific purpose in a single process will ensure that problem in one thread pool does not prevent the process from servicing other types of requests.
The negative side of bulkheads is that it can make optimization of resource usage difficult. One would potentially have to provide more capacity than actually required.
4. Steady State: Maintaining a steady state of the systems is very important. Any kind of fiddling with the system for any reason can lead to instability. At the same time to maintain steady state some cleaning up is required. Log files will be generated by the applications and it is important to have a process that will keep removing the log files at the same rate or greater than the rate of generation. Similarly archival of records in a database is important to ensure that the queries on the database continue run consistently.
It is important to ensure that one has a finite, controlled number of entries in the in memory cache. Use an LRU or LFU mechanism to keep clearing the cache if it is expected to keep growing beyond known values.
5. Fail Fast: Quickly failing a request is very important to the health of the transaction. One should upfront have the statuses of all the external systems before beginning to process a transaction and if any of the external system is in a state which will mean that the transaction will fail then it is better to fail the transaction immediately. This will ensure that no compute power is wasted in processing doomed transactions.
6. Handshaking: It is important to have handshaking between any two systems so that the server process has the ability to state that it has its hands full and cannot respond and the client does not waste time trying to make a request which is going to take the server a long time to respond. This helps in failing fast.
7. Test Harness: A test harness should be able to emulate bizarre problems, like accepting a connection, but not sending any response, resetting the connection without ever accepting it and so on, not responding for a very very long time, send out large amounts of data as response. Testing the against such a test harness will help test how the system will behave under unexpected conditions.
8. Decoupling Middleware: A middleware typically helps shielding the requestor from the nitty-gritties of the server and also from the failures of the server. It helps decoupling two systems while integrating them.

The author very rightly concludes that “Sadly, the absence of a problem is not usually noted. You might be salvaging a badly botched implementation in which case you now have an opportunity to look like a hero. On the other hand, if you’ve done a great job of designing a stable system from the beginning, it’s unlikely that anyone will notice your system’s lack of downtime. That’s just the way it is. Deliver an unbreakable system, and users will surely go on to
complain about something else. That’s just what users do. In fact, with a system that never goes down, the users will most likely complain that it’s slow. Next, you’ll look at capacity and performance and how to get the most out of your resources.”

In a case study it is illustrated how usage of sessions killed the application. The bots and the regular users increased the number of sessions far beyond what the system could handle and site crashed. This was later resolved by supporting session through URL rewriting so that no new sessions are created by the bots and also by creating a throttling mechanism to control the total number of sessions in the system. The key learning is that the performance test only tested for happy paths and never for situations like bots hitting the site.
When planning for capacity it is important to ensure that the software written is optimal and has minimum wastage. If this is not done it would lead to increasing costs of resources required to run the application. As an example if an HTML page has 1K of junk data, this will translate into 1GB of extra bandwidth usage if there are a million requests to this page. The cost of resources multiplies as the usage of the application increases.

Some good patterns to follow are:
1. Pool resources, size them properly and monitor them.
2. Use caching, limit the maximum memory that can be used by the cached objects and monitor the hit ratio.
3. Precompute whatever is possible and recompute only when absolutely necessary.
4. Tune Garbage Collection

Some Network points
1. Servers in production tend to be multi-homed and it is important to bind the applications to the right home to prevent security issues.
2. Given the above scenario it becomes important to correctly make the network routing scenarios.
3. Use Virtual IPs where native clustering of applications is not possible. Applications need to be written keeping in mind that this will be the case in production systems.

Some Security aspects:
1. Follow the principle of “least privilege”. This states that every action should done with the least privilege required to execute the action. Rnu each application with its own user so if one application is compromised it is only that application and none of the others.
2. Ensure that the passwords use to access other services are secured properly. Ensure that the memory dumps of the processes will not reveal the passwords. Keep the passwords away from the installation directory.

Some Availability Aspects
1. The cost of the a system grows exponentially with the required availability. Availability should be defined realistically, not idealistically.
2. The SLAs should be well defined and measurable. SLAs should be defined by features and dependent on 3rd party SLAs available. The location from where the application is accessed also matters.
3. Load Balancing and Reverse Proxies should be used to balance the load across the multiple servers and across the various tiers.
4. Clustering will be required in scenarios where the servers need to communicate with each other to exchange some data.

To ensure reliability of the system the topology of the QA environment should be same as that of the Production although the capacity may be far lower.
Configuration of the application and environment related configuration should be separated out.
Application should be able to announce if it has not started properly.
Provide command line options to configure the systems. GUI can be used when sufficient time is at hand and automation is not required.

Every system needs to be transparent, i.e. it needs to show what it is using and what it is doing. Without this information it is very difficult to manage the system. While it is necessary to know the status of the individual parts, it is important to also know the status across all the parts of the system. This helps in analysing any problem that is manifesting in the system.
It is not necessary to log the stack trace of a business exception like a validation error which states a mandatory parameter was not entered. It is vital to log the stack trace in case a non business exception occurred.
It is important to have a network separate from the production data network for monitoring traffic.
A good monitoring system provide visibility to to business outcome and not just technical parameters.

A very good comparison between crystals and tight coupling in software design.
“A cluster of objects that can exist together only in a tight collaboration resembles a crystal in a metal. The objects stay together in a tightly bound relationship, just as the atoms in a crystal are tightly bound. In metal, small crystals mean greater malleability. More malleable metals recover from stress better. Large crystals encourage crack formation. In software, large “crystals” make it harder to change the software. When objects in one grain participate in multiple collaboration patterns, they bridge two crystals, forming a larger grained crystal—further reducing the malleability of the software.
There is no limit to how far this region of tightly bound crystals can spread. In the extreme case, the crystal grows until it is the boundary of the application. When that happens, every object suits exactly one purpose to which it is supremely adapted. It fits perfectly into place and ultimately relates to every other object. These crystal palaces might even be beautiful in a baroque sort of way. They admit no improvement, in part, because no incremental change is possible and, in part, because nothing can be moved without moving every other object. These tend to be dead structures. Developers tiptoe through crystal palaces, speaking in hushed tones and trying not to touch anything.”

View all my reviews

Ship It!Ship It! by Jared Richardson
My rating: 3 of 5 stars

A collection of lessons learned by various developers in the trenches. The book starts off with a quote of Aristotle “We are what we repeatedly do. Excellence, then, is not an act, but a habit.”. The book strengthens this argument by stating “Extraordinary products are merely side effects of good habits.”. So the first tip of the book is “Choose your habits”. Do not follow something just because it is popular or well known or is practised by others around you.

The author says that there are three aspects that one needs to pay attention to:

  1. Techniques: How the project is developed? I.e. Daily meetings, Code Reviews, Maintaining a To Do List etc.
  2. Infrastructure: Tools used to develop the project. I.e. Version Control, Build Scripts, Running Tests, Continuous Build etc.
  3. Process: The process followed in developing the applications. Propose Objects, Propose Interfaces, Connection Interfaces, Add Functions, Refactor Refine Repeat.

Tools and Infrastructure

The author highlights the need for a proper tool for Source Control Management. The author also issues a warning that the right tool should be chosen. A tool should not be chosen because it is backed by a big ticket organization. Vendors would push for “supertools”, but one needs to exercise discretion when choosing between the tools.

Good Development Practices

  1. Develop in a Sandbox, i.e. changes of one developer should not impact the other until the changes are ready.
  2. Each developer should have a copy of everything they need for development, this includes web server, application server, database server, most importantly source code and anything else.
  3. Once all the changes by the developer are finished they should check it in to the Source Control so that the others can pick up and integrate it with their code and make any changes they need to make to integrate.
  4. The checked in changes should be fine grained.

Tools Required for ensuring Good Development Practices

  1. SCM
  2. Build Scripts
  3. Track Issues

What to keep in SCM?

  1. While it can be debated whether runtimes like Java need to be kept in the SCM, it is important that all the third party libraries (jars, dlls) and configuration templates be available in the SCM. Note that configuration templates need to be available as the contents itself can change from environment to environment.
  2. Anything that is generated as part of the build process (jars, dlls, exes, war) should not be stored in the SCM.

What a Good SCM should offer

  1. Ensure that the usage of SCM is painless to the developers. The interactions with the SCM should be fast enough to ensure that the developers do not hesitate to use it.
  2. A minimal set of activities that should be supported by the SCM are
  • Check out the entire project.
  • Look at the differences between your edits and the latest code in the SCM.
  • View the history for a specific file—who changed this file and when did they do it?
  • Update your local copy with other developers’ changes.
  • Push (or commit) your changes to the SCM.
  • Remove (or back out) the last changes you pushed into the SCM.
  • Retrieve a copy of the code tree as it existed last Tuesday.

Script the Build

Once the required artefacts are checked out from the SCM it should be possible for any developer to run a script and have a working system (sandbox) of her own to work on. For this one needs a Build Script. This should be a completely automated build requiring no manual intervention or steps. This build script should be outside of the IDE so that it can be used irrespective of the IDE being used. The IDE could use the same script for local builds.
Once the one step/command build script is ready, automate the build. Ideally everytime a code is checked in the following should be done.

  1. Checkout the latest code and build
  2. Run a set of smoke tests to ensure that the basic functionality is not broken.
  3. Configure the build system to notify the stakeholders of new code checked, the build and the test results.

This is Continuous Integration

Tracking the Issues

It is important to track the issues that are reported for the application so that they can be tracked and fixed.
At a bare minimum one needs to know the following about an issue:

  • What version of the product has the issue?
  • Which customer encountered the issue?
  • How severe is it?
  • Was the problem reproduced in-house (and by whom, so they can help you if you’re unable to reproduce the problem)?
  • What was the customer’s environment (operating system, database, etc.)?
  • In what version of your product did the issue first occur?
  • In what version of your product was it fixed?
  • Who fixed it?
  • Who verified the fix?

Some more that will help in the long term

  • During what phase of the project was the bug introduced?
  • The root cause of the bug
  • The sources that were changed to fix the problem. If the checkin policy demands that the checkin comment indicate the reason for the fixes, then it should be possible to correlate the checkin with the issue that they fixed or requirement that they addressed.
  • How long did it take to fix the error? (Time to analyze, Fix, Test)

Some warning signs that things are not OK with the issue system

  • The system isn’t being used.
  • Too many small issues have been logged in the system
  • Issue-related metrics are used to evaluate team member performance.

Tracking Features

Just as it is important to track the issues, it is important to track the features that have been planned for the application.
The system used to track issues may also be used to track the features as long as it provides the ability to identify them separately.

Test Harness

Have a good Test Harness which can be used to run automated tests on the system.

  1. Use a standard Test Harness which can generate all the required reports.
  2. Ensure that every team member uses the same tool.
  3. Ensure that the tool can be run from the command line. This will enable driving it from an external script or a tool.
  4. Ensure that the tool is flexible to test multiple types of applications and not specific to a particular type.

Different types of testing needs to be planned for

  1. Unit Testing – Testing small pieces of code. This forces the developers to break up the code into smaller pieces. This makes is easier to maintain and understand, reduces copy paste, ensures that overall functionality is, if at all, minimally impacted by refactoring.
  2. Functional Testing – Testing all the functions of the application.
  3. Performance Testing – Testing the application to ensure that the application is performing within acceptable limits and meets the SLAs.
  4. Load Testing – This is similar to the Performance Testing. The goal of this is to ensure that the application does not collapse under load.
  5. Smoke Testing – This is a light-weight testing which will test the key functionality of the application. This should be included as part of Continuous Integration so that any breakage in key functionality comes to light very quickly.
  6. Integration Testing – This ensures that the integration of the modules within the application and the integration of the application with the external systems is functioning correctly.
  7. Mock Client Testing – This mocks the client requests and ensures that the client get the right response and within the expected time period.

Pragmatic Project Techniques

Some of the good practices to follow when working in projects are as follows:

  1. Maintaining a list of activities to do. This should be visible and accessible to everybody on the project. Even the client should have visibility to the list so that they are check the speed and prioritize the items in the list. Each item should have a target time. The list should reflect the current status and should not be out of date.
  2. Having Tech Leads in the project is important. The Tech lead should guide the team in the selection and utilization of the technology. Tech lead should be responsible to ensure that the deadlines are realistic. The Tech lead should act as the bridge between the developers and the management. It is an important role to be played by a person with the right temperament.
  3. Coordinating and Communicating on daily basis is very important. Meetings need to be setup on a daily basis. These meetings should be short and to the point, with everybody sharing details of what they are doing and what they plan to do. Team should highlight any problem they are facing. The solutions for these problems should not be part of this meeting, but should happen separately.
  4. Code review is a very crucial part of the project and every piece of code should be reviewed. Some good practices of code review are
    1. Review only a small amount of code at any time
    2. A code should not be reviewed by more than two people
    3. Code should be reviewed frequently, possibly several times a day
    4. Consider pair programming as a continuous code review process.

Tracer Bullet Development

Just like it is possible to fire a Tracer Bullet in the night to track the path before aiming the real bullet, it should be possible to predict the path of the project using the process opted for.

Process

Have a process to follow.
The process followed should not claim exclusivity in success of projects. If it does so, then suspect it.
Follow a process that embraces periodic reevaluation and inclusion of whatever practices work well for the projects.

Executing

  • Define the layers that will exist in the application.
  • Define the interfaces between the layers.
  • Let each layer be developed by a separate team, relying on the interface promised by the adjacent layers.
  • Keep it flexible so that the interface can be changed as it is hard to get the interfaces perfect the first time around.
  • First create the large classes like the Database Connection Manager, Log Manager etc required for each layer, then write the fine grained classes.
  • Collaboration between the teams developing the different layers is key to the success. These collaborations will Trace the Path that the project will take.
  • Do not let an architect sitting in an ivory tower dictate the architecture.
  • It is dangerous to have one person driving the whole project. If this person leaves, the project will come to a standstill.
  • Create stubs, or mock the interfaces of the adjacent layers so that it becomes easy to test.
  • Code the tough and key pieces first and test them before addressing the simpler ones. It may take time to show progress, but when the progress happens it will be very quick.

Common Problems and How to fix Them

What to do when legacy code is inherited?

  1. Build it – Learn to build it and script the build.
  2. Automate it – Automate the build.
  3. Test it – Test to understand what the system does and write automated test cases.

Don’t change legacy code unless you can test it.

Some other tips from the chapter

  1. If a code is found unsuitable for automated test, then refactor the code slowly so that it becomes amenable to automated testing.
  2. If a project keeps breaking repeatedly, automated test cases, emulating the user actions will help reduce the incidents.
  3. Ensure that the automated tests are updated with change in code/logic whenever required, otherwise these would become useless.
  4. It is important to have a Continuous Intergration so that the automated tests can be run regularly.
  5. Early checkins (in fact daily or more than once a day) and quick updates by the developers is important so that the integration problems are detected as early as possible.
  6. It is important to communicating with the customers and getting regular feedback.
  7. Best way to show the customer the progress of the project is to show them a working demo of the application.
  8. Introduce a process change when the team is not under pressure. Point out the benefit the stakeholders will have with the new process. Show them the benefit of the process/practice rather than talk and preach about it.

A wonderful Dilbert quote from the book
“I love deadlines. I especially love the swooshing sound they make as they go flying by.” — Scott Adams

Some Excerpts from the book

View all my reviews

When I started programming,
I just talked to the customer,
And coded and everyone was rocking,
Nothing was only bluster.

Then they said we need to do waterfall
Somebody talked to the customer,
Someone else coded leading to downfall,
And the project manager went a fluster.

Now they are saying use Agile,
Everybody talking to the customer,
It is making everything fragile,
And the project manger has lost his luster.

A few weeks ago a blog was published asking a rhetoric question “Can software be created in factories?“. My good friend pointed me to the wikipedia post on “Software Factory“. What I would like to point out is that the statement “Software factory refers to a structured collection of related software assets that aids in producing computer software applications or software components according to specific, externally defined end-user requirements through an assembly process. [1] A software factory applies manufacturing techniques and principles to software development to mimic the benefits of traditional manufacturing. Software factories are generally involved with outsourced software creation.” from the wikipedia is completely incorrect and fallacious, despite the number of individuals who believe it.

Martin Fowler’s bliki, Code as Documentation, has a link to Jack Reeve’s famous essay “What is Software Design?”.

This article first appeared in 1992 in the C++ Journal. It was written by Jack Reeve who had been in the industry for more than 10 years at the time and the trigger was the fact that C++ had taking the software world by storm. It was being seen as the panacea for all the problems plaguing the software industry during that time.

He summarizes the article as follows:

To summarize:

  • Real software runs on computers. It is a sequence of ones and zeros that is stored on some magnetic media. It is not a program listing in C++ (or any other programming language).
  • A program listing is a document that represents a software design. Compilers and linkers actually build software designs.
  • Real software is incredibly cheap to build, and getting cheaper all the time as computers get faster.
  • Real software is incredibly expensive to design. This is true because software is incredibly complex and because practically all the steps of a software project are part of the design process.
  • Programming is a design activity—a good software design process recognizes this and does not hesitate to code when coding makes sense.
  • Coding actually makes sense more often than believed. Often the process of rendering the design in code will reveal oversights and the need for additional design effort. The earlier this occurs, the better the design will be.
  • Since software is so cheap to build, formal engineering validation methods are not of much use in real world software development. It is easier and cheaper to just build the design and test it than to try to prove it.
  • Testing and debugging are design activities—they are the software equivalent of the design validation and refinement processes of other engineering disciplines. A good software design process recognizes this and does not try to short change the steps.
  • There are other design activities—call them top level design, module design, structural design, architectural design, or whatever. A good software design process recognizes this and deliberately includes the steps.
  • All design activities interact. A good software design process recognizes this and allows the design to change, sometimes radically, as various design steps reveal the need.
  • Many different software design notations are potentially useful—as auxiliary documentation and as tools to help facilitate the design process. They are not a software design.
  • Software development is still more a craft than an engineering discipline. This is primarily because of a lack of rigor in the critical processes of validating and improving a design.
  • Ultimately, real advances in software development depend upon advances in programming techniques, which in turn mean advances in programming languages. C++ is such an advance. It has exploded in popularity because it is a mainstream programming language that directly supports better software design.
  • C++ is a step in the right direction, but still more advances are needed.

The points to note with respect to the factory aspect of software are highlighted in red. Note that the author states the coding is design and one cannot dispute this fact. An since one does not design in a factory software development cannot be considered to happen in a factory. It may look like splitting hairs, but for somebody who is coding, be it a novice who has started yesterday, or be it somebody who has been doing it for donkey’s years it is apparent that this is indeed a fact. One does keep designing practically with every line of code.

Another interesting excerpt is “In software engineering, we desperately need good design at all levels. In particular, we need good top level design. The better the early design, the easier detailed design will be. Designers should use anything that helps. Structure charts, Booch diagrams, state tables, PDL, etc.—if it helps, then use it.”

This was the statement of the author in the essay published in 1992. Writing about this in 2005 the author says “Today, I would phrase it differently. I would say we need good architectures (top level design), good abstractions (class design), and good implementations (low level design). I would also say something about using UML diagrams or CRC cards to explore alternatives.”
This is what the author is referring to from the earlier article: “We must keep in mind, however, that these tools and notations are not a software design. Eventually, we have to create the real software design, and it will be in some programming language. Therefore, we should not be afraid to code our designs as we derive them.”

The author goes on to say “This is fundamental. I am not arguing that we should not “do design.” However you want to approach the process, I simply insist that you have not completed the process until you have written and tested the code.”

Note that the author bolsters the argument that software development involves design at all stage. It is not limited to a single design phase.

Another interesting statement in the second essay is “When the document is detailed enough, complete enough, and unambiguous enough that it can be interpreted mechanistically, whether by a computer or by an assembly line worker, then you have a design document. If it still requires creative human interpretation, then you don’t.”. Again goes on to prove that software cannot be created in factories.
One final argument to support that Software cannot be created in factory “The problem with software is – design is not just important, it is basically everything. Saying that programmers should not have to design is like saying fish should not have to swim. When I am coding, I am designing. I am creating a software design out of the void.”

Update
When this was sent to a few people the reply I got back is “In the Indian IT industry, there is no such thing as a “Less Able Programmer”. All donkeys can be “processed” to become a stallion. all crows can become swans…” and all that can be said about this sad fact is that “This belief is exactly the bane of the Indian IT industry and in my, black, cynical, negative opinion is going to lead to the downfall of the what we today consider to be a cash cow.”

In my interactions with various personnel working in the IT world I have noticed that some people have the tendency to use the word “factory” to describe location where people are either writing new applications or are maintaining existing software. Something in the word “factory” raises an irritation in me. I do not get a comfortable feeling when somebody equates to software development/maintenance to the tasks performed in a factory.

What, I think, these people fail to realize or admit is that in factory the tasks tend to be repetitive and hence “teachable” and “learnable”. This is the reason why we see so much automation in all the factories, they hardly have any human intervention.

Unlike the manufacturing factories the software “factory” is full of people. Except for very few processes in the development cycle, software creation cannot be automated. Human intervention is required at almost every stage. Software requires human touch during creation.

From a maintenance and support perspective too software needs humans to address any issues that come up in the production. Very little of this can be automated.

Given these it gives me creeps if somebody refers to “software factory”.

1. Thou shalt not copy-paste code
2. Thou shalt name appropriately even if it means typing long names.
3. Thou shalt write and automate Unit Test Cases
4. Thou shalt write small methods and small classes. (#)
5. Thou shalt document, not how it works, not what it does($), but how it can be used and why it exists
6. Thou shalt have only two levels of indentation in any method.
7. Thou shalt not write god classes (*).
8. Thou shalt update your, logically complete, code changes at least one a day to the version control system.
9. Thou shalt update your development environment at least once from the version control system.
10. Thou shalt be humble enough to accept errors or inefficiencies in your code as pointed out by your reviewers, peers or juniors and be open to correct them.
Notes
(#) Because all good things come in small packages.
($) What it does or contains should be known from the name.
(*) God classes are classes that do too many things. Whole application depends on a few classes.

Categories