Minimizing Work in Progress within a Scrum Sprint is Important, too!

 

How many times have you heard the following from the testers in a team attempting to be agile?

“There was not time to test at the end of the sprint. We just got most of the features on the last day of the sprint. How are we supposed to test those features?”

Frankly, I think it happens all the time. This is one of the reasons that some teams try to have the testing teams test features in a following sprint. The problems is that so many teams attempt to apply a variant of Scrum that can be called Scrumerfall. Basically, these teams are treating a sprint like a miniature waterfall project like the following diagram shows.

This team is based on a real scenario that I’ve seen. The team had three developers, on the first day of the sprint, they all took a PBI and drove it. But what happened? The variation in completion of requirements and development resulted in really squeezing the testers that were on this team. Sure they were working on test cases and test planning initially, and even participating in requirement meetings. Something obviously wasn’t working very well. For instance, how well do you think Feature 3 was actually tested?

If that’s all we do when setting up scrum teams, we’ve failed. We haven’t allowed testing to happen sufficiently, and we haven’t really built quality into the process. We have limited the amount of work in progress (on paper), but technical debt obviously is growing here no matter what.

An alternate reality of Sprint 23 above is executed by the same team, but we’ve set a WIP (Work In Progress) limit of 2 PBI at a time. Let’s see what happens:

What’s different here? The total duration of Features 1 and 2 shrunk a little bit because I had to double up the effort short term (we forced the team to collaborate on those features, rather than each going in their own directions). Feature three testing also grew in duration which illustrated it probably needed to get tested more even in the first case. Additionally, the team is often performing all types of the functions in there, from requirements, development, and testing throughout almost the entirety of the sprint. The team was increasing its design, build, deploy cycles and having many of those throughout the sprint, rather than just at the end.

At the end of the day, forcing WIP limits can increase the throughput and quality of features that are running through the system.

Why Minimize Work in Progress is Important in Software Development

Many years ago, The Goal was written by Eliyahu Goldratt as a business novel to illustrate the fallacies in thinking that companies in a manufacturing world tend to have about how to maximize the profit. In this novel, the protagonist learned how to do so for his plant by applying the theory of constraints. As bottlenecks were identified and solved within a sequential manufacturing process, the flow of the whole system was improved when the group stopped optimizing for utilization, and started optimizing for flow. In this process, the team learned to keep the total work in progress to a minimum, so that as a product was started, they worked to get that piece to completion and shipped as quickly as possible. In that world, inventory was a form of work in progress and that was real dollars to be received if it was shipped.

In software development teams, we have our own form of work in progress. Everything that we spend between requirements and Released Code, is our Inventory. It represents time and dollars that have gone into the process that are not being realized in terms of value for our customers. In the below diagram, everything before Released Code is “inventory.”

This is one underlying reason why Scrum and Agile have taken hold in our software development community. Scrum in particular prescribes a batch size (a sprint’s worth of work) that represents the maximum amount of inventory the team should take on. In general, Scrum teams should have “shippable” software and that distinction between shipped and “shippable” is a compromise because our customers may not be able to accept the software at a frequency of “a sprint.”

Before software is shipped, or deployed to production, represents an expense to the company. Consider the following two cases, one is an agile team, and the other is a waterfall project:

Waterfall Team (18 Month Project, 15 team Members)

  • Requirements Team (3 Members Each Making 80K/Year, 5 Months of Work): $100,000
  • Development Team (8 Members Each Making 80K/Year, 9 Months of Work): $480,000
  • QA Team (4 Members Each Making 80K/Year, 4 Months of Work): $106,666

    In whole, the company has $686,666 of inventory. And if this is a general standard operating practice, they are continually (just in IT) maintaining that much inventory on the books (that’s equivalent to having that many products sitting in a warehouse and not being sold).

Scrum Team (18 Month Project, 15 Team Members (there are two teams)). Below is the amount of inventory the company is holding onto, assuming deployments every two weeks. Assuming maximum amount of inventory below (last day of the sprint).

  • 15 Cross Functional Team Members (2 Weeks of WIP): $46,153

Two scrum teams at any given point have six (6%) of the total inventory being maintained. They then ship their software, and the company can realize value from that software much sooner in the process of building that product. Of course there are caveats here in that a software development product must have a minimal viable product, which is assumed to be less than the total scope. Whenever the scrum teams meet that MVP, the company can start transitioning some of these costs into capital expenditures.

This is not true just of Scrum teams, but teams that apply Lean principals and rely on Kanban boards and release continuously also are taking advantage of the significant savings possible in in this world. Rather than measuring a fixed batch size, they are in fact measuring lead time and time to resolution. Either way, the company again is realizing the value much sooner for their efforts.

So we work to get the boxes out of the warehouse and into the customers hands sooner.

Choosing the Right Level of Communication from Release Management

We leverage Release Management for Visual Studio quite often internally and at client sites to enable them to implement release processes for their organizations. One of the strengths of Release Management is its ability to model moderately complex workflows and approval processes within the enterprise. One of the challenges is to get the right level of communication from this tool. If a group has too little communication then people are not informed when they need to take part in a workflow. If there is too much and people begin to start to ignore it completely. Both of these outcomes result in decreased alignment and harm adoption of Release Management within organizations. This post is aimed at highlighting a common problem and a few solutions for it.

For this post, I’ll be using the following simple environment promotion process (click here for explanations of validation, approval, and acceptance):


Problem: QA Manager is Getting Too Many Notifications

Each Release that is created has a targeted environment. The target environment will indicate how far (from left to right in the above diagram) that release will be promoted if all approvals are made. Typically Development will “automate” validation and approval in the development environment to ensure that the noise to them for doing many builds is minimal. In an environment where any build can potentially make its way to production, this is bound to happen. The development team is in effect shifting the noise to the QA group/owner.

Solution A:

Development team changes most builds to only target Dev, and then if a release needs to go to QA and beyond: Dev can retarget an existing release to QA, or create another release that targets QA.

To retarget an existing build, within Release Management, navigate to Releases (tab), Releases (hyperlink below), Open the relevant release, and click on properties:

After that, simply change the Target Stage of the release, and Release management will prompt you to accept the new target stage.

Solution B:

Development team accepts the notification load by adding a member or members as Approvers of the Dev build when it’s satisfactory to promote beyond.

Conclusion

As this one issue highlights, where users are involved in the release process, a team needs to be particularly deliberate about how communication is handled. Before tools like Release Management, it was largely the responsibility of Development and QA to discuss when the next build would happen. These conversations should still be taking place regardless, but now Release Management will help in that process if it’s helping organize the communication. That will only happen as long as both parties in the example transaction above feel like they are getting a fair amount of communication.

 

 

 

Why Agile Transformations Can Fail: Sense of Urgency

At Polaris Solutions, we work with a number of customers that are looking to become more agile in their software development practices. We often help them move in that direction through adoption of new practices, tools, and roles within their organizations. Scale (the number of people/teams affected) and Scope (the degree of change involved) are the two largest variables that go into planning such a change. The bigger the scope and scale together tend to require more effort in order to ensure that the change will be successful and our customers will realize more value. In order to be successful at agile transformations you need more than agile knowledge and experience, you need to be experienced at managing change.

Over the years, as we’ve been involved in numerous transformations, we’ve noticed several patterns for success and many anti-patterns. In this post, which is the first in a series of posts that I will write, I want to deal with one common way that these efforts can go sideways or fail: Not establishing any sense of urgency.

The first question to ask – is why the transformation is needed? Is it actually an urgent need? Does your team or organization feel compelled to change now or is continuing standard operating practices considered generally acceptable? Before a team or organization is going to change, they will need to recognize the pain of not moving. It has been suggested that human beings change behavior only in response to pain (either emotional or physical), therefore, the greater the actual perceived or actual pain, the more the higher motivation to fix it. This is true for agile transformations as well.

Providing strong focus and investment in “the case” for an agile transformation in the enterprise will pay off many times over. The following are some suggestions to ensure that a transformation gets off to right foot that we’ve gathered from our experiences:

  • Ensure Awareness – I’ve seen too many organizations hope to enact change after 2 weeks of training and never talking again about agile. There are many degrees of education and experiences you can provide a team to illustrate the value of moving to agile. Giving awareness and knowledge can help make the case for agile as it largely makes sense to most team members. “Early and often” is a good rule of thumb when communicating in larger organizations.
  • Identify Champions and Hold
    Outs – The United States military would like to ensure that it has overwhelming force when engaging in battles, and you should as well when endeavoring to change a company’s culture towards agile. You have friends and allies that can be helpful – do not try to lead a change like this by yourself.
  • What Will Happen if We Don’t Change – Answer that question for yourself and for your team or organization. Perhaps the answer to this is “quality issues are causing us loss in customers or contracts.” You can use a combination of logic and reason and emotion to communicate this concept, but I have found that if it can be tied back to metrics and numbers – most people will understand and align with you.

None of these items really have to do with technology or agile in particular, but the people in our industry that help lead these changes need to consider these concepts if their engagements are to be successful.

Release Management’s Traffic Overview Explained

I wanted to write another post to help clear up some functionality in Release Management (for Visual Studio 2013). Many teams that I have opened up Release Management (RM) to see the traffic overview page and wondered how that data is actually calculated. What are App/Traffic/Fail metrics to be watching and why is this important. The pro-tip here is that if you hover over these numbers in the tool, you can see a better description.

Each of your Release Paths (pipelines) are listed down this page. Each of your stages is shown for each pipeline. You can get a sense of how many steps are required for an application to reach its ultimate destination (in this case “prod-client”), which is the right-most stage for any given pipeline.

Under each stage, you have:

  • App – which are the number of apps that have been released to that stage (all time). Note, you can have multiple applications going through a shared release path. If you have two applications that follow the same approval chain and environments, have them share Release Paths.
  • Traffic – Number of releases in the last 5 days to have been released to that stage (or have failed to get to that stage, but attempted)
  • Fail – Number of failed releases in the last 5 days to have been released to that stage.

You should probably see higher numbers on the left than you do assuming you’re like most development shops.

So what can you do on this screen?

If you double click on any of the stages, you’ll get more details about the apps that have been transitioned through that Release Path. It will show you the all-time traffic through each of those stages (more than the last 5 days).

Release Management Approvals Explained

Microsoft’s Release Management for Visual Studio 2013 has a lot of great functionality around authorization and security for releases that are being promoted through environments within the enterprise. One of the great strengths of Release Management is its ability to track actions being performed around deployment in an auditable fashion. Every single approval and manual step performed by humans is captured by the tools (assuming you’re respecting the rules of the game), and every single automated action has an owner that is tracked as that automation is performed in a company’s IT environments.

A Release Path in Release Management defines the environments by which applications will be promoted. It contains the definition of which environments are relevant and who will be participating in that promotion process. The release path below is simple one that contains only a Dev and Prod:

For each environment below, one can define relevant approvers in the workflow. I’m highlighting what I think is the most important of these:

Acceptance: This person or group of people is responsible for accepting a new release into that environment. During the release process, Release Management (RM) will optionally email everyone that is named in that box. And if any one of them approve it, then Release Management will start the automated deployment process.

Validation: This person or group of people is responsible for indicating, generally, that the automated deployment worked. Perhaps as part of the process, the team decides to run a quick smoke test against the environment before considering it “validated.” The team has the ability to define what “validated” means and it could vary by team and environment. It could mean that the QA team has run a regression test, or perhaps the developers have performed a quick smoke test. In this case, as in the case of Acceptors, if any one member of the group approves, then the release is considered validated for that environment.

Approvers: After an environment is validated, then there’s another step called Approval. This is a separate from validation in that – it’s not indicating that the release is working in some fashion, but rather this is the chance for a person or team to approve that the release can be considered “worthy of being promoted beyond.” For a development environment, the development team may be signifying that the release is worthy of moving to QA or PROD. This Approval step is different from the other two kinds in that multiple people or groups can be added. All of the ones listed must approve before the release is considered “Approved” in that environment.

 

Below is an example chart that I’ve used to map out roles and responsibilities in the release process. This chart was created for a traditionally managed IT organization with separate QA and DEV teams.

Environment

Development

Testing

Production

Acceptance

[All Developers]

Recommend leaving automated to lower friction of deploying to a development environment.

 

[Testing Leads]

Recommended a small group make this acceptance to prevent any loss of work from testing team due to errant releases to the Test environment.

[Infrastructure Team Leads]

Team required to indicate that production was ready for a release and the change management process was respected.

Validation

[All Developers]

Recommend leaving this automated to prevent friction of deploying to development team.

[All Developers]

Likely, the developers would do a quick smoke test on the QA environment to ensure it’s in working order before testers start performing their tests.

[All Testers]

Whomever has the skills to indicate that the release successfully made it to the environment.

Approval

[Development Leads]

 

Agrees that the release is good enough for QA to begin testing.

[Testing Leads] and [Product Owner]

 

Required both sign offs before the release is readied for production.

[Product Owner]

 

Someone that signs off that the release was a success and new features are working as expected.

 

I have a few patterns that I’ve noticed that seem to be worthy of reminding:

  • Validation and Approval can be confusing. They both take place after the deployment, but I like to suggest that they mean very different things. Validation is something that happens to suggest that the release is good for the “current environment” while “approval” really indicates that the release is good enough for a “future environment.”
  • Acceptance or Approval both are gatekeepers to prevent the releasing of a build to a new environment. When teams use both of these, it can be confusing which one is going to be used for tracking “ready for next.” I’d suggest “Approval” to be used particularly between Test and Prod as the “this is ready” as opposed to acceptance is “the environment is ready” for the new release.

I’d be curious if you have any other patterns of approvals/disapprovals or if you have used these states in other ways that defined above. Please comment and share.

Recreating TFS Shelvesets in New Environments

Invariably, when you work in the ALM field enough, you get your share of source code migration projects. Lately, I’ve had a few of those, and many of them between two separate Team Foundation Server instances. In the world of source code migration, often consultants will look to recommend either tip-only migration or the use of automated tools to essentially play-back all the changesets/checkins from the old system into the new one. The tip-only migration means to get-latest the code from the old, and check that latest version into the new system. Although typically far simpler, this approach of checking in a single latest version does have its complications. Lately, I’ve worked with several clients that have had invested in shelvesets that need to be brought over. In those cases, the teams had access to both the old TFS server and the new one at the same time. I had written up some instructions on how to do this easily for the developers so that they could bring over their own shelvesets at their convenience before the old system was taken down.

This approach is not-automated and requires the users’ participation, but is a decent option for small teams.

What you need is a developer’s local workstation with workspaces mapped to both the old and the new TFS servers.

  1. Ensure that a local workspace for the old environment is mapped and up to the version of code that your shelveset is based on (often this is “latest”). At this point, TFS believes that the local workstation has the latest version of the code.
  2. In Windows Explorer, navigate to that location and Delete everything under the workspace mappings to that the folder itself is empty. TFS may detect that as a bunch of deletes, but that’s okay.
  3. Unshelve the shelveset to be migrated. At this point, the old workspace’s folders should only have the shelveset files in it.
  4. Paste that set of files into the new workspace in the right place in the folder structure.
  5. If using VS 2012 or VS 2013, Open up the Pending Changes Window and ensure to promote those changes that should show up now as Pending Changes.
  6. If your original shelveset included any deletions, the developer will need to go perform those deletions in the new workspace as well.
  7. Shelve the changes.

At this point, you should have a shelveset mapped to the correct folder path in your new environment ready to be used by the developer.

Promote the Bits or Promote the Code

Release management in the enterprise has design patterns just like software has. A design pattern in the software design is a way to communicate recurring structures of code and logic in applications architectures. They help technologists in various roles communicate intent and structure in a succinct way. If both people are aware of a particular design pattern, then a huge short-cut in communication can be reasonably be taken. Invariably in my consulting career when doing a tools-focused engagement, we start to talk about how to do automated builds and deployments. One of the first decisions is whether or not to couple build and deployment and this is a “release” pattern that enterprise teams are choosing all the time.

Promote the Code.

This pattern was earlier defined commonly as “Source Code Promotion Modeling” – essentially having a branch for every environment, and create a build for every branch that would deploy to that environment. This can be represented by the diagram below showing the artifacts that would typically be created in this situation.

This has traditionally been less expensive to implement. It requires developer skillsets in branching and merging which are ubiquitous, and it requires very easy configuration strategy for builds. The downsides are two-fold: (1) Code is recompiled in each environment and therefore what was tested in a lower environment is not being promoted, but rather in best cases the actual source code was that gets recompiled and (2) Depending on implementation, even the same code cannot be guaranteed to be the same in the lower environments if reliance on merging with the possibility of two-way merging can happen.

Promote the Bits

In this world, the source code is compiled once, but released/deployed potentially man times. This has the result that the compiled binaries that were tested in a lower environment are identical to the ones that are being promoted to the higher environments. Typically, the application is just re-configured in each of the environments that are required. The downside has traditionally that this is certainly more sophisticated and thus more expensive to deploy in enterprises that are just adopting new patterns. What is changing however is the cost to move to this pattern has started to decrease in the Microsoft community due to Microsoft’s entry into the market through “Release Management for Visual Studio.” Having an anointed Microsoft solution, in reality, has pushed this scenario more and more into the mainstream of enterprise software development.

Neither of these concepts are new. Using Team Foundation Server and Release Management, both scenarios are fairly easy and cost effective to implement. I have found that these two phrases: Promote the Bits or the Code — definitely have simplified the conversations with my clients and allowed us to move into the things that in fact do vary from client to client. If you’re small and looking for some quick wins – Promote the Code can work. If you’re subject to regulatory compliance such as SOX or other – Promote the Bits looks to be the answer.

Deliberate Risk Taking in Software Teams

Risk is a difficult concept to understand and deal with. As an industry, when building software, we talk about lowering our overall risk and that will increase our chances of success. So many things can potentially go wrong on projects that we are simply conditioned to remove those possibilities as much as possible. In many cases, I would agree with that approach, but just as in the field of Enterprise Risk Management (ERM), our industry needs more nuanced ways of looking at risk in order to move forward and striking the right balance between debilitating risk aversion and wild-west gun slinging. Today this is managed by our industry gurus, our project managers, and scrum masters using intuition and experience. Perhaps if we formalized the thinking about risk, we can increase our overall competency beyond our sages of wisdom.

ERM categorizes risks into two categories: ancillary and core.

  • A core risk is one that the company wants to take on because often it is the reason for them being in business and earning revenues. The business is said to be exploiting this risk and deriving profits from doing so. If you’re a consulting company, such as Polaris Solutions, you really want to take on as many fixed-bid projects as you can – as long as you know they are the types of projects your company can do well. Similarly, a staff augmentation firm that might not have strong engagement / project management competencies may want to avoid those same engagements. The consulting group should take on the risk because in theory it should be compensated for taking on the risk and it knows it can deliver on or ahead of schedule. Additionally, imagine a cloud hosting service provider like Microsoft Azure. People pay Microsoft so they can offload their risk of outages and complexity in infrastructure. If Microsoft hires a lot of really smart people that do infrastructure management, it wants the risks associated with that work because it can charge a price for that. In theory – their competency is keeping the datacenter functioning well.
  • An ancillary risk is one that the company would love to reduce, mitigate, or offload to a third party whenever it can. Companies extend credit to their customers all the time through “accounts receivable” – which are promises of customers to pay invoices within 30/60/90/etc. days. There is a risk always that a customer might not pay their invoice to the company. Most companies don’t like that risk and that’s why many have shrunken payment terms, or outsource collections from the outset to someone that is good at “collecting.” At my company, Polaris, there’s the risk that an employee gets hurt on the job. I don’t personally like having that risk, so we buy (in addition to compliance reasons) workers compensation insurance which prevents me from having to pay the medical bills directly. That’s an ancillary risk that I don’t want, and is not core to what we do.

These concepts can apply to software development teams as well. Unfortunately, most of us treat all risks in the same way – they are things that we want to remove from our paths. Some examples of risks that indeed are ancillary for most enterprise development teams:

  • Laptop / hard drive failure results in source code getting lost.
  • Changing business conditions resulting in changing requirements (*note on this in my next post)
  • Dependencies on other teams’ output not getting built in time
  • Dependencies on third party libraries that are unexpectedly failing
  • Not considering product like hardware when testing and sizing applications results in scalability challenges

If those are ancillary risk, then what would possibly be a core risk to the team? A software development team exists to build software, and therefore the core risks are often surrounding the software itself. Some examples might be:

  • An experiment in new design patterns results in slower code
  • An architecture spike didn’t produce a successful POC
  • Some new feature might be more complicated than we thought it was going to be

These do share a commonality that they are the actual product, and sometimes within the teams own control. I’ve seen agile and waterfall teams unnecessarily pad estimates in such a way to reduce their risk of missing commitments to the project or product. By padding estimates, especially on an agile team, they are ensuring that fewer User Stories are being accepted by the team. Those teams often always meet their commitments and then have time left over. If this culture persists, the team takes on less and less over time. Their overall value to the business becomes less and less as they are marginalized from key product enhancements. In extreme cases, when left unchecked, this leads to team members being “let go.”

In this post, I’m not arguing to take on all commitments you can, but I am pushing that we recognize when our teams are starting to look “risk averse” and are being overall safe beyond what is acceptable. As members of the team, we can recognize this most easily, but a trained scrum master can recognize this through the empirical data on sprints. The moral of the story is that padding estimates is not avoiding the same type of risk as adding another hard drive to prevent hardware failure. So think differently about these cases, and we should ensure that our teams keep striking a good balance between too much/too little core risk.

What do you think – do you see any core risks that your team is silently mitigating?