In my journey through the world of software development, one practice has consistently stood out as a game changer for organisations striving for more frequent delivery: the use of feature flags. This approach not only facilitates continuous delivery but also allows teams to deploy new features to production incrementally, ensuring that they can gather valuable feedback before fully rolling out changes to all users.
Embracing Incremental Delivery
When we talk about continuous delivery, it’s essential to accept that we might deploy features that aren’t quite ready for the end user. This doesn’t mean we’re throwing caution to the wind; rather, it’s about delivering small increments of functionality. For instance, you might have a feature that requires multiple deployments to reach its final form. Imagine you want to deliver 10% of a feature now, with the remaining 90% to follow later. This is where feature flags come into play.
A Real-World Example: Azure DevOps
The Azure DevOps team exemplifies this practice beautifully. When you log into their platform, you’ll notice a button in the top right corner that allows you to access preview features. This is a clear indication of how they manage feature flags. Here’s how it works:
- Development Phase: When developers are working on an update, they often ship smaller changes directly. However, for significant new functionalities, they hide these behind feature flags.
- Internal Testing: Initially, the feature is enabled only for the developers’ accounts. They monitor telemetry to assess performance and identify any issues.
- Private Preview: Once the team is satisfied with the internal testing, they may open the feature to a select group within Microsoft. Users can opt in to test the feature, providing feedback and telemetry data.
- Public Preview: If the feature performs well, it transitions to a public preview, where it becomes available to all users who wish to try it out. This stage is crucial for gathering broader feedback.
The Importance of Feedback
Feedback is the lifeblood of this process. When users turn off a feature, they’re prompted to provide reasons. This engagement is invaluable. I’ve personally experienced this when I’ve turned off a feature and received follow-up communication from the product manager. They genuinely want to understand what’s missing or what could be improved. This level of engagement not only helps refine the feature but also fosters a sense of community between the development team and users.
Continuous Monitoring and Adaptation
As the feature progresses through its lifecycle, the team continuously monitors its performance. They assess whether enough users are engaging with the feature and whether it meets their needs. If the telemetry indicates that the feature isn’t performing well, they can quickly disable it, ensuring that it doesn’t negatively impact the user experience.
The Long Road to Full Deployment
It’s worth noting that transitioning a feature from development to full deployment can be a lengthy process. For example, the Azure DevOps team took nearly three years to fully migrate users from old boards to new boards. This was due to the need for ongoing adjustments and improvements based on user feedback and performance metrics.
Conclusion: The Power of Feature Flags
Incorporating feature flags into your development process is a powerful strategy for enabling continuous delivery while safeguarding your users and system. By allowing for incremental releases and gathering real-time feedback, you can ensure that you’re building the right features that truly meet user needs.
As you consider your own practices, I encourage you to reflect on how feature flags could enhance your delivery process. Embrace the opportunity to engage with your users, gather insights, and ultimately deliver a product that resonates with them. After all, in the world of software development, it’s not just about delivering features; it’s about delivering value.
So one of the key practices that I think is hugely valuable for organisations if they’re moving towards that more frequent delivery, if they’re starting to get things in front of customers as quickly as possible. Right, you want to, if you want to be able to do continuous delivery, you’re going to accept that you’re going to be deploying features, new features to production that you’re not ready for customers to use yet. Right, you might need, you might have a feature that takes multiple deployments to get to the point where you’re happy to have it in production. Right, even if the end of the feature is much further down the line, you’ve got 10% you want to do before you want the majority of users to be accessing it, but you’re going to be delivering 1% at a time. Right, just as a random example, a great example of this, again, the Azure DevOps team do a great job of this. You can go open up dev.your.com and when you’re logged in, you’ll see that there’s a button in the top right with a picture of you or a blank picture of you. You click on that and it’ll have a preview features option, and when you click that, it will list all of the feature flags that are available.
Okay, so what they effectively do is they write all of, if they’re writing an update to an existing feature that’s in production, it’s just going to go, unless they’re replacing a whole feature like they did recently with Azure Boards. They had old boards and new boards they wanted; they were doing a big update and they had a feature flag for that. Right, typically if you’re adding functionality to an existing feature, you’re probably just shipping it, right, because it’s a smaller change. But if you’re adding a net new functionality to your product, a new feature, a new capability, a new different thing, then they hide it behind a feature flag. The feature flag would then be turned on in their environment. They have the developers themselves’ environment in their environment for testing. Once they’re happy with their testing, they’re going to turn it on for the product group. Right, so the environment that they’re working in, their account has that new feature, and they’re going to be looking at the telemetry. The developers that built the feature are going to be looking at the telemetry. So it’s off for 99% of all the users in the system; it’s on just for their own engineering team. And then they’re looking at the telemetry: is this performing? Is it doing the right thing? Is it causing a lot of exceptions? They’re looking at all of that data. Once they’re happy that that’s good, they maybe open that up to everybody within Microsoft that’s opted in to see those things. So then everybody inside of Microsoft can turn on that feature and can turn off that feature. The feature flag’s visible. If you turn off a feature flag, it asks you why you’re turning it off and if it’s okay to contact you. Very prudent. I always give a message and I always tick the box to say it’s okay to contact me, and I do get contact from those teams because they’re interested in, like, why are you turning it off? Oh, you’ve said why it is, but I don’t quite understand what you mean. And then they’ll contact you and try and figure it out. What’s missing? Right, what’s missing that you need? Is there an expectations gap? Is it bad? Right, it’s not what I need. Is it causing errors and performing badly on my machine? Right, those are all could be reasons why I want to turn it off. But at some point, they’re going to want to open it out to a wider audience, more people in the world. So what they typically do is they publish a blog post on their team Azure DevOps blog that says, “Here’s a new feature we’re working on. We’re looking for people to help us kick the tyres.” And “kick the tyres” is just a euphemism for try it out and see if it works. And they give an email address for you to email with your account, and they’ll enable it for your account. Then on your account, you’ll get that feature flag, and it will be off by default, and you can go turn it on. Again, they’re looking for people turning it off and feedback. They’re looking for telemetry of people leaving it on. Is it performing well? That’s their private preview. So you have to explicitly opt in to that capability, and that means that you’re limiting your audience. Right, they’ll have an idea; they’ll be able to look across all of the times they’ve done this and have an idea how many people they’re going to get. With that number of people, they’re then monitoring the telemetry and looking at, do we have enough people using it? Is it because they might have to do another blog post, right, or another way to get people in? And do we have enough people using it? Does the telemetry look good? Is this viable for the next stage? So the next stage after that might be a public preview, and a public preview is that switch that feature flag then becomes visible to everybody on the service. And they put it out in a blog post saying, “This is now available to everybody. We’d like you to try and kick the tyres.” So people that want to then go switch it on, it’s now easy. They don’t have to email; it’s just there. They can go turn it on and see what it’s like. Some people just discover it; other people, they’re reading the blogs and they find it. And again, they’re looking at telemetry: do we have enough people using it? Do we need to promote it again? Do users care? Do they use this feature? Is it performing well? All of that type of data they’re looking at on whether they need to invest more in that feature, whether anybody cares about that feature, whether they just need to stop. Right, or so it’s pivot, anti-up or call, right? That’s your poker analogy, right? You’re, is this feature viable? If they get enough telemetry and everything looks good, the next stage is they turn on the feature for everybody on the server, but they leave that feature flag to be able to turn it off. So they’re interested in that feedback: who’s turning it off? Why are they turning it off? Can we have a chat with them and find out what’s missing? What do they need? Right, you’re engaging with your customers. This is the development team, the engineering team that built that functionality, is continuously engaging with real customers in production, either through telemetry or actually talking to them because you want to find out what’s going on. I’ve turned off a feature before because I didn’t like it and said I didn’t like it. This is bad; it’s not working for me. Take the box, hit submit, and a couple of days later, I’ve got a communication from the product manager for that feature, the person who’s managing that feature and bringing it through, to basically ask what’s it need? What’s missing? Why do you not like it? All useful. And do we need to keep continuing to make changes to it before it goes live? So at any point in that story, you can pull that feature. You can just disappear that feature flag, and it’s gone for everybody because it’s performing badly. And you can automate that as well, so you could have automation that checks for performance and then automatically lets get rid of that feature if it’s impacting the system. Right, that should be just built into the system. And then once everybody’s got it turned on, people aren’t turning it off. There’s probably a few lards; there’s probably a few unhappy people when you change things, right? You move their cheese; that’s just the cost of doing business. But everything’s generally okay. Then feature flags on for everybody. You start removing that feature flag from the system, and that feature’s just in production. So that could be a very long time. I know that when the Azure DevOps team did their boards, the new boards, old boards, it took them nearly two and a half years, nearly three years to get everybody over to the new boards because they kept hitting missing functionality. They kept hitting some performance problems, and then they had to kind of old boards rewrite that new boards again, and it took them a long time to get from where we need to replace the boards to it’s done. It was a very long time. And so feature flags and how you enable feature flags, so they can be behind the scenes. They can be something that you give actual users access to. This is one of the core practices that enable you to do continuous delivery to production while protecting your system, protecting your users, and enabling you to get the telemetry you need to understand whether you’ve built the right thing in production.