The Problem With Feature Flags
By Noam YadgarWhat is a Feature-Flag?
A feature-flag service is essentially a configuration server that allows your applications to read real-time configurations (i.e., feature-flags or togglers). For instance, if you’ve deployed a new feature integrated across your stack, you’ll want to ensure that all microservices involved can be toggled to support it using a single flag. This approach enables your teams to deploy changes at their own pace while ensuring the new feature is available only when all relevant services are ready to support it.
flowchart LR
fe(Frontend)
srv(Webserver)
fe --> |GET /feature_a|ff(FeatureFlags)
srv --> |GET /feature_a|ff(FeatureFlags):::ffc
ff -.-> |if true => Render button| fe
ff -.-> |if true => Expose API| srv
classDef ffc fill:#9bf,stroke:#68c;
This example, while arguably one of the best use cases for a feature flag, already presents an inherent issue concerning its lifecycle once the flag is no longer needed.
A temporary fixture? Or part of the architecture?
Temporary fixtures are an essential part of engineering. For instance, if a main road upgrade necessitates blocking the existing road during construction, engineers might build a temporary bypass. Naturally, this temporary road will be removed once construction is complete. Therefore, a road upgrade project could involve:
- Building and maintaining a temporary road
- Upgrading the main road
- Removing the temporary road
A temporary flag
In our example, once all services are prepared to support the new feature, the flag is activated, rendering it immediately obsolete. The act of switching on the flag is what makes it dispensable. Removing a feature flag is a gradual process: Services utilizing the flag are updated with new code that no longer relies on it. Only when no service remains dependent on the flag, can it be safely deleted from the feature-flag service.
Implementing a temporary feature flag will inevitably require additional effort for its deprecation. A poorly deprecated flag can lead to various undesirable side effects, including:
- Unnecessary API calls to the feature-flag service.
- ‘Rotten’ code—code written around the feature flag’s conditional block when it should ideally replace the deprecated functionality.
- Unnecessary code branches and workarounds that could negatively impact the overall solution.
- Premature deletion of the flag, causing unintended issues.
Whenever a temporary flag is considered as a solution, it’s worthwhile to evaluate alternative, more permanent approaches that could integrate with or even enhance your software architecture. Referring to the flowchart above, we observe a clear direction where the frontend relies on an API provided by the webserver. A potential solution involves utilizing Semantic Versioning alongside a service discovery mechanism. This allows the frontend to render markup based on the availability of specific webserver versions.
sequenceDiagram
participant fe as Frontend
participant sr as Service Registry
fe ->> sr : Get /webserver/1.x.x
sr -->> fe : /webserver/1.2.3
alt >= 1.2.x
fe ->> fe : Render button
end
This particular solution can be integrated directly into the architecture (especially within a microservices environment). While a permanent solution might initially appear more time-consuming and complex than a simple temporary feature-flag, factoring in the flag’s deprecation process often shifts the total effort in favor of the permanent approach. A significant advantage of this solution is that, unlike a specific feature-flag, it fully liberates the backend and frontend teams to deploy updates at their own pace, as it’s not tied to any single feature.
Part of the architecture
From a technical standpoint, a feature-flag service can be positioned within the data tier, thereby becoming an integral component of your system architecture. Consider, for example, Alex, an employee at the Bank of the World (BOW), and Beth, who works at the Modern Medical Facility (MMF). Both organizations are your customers. However, BOW holds a Standard Subscription, while MMF has purchased the ProPackage.
Upon logging into your system, Alex is identified as a BOW employee. Based on BOW’s customer ID,
your services retrieve the PRO_PACKAGE_ENABLED flag from the feature-flag service. A similar process
occurs for Beth. Alex’s session will have the flag set to false, whereas Beth
will experience the full suite of features, with the flag set
to true. This approach is simple and effective, but is it truly necessary?
sequenceDiagram
actor user as User
participant app as App
participant ff as FeatureFlags
participant Database@{ "type" : "database" }
user ->> app : /login
app ->> Database : SELECT customer ...
Database -->> app : customer_id
app ->> ff : /pro_package_enabled?customer_id=$customer_id
ff -->> app : PRO_PACKAGE_ENABLED
While various companies adopt different strategies, and sometimes user-based flags are occasionally deemed necessary, particularly for non-technical teams demonstrating products and switching between different modes. However, feature flags used for user-based configurations can almost always be superseded by a more robust solution, primarily by integrating them directly into the database schema. Incorporating these configurations into our data model offers two primary benefits:
-
Improved Data Model Cohesion: Configurations are baked directly into the data model rather than being split and coordinated between two separate services. This establishes a stronger connection between entities through standard database features like relations and foreign keys.
-
Reduced Network Communication: The feature-flag service operates remotely, with its own dedicated stack. If your application can retrieve the customer ID from its primary database, it can most likely fetch the customer’s configurations within the same transaction, thereby minimizing network calls.
sequenceDiagram
actor user as User
participant app as App
participant Database@{ "type" : "database" }
user ->> app : /login
app ->> Database : SELECT customer ... JOIN customer_configs ...
Database -->> app : customer_id, ... subscription_type
While your sales team might need to utilize multiple accounts to demo the product, your R&D teams will manage these configurations in a substantially more robust and maintainable manner.
Should we always avoid feature flags?
Feature flags are not bad in their nature. In certain rare scenarios, a feature flag may represent the most appropriate solution, particularly when considering various business decisions. One such instance could involve a process that maintains a critical state and must not experience any downtime.
Even though the current mainstream approach prioritizes fault tolerance over crash-prevention and microservices over monoliths, certain processes may maintain critical states, making them difficult to redeploy. Furthermore, the deployment of some processes can be so time and energy-consuming that the cost of redeploying them due to configuration changes becomes prohibitive.
Feature flags enable these processes to acquire real-time configurations seamlessly, ensuring continuous operation. While any database can facilitate this, maintaining a clear distinction between the application’s data schema and its configuration map could enhance security and separation of concerns. In this context, the feature flag service can serve as a control board for setting and toggling various configurations.