Foundations of Azure Governance

Azure Governance Foundation

Cloud adoption continues to grow at a rapid pace.  It is truly exciting to see the pace of innovation, the increased capabilities the lowering bar of entry for companies of any size to have truly world-class technology solutions.  A move to the public cloud brings some great core enhancements such as agility and infinite elasticity when compared to many private cloud or on-premises scenarios, but it can also introduce some new challenges or risks that grow exponentially with that same elasticity.  For example, cost management, security, and overall platform management frequently come up as stumbling points for many organizations in their cloud journey.   These issues are not just what I am seeing with the customers I work with, they were some of the primary conclusions highlighted during the Cloud Computing Trends:  2019 State of the Cloud Survey recently published by RightScale.

With that in mind, I wanted to address some of the points I think are critical to setting the right cloud foundation of Azure governance.  When discussing Azure governance I tend to focus on five key areas:

  • Subscription / Resource Group / Resource Organization
  • Security Objects (RBAC)
  • Tagging
  • Policies
  • Locks

In this post, I will briefly cover each of these topics, and then post some good external references.  I also have a few follow-up posts queued up to dive into the details and will add those links as the material is posted.

When To Implement Governance?

The absolute best time to think about, plan, and implement a cloud governance strategy is before a single service is provisioned.  Unfortunately, that is not how IT initiatives often work, and many organizations start with informal pilots or “Proof of Concept” (POC) systems that are shepherded into production informally followed by additional services that expand organically over time.  With that in mind, I think the best answer is to make governance a priority now.  If needed, establish a formal project for cloud service optimization.

It is also important to understand that the nature of governance is that it is an ongoing activity.  As the needs of the organization changes, the services provided by the platform continue to change, the governance approach and policies should also evolve and be reviewed periodically.

Organization – Subscription / Resource Group / Resource

Deciding how to organize your subscriptions, resource groups, and the individual resources may seem pretty arbitrary, but the decisions can have a really big impact on your ability to secure and manage the resources over time.

It is a big topic, and one that I will devote an entire post to soon.  As you read the rest of the post you will see how the organization decisions impact, or are impacted by the decisions for the other foundational points addressed in this post.

Here are some organization considerations:

  • Service/Application Lifecycle:  Group services that you deploy and update together
  • Ownership Scope:  Consider the ownership of the service or application (division, department, etc).
  • Environment:  Consider the segmentation of services for dev, test, staging, and production and how this maps to the security assignments.
  • Policy Scope:  Consider the policies that may be assigned.
  • Location:  Consider which region(s) the services will run.  While a given resource group can contain services running in different regions, it may be beneficial for management purposes to structure your organization hierarchy to better support reporting and automation requirements.

MS Docs References:  https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-overview

Securing Objects (RBAC)

For resources using the Azure Resource Manager version of the service (as opposed to the classic service), all resources are secured using RBAC which stands for role-based access control.  RBAC provides a standard set of roles that can be applied to the objects within your Azure tenant.

These roles can be applied to any object including Subscriptions, Resource Groups and the underlying Resources.  These considerations therefore should be a core part of the decisions for how to organize Subscriptions, Resource Groups, and Resources as previously covered.

When I work with companies to define their foundational design I consider these goals:

  • Provide the minimum access needed with a least privileged approach.
  • Manage roles at the highest object level possible.

Least Privileged Approach

Simply put, only provide access to the objects and actions that are needed by the account.  For those not familiar with the principal of least privilege you can find the Wikipedia definition here.

Thinking about how that applies to our Azure environment, it means you want to associate the user with the right role on the right object(s).  When not followed, we might see a large number of users added as administrators or contributors to an entire subscription.  This scenario is very risky and can lead to either malicious or accidental changes that negatively impact production services.

Where to Apply RBAC

The decision of where to apply RBAC role assignments is a great example of where a little planning can go a long way.  Establish a default assignment behavior, and be consistent.  Generally speaking, I look to apply those assignments at the highest level possible.  Normally, I like to organize resources around a fairly granular set of resource groups.  In this case, I would define my global administrators as well as other globally scoped roles such as policy administrators at the subscription level.  Then at the resource group level, I would define the resource group owners, contributors, readers, and more granular resource-type roles.

MS Docs References: https://docs.microsoft.com/en-us/azure/role-based-access-control/overview

Tagging

With Tagging, Microsoft provides administrators with a logical way to identify and associate metadata with resources.  This metadata forms a taxonomy that can be flexible and evolve over time.

Uses for Tags:

  • Leverage in your resource and cost management reporting.
  • Drive criteria (conditional logic) in automation scripts.
  • Drive navigation links between related resources.

Tagging is another topic where I draw from my vast experience with content management on the SharePoint platform.  Like with SharePoint, as the number of items increases the importance of leveraging this flexible metadata becomes significantly more important.

This is a topic I put a lot of focus on and one I see too many companies struggle to adopt with consistency.  To help address that, I look to apply Policies related to tags.  Policies are addressed in the next section.

It is important to understand that Tags can be applied to both resource groups and the underlying resources.  However, if you apply the tag at the resource group level, it is not applied directly to the resources contained in the group.  This can limit the usefulness of the tags.  This is also an important example of how resource organization may be important.  For example if you have a Cost Center tag applied to a Resource Group then all of the underlying resources should be associated with that Cost Center.

You can find some more detailed thoughts in the article:  Azure Resource Tagging Overview

MS Docs References: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-using-tags

Policies

The Azure Policy service and underlying policies can be used to help support your overall governance efforts.  While some administrators have a negative impression of policies and see them as limiting, they are not necessarily there to slow you down or prevent innovation.  I generally lump policies into three key categories:

  1. Compliance Requirements:  Rules that monitor things that have to be done to satisfy the organization’s compliance and regulation requirements.
  2. Company Requirements:  Rules that monitor things that have to be done to satisfy the organization’s internal requirements.
  3. Good Hygiene Requirements:  Rules that monitor general best practices or help to ensure that the platform will be easy to maintain.

While I find overall use of policies inconsistent, and lightly applied I definitely think more organizations should consider that last one to help promote good hygiene.  A great example of this is enforcing the use of tags.  Plans to leverage tags in your Cost Management Reports or Automation scripts quickly fails when tags are not properly applied to the right objects.  Use of policies an quickly address this, and make it easier to find configuration or metadata problems that could be lurking in your tenant.

MS Docs References:  https://docs.microsoft.com/en-us/azure/governance/policy/overview

MS Azure Policy GitHub Repo: https://github.com/Azure/azure-policy

Locks

Simply put, and as the name implies, Locks allow administrators to prevent bad things happening to the object it is assigned to.  Locks can be assigned to a subscription, resource group, or an individual resource and provide an additional protection layer (in addition to security) to prevent accidental changes or deletions.

The locks come in two flavors:

  • CanNotDelete:  Authorized users can make changes, but cannot delete the object.
  • ReadOnly:  No changes can be made.

Users with permissions to manage the objects with locks, will have the ability to remove the lock, but having that extra step helps to greatly reduce accidental changes or deletions.

Where to apply these locks, and which option to use will depend on the criticality of the resource and how often it needs to change.

Using an example of core network or firewall resources, these should remain pretty static and should be tightly managed for both security and availability reasons.  This is a great use case for the ReadOnly lock.  Thinking about the scope of the locks, I would like to draw a quick dotted line back to the Organization topic.  Applying a lock to the upper most object would be easiest and simplify the management versus having to set and manage the lock at the granular resource level.

Another example would be protecting a standard compute resource such as a VM or Azure Web App that may be pretty static.  Create a CanNotDelete lock will ensure that a stray click or script does not accidentally remove the resource.

MS Docs References:  https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-lock-resources

Wrap Up

As you can probably see each of these foundational points can have an overlapping impact on the other areas making it important that you think through the decisions, test the implementation, validate the outcomes, and iterate.

2 Replies to “Foundations of Azure Governance”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: