Multi-Account Access Advisor
MAY - AUG 2018
How do we user service access data to help cloud administrators discover and remedy vulnerabilities across all the accounts in their AWS organization?
Amazon Web Services
Identity and Access Services
UX Design Intern
Sole designer on a team with a PM and Lead Developer
Usability Testing, Data Presentation, Information Architecture
Many Amazon Web Services (AWS) customers will have multiple AWS accounts for different purposes (e.g. one for their developers to experiment in and test new features, and one for their consumer-facing product).
AWS Organizations is an AWS service that allows customers to link their many accounts together, logically group them into Organizational Units (OUs), and place restrictions on these groups or individual accounts. These restrictions are called Service Control Policies (SCPs) and they prevent accounts from accessing certain AWS services that could threaten account security. For example, there may be a service that has a "delete entire database" action in it. If a user were to accidentally trigger this in a developer sandbox account, it may not be a huge deal. But, doing so in the production account could be catastrophic for our customers' products.
THE PROBLEM SPACE
Given these consequences, AWS encourages its customers to use the Least Permissive Model, meaning that any services an account does not need to use should be restricted out of it by a SCP. However, there was no way for customers to see which services their accounts had access to that they were not using in order to be able to scope their SCPs tighter. I worked with my team to scope and design a product to give customers visibility into their accounts permissions usage and incentivize them to use the Least Permissive Model in their Organizations.
Wants to be aware of what is happening in their AWS accounts
Wants to follow the Least Permissive Model
Wants to keep applications running smoothly
Has no way of viewing account activity at a high level
Customers have to log into each account and check the activity of individual users within that account to have a sense of what is occurring.
Creates over- or under-permissive policies
In an effort to follow the Least Permissive Model, administrators create overly strict SCPs that prevent developers from working while they request access to the tools they need. This friction often leads administrators to create overly-permissive SCPs in order to avoid this issue in the future, but, as a result, leaves their system vulnerable .
Is afraid to break applications
Blindly adding a more restrictive SCP to an account could remove its access to services or actions that it actually needs, which administrators cannot risk.
REQUIREMENTS - MVP
Conveniently view service and action activity within an AWS Organization at the OU, Account, and individual user levels
Easily identify where SCPs are over-permissive
Be alerted of suspicious activity within their organization
Receive warnings if removing access to services in use
Automatically receive SCP recommendations that permit only the necessary services and actions
FIRST CHALLENGE: LOCATION
When my PM presented the project to me, his plan was to make a brand new AWS service for this feature. I advised against this for two reasons:
AWS had over 150 services, with many of them dependent on one another to use. The current layout of services was daunting and did not make it clear how the services related to one another. I wanted to avoid adding another service to add to this confusion.
SCP creation and management resided in the AWS Organizations service. I felt that if a customer gained insight on how to adjust their SCPs using our new feature, this information should be in proximity to where they can take action on it.
After meeting with my team and the AWS Organizations team, we all agreed that this feature should live in the Organizations console. This choice presented some extra challenges to my project, as the Organizations console was scheduled to be redesigned using an updated design system in a few months. In order prevent my design work from being scrapped in this update, I took on the task of designing the update in addition to adding in the original project to the console.
First, I explored creating tree visual of the Organization and being able to see indicators on OUs and accounts that had unused services, as well as the SCPs attached to that section of the tree that are granting access to those unused services. You can set the time range for the tracking period, and my PM suggested that 90 days should be the default for "a long time" of a service not being used.
Next, I tried for a design more consistent with the AWS design language by using tables for access data within the OUs and account details. In this example, I also explored how a customer would be able to drill into service usage and see if any actions were unused during the tracking period to be scoped out.
We decided to go with the table designs since they were most consistent with the available AWS component system. However, this direction came with its own challenges, since it is not intuitive to view tree structures in a table format and access information easily gets buried. In order to surface information while also catching all over-permitted entities, I created two tables with action usage data - one for OUs that starts with the shallowest entities and gets deeper, and one with just accounts for maximum granularity. Any entities that have not used all the actions they have access to across their permitted services is highlighted for further review.
Customers can see a more detailed table of permissions usage of services and actions by clicking into the OU or account interest. For example, the following could be a a permissions access table for an OU of developer accounts.
I also added access information on the SCP list so that customers could audit from the policy side too. For example, a customer has a policy that grants access to a service but none of its attachees uses that service. Instead of adding new SCPs to each one of the entities it affects to scope that permission out, the customer can edit the existing SCP to exclude the service.
Then my team's developer realized we hit a hitch with our available dataset: we could not track the usage of many AWS actions and even some entire services. Consider for example, a service with 100 actions total, but the usage of only 50 of them is trackable. An account with permissions to the whole service would display permissions access data of only 50/100 actions accessed even if the account uses all 100 actions.
My team and I decided it was most important to be as upfront and honest about the data we were showing customers as possible. We did not want to cause undue panic or give customers a false impression of their Org activity. To this end, we replaced the Accessed Actions column and #/## styling with two separate columns with a link to documentation with the technical qualifications of the data. Additionally, we called out any services where tracking was unavailable.
USABILITY TESTING AND DESIGN REVIEW
I made a fuller prototype of my designs and I wrote and conducted a usability study with over five IT professionals familiar with permissions management. Here are some examples of the tasks we gave users to measure my design's effectiveness:
View the activity in your Organization
View the action-level activity in your Production OU
Scope down a policy to be more restrictive
METRICS FOR DESIGN SUCCESS
Navigates to the AWS Organizations console
Easily navigates through the Org to the Production OU, identifies Permissions Access table, navigates to the action-level information, understands the data shown and feels it is sufficient for gauging activity
Easily navigates through Policies pages, identifies policies that are over-permissive, able to verbalize which ways they would restrict a policy based on over-permissions
RESULTS - WHAT WENT WELL
All users found value in having access to Organization-wide data.
Most users were able to understand the data presented in the Permissions Access tables at a high level.
When participants were asked to complete action-oriented goals, like editing policies, they were able to verbalize how they would apply the data they saw to inform the decisions they made.
RESULTS - WHAT DID NOT GO WELL
Navigation between pages took a long time for users. Many took time to read over and understand all data in every data table even if it was not relevant to completing their goal before navigating to a new page.
Participants did not understand the lower-level differences between actions that were trackable or not. This distinction on service-level tables confused them. Even when they did understand, it did not seem to provide goal-related value to them.
Certain table manipulation functionalities, such as complex filtering and timeframe adjustment, went largely untouched.
When given the first prompt to check on Organization activity, most users clicked on the service AWS IAM, which deals with permissions for users within one AWS account. This might have meant that customers expect activity data to reside in this service, or that they are accustomed to only having user-level data to go on.
I also review this project with the central AWS design team to get feedback and advice on my designs. They suggested that I add clearer calls to action to help customers understand why the information we are surfacing is useful to them.
Based on the results of the usability study and the feedback I received from the central design team, I made the following improvements:
Simplified the data shown to customer understanding
It was important to make sure customers understood the limits to the data we were showing them, but our current presentation left them confused and with false impressions. Rather than showing an incomplete dataset on usage, I refocused the data presentation on the valuable information that we can show them: that they have permitted actions going unaccessed. This way, any number > 0 in this column serves as an indicator that there is opportunity to scope permissions.
Additionally, I removed low-priority data, such as Region Last Accessed from the table, as well as replacing time filters with a set range of 90 days. Users could add back the data and change the tracking period via settings.
Created info banners as calls to action
I experimented with varying levels of security threat explanations and action urgency, with dismissible banners and explicit tables for actionable issues.
Restructured the OU and Policy details pages
I reorganized the long scrolling of pages into tabbed sections based on information categories.
I presented my designs to senior-level managers and received positive feedback on the project. The managers were particularly happy to see me make the decision on placing the features within the AWS Organizations product, and commented that this was a start to the discussion of better organizing the AWS permissions management features - they also felt that the amount of services was confusing and the relationships between them were not clearly explained by AWS.
The project was launched publicly a year later. Interestingly, my team chose to put this permissions access information into the IAM service. This was a good choice, since the usability studies showed that customers associate IAM with analyzing services access data and it is a step in the right direction to consolidating permissions management features.