The Relativity of Wrong: shadow IT

The value of automation in the DataCenter

Everyone is aware of the value of the automation.
Many companies and individual engineers implemented various ways to save time, from shell scripts to complex programs and to fully automated IaaS solutions.

It helps reducing the so called "Shadow IT", a phenomenon that happens when developers can't get a fast enough response from the IT of the company and rush to the public cloud to get what they need. Doing that they complete and release their project soon, but sometimes troubles start with the production phase of the deployment (unexpected additional budget for the IT, new technologies that they are not ready to manage, etc.).

shadow IT happens when corporate IT is not fast enough

For sure, some departments are organized in silos (a team responsible for servers, one for storage, one for networking, one for virtual machines, of course one for security...) and the provisioning of even simple requests takes too long.

process inefficiency due to silos and wait time

Pressure on the infrastructure managers

So there is inefficiency in the company, that affects the business outcome of every project.
Longer time to market for strategic initiatives, higher costs for infrastructure and people.
Finger pointing starts, to identify who is responsible for the bottleneck.

The efficiency of teams and individuals is questioned, and responsibility is cascaded through the organization from project managers to developers, to the server team, to the storage team and generally the network is at the end of the chain... so that they have no one else to blame.

Those on the top (they consider themselves on top of the value chain) believe - or try to demonstrate - that their work is slowed down by the inefficiency of the teams they depend on. They try to suggest solutions like: "you said that your infrastructure is programmable, now give me your API and I will create everything I need on demand".

Of course this approach could bring some value (not much, as we'll see in the rest of the post) but it mines the relevance of the specialists teams that are supposed to manage the infrastructure according to best practices, to apply architectural blueprints that have been optimized for the company's specific business, to know the technology in deeper detail.
So they can't accept to be bypassed by a bunch of developers that want to corrupt the system playing with precious assets with their dirty hands.

The definitive question is: who owns the automation?
Should it be left to people that know what they need (e.g. Developers)?
Should it be owned by people that know how technology works, and at the end of the day are responsible for the SLA including performances, security and reliability that could be affected by a configuration made by others (i.e. IT Administrators)?

In my opinion, and based on the experience shared with many customers, the second answer is the correct one.
By definition the developer is not an expert on security: if he can easily program a switch via its REST API to get a network segment, it’s not the same when traffic needs to be secured and inspected.

The IT Admin patrolling the infrastructure

Offering a self service catalog (or API)

A first, immediate solution could be the introduction of an easy automation tool like Cisco UCS Director, that manages almost every element in a multi vendor Data Center infrastructure: from servers to networks to storage to virtualization in a single dashboard. But what is more interesting is that every atomic action you do in the GUI is also reflected in a task in the automation library, that allows you to create custom workflows lining all the tasks for a process that you want to automate.
A common example of automation workflow is the creation of a 4-hypervisors server farm.
A single workflow starts from the SAN storage creating a volume and 4 LUN, where the hypervisor will be installed to enable remote boot for the servers. Then a network is created (or the existing management network will be used) and 4 Service Profiles (the definition of a server in Cisco UCS) are created from a template, with individual ip address, mac address and wwn for each network interface. Then, zoning and masking are executed to map every new server to a specific LUN and the service profiles are associated to 4 available servers (either blades or rack mount servers). The hypervisors are installed using the PXE boot, writing the bytes in the remote storage, configured and customized, and finally added to a (new) cluster in the hypervisor manager (e.g. vCenter).

All this process takes less then one hour: you could launch it and go to lunch, when you're back you'll find the cluster up and running. Compare it to a manual provisioning of the same server farm, eventually performed by a number of different teams (see the picture above): it would take days, sometimes weeks.
Other use cases are simpler: maybe just creating a 3 tier application with VM and dedicated networks.

Once the automation workflow has been built and validated, it can be used by the IT admin or by the Operations everyday, to save time and ensure consistent outcome (no manual errors). But it can also be offered as a service to all the departments that depend on the IT for their projects.

You can build a service catalog with enterprise features: multitenancy, role based access control, reporting, chargeback, approvals, etc. But you can also offer (secured) access to the API to launch the workflow, offering a degree of autonomy to your consumers. Eventually, using a resource quota: you don’t want everyone to be able to create dozens of VMs every hour if the capacity of the system can't sustain it.

They will appreciate the efficiency improvement, for sure.

What's in it for me?

If you allow your internal clients to self serve, you will:

get less requests for trivial tasks, that consume time and give no satisfaction (let them play with it),
be the hero of the productivity increase (no requests pending in your queue)
dedicate your time and skill to designing the architectural blueprint that will be offered as a service to your clients (so that everybody plays according to your rules)
use policy based provisioning, so that you define the rules just once and map them to tenants and environments: every deployment will inherit them
maintain control on resource consumption and system capacity, hence on costs and budget
increase your relevance: they will come to you to discuss their needs, propose new services, collaborate in governance

Example: network provisioning

The discussion above is valid for the entire infrastructure in the Data Center.
Now I tell you the story of a customer that implemented it specifically for the networking.

They were influenced by the trend about SDN and initially they were caught in the marketing trap "SDN means software implemented networking, hence overlay". Then they realized the advantage provided by ACI and selected it as the SDN platform ("software defined networking", thanks to the software controller and the ACI policy model).

Developers and the Architecture department asked to access the API exposed to self provision what they needed for new projects, but this was seen as an invasion of the property (see the picture with the dirty hands).

It would have worked, but it implied a transfer of knowledge and delegation of responsibility on a critical asset. At the end of the day, if developers and software designers had knowledge in networking, specialists would not exist.

So the network admins built a number of workflows in UCS Director, using the hundreds of tasks offered by the automation library, to implement some use cases ranging from basic tasks (allow this VM to be reached from the DMZ) to more complex scenarios (create a new environment for a multi tier application including load balancer and firewall configuration, plus access from the monitoring tools, with a single request).

Blueprint designed in collaboration with Security and Software Architects

Graphical Editor for the workflow

These workflows are offered in a web portal (a service catalog is offered by UCSD out of the box) and through the REST API exposed by UCSD. Sample calls were provided to consumers as python clients, powershell clients and Postman collections, so that the higher level orchestration tool maintained by the Architecture dept was able to invoke the workflows immediately, inserting them in the business process automation that was already in place.

Example of python client running a UCSD workflow

All the executions of the workflows - launched through the self service catalog or through the REST API - are tracked in the system and the administrator can inspect the requests and their outcome:

The IT admin can audit the requests for the automation workflows

The Service Requests are audited and can be inspected and rolled back

Any run of the workflow can be inspected in full detail, look at the tabs in the window:

The IT admin can inspect any run of the workflows

The Admin has full control (see the tabs in the window)

References

Cisco UCS Director
Cisco ACI
ACI for Simple Minds
ACI for (Smarter) Simple Minds
Invoking UCS Director Workflows via the Northbound API

As described in my previous post about Terraform, the deployment environment for a new application can be created "on demand" by configuring physical and virtual resources.

Good open source products allow to describe the desired state and to automate the setup of a target infrastructure.

They can also deploy your software application and configure it properly.

But is some use cases this is not enough.

You might want to offer your users - depending on their needs and their skills - a visual catalog in a web portal.

You might want to apply a governance model based on policies, use different clouds as possible targets for the deployment, offer a easy way to manage the life cycle of the deployment (start, stop, scale up/down, terminate) and get reports on usage of the resources.

If this is the case, there are good solutions available.

One of these is Cisco Cloud Center, a powerful tool that offers two main use cases:

modeling the deployment of a software stack (creating a template or blueprint for common deployments) and
brokering cloud services (different resource pools available from a single catalog).

A easy to consume (and manage) self service catalog

Open Source or commercial products?

In the same project where I used Terraform to deploy Apache on Openstack, I also used Cisco Cloud Center to deploy a portal application on Openstack.

But at the same time, I offered the possibility to target the same deployment to a public cloud (AWS in this particular case) or to the private cloud (choosing between Openstack and vmware in this particular case). No duplication of the effort was needed, because the model you creat is not referred to a specific cloud as a target. It will be matched, when a user orders it, with one of the cloud avaliable for him or for his project.

So I was able to show the difference between a free, open source solution (Terraform) and a commercial product (Cloud Center) in a similar scenario.

The second option addresses different needs of the organization and offers a richer solution.

It’s up to you to evaluate which one fits your requirements better.

Modeling, policies and multitenancy

One of the differences is that Cloud Center offers a graphical editor to model the topology and the dependencies among all the building blocks of your deployment.

You have a library of services (software applications from a repository, physical and virtual services like load balancers and firewalls).

Services can be dragged and dropped in the editor, then you set their properties and dependencies.
The architecture of the application you're modeling can be based on a single server or a number of servers with different roles.

If the application architecture has multiple tiers, every tier gets its own attributes and policies: as an example, you can set the minimum and maximum number of instances in a cluster of web servers (or application servers or database servers).

Autoscaling policies will tell the orchestrator to increase or decrease the number of servers based on metrics like consumption of cpu or memory, inbound/outbound traffic, etc.

Everytime the cluster changes, the orchestrator will modify the configuration of load balancers and firewalls accordingly: no manual intervention is needed.

Models are saved in the catalog and offered to users in a multitenant organization: every tenant is given a portion of resources (target cloud environments) and services (models available in the catalog to deploy applications) that the tenant administrator can offer to his own users and groups... and sub tenants. Every tenant cannot see other tenants' stuff.

a graphical editor to model the deployment of your applications

A graphical editor to model blueprints for application deployment in hybrid cloud

Dashbooard and Reporting

Every user has a dashboard that shows the consolidated information about all the applications he has deployed (or the other users in the same tenant), and can manage the lifecycle of all the deployments.

Of course the administrator of the system sees the global view including all the assets.
Active VM per cloud and per application are shown in the dashboard, as well as associated costs.

a unified dashboard for all your deployments in all the clouds

Cloud Center's Dashboard

A powerful reporting features allows to filter deployments and costs by user or group, application, environment and cloud.
Data can also be exported in different formats, to be consumed by humans and other systems.

powerful reporting allows for governance, showback and chargeback

Unified reporting

Architecture

The architecture of the Cloud Center product is based on two Virtual Machines: the Manager (CCM) and the Orchestrator (CCO).

The Manager is the engine where policies and application models are defined, and where the user portal runs. The Orchestrator lives within each of the target clouds (indeed, there is one CCO in each cloud region), receives commands from the Manager and executes them locally using the API of the cloud platform.

Cisco provides orchestrator images that are specialized for every cloud supported by Cloud Center. So you have a single place to manage all your cloud resources, and a single model to maintain: you don't need a model, or a workflow, or a script for every target cloud where the syntax of that specific API is used. You create a single model, that is completely decoupled from the target of the deployment: this reduces the amount of work (a single model instead of many) and makes the maintenance of the model easier and more consistent (you don't have to evolve many models for the same application).

One manager drives a separate orchestrater for every cloud region you have access to

Cisco Cloud Center architecture

Comparison

Two solutions for the same use case, one for free and one at a cost?
Indeed they address different requirements: as described above, Cloud Center is for enterprise organizations that need to rationalize their usage of cloud resources. It is used by the corporate IT to provide flexibiliy and agility to their developers (within a governance model), to standardize the architecture of their projects based on blueprints (including what products, what versions, what setup configuration they prefer) and to get reports on consumption.

Service providers can use Cloud Center to broker third parties' resources, offering a single catalog to their customers. The hierarchical multi tenant organization and the sophisticated cost models that can be offered make it simple.

I suggest you to consider it if you are using, or plan to use, two or more cloud providers (counting also your private cloud or your virtualized data center). You will see an immediate benefit in terms of compliance and efficiency.

References

http://www.cisco.com/c/en/us/products/cloud-systems-management/cloudcenter/index.html
http://www.cisco.com/c/en/us/solutions/data-center-virtualization/one-enterprise-suite/index.html
http://docs.cliqr.com/display/CCD46/CloudCenter+Documentation+4.6

The Relativity of Wrong

Pages

July 28, 2017

Protecting your border or offering a service to others?

The value of automation in the DataCenter

Pressure on the infrastructure managers

Offering a self service catalog (or API)

What's in it for me?

Example: network provisioning

References

October 12, 2016

Just 1 step to deploy your applications in the cloud(s)

Open Source or commercial products?

Modeling, policies and multitenancy

Dashbooard and Reporting

Architecture

Comparison

References