April 28, 2022

Infrastructure as Code: what's the advantage

This post describes the value provided by managing the infrastructure the same way you manage the source code of software applications, applying standard tools and best practices to the automation. The reference to infrastructure, of course, includes all cloud services incorporated in your architecture.

The following topics areI explore in this post. More posts will follow with a deeper investigation, and to show what is the link between Infrastructure as Code (IaC) and DevOps.

  • What does Infrastructure as Code mean?
  • Is IaC a product I can buy?
  • Most common use cases.
  • From where do I start?
  • Resources to practice with Infrastructure as Code.


What does Infrastructure as Code mean

Infrastructure as code (IaC) is the process of managing and provisioning data center environments through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. 

The IT infrastructure managed by this process includes both physical equipment, such as bare-metal servers, storage and network, as well as virtual machines, and associated configuration resources. The same concept applies to public cloud resources, i.e. IaaS and PaaS services.

The definition files for Infrastructure as Code are maintained in a version control system, similarly to what we do with source code of software applications. Generally, in these files you describe the desired state of the system, rather than a sequence of commands that must be executed. This implies that you trust a component in the infrastructure, called a controller, delegating all the logic and the exception handling to it (or to more than one).


Descriptive model, not commands.

You don't configure the individual components of the system (e.g. 20 switches, or 5 servers or 15 virtual machines and their virtual network) one by one, in the right order, managing eventual error conditions and verifying manually that everything work as expected.

You simply describe what you expect the system to look like to a software controller, that owns the configuration of all the individual components. The controller knows how to contact, provision and configure the elements and to make sure your intent is realised. If any command fails, everything is rolled back to ensure a clean state. The APIC controller in the Cisco ACI architecture has this role, but many examples can be found among Cisco products and other vendors', and open-source solutions.

It is like ordering a slice of cake 

 

  versus preparing the cake yourself following the recipe from your grandma:

In other architectures you don't have a centralised controller, but the programmability of the individual targets and the API that they expose allow for a remote, automated management that is still much better than using the command line interface or any GUI offered by the device. One script could update the configuration of dozens of devices at the same time, e.g. adding a VLAN to all the switches in the network. 


Treat infrastructure like software (source control, single source of truth).

The input files for this process are text files, using different formats based on the tool you use. They might contain variables, whose values are defined externally to make the template reusable (e.g. via environment variables, databases, or systems designed to keep secrets like Vault). I use the word template for Ansible playbooks, Terraform plans, etc.

In any case these are text files, like the files that contain the source code of a software application. And they can be treated the same way: stored in a versioning system, edited collaboratively, subject to role-based access control, retrieved and built automatically by a pipeline orchestrator. 

When you adopt this approach, the latest validated version of the system configuration is stored in the versioning system. You can consider that one the single source of truth, rather than the current configuration of the system (that might be corrupted by uncontrolled manual changes, either made intentionally or by mistake, or consciously applied long time ago for a reason that nobody remembers today). Instead, the last committed version in the repository is documented (including the tests that it passed) and ready to be applied again to reset the system, in case you need to solve a configuration drift, or to clone the environment, or for other use cases that require consistency.


Provision and configure entire environments

One example is creating clones of a complete environment, including computing, network and storage resources, to deploy an application in the different phases of the release process. Even with different sizing, being generated by a single template (or blueprint) makes sure they are identical in the configuration that influences the behaviour of the applications deployed.
There will be no surprise due to a missing configuration of a firewall port, of a datastore or a vlan trunk: consistency is granted, troubleshooting is limited.


Ensure idempotence

Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application (you can find the complete definition here). Well designed API ensure that repeated calls with the same input will not alter the state of the system, that is good because in case of retry you don't risk to create duplicated resources or other troubles.


Is Infrastructure as Code a product I can buy?

No: IaC is a methodology, not a product. It's a set of best practices that you can gradually adopt, and learn step by step. You can start with basic use cases, like creating a new tenant with a few associated resources, if that is a recurring activity and you want to make it faster and error-proof.

Then you will grow with more complex use cases, like creating an entire test environment on demand with all needed resources, including services from a cloud platform. The adoption of Infrastructure as Code is not a big bang, you don't need to build complete automation with no manual activity in one week. You can target quick wins, that validate the approach and generate momentum in the organization. The critical factor in the adoption is change management (i.e. the introduction of a new operational model and new ways of doing things), not the technology. 


Of course you need supporting tools: automation frameworks like Ansible and/or Terraform (eventually also scripting languages could do the job), versioning systems, collaboration systems. And your infrastructure needs to be programmable (public cloud is programmable by default), meaning that your servers, networks, storage should be managed via  software controllers or, at least, expose well-documented API.


Common use cases for Infrastructure as Code


Environments on demand

IT admins and the Operations teams receive a lot of requests from the applications teams: they need a configuration change to fix a problem or to deploy a new service, they need a new test environment, they need a clone for a new tenant, etc. Most of these requests cannot be satisfied immediately because of higher priorities, or because they require the collaboration of different teams that needs planning.

If the provisioning of the system - and some "day 2 operations" - was automated, with controlled execution of validated templates, both parties (Dev and Ops) would save time and be more satisfied... and efficient, for the benefit of the entire company.  


Shared resource pool to increase efficiency

Some companies keep separate environments for each stage of a project: integration test, performance test, quality assurance, production, etc. Resources are always allocated, regardless they are only used - let's say - a week every three months (the interval may vary between one day and one year): only when they release a new version of the project, or have a maintenance windows for deploying bug fixes.

Keeping resources allocated when they are not in use is a waste of capacity, hence a waste of money. Imagine if you multiply the waste by the number of projects. But they cannot do it differently because of the complexity and the time required to build the different environments.

If they could - and with automation they can - recreate an identical environment, end to end, whenever required, they could dispose each environment as soon as it's no longer in use. Knowing that they can recreate it in minutes, they would reuse the returned resources for another stage or another project.
Using a shared resource pool (computing, networking and storage) for many project would be more efficient from a cost perspective. It applies to fixed capacity (less capex: you need to buy less hardware to satisfy all the requests) but also to pay per use scenarios (less opex: you dismiss resources when not needed).


Disaster Recovery

In case you need to repurpose existing or new resources to recover from a disaster, recreating a clone of the system from a single source of truth is much faster and safer. Generating the new infrastructure from the same blueprint that had created the old one, makes sure they are identical.


Blueprints and Compliance

Subject matter experts from every technology domain that collaborate to provision and maintain a system, instead of being engaged every time, could design and release Infrastructure as Code models once. Users (i.e. applications teams or other operations teams) could then use the blueprints for a self-service provisioning, without depending on the availability of the SME. The SME would save their time, feeling safe because the blueprints respect all the defined constraints and comply with the policies (no provisioning anarchy is allowed). 


Auditing

Running automation scripts with standardised logging, or better using a pipeline orchestrator for provisioning and configuring systems, would trace what operations have been done, by whom, the input and the outcome. Very useful audit information, with no effort. 


From where do I start?


Tools: Ansible and Terraform

Those are the most widely used tools for automation (with or without an Infrastructure as Code approach). They are open-source and free, easy to use. An enterprise version also exist, and in some cases you will find it very useful. But you can start practicing with the free tool and use it for years, with great advantage, if you are the only responsible for the infrastructure. In case of teamwork, you can still use the free version and dedicate some time to build your own operational model and additional tools, or you can switch to the enterprise version that makes it easy to scale at the enterprise level.

You can download the software from the Ansible and Terraform websites, along with good documentation and reusable examples (see below). Good tutorials are also available.


Single operating tool: Cisco Intersight 

Cisco Intersight™ is a Software-as-a-Service (SaaS) hybrid cloud operations platform which delivers intelligent automation, observability, and optimization to customers for traditional and cloud-native applications and infrastructure. It supports Cisco Unified Computing System™ (Cisco UCS®) and Cisco HyperFlex™ hyperconverged infrastructure, other Intersight-connected devices, third-party Intersight-connected devices, cloud platforms and services, and other integration endpoints. Because it’s a SaaS-delivered platform, Intersight functionality increases and expands with weekly releases.

With Intersight, you get all of the benefits of SaaS delivery and full lifecycle management of distributed infrastructure and workloads across data centers, remote sites, branch offices, and edge environments. This empowers you to analyze, update, fix, and automate your environment in ways that were not previously possible. As a result, your organization can achieve significant TCO savings and deliver applications faster in support of new business initiatives.

See also Diving Deeper into Hybrid Cloud Operations with Intersight 


Resources to practice Infrastructure as Code

DevNet - Cisco's developers community, that offers tutorials, sandboxes, labs and reusable assets. This is the Infrastructure as Code page at DevNet: https://developer.cisco.com/iac/

Terraform - documentation, download and tutorials at https://www.terraform.io/. The integration with Cisco Intersight is explained at https://www.hashicorp.com/resources/standardizing-hybrid-cloud-environments-with-hashicorpterraform-and-cisco-intersi  

Ansible - documentation, download and tutorials can be found  at https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html 

Cisco workshops on Infrastructure as Code - feel free to contact me if you're interested in participating in our free, 3 x half days hands-on workshop.








May 25, 2020

Cost control in a multicloud world - part 2


This post is the second part of a discussion about optimization of the cloud budget and cost control in your IT organization. The first part is here.

The CloudCenter Suite


The CloudCenter Suite is a key component in this framework: it is made of three modules that offer provisioning, lifecycle automation and – what we are discussing here – cost control.

The Cost Optimizer module collects data from the API of all the cloud providers to create a series of detailed reports, that you can use to understand where and why you spend your money. You can slice and dice the information across different dimensions, and you can schedule reports or download them to feed a billing system.

complete and granular reporting of cloud spend
complete and granular reporting of cloud spend



You can create any custom organizational hierarchy to map your business (departments, customers, projects, etc.) and give them visibility and responsibility of the budget consumption.


custom hierarchy of cost groups
custom hierarchy of cost groups



budget management with Cisco CloudCenter Suite
budget management



The Cost Optimizer gives you recommendations for rightsizing, wherever your assets are deployed, analysing their behavior. You can evaluate the suggestion and then act manually, or you can enable the tool to do it for you automatically.

recommendations for rightsizing and other optimizations
recommendations for rightsizing and other optimizations



It will also tell you when it makes sense to adopt reserved instances to get a discount, showing you how much you’re going to save.


Act now to reduce your spending… across all clouds


The best practices and the tools are there. You can choose among tools that are specific to a cloud provider (they all offer good solutions) or a cloud agnostic solution that works with all clouds.

With the Cisco CloudCenter Suite you get a fully detailed report of your inventory, all the services you are consuming everywhere, and the efficiency from a budget standpoint.

You also get actionable suggestions to save money that can be, if you configure the tool to do so, executed automatically. If you prefer, you can just get the list of suggested actions and implement them manually.

Thanks to the Cisco CloudCenter Suite you can setup a common governance model and a set of policies in one place, instead of replicating the build of reports and automation in every single cloud, based on full visibility that goes beyond the suggestions given by your service providers.

All you have to do now is to download and test the CloudCenter Suite (30 days trial), or contact us for a live demo or a discussion of your use cases.

Cost control in a multicloud world

Using public cloud is not as cheap as you expected?

Is allocating budget and verifying spending a little more difficult than they had told you?

You’ll probably find some best practices and tools for cost control in public cloud services, described in this series of two posts, useful to improve the efficiency of your consumption. They apply to every IaaS and PaaS service in a multicloud context.

What CxO expect from IT


Surveys show that the enterprise IT is getting a predominant role in front of the lines of business when they need cloud services. Not just to select the best solution for them, but to help keeping the financial aspects in control (source: RightScale State of the Cloud Report).


role of enterprise IT in cloud decisions



Looking at the initiatives, that is where budget is allocated, cost control and cost saving is on top:


Allocation of budget for cloud initiatives


Like drug dealers


Cloud providers are making a lot of money. And, of course, they want more customers and they want to retain the existing ones. Competitors steal some, and others just fly away when they realize that operating applications in production is more expensive than they had thought.

So, they offer you some candies, just to taste what a beautiful trip they can offer you. Even with a free account, you get amazing services that work very well. See also Azure and Google. They all make building business applications quick and easy.


appealing services offered by AWS


Developers are attracted by that, and they love creating cloud native applications using PaaS services. But they do not realize, or they just don’t care of, how expensive it will be to run applications in production.

In addition, they generate a lock-in because PaaS services are not portable across clouds.


Is public cloud easier or is it cheaper?


Now ask yourself this question: do you use the public cloud because it’s easier or because it’s cheaper? Or maybe you’re just lazy and you want to delegate the SLA responsibility to the service provider?

I will not discuss if it’s easier in this post. But, for sure, it’s not cheaper. Budgets go out of control very often.

Those are some tips from Amazon (source: AWS re:Invent2019)

tips for saving from AWS


Of course, cloud providers want to help you to be efficient but… not too much! The two green actions look like the most effective, now let’s see if we can do anything better.


Cost control framework


We could create a cost control framework, as the large company I took this information from.


cost saving framework



We focus mostly on two aspects: avoidance of unnecessary costs and iteration of the improvement actions.

In the avoidance best practice, we concentrate on the efficiency of VM snapshots retention: how many and how long. Storage volumes and public ip addresses should be monitored, because they generate a cost. When you use a S3 storage you can optimize the cost based on frequency of access, speed and size. And VM should not be larger than need, because you pay also the capacity in excess.  Suspending VM at night, in the weekend or whenever the business application is not used, also saves money. And so does scaling capacity dynamically

You should monitor the situation carefully, and automation helps here. You can setup alerts and governance policies, having reports sent to the stakeholders. Automation in provisioning, but mostly in reporting and resource adjustment is key.


You can obtain important savings


This is what cost optimization saved in the context of a large SaaS service operations. Most of the saving comes from adoption of reserved instances where appropriate and from cleaning environments used for proof of concepts. The total saving is $7M per year.


savings from cloud optimization



And this is what they are planning: choosing the right instance types, rightsizing, cleaning up will save $7M more per year.


expected savings from recurring optimization
expected savings from recurring optimization


Cost control: a solution


We will examine a simple 3 steps process to save your budget.


  1. Understand your assets and ecosystem
  2. Optimize based on best practices, on investigation through your assets or... recommendations from a tool
  3. Avoid unnecessary costs and iterate of the improvement actions

Easy principles to set a governance model are:

  • Define (and enforce) budgets for your customers, projects and departments.
  • Monitor and report about that.
  • Set policies to automate suspension, autoscaling and cleanup.

Cisco offers a solution that helps you to achieve all that, quick and easy.

We have built a framework that is based on the feedback loop suggested by best practices (e.g. the second principle of DevOps recommends looking at the entire system as a whole and iterating the optimization).


the optimization loop
the optimization loop

In next post you will find a solution for implementing your cost control quickly and consistently across all the clouds you use: any combination of private cloud technologies and public clouds, with no need to adopt specific solutions for each individual target.

Keep reading, it will prove useful for your strategy.



January 22, 2020

Teaching Alexa to deploy applications in any cloud

Alexa, deploy a webserver in AWS and a database in Azure!


Recently I presented a session at Codemotion Milan, with my colleague Stefano Gioia.
We demonstrated how the API exposed by the Cisco CloudCenter Suite can be easily integrated from within an Alexa skill.


Of course, this is not something you would do in the real life in a production environment 🙂
But it’s an easy and funny way to show how easy the integration is, and we found it attractive for our customers and partners.
Instead of Alexa you could use any client, like a custom script from the command line, a workflow engine, a web portal or a ITSM system like ServiceNow to achieve the same result.
Any program that can do a REST call can drive CCS (CloudCenter Suite) externally to orchestrate the lifecycle of a software deployment.
In case you use Alexa, you will code the REST client logic in the serverless implementation of the “skill” that is executed as a Lambda function. You can use different languages to create the skills: we chose node.js for the demo.

We decided to show some basic CCS features like deploying any kind of software in any cloud, or measuring the cost of all the services we are consuming in all our clouds (for running VM and containers, for consuming cloud services like load balancers or network bandwidth, for running serverless functions. etc.). Of course there is much more in the product, but we wanted to keep the demo light and funny.


The 3 modules of the Cisco CloudCenter Suite


Everything you can do in the CloudCenter web portal can also be done through its REST API, and the CCS documentation shows examples you can easily reuse and adapt.

These are the API targets, for the different modules, that we used in the implementation of the skill:

Suite Admin API
/suite-idm/
E.g.: https://na.cloudcenter.cisco.com/suite-idm/api/v1/tenants

Workload Manager API
/cloudcenter-ccm-backend/api/v2/apps
E.g.: https://na.cloudcenter.cisco.com/cloudcenter-ccm-backend/api/v2/apps

Action Orchestrator API
/be-console/
E.g.: https://na.cloudcenter.cisco.com/be-console/api/v1/workflow

Cost Optimizer API
cloudcenter-shared-api
E.g.: https://na.cloudcenter.cisco.com/cloudcenter-shared-api/api/v1/costByProvider?cloudGroupId’



Next picture shows the high level process that allows a user to get something done by Alexa, just by speaking to a Echo device or to the Alexa mobile application.
The speech recognition system translates the user’s voice to text, then the “intent” of the user is matched to one of the functions available in the skill. Skills and their intent are executed based on patterns and keywords that the system is able to recognize in the natural language, thanks to machine learning algorithms.




Recognizing an intent triggers your custom code, that is generally a Lambda function (the Amazon developer console makes it easy to write the code and to host it in the Lambda service, providing also reusable examples). The outcome is rendered as audio or, depending on the device, also as a video.
In our specific demo, we put the client code for the Cisco CloudCenter API in the serverless implementation of the intent.
These are the commands that we can give Alexa:

  • give me the list of existing tenants
  • list all configured target clouds
  • deploy a database or a web server in a cloud
  • show current cost of all cloud services

Here you can see a sample of the capabilities of Alexa when it calls the Cisco CCS API:



Building a new Alexa custom Skill: as the skill developer, you have to:

  1. Define the requests the skill can handle
  2. Define the name Alexa uses to identify your skill, called the invocation name
  3. Define the utterances and input variables, called slots
  4. Write the code to fulfill the request
  5. Test it from the developer console or from your Alexa device




The documentation at the Amazon Developer console contains excellent tutorials to build Alexa skills.
You can learn easily to create a Hello World skill, then you are ready to incorporate the client code to call the CloudCenter Suite API.
Stefano has published his examples in github here, feel free to test it yourself.

This demo demonstrates that it's easy to build a client to drive the API exposed by CCS.
And it helps positioning the CloudCenter Suite as a mediation layer in your architecture, to orchestrate the lifecycle management and to define a governance model including cloud cost control.




July 19, 2019

Just one button to provision a production-grade Kubernetes cluster

(this is a guest post, authored by my esteemed colleague Fabio Di Niro)

Do you remember?


I bet all of you who are working or playing with Kubernetes remember perfectly the first time you tried to install it.
And the second.
And the third.
...
And the one that finally worked out.

And if you’re a professional you remember also the long path that brought you to own the expertise on Kubernetes that you need to install and fine-tune production grade clusters.
Or, if you’re not a Kubernetes professional, you probably remember how much time it took for you to find someone able to perform a valid Kubernetes install...and how much it costed.

To save all this time and effort to our customers Cisco released the Cisco Container Platform (CCP), a turnkey solution to easily provision production-grade Kubernetes clusters on-prem or in the cloud in minutes, with few mouse clicks and requiring little to no knowledge of K8s.
All the needed integrations with network, storage, computing and security are done automatically by CCP so that the provisioned K8s clusters are ready to run in production.
Clusters provisioned by CCP are already equipped with finely configured monitoring and logging tools like FluentD, Grafana, ElasticSearch, Kibana.
Through the Container Network Interface (CNI) you can choose whether to leverage Cisco ACI as network infrastructure or Calico (no dependence on the underlying infrastructure).

This is already great, but I thought to create a demo that may push the simplicity of those “few mouse clicks” to its limit, making possible to create a production grade cluster in just one click.







Introducing the Kubernetes dash button.

The concept is fairly simple: build a dash button that, once pressed, creates a production grade Kubernetes cluster ready to use.

Leveraging the rich set of the Cisco Container Platform (CCP) APIs this is even too easy, so I thought to add some more feature on top:

- I wanted to provision the cluster and access it just through the dash button. So, I want CCP to display on the dash button itself the IP address of the master node of the cluster created
- The start and finish of the cluster provisioning process had to be confirmed, so the communications had to be bi-directional with the dash button
- I wanted a fair battery life that would avoid me to recharge the button every day, so I needed to have electronics able to sleep or hibernate
- My lab, where I have the infrastructure and the CCP, is behind a proxy, so I can’t listen for calls inside the lab, I can just initiate communications from the lab. So, I needed a way to change the “push” of the button in a “pull” of the button press information
- I wanted to use the button everywhere I go without worrying about the local Wi-Fi settings



How it works

To satisfy all the above requirements I added a couple of elements in the picture, ending up with the following architecture:



The button is based on an Arduino ESP 32 board, it connects via Wi-Fi to my smartphone and uses its internet connection, this way I can use the button everywhere my phone has data signal. The button leverages a publish-subscribe message service (MQTT) in the cloud to bypass the limitation of the proxy I have behind my lab and reach a couple of scripts that calls the right API in the Cisco Container Platform to trigger the provisioning of a shiny new Kubernetes cluster.
Once the cluster is provisioned the IP address of the master node is returned to the dash button that shows it on its display, at this point it is ready to accept connection and be used.

A 3D printed enclosure completed my project, I took an existing model but then I decided to  leverage the capabilities of CCP to deploy K8s clusters on-prem or in the cloud so I designed the two different enclosures you can see in the picture to have two different dash buttons for the two different deployment target.
All the code and 3D designs have been released and are publicly available at: https://github.com/fdiniro/CCPDashButton




Now, before doing my demo, I can ask to my customers: “How much time and effort takes you to install a production-grade, fully operationalized and secured kubernetes cluster?” and whatever answer I get I know I can answer “I can do it in 2 minutes blindfolded and cuffed”.

You can see the recorded demo here: https://youtu.be/-F-xR0XNPBs



June 20, 2019

A new community for Cisco Multicloud software users

Today we are launching a new project, that is a local community of people interested in the Cisco software solution for multicloud.

Before you go on reading, this is our next meeting:
Rome and Vimercate, October 23, 2019 - Cisco offices (details below)

Like many open source communities (e.g. meetups on various technologies) our goal is to spread information, share experience and offer access to experts to discuss your own use cases. In our opinion, this could be beneficial for customers, partners and people that are just curious about multicloud and the open source technologies. Of course, the expected result is also to facilitate sales of Cisco technology, products and projects.





That is the value for Cisco, but what's in it for you? 

We think that joining this community (meetings will be in Italian) you could learn the solutions offered by Cisco for multicloud in an informal context (and for free), understand what use cases you can implement and how, how Cisco technologies integrates with the open source technologies like Kubernetes, Docker and others, how to adopt DevOps and, why not, learn the open source stuff regardless the integration with Cisco products. We will also offer hands on labs and activities that matches learning and fun (e.g. teaching Alexa to deploy a database, or create a Kubernetes cluster).
In addition, you get some pizza and beer for free :-)
At the end of the day, it's an easy learning opportunity, a (bidirectional) share of experience, a stage where you can show - if you like - your knowledge and share the experience you've done in your project. Offering help to others or receiving support from peers and from the Cisco experts.




We are starting an experiment

We are starting with a few meetings planned, see the agenda below.
We thought that scheduling outside the office hours will make it easier for you to join, avoiding conflicts and positioning this community as a place where you go for fun. Or, at least, with no relation with your role in your company. As an individual you learn subjects that make your work easier and your resume more interesting, so maybe this is worth dedicating 2-3 hours in your spare time (e.g. every second Thursday of each month, from 5 to 8 pm, pizza included).
We will ask your feedback about the schedule, so that we move to a different time of the day or cadence according to your preference.
And, of course, about the subjects we're going to address in next meetings so that we stay relevant for you.
An additional value is introducing you with the official Cisco DevNet community, that is our developers community.
We are going to leverage a lot of amazing material from DevNet, including documentation, tutorials and sandboxes where you can experiment with no need to install the products and no fear of destroying the environment: it's there for your enjoyment and will be reset after you use it.
Every event will be split in 2-3 sessions:
  • one based on presentation/demo of a Cisco product, 
  • one based on a subject from the open source world that is not necessarily related to Cisco. Sometimes the integration of open source and Cisco API will be demonstrated.

We will keep each session short and crispy and every event will have a hands-on activity to keep you awake.
When possible we'll add a lab activity that participants can do directly from their laptop.



Next topics

This is a temporary, draft list of the subjects we could offer in the first series of meetings: we can prioritize them based on internal feedback or through a survey with the attendees of first meetings (or sent remotely to the community).
  • Amazon Alexa integration
  • Devops - CI/CD with CloudCenter - (VM)
  • Devops - CI/CD with CloudCenter - (containers)
  • The ACI CNI (ltechnical intro)
  • The ACI CNI (use cases, operational model)
  • Multcloud cost control
  • Serverless lab
  • Devnet sandboxes
  • Devnet Express - programmability
  • Meet The Engineer or design clinic - bring your on use cases
  • ACI and Terraform
  • Managing k8s clusters in cloud and on prem
  • Automating the Software-Defined WAN

The next event we have planned is at the Cisco offices in Rome and Vimercate, on October 23, 2019.

Address: 

Time: 

  • from 5 pm to 8 pm (pizza and beer included)


This is the proposed
agenda (no technical requirements to attend):

- Why this meetup (10')
Container track: container 101 (theory, use cases, application architectures)
- Cost control with CloudCenter (30')
- Pizza & beer
- DevOps: Testing methodology (30')

Registration required


To register for the event, please click here.
See you soon!




References

Cisco DevNet - https://developer.cisco.com 



March 7, 2019

DevOps with CloudCenter and Kubernetes in a multicloud environment

A short description of what is DevOps and how it helps companies to compete in their business with a faster innovation, followed by a demonstration of how the Cisco multicloud portfolio helps in the adoption of DevOps practices [see also this post for more detail].

If your time for reading is limited, here is the structure of the post: you can jump to the paragraphs you are interested in.

(The business view starts here)
The need for digital innovation.
  New services, better quality
  Frequent releases
DevOps is not a technology or a product.
  Cultural change and collaboration (break silos in the organization)
        Small teams responsible for a service’s lifecycle end to end
DevOps principles.
  Feedback loop
  CI/CD – Continuous Integration and Continuous Deployment

(The technical view starts here)
Cisco Multicloud approach.
Cisco CloudCenter Suite (CCS)
Cisco Container Platform (CCP)
Our lab
Application
Infrastructure
Demo flow
Implementation
Conclusions
DevOps makes it faster and easier
Cultural change is needed (incentives)
Cisco offers an effective toolset to help the adoption of DevOps practices


The need for digital innovation.

Whatever is your business, your customers expect more and more services, greater efficiency and value added by innovation.
Providing new business services (generally supported by software applications) to customers and anticipating your competitors’ moves attracts new customers and retains the existing ones. 
Often the lines of business are not satisfied with the support they receive from the corporate IT in terms of flexibility and speed to start a new project, especially if new technologies or skills are required (e.g. cloud native applications).
A better perceived quality of IT depends also on the frequency of the release of fixes for broken services and on the process to avoid that bugs reach the production environment, being intercepted in good functional and reliability tests.
Frequent releases and the quality of the code can benefit a lot from automation in all the phases of a software project, though the end to end automation is not absolutely necessary: it is just much better. The fundamental pillars are a good organization of the work and processes that ensure a coverage of every need (no gaps in the responsibility, no grey area in communication among different departments, shared objectives instead of finger pointing).

Next picture shows the evolution of methodologies and the impact on the value perceived by the business. The small star represents the instant when business value is realized by a release of the application in production.
With the traditional waterfall projects, it happens only at the end of the project (by the way, with a lot of uncertainty due to delays and unexpected troubles during the development and the test phases).
The agile methodology reduces the risk, repeating shorter cycles of design, coding and testing that can address any surprise and correct the course of the project sooner. But the deployment in production still happens at the very end of the project.
The innovation allowed by Continuous Integration and Continuous Deployment brings the application in production at every cycle (new releases or bug fixing) ensuring optimal quality and a deterministic outcome: the business will appreciate a benefit in terms of time to market for their initiatives.

CI/CD offers more business value
Picture 1 - CI/CD offers more business value



DevOps is not a technology or a product.

DevOps means collaboration between Developers and Operations.
The work of who is responsible for design and implementation of the code does not finish when a new build of the application is released. Developers should also collaborate in testing the system, releasing it in production, operating and measuring its KPI.
The Operation team should not just execute a defined process to maintain the system but should collaborate since the design phase of the application and, most importantly, provide a constructive feedback from the production environment that helps improving and extending the application in next development cycles.

The collaboration and the feedback loop are foundational principles in DevOps, as described in next paragraph. [See The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations]

A cultural change. 

A cultural change (breaking silos in the organization) should be promoted, with incentives and gradual adoption of practices that will improve with time: the entire organization and the individuals have to digest a new way of working, openly analyze the outcome, contribute to the progress with personal feedback and suggestions. 
Everybody should feel that they have a common goal and they are collaborating to everyone’s success.
A great book describing this cultural change is the Phoenix Project.

Small teams responsible for a service’s lifecycle end to end.

DevOps practices suggest that the entire lifecycle of a service is managed by a single team: from the inception phase and the requirements analysis, to the implementation, test, release and operations. They can be more efficient – a provide a better quality – if they know everything about the service and they can react to any problem quickly, as well as evolving it based on new requirements.
The team should include representatives from different departments (lines of business, IT Architecture, Operations…) that bring their skill and experience, so a new organizational model can be required in your company: maybe a dotted line reporting structure with functional responsibilities.
It is not necessary to build a team for each service: some services can be grouped in one team, especially if they belong to the same business area or if they are the building blocks for a composite application (in a microservices architecture).

DevOps principles.

Gene Kim defines the principles that all of the DevOps patterns can be derived from (the Three Ways) in the books “DevOps Handbook” and “The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win.” He asserts that the Three Ways describe the values and philosophies that frame the processes, procedures, practices of DevOps, as well as the prescriptive steps.

The First Way – Systems Thinking
•  Understand the entire flow of work
•  Seek to increase the flow of work
•  Stop problems early and often – Don’t let them flow downstream
•  Keep everyone thinking globally
•  Deeply understand your systems

First Way Goals
•  One source of truth – Code, environment and configuration in one place
•  Consistent release process – Automation is essential (one click)
•  Decrease cycle times, Faster release cadence

The Second Way – Feedback Loops
•  Understand and respond to the needs of all customers (internal and external)
•  Shorten and amplify all feedback loops
•  With feedback comes quality

Second Way Goals
•  Defects and performance issues fixed faster
•  Ops and InfoSec user stories appear as part of the application
•  Everyone is communicating better
•  More work getting done

The Third Way – Synergy
•  Consistent process and effective feedback result in agility
•  Now use that agility to experiment
•  You only learn from failure – So fail often, but recover quickly

Third Way Goals
•  Ability to anticipate, even define new business needs through visibility in the systems
•  Ability to test and optimize new business opportunities in the system while managing risk
•  Joy

Now that you have got an idea of what is DevOps, let’s have a look at a solution from Cisco that could make it easier to adopt DevOps practices. Remember that DevOps cannot be bought: it is the set of good practices that you define and refine based on continuous improvement based on direct experience. Automation is only a part of the story.


Cisco Multicloud approach.

Cisco knows that many customers are using at least one private or public cloud, but most of them use at least two: that implies a need for consistent governance, security, networking, analytics and automation that apply to every environment.
The multicloud portfolio includes products, and reference architectures to make the adoption simpler, that span all the technologies mentioned above.

This post explains how we have built a demo using products in the automation bucket to support a DevOps use case (i.e. Continuous Integration and Continuous Deployment, aka CI/CD).

The two products are the Cisco CloudCenter Suite (CCS) and the Cisco Container Platform (CCP), briefly described in the following paragraphs before we go to the demo.


Cisco CloudCenter Suite

A solution to help the IT organization to sustain the pressure from developers and lines of business to deploy and operate a large number of applications and middleware platforms, made more complex by the availability of different possible targets (private and public clouds for running VM and containers).



CloudCenter addresses the many-to-many complexity
Picture 2 - CloudCenter addresses the many-to-many complexity



The CloudCenter Suite is a single tool to automate the deployments and broker resources from any cloud. It helps to enforce a single governance model including cost control, approval processes, security policies and consistent architecture across different clouds.
You don’t have to learn and use separate tools from the cloud providers, neither to replicate the automation blueprints using the native automation technologies in each cloud (e.g. Cloud Formation, Heat, Powershell): you only create a single model and CloudCenter translates it into call to the specific API exposed by each private and public cloud and Kubernetes clusters.



CloudCenter translates a single blueprint to API calls for all clouds
Picture 3 - CloudCenter translates a single blueprint to API calls for all clouds



Everything you do in CloudCenter can be done through its API, that makes it easy to orchestrate it externally (e.g. from Jenkins, through a plugin that Cisco ships so that you can insert multicloud deployments in your CI/CD pipeline).

The current version of the CloudCenter Suite also includes additional modules like the Cost Optimizer and the Action Orchestrator: a useful enhancement to create a governance model and make operations easy in a heterogeneous multi-cloud environment.


Cisco Container Platform

Another software product form Cisco, that you can see as a tool for Operations to create and manage enterprise grade Kubernetes clusters.
It creates, fully configures and manages (upgrades, scales, monitors) Kubernetes clusters on-premises and in the public cloud for you.
It takes care of all the complexity of the integration with networking (options offered out of the box are Calico, Contiv and Cisco ACI), storage, security (SSO and RBAC are added to Kubernetes) as well as centralized monitoring and logging while shipping 100% open source binaries from the upstream repositories.


Our lab

We have built a simple application based on a microservices architecture, as shown by the picture. 


the microservices application built for the demo
Picture 4 - the microservices application built for the demo

The source code of the 5 components is stored in a github repository, where new versions of the application are committed (saved) by developers. At each commit, the Jenkins orchestrator gets the source code and compiles it, building the container images ready to deploy the application.

The images are saved in a shared container registry (Harbor, see next picture) where Cisco CloudCenter will be able to retrieve them when asked by Jenkins to deploy the application. Based on input parameters provided by Jenkins, CloudCenter will target the deployment to the most appropriate environment for the current phase of the project.

In our demo lab, the environments are “integration test”, “performance test” and “production”.
They correspond to three different Kubernetes clusters that have been created in the private cloud (the first two) and in the public cloud (for the production environment).
Each environment has different policies set, that will be inherited by every application that is deployed there: for security, networking, autoscaling, etc.

The 3 Kubernetes clusters mentioned above have been generated by the Cisco Container Platform (though we could have created them manually in each cloud). The value in using CCP consists in consistent operations, speed and easiness: in few minutes we created 3 production-ready clusters, fully integrated with networking, storage, security, monitoring and logging without even touching the K8s installer or the underlying infrastructure.
The 2 clusters named “integration test” and “performance test” were created automatically inside VM in a local vmware environment, while the cluster named “production” was created in the Amazon cloud (CCP uses the API exposed by the Amazon EKS service to do everything automatically, including the integration with the AWS IAM for security).

The automated deployments will repeat, in the three environments, in a sequence that alternates them with the necessary tests and ensures the quality of the release. Though in the real world you might want to run more complex testing activities, this is a meaningful example of the efficiency you can achieve thanks to full automation of the process.
The pipeline can still be extended by adding additional tests like quality code inspection and more.


the CI/CD cycle
Picture 5 - the CI/CD cycle

Demo flow

Next picture is a sequence diagram showing all the actions that we have automated.
We used a color code to represent the phases that are commonly referred to as Continuous Integration (the green part) and Continuous Deployment (the orange part).
CCC stands for Cisco CloudCenter, where K8s dev, test and prod represent the 3 Kubernetes clusters mentioned above.
The entire process is completely automated and brings a new version of the application to the production deployment without any human intervention.
This complete automation is often referred to as Continuous Deployment and, though very useful and adopted by big players like Facebook (their pipeline is more complex than our simplified demo) is not very common among the customers I generally meet.
Those that adopted DevOps still prefer to have some human checks in between the activities, so that they feel they have a better control on the process and its quality.
When they have more experience, probably they will be confident enough to delegate every check to the automation tools.


sequence diagram showing the automated actions
Picture 6 - a sequence diagram showing the automated actions


Implementation

The automation is based on Jenkins, an open source orchestrator that benefits from the availability of hundreds of plugins: it can automate almost every component in your IT ecosystem, including Cisco CloudCenter of course.

In the Jenkins dashboard you can build different projects, like in the picture below. A project is a sequence of steps, using plugins to drive activities in the systems you want to automate (e.g. pull the source code from the repository, compile it, build containers images, trigger a cloud deployment through CloudCenter, etc.).

Jenkins projects
Picture 7 - Jenkins projects


Projects can call other projects, to make your orchestration modular and reusable. In the picture above, the project TheWall (that is the name of our demo application) calls the other 5 projects in a sequence, checking that the outcome is positive before calling next project.

So, we are able to automate the deployments in the 3 Kubernetes clusters and to run the functional test and the performance test of the application using an external tool (we used another open source product called Apache Jmeter). 

The functional test is a sequence of user transactions, executed by the test tool using a pool of user identities and a pool of input data, where assertions about the expected result are validated automatically. If the page generated by the application differs from the expected result, an error is logged, and the test can be considered failed. So, the functional test ensures that the application behaves as expected from a functional standpoint (and you can avoid a manual test for user acceptance).

The performance test, executed by the same tool, stresses the application and the infrastructure from a performance standpoint. A large number of concurrent users are simulated by the tool, invoking a sequence of user transactions with random wait time, reproducing a situation similar to the workload in a production environment. Response times are tracked and so are eventual errors, allowing the tool to declare that the test is successful or not.

Based on the outcome produced by Jmeter, the Jenkins orchestrator will continue with the Continuous Deployment pipeline or abort it, notifying the developers that something went wrong and a correction is required.
In this case, the CI/CD cycle will restart from the beginning: source code modified and committed, application built and deployed to the first environment, test executed, application promoted to next environment and tested… until the pipeline is completely executed without any warning or error and the application is released automatically in production.

Next picture shows the execution of the Jenkins pipeline for three different builds of the application. The most recent execution failed because the modification of the source code introduced an error that blocked the build. The other two executions succeeded, as demonstrated by the green color of every step in the pipeline.

Jenkins pipeline
Picture 8 - Jenkins pipeline



Jenkins logs all the activities, so that you can check what’s happened during the automated process.
Next picture shows the output of the sub-project named TheWall_Deploy_Test, that is the 7th stage in the pipeline in previous picture.
It uses the API exposed by CloudCenter to deploy the application “TheWall” to a test environment running Kubernetes, that is robust enough to sustain the workload of the performance test (while the functional test can be executed also in a smaller cluster with less computing power).


output from the Jenkins CI/CD pipeline
Picture 9 - output from the Jenkins CI/CD pipeline


You don’t have to code the API calls, because CloudCenter ships a plugin for Jenkins that integrates into its user interface graphically. But if you prefer, Jenkins can run scripts and commands from the CLI for you.

Conclusions

DevOps makes it faster and easier

If you adopt a DevOps methodology you bring agility to an extreme and get a business outcome from the fast release of applications. 

Cultural change is needed (use incentives)

DevOps is not a matter of technology. Your people need to work in a different way: no finger pointing between Developers and Operations, no “it’s not my job”, everybody should commit towards a common goal and enjoy their common achievement.
Initially it will be difficult, you have to teach them little by little. Offer incentives to people that shows a collaborative attitude and a spirit of innovation, let them feel like the heroes in the new adventure and grant that failures will not create any trouble. You learn from your mistakes and there is no magic wand to start directly with a perfect solution.
In addition, also traditional methodologies generate project failures: the difference is that DevOps anticipate problems and you discover them sooner, so the business impact is much smaller.

Cisco offers an effective toolset to help the adoption of DevOps practices

We all agree that DevOps is not a product. But once you start working you will see that automation helps the CI/CD process to be fluent. You can find great opens source (and free) tools – e.g. Jenkins, Jmeter, Ansible – to support your project teams, but if you also adopt Cisco CloudCenter and the Cisco Container Platform your professional life will be much easier.

Credits

The demo lab described in the post has been built with two colleagues and friends, that I want to thank here: Stefano Gioia and Riccardo Tortorici.

References

Jenkins – https://jenkins.io