Showing posts with label infrastructure as code. Show all posts
Showing posts with label infrastructure as code. Show all posts

June 9, 2023

Infrastructure as Code: making it easy with Nexus as Code

In previous posts I've described the advantage provided by managing the infrastructure the same way developers manage the application code.

Infrastructure as Code means using the same toolset (version control systems, pipeline orchestrators, automated provisioning) and same processes for building, integrating, testing and releasing the system that are used in the release cycle of a software application. This approach has a positive impact on speed, reliability and security end to end.


Together with Ansible, Terraform is one of the most used tools in the automated provisioning space, and many organizations use it when they adopt Infrastructure as Code. The availability of plugins (Terraform Providers) for almost every possible target (physical and virtual servers, network and storage, cloud services, etc.) makes it a common platform for automation: a "de facto" standard.

As many other technology vendors, Cisco offers Terraform Providers wrapping the API of their products, especially for Data Center and Cloud technologies. The Nexus family of switches, that includes the ACI fabric architecture, makes no exception. You can provision and manage the ACI fabric easily with Terraform (as well as with Ansible), and many examples and reusable assets are available at DevNet.

Generally, Terraform Providers surface the object model of the target system so that resources and the their relationships can be managed easily in a configuration plan, representing the desired state of the system. You need to understand how that particular system works and, in some cases, to manage the relationships among managed objects identifiers explicitly.

This is an example of creating a tenant in ACI, and a VRF contained in it:



Some engineers find this object model, and the use of the HCL (Hashicorp Configuration Language), easy and comfortable. Others, maybe due to a limited experience, would prefer an easier syntax and simpler object model.

For this reason Cisco has created a module called Nexus as Code, that sits on top of the standard ACI provider for Terraform, hiding the perceived complexity and offering a simplified object model. The objects that are contained in each other are simply nested and represented in a way that's very close to the conceptual representation of the logical architecture (represented by the following picture)


Nexus as Code can be seen as a (optional) component in the Terraform solution to automate ACI and other network controllers from Cisco.



Using a configuration language as simple as YAML, nesting is represented with indentation in Nexus as Code. This example corresponds to the HCL snippet above:


This format is particularly suitable for copy/paste operations, that make it easy to clone and modify a template so it is ready for a new project.

If you start from the example above, simply copying one line you can have one more VRF created and contained in the same tenant. Definitely simpler that doing the same in a HCL file, and encouraging for a network engineer the first time he/she uses Terraform. 

Everything you need is a folder to store one or more YAML file defining the desired state of the ACI fabric, and the installation of the Terraform binary file (free download from here). After that, you will just use the following two commands:

terraform init (that makes sure that the needed providers are installed, and eventually downloads them automatically)


terraform apply (that reads the input, evaluates changes required to align the state of the target fabric to the desired state, then call the API of the ACI controller)



when you confirm the apply, you will see the log of the execution and finally the message will tell that the job is done.



I believe that Nexus as Code is a powerful tool that may help engineers to approach the IaC (Infrastructure as Code) methodology easier, with no stress due to learning new complex technologies and tools.

Being based on standard, open-source tools, it does not introduce any lock in with Cisco technologies. 
It simply translates easy-to-manipulate YAML files, that describe your desired state, into plain Terraform plans that are executed automatically.

So you can start adopting the same tools and same processes that developers use in building, integrating, testing and releasing the system, obtaining the same benefit in terms of speed, consistency, security and self-documentation.

Don't be shy, start today to experiment and see how easy it is 😜



July 4, 2022

Infrastructure as Code: tools and processes

In a previous post we have seen that Infrastructure as Code is a way of managing the infrastructure and the cloud resources, consisting in a set of processes and best practices.

But there is also a need for a set of tools, and this post will offer an overview of the most used tools in the industry. Most of them are free, open source tools, matched by a vendor-supported version that requires a license or a subscription. There are also SaaS versions that offer a free tier. 

I'm describing the following tools in this post:

  • Version Control Systems
  • Automation tools (Ansible, Terraform) 
  • Accessory tools for "scaffolding" (Vault, etc).

But, before we explore the tools, just a few more words about the process.

Programmable infrastructure

The simple fact that the infrastructure is programmable via the API it exposes, does not mean that anyone can come and change its configuration and/or state.


We don't want anarchy, and even less we want that programmers do whatever they like bypassing the owners of the target technology domain. The administrators, that are also the SME (subject matter experts) have the responsibility to ensure the reliability, performance and security of the system and cannot afford that a naive developer compromises it.



So what we mean with treating the infrastructure as code is applying the same processes, and same tools, as we do with the source code of the applications. The infrastructure provisioning and configuration should follow the same process that we implement for the applications: write the code, version the code, test it statically for quality and security, deploy it automatically, test it dynamically (functional, performance, reliability and security tests), then deploy it in the production environment. Generally, it happens within a CI/CD pipeline with a good level of automation (but the same sequence could also be executed manually).

Now that we have agreed on the basic principles, let's have a look at the tools.

Version Control Systems

Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later.

The purpose of version control is to allow software teams track changes to the code, while enhancing communication and collaboration between team members. Version control facilitates a continuous, simple way to develop software.

Since we want to manage the infrastructure as developers do with the software applications, we use the same organization for the files describing the desired state of the infrastructure (remember, infrastructure includes physical, virtual and cloud resources).

A central repository is the single source of truth. Local working copies can be used to evolve the code, to create new versions and test them. After validation, a new version is committed in the VCS (version control system). The most used tools are github and gitlab, but there is a large choice available.

Pipeline orchestrators 

When a new version is created, a number of activities must take place: they can be executed manually or, better, automated to increase speed and reduce vulnerability to human errors. 

If any task or test fail, a notification is sent to the right stakeholder to solve the problem and the pipeline aborts. A new pipeline cycle will restart after the issue has been fixed. You might build a single pipeline for the infrastructure and for the application deployment, or more often separate them in distinct processes: depending on the work organization and the availability of resources, there is no strict need to rebuild the target environment every time a new application version is released.

Orchestrators for CI/CD pipelines can be open-source or commercial products or can be engines incorporated in the version control system.

The following picture shows an example of pipeline:



Automation tools (Ansible, Terraform) 

Those are not the only tools available for automation, but they are by large the most used.

They access the target system (infrastructure, cloud and components in the software stack) remotely, with no need for a local agent.

Generally the target API are wrapped in plugins for the automation engine (called Ansible modules and Terraform providers) that are either built and supported by the vendor of the target technology, or by the open source community.

Ansible was born for managing servers, so its approach is more orientated towards configuration management. Terraform excels at provisioning resources, and brings you to concepts like immutable infrastructure (see below).

Both the tools are great and let you define the desired state of the system, making sure that the current state matches the desired state. If it does not, changes are executed automatically by destroying configuration items and recreating them as they need to be. Indeed, Ansible tolerates changing the configuration of existing resources, in that it is more procedural than declarative.

Immutable Infrastructure

An approach to managing services and software deployments wherein components are replaced rather than changed. They are effectively redeployed each time any change occurs.

Traditionally, an application or service update requires that a component is changed in production, while the complete service or application remains operational. Immutable infrastructure instead relies on instancing "golden images", where components are assembled on computing resources to form the service or application. 

Once the service or application is instantiated, its components are set - thus, the service or application is immutable, unable to change. When a change is made to one or more components of a service or application, a new golden image is assembled, tested, validated and made available for use. Then the old instance is discontinued, to free the computing resources within the environment for other tasks.

You can find a very good description and reusable examples at this two websites:

Immutable Infrastructure with Ansible, Packer and Terraform on Azure - https://devopsand.beer/2022/03/26/immutable-Infrastructure-with-ansible-packer-and-terraform-on-azure/

Immutable Infrastructure Using Packer, Ansible, and Terraform - https://medium.com/paul-zhao-projects/immutable-infrastructure-using-packer-ansible-and-terraform-a275aa6e9ff7

Accessory tools for "scaffolding" 

The automation you can build with these tools is amazing, and it saves you time and troubles (sometimes also money as a consequence). As a single individual, or part of a small team, you are much more productive thanks to the reuse of scripts and blueprints, to less troubleshooting required, to higher speed in provisioning.

When the size of the operations team, or of the organization made of different teams, grows beyond a handful of people, some coordination issues start being visible.

  • If many people use the same scripts (playbooks, configuration plans, etc.) those resources need to be accessible in a central repository (generally a VCS) and you need to enforce RBAC (role based access control) to protect them. 
  • Credentials to access the target systems cannot be stored within the code in the VCS, so you need to store them separately and pass them in as variables. 
  • If a change is pushed to the environment, people need to be notified (even more if a pipeline fails and someone has to fix it).
  • Bespoke IaC pipelines can stretch across personal machines or shared VMs creating a management nightmare
  • Terraform state files contains sensitive information which requires special handling and access control

So you start defining processes to work in a ordered manner, and adopting accessory tools to store the secrets (one example is Vault, to store credentials in a safe, centralized place). The governance work and the tools you start accumulating are defined scaffolding, and rapidly become such a burden that they exceed the advantages you've got from adopting the Infrastructure as Code approach (this happens only at a large scale and if you don't have experienced staff).

A solution for this problem is offered by the enterprise version of the tools (both Ansible and Terraform), that is also offered as a SaaS option. The paid versions - that are also supported by the vendors - offer everything you need for large teams' collaboration and avoid that you need to invest in creating all the operational framework.

I'm not saying that you absolutely need those versions, but consider that the miracles an engineer can do with the free, single binary file, local setup of the automation tools are less likely to be seen on a larger team scale when the IaC best practices are broadly adopted. There will be an inflection point where the benefits provided by the enterprise edition justify the cost of the solution.


April 28, 2022

Infrastructure as Code: what's the advantage

This post describes the value provided by managing the infrastructure the same way you manage the source code of software applications, applying standard tools and best practices to the automation. The reference to infrastructure, of course, includes all cloud services incorporated in your architecture.

The following topics areI explore in this post. More posts will follow with a deeper investigation, and to show what is the link between Infrastructure as Code (IaC) and DevOps.

  • What does Infrastructure as Code mean?
  • Is IaC a product I can buy?
  • Most common use cases.
  • From where do I start?
  • Resources to practice with Infrastructure as Code.


What does Infrastructure as Code mean

Infrastructure as code (IaC) is the process of managing and provisioning data center environments through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. 

The IT infrastructure managed by this process includes both physical equipment, such as bare-metal servers, storage and network, as well as virtual machines, and associated configuration resources. The same concept applies to public cloud resources, i.e. IaaS and PaaS services.

The definition files for Infrastructure as Code are maintained in a version control system, similarly to what we do with source code of software applications. Generally, in these files you describe the desired state of the system, rather than a sequence of commands that must be executed. This implies that you trust a component in the infrastructure, called a controller, delegating all the logic and the exception handling to it (or to more than one).


Descriptive model, not commands.

You don't configure the individual components of the system (e.g. 20 switches, or 5 servers or 15 virtual machines and their virtual network) one by one, in the right order, managing eventual error conditions and verifying manually that everything work as expected.

You simply describe what you expect the system to look like to a software controller, that owns the configuration of all the individual components. The controller knows how to contact, provision and configure the elements and to make sure your intent is realised. If any command fails, everything is rolled back to ensure a clean state. The APIC controller in the Cisco ACI architecture has this role, but many examples can be found among Cisco products and other vendors', and open-source solutions.

It is like ordering a slice of cake 

 

  versus preparing the cake yourself following the recipe from your grandma:

In other architectures you don't have a centralised controller, but the programmability of the individual targets and the API that they expose allow for a remote, automated management that is still much better than using the command line interface or any GUI offered by the device. One script could update the configuration of dozens of devices at the same time, e.g. adding a VLAN to all the switches in the network. 


Treat infrastructure like software (source control, single source of truth).

The input files for this process are text files, using different formats based on the tool you use. They might contain variables, whose values are defined externally to make the template reusable (e.g. via environment variables, databases, or systems designed to keep secrets like Vault). I use the word template for Ansible playbooks, Terraform plans, etc.

In any case these are text files, like the files that contain the source code of a software application. And they can be treated the same way: stored in a versioning system, edited collaboratively, subject to role-based access control, retrieved and built automatically by a pipeline orchestrator. 

When you adopt this approach, the latest validated version of the system configuration is stored in the versioning system. You can consider that one the single source of truth, rather than the current configuration of the system (that might be corrupted by uncontrolled manual changes, either made intentionally or by mistake, or consciously applied long time ago for a reason that nobody remembers today). Instead, the last committed version in the repository is documented (including the tests that it passed) and ready to be applied again to reset the system, in case you need to solve a configuration drift, or to clone the environment, or for other use cases that require consistency.


Provision and configure entire environments

One example is creating clones of a complete environment, including computing, network and storage resources, to deploy an application in the different phases of the release process. Even with different sizing, being generated by a single template (or blueprint) makes sure they are identical in the configuration that influences the behaviour of the applications deployed.
There will be no surprise due to a missing configuration of a firewall port, of a datastore or a vlan trunk: consistency is granted, troubleshooting is limited.


Ensure idempotence

Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application (you can find the complete definition here). Well designed API ensure that repeated calls with the same input will not alter the state of the system, that is good because in case of retry you don't risk to create duplicated resources or other troubles.


Is Infrastructure as Code a product I can buy?

No: IaC is a methodology, not a product. It's a set of best practices that you can gradually adopt, and learn step by step. You can start with basic use cases, like creating a new tenant with a few associated resources, if that is a recurring activity and you want to make it faster and error-proof.

Then you will grow with more complex use cases, like creating an entire test environment on demand with all needed resources, including services from a cloud platform. The adoption of Infrastructure as Code is not a big bang, you don't need to build complete automation with no manual activity in one week. You can target quick wins, that validate the approach and generate momentum in the organization. The critical factor in the adoption is change management (i.e. the introduction of a new operational model and new ways of doing things), not the technology. 


Of course you need supporting tools: automation frameworks like Ansible and/or Terraform (eventually also scripting languages could do the job), versioning systems, collaboration systems. And your infrastructure needs to be programmable (public cloud is programmable by default), meaning that your servers, networks, storage should be managed via  software controllers or, at least, expose well-documented API.


Common use cases for Infrastructure as Code


Environments on demand

IT admins and the Operations teams receive a lot of requests from the applications teams: they need a configuration change to fix a problem or to deploy a new service, they need a new test environment, they need a clone for a new tenant, etc. Most of these requests cannot be satisfied immediately because of higher priorities, or because they require the collaboration of different teams that needs planning.

If the provisioning of the system - and some "day 2 operations" - was automated, with controlled execution of validated templates, both parties (Dev and Ops) would save time and be more satisfied... and efficient, for the benefit of the entire company.  


Shared resource pool to increase efficiency

Some companies keep separate environments for each stage of a project: integration test, performance test, quality assurance, production, etc. Resources are always allocated, regardless they are only used - let's say - a week every three months (the interval may vary between one day and one year): only when they release a new version of the project, or have a maintenance windows for deploying bug fixes.

Keeping resources allocated when they are not in use is a waste of capacity, hence a waste of money. Imagine if you multiply the waste by the number of projects. But they cannot do it differently because of the complexity and the time required to build the different environments.

If they could - and with automation they can - recreate an identical environment, end to end, whenever required, they could dispose each environment as soon as it's no longer in use. Knowing that they can recreate it in minutes, they would reuse the returned resources for another stage or another project.
Using a shared resource pool (computing, networking and storage) for many project would be more efficient from a cost perspective. It applies to fixed capacity (less capex: you need to buy less hardware to satisfy all the requests) but also to pay per use scenarios (less opex: you dismiss resources when not needed).


Disaster Recovery

In case you need to repurpose existing or new resources to recover from a disaster, recreating a clone of the system from a single source of truth is much faster and safer. Generating the new infrastructure from the same blueprint that had created the old one, makes sure they are identical.


Blueprints and Compliance

Subject matter experts from every technology domain that collaborate to provision and maintain a system, instead of being engaged every time, could design and release Infrastructure as Code models once. Users (i.e. applications teams or other operations teams) could then use the blueprints for a self-service provisioning, without depending on the availability of the SME. The SME would save their time, feeling safe because the blueprints respect all the defined constraints and comply with the policies (no provisioning anarchy is allowed). 


Auditing

Running automation scripts with standardised logging, or better using a pipeline orchestrator for provisioning and configuring systems, would trace what operations have been done, by whom, the input and the outcome. Very useful audit information, with no effort. 


From where do I start?


Tools: Ansible and Terraform

Those are the most widely used tools for automation (with or without an Infrastructure as Code approach). They are open-source and free, easy to use. An enterprise version also exist, and in some cases you will find it very useful. But you can start practicing with the free tool and use it for years, with great advantage, if you are the only responsible for the infrastructure. In case of teamwork, you can still use the free version and dedicate some time to build your own operational model and additional tools, or you can switch to the enterprise version that makes it easy to scale at the enterprise level.

You can download the software from the Ansible and Terraform websites, along with good documentation and reusable examples (see below). Good tutorials are also available.


Single operating tool: Cisco Intersight 

Cisco Intersight™ is a Software-as-a-Service (SaaS) hybrid cloud operations platform which delivers intelligent automation, observability, and optimization to customers for traditional and cloud-native applications and infrastructure. It supports Cisco Unified Computing System™ (Cisco UCS®) and Cisco HyperFlex™ hyperconverged infrastructure, other Intersight-connected devices, third-party Intersight-connected devices, cloud platforms and services, and other integration endpoints. Because it’s a SaaS-delivered platform, Intersight functionality increases and expands with weekly releases.

With Intersight, you get all of the benefits of SaaS delivery and full lifecycle management of distributed infrastructure and workloads across data centers, remote sites, branch offices, and edge environments. This empowers you to analyze, update, fix, and automate your environment in ways that were not previously possible. As a result, your organization can achieve significant TCO savings and deliver applications faster in support of new business initiatives.

See also Diving Deeper into Hybrid Cloud Operations with Intersight 


Resources to practice Infrastructure as Code

DevNet - Cisco's developers community, that offers tutorials, sandboxes, labs and reusable assets. This is the Infrastructure as Code page at DevNet: https://developer.cisco.com/iac/

Terraform - documentation, download and tutorials at https://www.terraform.io/. The integration with Cisco Intersight is explained at https://www.hashicorp.com/resources/standardizing-hybrid-cloud-environments-with-hashicorpterraform-and-cisco-intersi  

Ansible - documentation, download and tutorials can be found  at https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html 

Cisco workshops on Infrastructure as Code - feel free to contact me if you're interested in participating in our free, 3 x half days hands-on workshop.








July 19, 2019

Just one button to provision a production-grade Kubernetes cluster

(this is a guest post, authored by my esteemed colleague Fabio Di Niro)

Do you remember?


I bet all of you who are working or playing with Kubernetes remember perfectly the first time you tried to install it.
And the second.
And the third.
...
And the one that finally worked out.

And if you’re a professional you remember also the long path that brought you to own the expertise on Kubernetes that you need to install and fine-tune production grade clusters.
Or, if you’re not a Kubernetes professional, you probably remember how much time it took for you to find someone able to perform a valid Kubernetes install...and how much it costed.

To save all this time and effort to our customers Cisco released the Cisco Container Platform (CCP), a turnkey solution to easily provision production-grade Kubernetes clusters on-prem or in the cloud in minutes, with few mouse clicks and requiring little to no knowledge of K8s.
All the needed integrations with network, storage, computing and security are done automatically by CCP so that the provisioned K8s clusters are ready to run in production.
Clusters provisioned by CCP are already equipped with finely configured monitoring and logging tools like FluentD, Grafana, ElasticSearch, Kibana.
Through the Container Network Interface (CNI) you can choose whether to leverage Cisco ACI as network infrastructure or Calico (no dependence on the underlying infrastructure).

This is already great, but I thought to create a demo that may push the simplicity of those “few mouse clicks” to its limit, making possible to create a production grade cluster in just one click.







Introducing the Kubernetes dash button.

The concept is fairly simple: build a dash button that, once pressed, creates a production grade Kubernetes cluster ready to use.

Leveraging the rich set of the Cisco Container Platform (CCP) APIs this is even too easy, so I thought to add some more feature on top:

- I wanted to provision the cluster and access it just through the dash button. So, I want CCP to display on the dash button itself the IP address of the master node of the cluster created
- The start and finish of the cluster provisioning process had to be confirmed, so the communications had to be bi-directional with the dash button
- I wanted a fair battery life that would avoid me to recharge the button every day, so I needed to have electronics able to sleep or hibernate
- My lab, where I have the infrastructure and the CCP, is behind a proxy, so I can’t listen for calls inside the lab, I can just initiate communications from the lab. So, I needed a way to change the “push” of the button in a “pull” of the button press information
- I wanted to use the button everywhere I go without worrying about the local Wi-Fi settings



How it works

To satisfy all the above requirements I added a couple of elements in the picture, ending up with the following architecture:



The button is based on an Arduino ESP 32 board, it connects via Wi-Fi to my smartphone and uses its internet connection, this way I can use the button everywhere my phone has data signal. The button leverages a publish-subscribe message service (MQTT) in the cloud to bypass the limitation of the proxy I have behind my lab and reach a couple of scripts that calls the right API in the Cisco Container Platform to trigger the provisioning of a shiny new Kubernetes cluster.
Once the cluster is provisioned the IP address of the master node is returned to the dash button that shows it on its display, at this point it is ready to accept connection and be used.

A 3D printed enclosure completed my project, I took an existing model but then I decided to  leverage the capabilities of CCP to deploy K8s clusters on-prem or in the cloud so I designed the two different enclosures you can see in the picture to have two different dash buttons for the two different deployment target.
All the code and 3D designs have been released and are publicly available at: https://github.com/fdiniro/CCPDashButton




Now, before doing my demo, I can ask to my customers: “How much time and effort takes you to install a production-grade, fully operationalized and secured kubernetes cluster?” and whatever answer I get I know I can answer “I can do it in 2 minutes blindfolded and cuffed”.

You can see the recorded demo here: https://youtu.be/-F-xR0XNPBs



March 7, 2019

DevOps with CloudCenter and Kubernetes in a multicloud environment

A short description of what is DevOps and how it helps companies to compete in their business with a faster innovation, followed by a demonstration of how the Cisco multicloud portfolio helps in the adoption of DevOps practices [see also this post for more detail].

If your time for reading is limited, here is the structure of the post: you can jump to the paragraphs you are interested in.

(The business view starts here)
The need for digital innovation.
  New services, better quality
  Frequent releases
DevOps is not a technology or a product.
  Cultural change and collaboration (break silos in the organization)
        Small teams responsible for a service’s lifecycle end to end
DevOps principles.
  Feedback loop
  CI/CD – Continuous Integration and Continuous Deployment

(The technical view starts here)
Cisco Multicloud approach.
Cisco CloudCenter Suite (CCS)
Cisco Container Platform (CCP)
Our lab
Application
Infrastructure
Demo flow
Implementation
Conclusions
DevOps makes it faster and easier
Cultural change is needed (incentives)
Cisco offers an effective toolset to help the adoption of DevOps practices


The need for digital innovation.

Whatever is your business, your customers expect more and more services, greater efficiency and value added by innovation.
Providing new business services (generally supported by software applications) to customers and anticipating your competitors’ moves attracts new customers and retains the existing ones. 
Often the lines of business are not satisfied with the support they receive from the corporate IT in terms of flexibility and speed to start a new project, especially if new technologies or skills are required (e.g. cloud native applications).
A better perceived quality of IT depends also on the frequency of the release of fixes for broken services and on the process to avoid that bugs reach the production environment, being intercepted in good functional and reliability tests.
Frequent releases and the quality of the code can benefit a lot from automation in all the phases of a software project, though the end to end automation is not absolutely necessary: it is just much better. The fundamental pillars are a good organization of the work and processes that ensure a coverage of every need (no gaps in the responsibility, no grey area in communication among different departments, shared objectives instead of finger pointing).

Next picture shows the evolution of methodologies and the impact on the value perceived by the business. The small star represents the instant when business value is realized by a release of the application in production.
With the traditional waterfall projects, it happens only at the end of the project (by the way, with a lot of uncertainty due to delays and unexpected troubles during the development and the test phases).
The agile methodology reduces the risk, repeating shorter cycles of design, coding and testing that can address any surprise and correct the course of the project sooner. But the deployment in production still happens at the very end of the project.
The innovation allowed by Continuous Integration and Continuous Deployment brings the application in production at every cycle (new releases or bug fixing) ensuring optimal quality and a deterministic outcome: the business will appreciate a benefit in terms of time to market for their initiatives.

CI/CD offers more business value
Picture 1 - CI/CD offers more business value



DevOps is not a technology or a product.

DevOps means collaboration between Developers and Operations.
The work of who is responsible for design and implementation of the code does not finish when a new build of the application is released. Developers should also collaborate in testing the system, releasing it in production, operating and measuring its KPI.
The Operation team should not just execute a defined process to maintain the system but should collaborate since the design phase of the application and, most importantly, provide a constructive feedback from the production environment that helps improving and extending the application in next development cycles.

The collaboration and the feedback loop are foundational principles in DevOps, as described in next paragraph. [See The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations]

A cultural change. 

A cultural change (breaking silos in the organization) should be promoted, with incentives and gradual adoption of practices that will improve with time: the entire organization and the individuals have to digest a new way of working, openly analyze the outcome, contribute to the progress with personal feedback and suggestions. 
Everybody should feel that they have a common goal and they are collaborating to everyone’s success.
A great book describing this cultural change is the Phoenix Project.

Small teams responsible for a service’s lifecycle end to end.

DevOps practices suggest that the entire lifecycle of a service is managed by a single team: from the inception phase and the requirements analysis, to the implementation, test, release and operations. They can be more efficient – a provide a better quality – if they know everything about the service and they can react to any problem quickly, as well as evolving it based on new requirements.
The team should include representatives from different departments (lines of business, IT Architecture, Operations…) that bring their skill and experience, so a new organizational model can be required in your company: maybe a dotted line reporting structure with functional responsibilities.
It is not necessary to build a team for each service: some services can be grouped in one team, especially if they belong to the same business area or if they are the building blocks for a composite application (in a microservices architecture).

DevOps principles.

Gene Kim defines the principles that all of the DevOps patterns can be derived from (the Three Ways) in the books “DevOps Handbook” and “The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win.” He asserts that the Three Ways describe the values and philosophies that frame the processes, procedures, practices of DevOps, as well as the prescriptive steps.

The First Way – Systems Thinking
•  Understand the entire flow of work
•  Seek to increase the flow of work
•  Stop problems early and often – Don’t let them flow downstream
•  Keep everyone thinking globally
•  Deeply understand your systems

First Way Goals
•  One source of truth – Code, environment and configuration in one place
•  Consistent release process – Automation is essential (one click)
•  Decrease cycle times, Faster release cadence

The Second Way – Feedback Loops
•  Understand and respond to the needs of all customers (internal and external)
•  Shorten and amplify all feedback loops
•  With feedback comes quality

Second Way Goals
•  Defects and performance issues fixed faster
•  Ops and InfoSec user stories appear as part of the application
•  Everyone is communicating better
•  More work getting done

The Third Way – Synergy
•  Consistent process and effective feedback result in agility
•  Now use that agility to experiment
•  You only learn from failure – So fail often, but recover quickly

Third Way Goals
•  Ability to anticipate, even define new business needs through visibility in the systems
•  Ability to test and optimize new business opportunities in the system while managing risk
•  Joy

Now that you have got an idea of what is DevOps, let’s have a look at a solution from Cisco that could make it easier to adopt DevOps practices. Remember that DevOps cannot be bought: it is the set of good practices that you define and refine based on continuous improvement based on direct experience. Automation is only a part of the story.


Cisco Multicloud approach.

Cisco knows that many customers are using at least one private or public cloud, but most of them use at least two: that implies a need for consistent governance, security, networking, analytics and automation that apply to every environment.
The multicloud portfolio includes products, and reference architectures to make the adoption simpler, that span all the technologies mentioned above.

This post explains how we have built a demo using products in the automation bucket to support a DevOps use case (i.e. Continuous Integration and Continuous Deployment, aka CI/CD).

The two products are the Cisco CloudCenter Suite (CCS) and the Cisco Container Platform (CCP), briefly described in the following paragraphs before we go to the demo.


Cisco CloudCenter Suite

A solution to help the IT organization to sustain the pressure from developers and lines of business to deploy and operate a large number of applications and middleware platforms, made more complex by the availability of different possible targets (private and public clouds for running VM and containers).



CloudCenter addresses the many-to-many complexity
Picture 2 - CloudCenter addresses the many-to-many complexity



The CloudCenter Suite is a single tool to automate the deployments and broker resources from any cloud. It helps to enforce a single governance model including cost control, approval processes, security policies and consistent architecture across different clouds.
You don’t have to learn and use separate tools from the cloud providers, neither to replicate the automation blueprints using the native automation technologies in each cloud (e.g. Cloud Formation, Heat, Powershell): you only create a single model and CloudCenter translates it into call to the specific API exposed by each private and public cloud and Kubernetes clusters.



CloudCenter translates a single blueprint to API calls for all clouds
Picture 3 - CloudCenter translates a single blueprint to API calls for all clouds



Everything you do in CloudCenter can be done through its API, that makes it easy to orchestrate it externally (e.g. from Jenkins, through a plugin that Cisco ships so that you can insert multicloud deployments in your CI/CD pipeline).

The current version of the CloudCenter Suite also includes additional modules like the Cost Optimizer and the Action Orchestrator: a useful enhancement to create a governance model and make operations easy in a heterogeneous multi-cloud environment.


Cisco Container Platform

Another software product form Cisco, that you can see as a tool for Operations to create and manage enterprise grade Kubernetes clusters.
It creates, fully configures and manages (upgrades, scales, monitors) Kubernetes clusters on-premises and in the public cloud for you.
It takes care of all the complexity of the integration with networking (options offered out of the box are Calico, Contiv and Cisco ACI), storage, security (SSO and RBAC are added to Kubernetes) as well as centralized monitoring and logging while shipping 100% open source binaries from the upstream repositories.


Our lab

We have built a simple application based on a microservices architecture, as shown by the picture. 


the microservices application built for the demo
Picture 4 - the microservices application built for the demo

The source code of the 5 components is stored in a github repository, where new versions of the application are committed (saved) by developers. At each commit, the Jenkins orchestrator gets the source code and compiles it, building the container images ready to deploy the application.

The images are saved in a shared container registry (Harbor, see next picture) where Cisco CloudCenter will be able to retrieve them when asked by Jenkins to deploy the application. Based on input parameters provided by Jenkins, CloudCenter will target the deployment to the most appropriate environment for the current phase of the project.

In our demo lab, the environments are “integration test”, “performance test” and “production”.
They correspond to three different Kubernetes clusters that have been created in the private cloud (the first two) and in the public cloud (for the production environment).
Each environment has different policies set, that will be inherited by every application that is deployed there: for security, networking, autoscaling, etc.

The 3 Kubernetes clusters mentioned above have been generated by the Cisco Container Platform (though we could have created them manually in each cloud). The value in using CCP consists in consistent operations, speed and easiness: in few minutes we created 3 production-ready clusters, fully integrated with networking, storage, security, monitoring and logging without even touching the K8s installer or the underlying infrastructure.
The 2 clusters named “integration test” and “performance test” were created automatically inside VM in a local vmware environment, while the cluster named “production” was created in the Amazon cloud (CCP uses the API exposed by the Amazon EKS service to do everything automatically, including the integration with the AWS IAM for security).

The automated deployments will repeat, in the three environments, in a sequence that alternates them with the necessary tests and ensures the quality of the release. Though in the real world you might want to run more complex testing activities, this is a meaningful example of the efficiency you can achieve thanks to full automation of the process.
The pipeline can still be extended by adding additional tests like quality code inspection and more.


the CI/CD cycle
Picture 5 - the CI/CD cycle

Demo flow

Next picture is a sequence diagram showing all the actions that we have automated.
We used a color code to represent the phases that are commonly referred to as Continuous Integration (the green part) and Continuous Deployment (the orange part).
CCC stands for Cisco CloudCenter, where K8s dev, test and prod represent the 3 Kubernetes clusters mentioned above.
The entire process is completely automated and brings a new version of the application to the production deployment without any human intervention.
This complete automation is often referred to as Continuous Deployment and, though very useful and adopted by big players like Facebook (their pipeline is more complex than our simplified demo) is not very common among the customers I generally meet.
Those that adopted DevOps still prefer to have some human checks in between the activities, so that they feel they have a better control on the process and its quality.
When they have more experience, probably they will be confident enough to delegate every check to the automation tools.


sequence diagram showing the automated actions
Picture 6 - a sequence diagram showing the automated actions


Implementation

The automation is based on Jenkins, an open source orchestrator that benefits from the availability of hundreds of plugins: it can automate almost every component in your IT ecosystem, including Cisco CloudCenter of course.

In the Jenkins dashboard you can build different projects, like in the picture below. A project is a sequence of steps, using plugins to drive activities in the systems you want to automate (e.g. pull the source code from the repository, compile it, build containers images, trigger a cloud deployment through CloudCenter, etc.).

Jenkins projects
Picture 7 - Jenkins projects


Projects can call other projects, to make your orchestration modular and reusable. In the picture above, the project TheWall (that is the name of our demo application) calls the other 5 projects in a sequence, checking that the outcome is positive before calling next project.

So, we are able to automate the deployments in the 3 Kubernetes clusters and to run the functional test and the performance test of the application using an external tool (we used another open source product called Apache Jmeter). 

The functional test is a sequence of user transactions, executed by the test tool using a pool of user identities and a pool of input data, where assertions about the expected result are validated automatically. If the page generated by the application differs from the expected result, an error is logged, and the test can be considered failed. So, the functional test ensures that the application behaves as expected from a functional standpoint (and you can avoid a manual test for user acceptance).

The performance test, executed by the same tool, stresses the application and the infrastructure from a performance standpoint. A large number of concurrent users are simulated by the tool, invoking a sequence of user transactions with random wait time, reproducing a situation similar to the workload in a production environment. Response times are tracked and so are eventual errors, allowing the tool to declare that the test is successful or not.

Based on the outcome produced by Jmeter, the Jenkins orchestrator will continue with the Continuous Deployment pipeline or abort it, notifying the developers that something went wrong and a correction is required.
In this case, the CI/CD cycle will restart from the beginning: source code modified and committed, application built and deployed to the first environment, test executed, application promoted to next environment and tested… until the pipeline is completely executed without any warning or error and the application is released automatically in production.

Next picture shows the execution of the Jenkins pipeline for three different builds of the application. The most recent execution failed because the modification of the source code introduced an error that blocked the build. The other two executions succeeded, as demonstrated by the green color of every step in the pipeline.

Jenkins pipeline
Picture 8 - Jenkins pipeline



Jenkins logs all the activities, so that you can check what’s happened during the automated process.
Next picture shows the output of the sub-project named TheWall_Deploy_Test, that is the 7th stage in the pipeline in previous picture.
It uses the API exposed by CloudCenter to deploy the application “TheWall” to a test environment running Kubernetes, that is robust enough to sustain the workload of the performance test (while the functional test can be executed also in a smaller cluster with less computing power).


output from the Jenkins CI/CD pipeline
Picture 9 - output from the Jenkins CI/CD pipeline


You don’t have to code the API calls, because CloudCenter ships a plugin for Jenkins that integrates into its user interface graphically. But if you prefer, Jenkins can run scripts and commands from the CLI for you.

Conclusions

DevOps makes it faster and easier

If you adopt a DevOps methodology you bring agility to an extreme and get a business outcome from the fast release of applications. 

Cultural change is needed (use incentives)

DevOps is not a matter of technology. Your people need to work in a different way: no finger pointing between Developers and Operations, no “it’s not my job”, everybody should commit towards a common goal and enjoy their common achievement.
Initially it will be difficult, you have to teach them little by little. Offer incentives to people that shows a collaborative attitude and a spirit of innovation, let them feel like the heroes in the new adventure and grant that failures will not create any trouble. You learn from your mistakes and there is no magic wand to start directly with a perfect solution.
In addition, also traditional methodologies generate project failures: the difference is that DevOps anticipate problems and you discover them sooner, so the business impact is much smaller.

Cisco offers an effective toolset to help the adoption of DevOps practices

We all agree that DevOps is not a product. But once you start working you will see that automation helps the CI/CD process to be fluent. You can find great opens source (and free) tools – e.g. Jenkins, Jmeter, Ansible – to support your project teams, but if you also adopt Cisco CloudCenter and the Cisco Container Platform your professional life will be much easier.

Credits

The demo lab described in the post has been built with two colleagues and friends, that I want to thank here: Stefano Gioia and Riccardo Tortorici.

References

Jenkins – https://jenkins.io 






August 1, 2018

Lifecycle of an application in CloudCenter with CI/CD

In a previous post we demonstrated how to automate the setup of a Continuous Integration / Continuous Deployment environment.

Now we will demonstrate how to use it: a developer can create an application that will be compiled, then built and deployed into a test environment automatically using this CI/CD toolset. 
Next picture shows the list of operations automated by the CI/CD.

Application deployment: sequence of automated operations
 Application deployment: sequence of automated operations



These are the lifecycle steps that we will demonstrate in this post:
1.    Deploy the PetClinic application (introduced in the previous post) automatically.
2.    Push the java source code to a repository (SVN).
3.    Creating the next release of the application by modifying the java source code and saving it as a new version in the repository.
4.    Watch the Jenkins orchestrator create the new build, save it in the binaries repository (Artifactory) and use CloudCenter to deploy it.

1 - Deploy the PetClinic application automatically. 

We start by deploying the Java application PetClinic, using the Application Profile created in CloudCenter, into our development environment in the lab. The correct behavior of the application is tested by accessing its home page and verifying that it shows correctly in the browser.

 2 - Push the java source code to a repository (SVN). 

We then push the java source code of the PetClinic application into the repository (SVN) that was created in our previous task, committing it as the initial release of the application.


Source code control: committing the java code into the SVN repository
Source code control: committing the java code into the SVN repository 



An automated build of the application and its deployment follow, as explained by the workflow above, thanks to the Jenkins orchestrator and CloudCenter.  If we access the Jenkins GUI (see next picture) through a web browser and we select the project “repo1” we can see that Jenkins is currently creating a new build: look at the progress bar. As soon as the building process is terminated the binaries are copied into the Artifactory repository of binary files and the Jenkins process called “deploy” starts.

Jenkins: following the build process
Jenkins: following the build process  



If we access the “deploy” job in Jenkins, we can see that a new build of the PetClinics application has been sent to CloudCenter to be deployed. This is made possible by the plugin that integrates Jenkins with Cloud Center.

Jenkins: deployment of the PetClinics application through CloudCenter
Jenkins: deployment of the PetClinics application through CloudCenter




Cloud Center: viewing the deployment details of the deployed PetClinic application
Cloud Center: viewing the deployment details of the deployed PetClinic application 



PetClinic: home page of the deployed application
PetClinic: home page of the deployed application



 3 - Creating next release of the application by modifying the java source code and saving it as a new version in the repository. 

Let’s assume now that another developer is working at improving the front end of the application and he is ready to commit a major chunk of code. For the sake of the demonstration we will only change the picture and the text on the homepage but we will show that, as soon as we commit the modifications to SVN, the new application is automatically deployed in the Development environment via Cloud Center (as you already know, the Development environment could be in any on-premises/hosted private or public cloud).

Application lifecycle: creating a new version of the code of PetClinic
Application lifecycle: creating a new version of the code of PetClinic



We will modify the file petclinic/src/main/webapp/WEB-INF/jsp/welcome.jsp changing the text and the pet image (see next picture). Once we are done we save the new version of the file and commit it (right click, SVN Commit).  After waiting a few minutes for the whole chain of operation to finish, we will find out a new deployment of the application in Cloud Center –> Projects –> Project PetClinic  Now we can navigate the application, in the test environment, to see how the new release looks like: you can see that the puppy picture and the text message have been updated according to the edit done by the developer.

PetClinic: home page of the modified application
PetClinic: home page of the modified application

Conclusion  


The two use cases shown in this series of posts:
•    creation of a CI/CD environment as a service, and
•    automated Deployment of every new release of an application
demonstrate the power of CloudCenter as an orchestrator in deploying applications across a multicloud environment.

Every stage of the project (dev, test, prod…) can be associated to a different deployment environment, potentially in different clouds, having its own set of configuration, policies and rules. This information is stored in CloudCenter as part of the governance model you build for your IT.

The application will move automatically from one phase of the project to next one, if it passes the specific tests (i.e. integration, functional and performances tests, run by the automation tool) after each deployment.

In future posts we’ll show how to also automate these tests in a CI/CD pipeline. 
We will use open source tools like Apache Jmeter to run functional tests designed together with the application and automated by scripts stored in the same source code repository.
And we will run performance tests with the same tool, of course after CloudCenter has moved the deployment to a target environment that is able to sustain the load we generate.  

Credits 


This post is co-authored with a colleague of mine, Stefano Gioia.

References:

CloudCenter