Getting started with basics of building your own cloud
My daily routine involves too much of AWS Cloud infrastructure. And let me tell you AWS now has grown to an extent that it has now become the synonym of Cloud. I mean they have grown without leap and bounds in the past few years and believe me many other major players are not even near them in the cloud arena (Yeah of course Google and Microsoft does have their own cloud solutions which are pretty brilliant for all use cases, but nobody has the user/customer base that aws has in their public cloud architecture).
Nothing can match the flexibility, elasticity, and ease of use that cloud provides. Because I remember when I use to work with physical hardware machines (I had to literally wait for hours to get one ready up and running for an emergency requirement. Then if I need additional storage for that machine again wait some more time.) . And if you are using the cloud, then you can spin up a few cloud servers in seconds (believe me in seconds) and test whatever you want.
What is OpenStack Cloud?
An year ago I happen to read an article from netcraft regarding their findings on AWS. According to them in 2013 itself AWS has crossed the mark of 158K in the total number of public facing computers.
Now imagine if you get the same features that AWS cloud provides with something open source that you can build in your own data centre. Isn’t that amazing? Well that’s the reason why tech giants like IBM, HP, Intel, Red Hat, CISCO, Juniper, Yahoo, Dell, Netapp, Vmware, Godaddy, Paypal, Canonical(Ubuntu) support and fund such a project.
This open source project is called as Open Stack, and is currently supported by more than 150 tech companies worldwide. It all started as a combined project by NASA and Rackspace in 2009 (well both were independently developing their own individual projects, which at a later point got together and later called as OpenStack). Well NASA was behind a project called as NOVA(which is very analogous to amazon ec2 and provided computing feature), and Rackspace built another tool called as Swift(a highly scalable object storage solution, very similar to AWS S3).
Apart from these, there are other components that help make openstack very much same as aws cloud(we will be discussing each of them shortly, and in upcoming tutorials, we will configure each of them to build our own cloud).
Openstack can be used by anybody who wants their own cloud infrastructure, similar to AWS. Although its origin will trace back to NASA, its not actively developed/supported by NASA any more.
And they are currently leveraging aws public cloud infrastructure J
If you want to simply use openstack public cloud, then you can use Rackspace Cloud, ENovance, HP cloud etc(these are very much similar to aws cloud.) with their cost associated. Apart from these public openstack cloud offerings, there are plug and play cloud services, where you have dedicated hardware appliance for openstack. Just purchasing it and plugging it would turn it into an openstack cloud service without any further configurations.
Let’s now discuss some of the crucial components of OpenStack, which when combined together will make a robust cloud like any other commercial cloud (Like AWS), that too in your datacenter, completely managed and controlled by your team.
When you talk about cloud, the first thing that comes to your mind will be virtualization. Because virtualization is the technology that caused this cloud revolution possible. Virtualization basically is nothing but the method of slicing resources of a physical machine to smaller/required parts, and those slices will act as independent hosts sharing resources with other slices on the machine. This enables optimal use of computing resources.
- OpenStack Compute: So one of the main component of cloud is virtual machines, that can scale without bounds. This need of the cloud in openstack is fulfilled by something called as Nova. Nova is the name of the software component in OpenStack cloud, that offers and manages virtual machines.
Apart from the compute requirements, the second major requirement is storage. There are two different types of storage in the cloud, one is block storage(very similar to the way how you use RAID partition on any of your servers and format it and use it for all kind of local storage needs), or normal disk storage, where your operating system files are installed etc.
- OpenStack block storage (Cynder): will work similar to attaching and detaching an external hard drive to your operating system, for its local use. Block storage is useful for database storage, or raw storage for the server(like format it, mount it and use it), or else you can combine several for distributed file system needs (like you can make a large gluster volume, out of several block storage devices attached to a virtual machine launched by Nova).
The second type of storage full fills the scaling needs, without bounds. You need a storage that can scale without worry. Where your storage need is of static objects. This can be used for storing static large data like backups, archives etc. It can be accessed with its own API, and is replicated cross datacenter, to withstand large disasters.
- OpenStack Object storage(Swift): is suitable for storing multimedia content like videos, images, virtual machine images, backups, email storage, archives etc. This type of data needs to grow without any limitation, and needs to be replicated. This is exactly what OpenStack swift is designed to do.
Last but not the least, comes Networking. Networking in the cloud has become so matured that you can create your own private networks, access control lists, create routes between them, interconnect different networks, connect to remote network using VPN etc. Almost all of these needs of an enterprise cloud is taken care by openstack networking.
- Openstack Networking(Nova-networking, or Neutron): When I say openstack networking, think of it as something that manages networking for all our virtual hosts(instances), and provide IP address both private and public. You might be thinking that networking in virtualization is quite easy by setting up a bridge adapter and routing all traffic through it, similar to many virtual adapters. But here we are talking about an entire cloud, that should have public ip’s, that can be attached, detached from the instances that we launch inside, there must be one fixed ip for each instance, and then there must never be a single point of failure etc.
According to me openstack networking is the most complex thing that needs to be designed by taking extreme care. We will be discussing openstack networking in very detail, in a dedicated post, because of its complexity, and importance. Also it can be done with two different tools. One is called as nova-networking, and the other is called as neutron. Please note the fact that each and every component of openstack cloud needs special attention on its own, as they are each very distinct and work combined together to form a cloud. Hence i will be doing dedicated post for each of its major components.
Openstack is very highly configurable, due to this very reason, its quite difficult to mention all of its possible configurations in a tutorial. You will come to know about this, at a later point, when we start configuring things in the upcoming series of posts.
|Component Name||Used for||Similar to|
|Horizon||A dashboard for end users or administrators to access other backend services||AWS Management Web Console|
|Nova Compute||Manages virtualization and takes requests from end user through dashboard or API to form virtual Instances||AWS Elastic Compute|
|Cynder||For Block storage, directly attachable to any virtual instance, similar to an external hard drive||EBS(Elastic Block Store)|
|Glance||This is used for maintaining a catalog for images and is kind of a repository for images.||AMI (Amazon Machine Images)|
|Swift||This is used for Object storage that can be used by your applications or instances to store static objects like multimedia files, backups, store images, archives etc.||AWS S3|
|Keystone||This component is responsible for managing authentication services for all components. Like a credentials and authorization, and authentication for users||AWS Identity And Access Management(IAM)|
You might have got an idea of what OpenStack Cloud actually is till now. Let’s now answer some questions, that can really prove helpful in getting a little bit more idea of what openstack really is, or say how these individual components fit together to form a cloud.
What is Horizon Dashboard?
Its nothing but a web interface for users and administrators to interact with your OpenStack cloud. Its basically a Django Web Application implemented in mod_wsgi and Apache. Its primary objective is to interact with the backend API’s of other components and execute requests initiated by users. It interacts with keystone authentication service, to authorize requests before doing anything
Does nova-compute perform virtualization?
Well, nova-compute basically is a daemon that does the job of creating and terminating virtual machines. It does this job through virtual machine API calls. There is something called as a libvirt library. Libvirt is nothing but an API for interacting with Linux virtualization technologies(its a free and open source software that needs to be installed with nova as a dependency).
Basically libvirt gives nova-compute, the functionality to send API requests to KVM, Xen, LXC, OpenVZ, Virtualbox, Vmware, Parallels hypervisors.
So when a user in openstack requests to launch a cloud instance, what actually happens is nova-compute sending requests to hypervisors using libvirt. Well other than libvirt, nova-compute can send requests directly to Xen-Api, vSphere API etc. This wide support of different virtualization technologies is the main strength of nova.
How does Swift Work?
Well swift is a highly scalable object storage. Object Storage in itself, is a big topic, so i recommend reading the below post.
Unlike block storage, files are not organized in hierarchical name space. But they are organized in a flat name space. Although it can give you an illusion of a folder with contents inside, all files inside all folders are in a single name space, due to which scaling becomes much easier compared to block storage.
Swift uses multiple commodity servers and backend storage devices to combine together and form a large pool of storage as per the requirement of the end user. This can be scaled without bounds, by simply adding more nodes in the future.
What is keystone?
Its a single point of contact for policy, authentication, and identity management in openstack cloud. It can work with different authentication backends like Ldap, SQL or a simple key value store.
Keystone has two primary functions
- Manage Users. Like tracking of all users, and their permissions.
- Service list/catalog. This is nothing but providing information regarding what services are available and their respective API endpoint details.
What is Openstack Cinder?
As discussed before and shown in the diagram, cinder is nothing but a block storage service. It provides a software block storage on top of basic traditional block storage devices to instances that nova-compute launches.
In simple terms we can say that cinder does the job of virtualizing pools of block storage(any traditional storage device) and makes it available to end users via API. Users use those virtual block storage volume inside their virtual machines, without knowing where the volume is actually deployed in the architecture, or knowing details about the underlying device of the storage.