Silicon Valley Road Trip – Day 2

Today we had the pleasure to visit to other companies in Sillicon Valley, Nutanix and Pernixdata. Although these 2 companies cannot called an startup anymore they had quite some interesting stuff to show us.

At Nutanix we had a 1 hour meeting with @stevenpoitrais (who wrote the Nutanix Bible). If you ever worked with Nutanix you probably visits his web page. If not, you should!

Nutanix and there solution is new for me. So what I really liked is that Steven didn’t presented a slide deck but just asked what we would like to talk about. Before we knew it, we here in a discussion about the position of Nutanix and there new Acropolix Hypervisor and the impact on existing VMware, KVM and Openstack customers. Although sometimes the filter kicked in about what Steven can and cannot say I really like the discussion. After 1,5 hour Steven noticed that I was time to lunch. And with a few slices of pizza the discussion continued. Steven told us that most of there Acropolix customers (Hypervisor based on KVM) are in Asian and that we only see the top of the iceberg (and than the filter kicked in again 🙂 ).
After more than 2,5 hours we had to go, but where not finished!

Next was Pernixdata with @Frankdenneman. After a tour through there building (including the nice transformed pictures of Starwars characters with the faces of @SatyamVaghani and Frank Denneman) we where offered a beer (it was after 12AM) and went to the boardroom.
Here Frank showed us there latest product FVP Architect. FVP Architect comes with a VIB module in the VMware Hypervisor and is gathering all storage metadata and send this to database where it’s crunched. This gives VMware and Storage administrators realtime and historical overview how there environment is performing and what can be improved. It also gives a in-dept view about what type of storage workload and which virtual machine is generating this workload. Giving you the option to adjust this VM instead of just buying more IOPS for your storage.
After a while we had a discusion about when a virtual machine writes a block of data and how this is handled by the VMware kernel to storage . I had a mis conception that the VMware Kernel alway’s writes in 8Kb block to VMFS (8Kb is used for sub-allocation). So Frank pulled in one of the VMFS inventors @mnv104, and where taught a lesson in how the VMkernel kernel sends data to the storage layer. Woh!

I know Frank has tolled me a lot of times  that most people who where working on the cool stuff in the VMkernel now work at Pernixdata, but now I experienced it!
After a quick visit from Satyam I was time to leave. This was really a cool visit!


Silicon Valley Road Trip – Day 1

On the 27 and 28 of august I’m having a road trip through Silicon Valley with fellow colleges @rutgerkoster @nielshagoort and @frankdenneman.

These day where’re going to visit 4 startup companies who are so kind to spend a couple of hours on us to tell us there latest innovations.

Yesterday we had the pleasure to visit Rubrik and Platform9.

Rubrik developed a Converged Data Management time machine/backup appliance for the midrange and upper marked who can be setup in 15 minutes. Chris Wahl took us through a 1 hour presentation which he is also going to present at VMworld in San Francisco and probably  also in Barcelona.

The Converged Data Management solution of Rubrik can be scaled from 3 nodes to infinite. Enabling a backup solution with a unlimited retention time without the need for tapes. And let’s be honest. Tape should be death but if you want a retention time more than a year it’s hard to get around tapes these day’s. There solution can be attached to the Amazon cloud with S3 so you’re able to do a tier2 backup in house (or on a second datacenter) and a tier3 backup in the cloud. When a file or a virtual machine needs to be restore you just query the database of Rubrik (which is a Casandara database) for that specific file or virtual machine and a version who you want to restore. It doesn’t matter if this file is situated in your private datacenter or in the public cloud. This is completely transparent and encrypted. Using the cloud for tier3 backup can be useful for companies who need a long retention time like legal or health care.

I especially like to 2 things about there solution.

In traditional backup solutions you create a backup job where you configure several settings like destination and retention time. In Rubrik you create a backup policy and attaches this policy to a virtual machine. This policy has al the necessary setting configured. The dashboard of Rubrik gives a overview how long you can keep your backup based on the current capacity. If after a few years you need more capacity, you just add more nodes to the solution. Enabling you to grow on demand when every you need to.

Because Rubrik is API driven you can create custom backup scripted as you need. Currently Rubrik has no support for application aware backup for example Exchange. This is one the definitely looking into. But with the API (who are well documented) it shouldn’t be that hard to create a consistent application aware backups.

Platform9 was so kind to tell us there story during lunch. Platform9 developed a cloud solution for administrating and automating a multi hypervisor environment in the private cloud who can be global situated. There main focus is Openstack but also docker, KVM and VMware are supported.

The first part of our meeting we saw a slide deck prsented by Sirish Raghuram. After about 30 minutes the projector was turned of and we had a real nice and inspiring discussion with Hirish Raghuram and Madhura Maskasky about there product and what the world needs a this moment.

Traditional solutions are normally VM based and not workload based. Meaning that if your webserver environment needs more performance you add more performance to your webserver environment. This can be a web server or maybe a proxy with a load balancer configured or both. This is a completely different approach of deploying services. It doesn’t matter where it runs, on which hypervisor it runs, you just need the capacity for this type of services.

That there is going to be a  multi hypervisor world in the next couple of years it for sure. I know that a lot of companies don’t want to support more than 1 hypervisor because of the cost of managing different platforms but what if you have 1 management tool who can do it all.  Yes, of course you need to support vSphere, KVM and/or Openstack.  But I’m convinced the installation and maintenance of these hypervisors will be simplified in the next couple of years. And does the other hypervisor need to be on-premise? Or can this also be the public cloud?
The need for a capacity calculation is no longer needed if you just can add resources to your environment as you grow or shrink. Yes of course there are licenses involved. But the licensing model as we know right now will change. It has to! The number 1 cost in a datacenter at the moment are licenses. This is why the public cloud can be interesting for companies. You do not need to worry about licensing, hardware cost, etc. But I can imaging you don’t want all your data in the cloud. Putting your data in the cloud is easy. Getting it out can be painful. So a transparent solution supporting multiple hypervisors is a welcome solution!




Silicon Valley Road Trip

A few weeks ago I received a tweet from @frankdenneman if I was interested to join him and a few other guy’s for a short road trip through Silicon Valley to visit a couple of start-ups before attending VMworld San Fransisco.

I didn’t have to think long if I wanted to join, but had to discusses this with my boss at home (yeah, all you guy’s out there know what I mean 🙂 )

Frank made a list of the company’s who are interesting for us to visit:

The company’s who aren’t confirmed:

Fist of all thanks to the companies who are willing to make the effort the welcome us and show us there company! Of course we all know what your company does, but it’s always great to meat passionated people to share information and thoughts.

If you’re working for a company who hasn’t confirmed that we’re welcome, please do! We sure like to visit you!


Nutanix .NEXT first day

When I got to the key note this morning, the word was out that there would be some pretty cool announcements be made. And that was the fact.

One of the biggest announcements is Acropolis.
This is Nutanix own hypervisor platform with there management interface Prism. The hypervisor is based on KVM but is modified to run optimal with Nutanix filesystem. Acropolis is not the name of the hypervisor but the solution. And it can not only manage KVM but also VMware and Hypev-V. During the key-note we saw a a demo that a ESXi hosts was reinstalled with KVM managed from the Prism interface.
The vision of Nutanix is a multi-hypervisor world were you put your workload on the hypervisor you think it best fits. Al managed from one interface. Even the creation of one or more VMs are done from Prism.

I must say that I’m impressed with the capabilities of Acropolis. One single interface for multiple hypervisors. Nice!

I don’t think that a company want to run multiple hypervisor in there environment. Even if it’s managed from one interface. Every hypervisor must be managed and configured for performance, security and patches. Same goes for upgrades. Most company will choose one hypervisor as there standard.

On the other hand, if you choose for a web scale solution like Nutanix, you have the possibility to change your hypervisor without loosing your management interface. Reducing the learning curve for your IT department. And you have to possibility to use a different (cheaper) hypervisor for you R&D or LAB environment.

Some other features of Acropolis are:

  • Ip address management for virtual machines.
  • Collection of statistics of different hypervisors.
  • HA scheduler
  • VM console (done with VNC)
  • Configuring network settings for VMs.

Beging a old Novell consultant/trainer I’m pleased that a open source project as KVM gets a boost for the enterprise with a company like Nutanix. Although not everything is possible at the moment (VDI for example) I’m confinced this is only a matter of time.

I’m impressed with the work that Nutanix has done in the last 5 years and I’m curious whats coming .NEXT.


Hyper converged vs. traditional SAN/NAS?

The Hyper converged market is booming. Vendors like Nutanix, Simplivity or VMware with there VSAN are almost on every event spreading the word. But is Hyper converged the solution to everything? As always the answer to this question is: It depends.

Let me start by explaining the basics of a hyper converged system.

Hyper converged it a method of combining the local storage of multiple servers to one logical shared storages. So when a hypervisor does a read/write, it’s talking to local storage.
Most of these solutions are using a auto-tiering solution with SSD or RAM in every host, so when the hypervisor does a read/write, it’s talking to a local SSD.
Hyper converged solutions make sure that when a VM runs on hosts A the data of this VM is also on the local storage of Host A. When this VM gets moved to another host, the data of that VM is moved also.

Talking to local SSD , Flash or RAM has some advantages than talking to SSD, Flash or RAM in a SAN/NAS and that is latency. When talking to as SAN/NAS you always have to go through a storage network. This storage network brings extra latency that local SSD, Flash or RAM doesn’t have. It doesn’t matter if this storage network is ethernet or fiber channel, although fiber channel has lower latency than ethernet.
Oke, when the hosts write a block of data, this block has to be replicated to another host. So your storage network can be a bottleneck again. But with read you cannot beat local storage in case of IOPS and latency. Even when you have a full flash storage array.

You can ask yourself: Does the extra latency that a storage network adds matter? And of course the answer again is: It depends.

When you have for example a desktop workload (VDI, RDS), a database workload or a webserver, the low latency and high IOPS can me a great advantage. When you just have a file or print server you can accept a higher latency and less IOPS.

So what is the disadvantage of a hyper converged solution?
In my opinion the amount of TB you have in a hyper converged solution and the costs that this brings with it.
In a SAN/NAS you can easily have 1 PB of storage. Yes, this isn’t storage with 100K IOPS and a latency of less than 1 ms but who cares? We all know that in most cases 80% of the storage we have in our environment is that data who haven’t been toughest in the lasts 2 years.

There is also a third solution like PernixData FVP. In this case you add Flash, RAM in you hosts and use this like a intelligent caching solution for you SAN/NAS. This brings the same advantage of a hyper converged solution.
A solution like PernixData has 2 disadvanteges:

  1. When you read a block the first time, you have to fetch it first from you SAN/NAS.
  2. When you write data, this is on local SSD, Flash or memory but after a while it has to written to your SAN/NAS. This can only be this fast as your SAN/NAS is.

So, to come to a conclusion: I think that when you have a workload that demands low latency and high IOPS you cannot beat hyper converged. If you want to have a lot of TB than you want to go for a traditional SAN/NAS with or without a solution like PernixData.

Do you have to make a choice? Of course not, why not use them both. Use the type of storage or the type of workload you think is best.
Use the best of both worlds in your world!




Most simple web server ever!

When I install VMware vSphere on a server, I always (oke 9 out of 10 times) use a scripted install. A scripted installation has a couple of advantages over a manual installation like:

  • Every host is installed with the same configuration, limiting the possibilities of a misconfiguration by a administrator (even after a year).
  • The installation is finished in less time
  • You have a good disaster recovery of your hosts available.
  • You can (re)installed multiple hosts at the same time.
  • You have a good documentation how your hosts is configured.
  • You can have a cup of coffee while installing 🙂

Almost every server has a IPMI, ILO or DRAC interface enabling you to connect a ISO file to a virtual CD-ROM player so you can start your ESXi installation.

In the past I used a USB stick to host the ks.cfg file. So when the ESXi installation kicks-off you just point the kickstart to the USB stick and your ready to go.
But since a couple of years ESXi is more often installed on a local SD card. This SD card is connected to a internal USB interface. So you have to configure your kickstart file to install on the correct USB interface or you will overwrite your USB stick where your kickstart file is located (yes, it happened to me a couple of times).
In large environments you may want to use a PXE boot environment where you can hosts you kickstart file on a NFS share or a web server. But what if you only have to install a couple of hosts in a small environment?

A simple solution to this problem is to host a small web server on your laptop where you host the kickstart file.

For Microsoft Windows there are a couple of simple small web servers available but I’m using a Mac for my daily work. So will searching the Internet for a good, small and free web server I stumbled on a threat where a guy pointed out that Mac OSX has his on built in web server in Python!

Just open a terminal screen an give the following command:

python -m SimpleHTTPServer 8080

This command will start a simple, small and free web server on you Mac in the current directory (beware of your firewall!) on port 8080. Just as simple as that!

I haven’t tested this on a Windows installation where Python is installed, but I can imaging that it will work.


Multiple 1G or 10G in your data center?

The last couple of weeks I had the same discussion with a 3 different customers: Are we going to use multiple 1G (in a LACP or not) connections to the core switch of 2 10G connections?

When designing a VMware vSphere environment including a storage design, this is one of the first questions that pop-up.

As always the truth is in the middle, both solutions have there pro’s and con’s.

But first, let me clear something that you all probably know. 10 x 1G connections to your core switch isn’t the same as 1 x 10G connection to your core switch. Usally I use the following example for a customer to explain the difference.

You can compare 10 x 1G connections to a high way with 10 lanes where every car (packet) can drive at the speed of 1G. So the maximum throughput per session is no more than the speed of that lane. So in this case 1G. Every vSphere host has 1 connections on the high-way where he can place car’s on the road. The advantage is that if you have more vSphere hosts, you can add more cars on the road at the same time.

If you have a 10G connection (or in many cases 2 x 10G because you want to be redundant)  you have on lane on the high-way where all the cares can drive at the speed of 10G. Because the car’s can drive at a higher speed, there is more room on the high-way to place car on. Even when you configure a LACP (which is available with a distributed switch from version 5) you cannot get a higher session speed than 1G.

10G is especially nice when you have a NFS or iSCSI storage solution. Not that most storage solution use the whole 2 x 10G (hence most cannot fully utilize 1 10G connection) but the session between your vSphere hosts and storage solution is more than 1G.

This is in my opinion the most important reason why you  want 10G in you data center. That you can have more than a 1G connection per session.

In the past the price for a 10G solution was a bit of a issue. Now a day’s you can have a redundant 10G L3 core switches from vendors like HP and Cisco under 15K SFP’s and cabling included.

Another pro is the cabling in your rack. You have a cleaner rack who is easier to administer when you have only 2 cables per hosts than 6 or 8 cables per host. Less cables means that your core switch can do with less ports,  resulting in a smaller switch. Of course this switch has to handle a lot of traffic, but rack space is also important nowadays.

So resume. If you have the budget, I would go for 10G. Not that we’re going to use to full 10G but we sure want to use more than 1G. You have to talk to your budget and stack holders what the best solution is for your design.




1 Month after VMworld: What did it brought me?

It has been more than a month ago since I went to VMworld in Barcelona. You see many blog post about new startups, technology and interesting session. I think that VMworld is more than only the (great) technology stuff you can hear about. Eventually you can download almost all the sessions afterwards. I think the whole event is interesting. This includes talking to vendors, other virtualization fanatics and meeting new people.

Here a sum of what it brought me.


Of course you get a lot (and I mean a lot) of gifts. Beside pencils, peppermints, stress balls  and block notes I got the following interesting goodies.

  • SimpliVity was so nice to give all vExpert a Raspberry PI. I’m real excited about this small computer and the stuff that you can do with it. Unfortunately I haven’t had the time to decide what to do with it. But in a few weeks it’s Christmas and that is the perfect time to play around with it.
  • Nutanix gave all vExpert a personal vExpert glass. Even when I didn’t talk to the right person on VMworld. They contacted me after VMworld through Twitter and asked me for my address.
  • PernixData gave me a real nice polo that I can wear during my training classes.

SimpliVity, Nutanix and PernixData thanks! I really appreciate it!


VMworld is also about networking. I had really nice chats with people from Nexenta, PernixData, Nutanix, VMware. Especially talking to guru’s is great. You can drain so much information from them. Especially when it’s on one of the (vendor) party’s. Here you can meet and talk with a lot of VMware guru’s in a informal way. As you all know, the party’s of VMworld a famous. Probably everybody who went to the Veeam party had a hard time to wake up the next morning -:)

Every year I speak to some people that I only see at VMworld. It’s great to catch up and talk about virtualization or oder stuff. 1 month after VMworld one of those person contacted me if I was interested in a job for job designing a 2500 Horizon View environment with Nutanix servers. Although I haven’t had experience working with Nutanix, VMworld gave me enough knowledge to get me this job and I probably will be playing with Nutanix till the summer of 2015.

As you van see. VMworld is more than only technical session and commercial talks from vendors. The whole event is great and I will go for use next year!



Nutanix alert not cleared in Prism

When I visit a Nutanix customer, I always check the cluster health and if there are no alerts.

One customer had a error that wasn’t acknowledged for more than 45 day’s. And that alert repeated itself every hour resulting in more than 500 critical alerts.

After resolving the issue (rebooting the CVM) I wanted to clear the critical alerts in Prism. The procedure for this is quite simple. Just go to Alerts | mark all Alerts| click  Acknowledge and Resolve.

I this case I noticed that the Alerts where not cleared. Although the where Acknowledged and Resolved.  Repeating the action did not solved the problem.

The solution was to resolve the Alert through NCLI with the following script:

for alert in `ncli alert ls | grep ID | awk ‘{print $3}’`;  do echo “Resolving alert $alert”;  ncli alert resolve ids=$alert; sleep 2; done

You may have to run the script multiple times until all Alerts are cleared.


How to check the Nutanix cluster metadata store

Just a quick note.

While upgrading 32 Nutanix nodes for a customer, I wanted to make sure that every node is part of the metadata store.

This can be achieved by entering the command:

nodetool -h localhost ring

Output will look like:

nutanix@NTNX-14SX31290007-C-CVM:$ nodetool -h localhost ring
Address         Status State   Load            Owns    Token
                                                          w7iesrvWOTsU53XPvNTlWCVgub36H9PIJcE3nYDS2rHb4N7XzJEUpGp            lSYYc     Up     Normal     1.99 MB         7.18%   0ZbVFDk5VFoTcX78ofInvdnKw0uwRcxGqQsWrwEqmAFERi9ylGNkW86            9cgAe     Up     Normal     3.32 MB         11.31%  7aP5UmSTUO2pKO0I6AoPOit7jPzFGXlrDPYvimtqqT3qYI5el9ZXHUx            EdZZf     Up     Normal     3.42 MB         6.38%   BXeWgqIKH85IUcIITh1fjgcoYcv1EzVO15vjlxqPKFCUbHEqRhXMLCw            h9xy7     Up     Normal     3.16 MB         7.14%   FxyoAoJjF94Yd548AgmY8sXaoLtw5YRyF6dpDTbnXLzOCThcxzM9JrG            md2wa     Up     Normal     2.51 MB         5.83%   JaB5mJO5Q7DtPoHbPJd7QFRop57CY3dw3V0LwuJk58EnRPDxmRBr6FM            HLN3R     Up     Normal     1.52 MB         6.40%   NYI0KclyR6E8A6Yuok88lWE0yyzBil7vYe6dNrJfaB9iYem7X1D0sHJ            p2oyZ     Up     Normal     1.36 MB         6.28%   RRnlFLPALjpn3sCsA7qaMq7Msf8BEEpds0uSokeYLtlYYrb9gBr3mUw            LVcBS     Up     Normal     1.56 MB         6.81%   VfPTSjlJyU6ZFogXfaHlFKBwScidSEs61CLtW51mMb2SCOcTZauL7lc            xXrzE     Up     Normal     1.32 MB         3.85%   Y3SsPwG3mIRdzZVeED7hKZoCkH32NbDuPqBSH9moN6vtwvD8OGrFR3o            vNXyi     Up     Normal     2.31 MB         7.43%   cewHQfBTGVNdNy1BSABrh3DrI7XmbBLtF1EcvZ248cNygdiyAeYv0rk            SapAL     Up     Normal     2.82 MB         6.36%   gbLIMnJplxOXX4Jbk4jNxXnMo8njhxcrj6RFsMtZ5hQ7ha6hjT2wsi5            7lsrn     Up     Normal     3.16 MB         6.06%   kMR8DmhH2i1YuRgL746IOOXlsv5hitFUHjLO78K1dAlnPBcLdeXKjjX            h2EVF     Up     Normal     2.74 MB         6.24%   oEEJaKrjE1ukEkzB6n6fOVUyNK3P8qLkoFyHqrTzXItNBtNS1fxYBv8            DvmjW     Up     Normal     1.71 MB         6.70%   sNu0haLD0Zmb3JjtUFkk0Iffuqf0EwspDDbfVD4bDtiHCCarjVF18nY            1zBea     Up     Normal     1.8 MB          6.03%   w7iesrvWOTsU53XPvNTlWCVgub36H9PIJcE3nYDS2rHb4N7XzJEUpGp            lSYYc