Multiple 1G or 10G in your data center?

The last couple of weeks I had the same discussion with a 3 different customers: Are we going to use multiple 1G (in a LACP or not) connections to the core switch of 2 10G connections?

When designing a VMware vSphere environment including a storage design, this is one of the first questions that pop-up.

As always the truth is in the middle, both solutions have there pro’s and con’s.

But first, let me clear something that you all probably know. 10 x 1G connections to your core switch isn’t the same as 1 x 10G connection to your core switch. Usally I use the following example for a customer to explain the difference.

You can compare 10 x 1G connections to a high way with 10 lanes where every car (packet) can drive at the speed of 1G. So the maximum throughput per session is no more than the speed of that lane. So in this case 1G. Every vSphere host has 1 connections on the high-way where he can place car’s on the road. The advantage is that if you have more vSphere hosts, you can add more cars on the road at the same time.

If you have a 10G connection (or in many cases 2 x 10G because you want to be redundant)  you have on lane on the high-way where all the cares can drive at the speed of 10G. Because the car’s can drive at a higher speed, there is more room on the high-way to place car on. Even when you configure a LACP (which is available with a distributed switch from version 5) you cannot get a higher session speed than 1G.

10G is especially nice when you have a NFS or iSCSI storage solution. Not that most storage solution use the whole 2 x 10G (hence most cannot fully utilize 1 10G connection) but the session between your vSphere hosts and storage solution is more than 1G.

This is in my opinion the most important reason why you  want 10G in you data center. That you can have more than a 1G connection per session.

In the past the price for a 10G solution was a bit of a issue. Now a day’s you can have a redundant 10G L3 core switches from vendors like HP and Cisco under 15K SFP’s and cabling included.

Another pro is the cabling in your rack. You have a cleaner rack who is easier to administer when you have only 2 cables per hosts than 6 or 8 cables per host. Less cables means that your core switch can do with less ports,  resulting in a smaller switch. Of course this switch has to handle a lot of traffic, but rack space is also important nowadays.

So resume. If you have the budget, I would go for 10G. Not that we’re going to use to full 10G but we sure want to use more than 1G. You have to talk to your budget and stack holders what the best solution is for your design.

 

 

1 Month after VMworld: What did it brought me?

It has been more than a month ago since I went to VMworld in Barcelona. You see many blog post about new startups, technology and interesting session. I think that VMworld is more than only the (great) technology stuff you can hear about. Eventually you can download almost all the sessions afterwards. I think the whole event is interesting. This includes talking to vendors, other virtualization fanatics and meeting new people.

Here a sum of what it brought me.

Gifts

Of course you get a lot (and I mean a lot) of gifts. Beside pencils, peppermints, stress balls  and block notes I got the following interesting goodies.

  • SimpliVity was so nice to give all vExpert a Raspberry PI. I’m real excited about this small computer and the stuff that you can do with it. Unfortunately I haven’t had the time to decide what to do with it. But in a few weeks it’s Christmas and that is the perfect time to play around with it.
  • Nutanix gave all vExpert a personal vExpert glass. Even when I didn’t talk to the right person on VMworld. They contacted me after VMworld through Twitter and asked me for my address.
  • PernixData gave me a real nice polo that I can wear during my training classes.

SimpliVity, Nutanix and PernixData thanks! I really appreciate it!

Networking

VMworld is also about networking. I had really nice chats with people from Nexenta, PernixData, Nutanix, VMware. Especially talking to guru’s is great. You can drain so much information from them. Especially when it’s on one of the (vendor) party’s. Here you can meet and talk with a lot of VMware guru’s in a informal way. As you all know, the party’s of VMworld a famous. Probably everybody who went to the Veeam party had a hard time to wake up the next morning -:)

Every year I speak to some people that I only see at VMworld. It’s great to catch up and talk about virtualization or oder stuff. 1 month after VMworld one of those person contacted me if I was interested in a job for job designing a 2500 Horizon View environment with Nutanix servers. Although I haven’t had experience working with Nutanix, VMworld gave me enough knowledge to get me this job and I probably will be playing with Nutanix till the summer of 2015.

As you van see. VMworld is more than only technical session and commercial talks from vendors. The whole event is great and I will go for use next year!

 

Nutanix alert not cleared in Prism

When I visit a Nutanix customer, I always check the cluster health and if there are no alerts.

One customer had a error that wasn’t acknowledged for more than 45 day’s. And that alert repeated itself every hour resulting in more than 500 critical alerts.

After resolving the issue (rebooting the CVM) I wanted to clear the critical alerts in Prism. The procedure for this is quite simple. Just go to Alerts | mark all Alerts| click  Acknowledge and Resolve.

I this case I noticed that the Alerts where not cleared. Although the where Acknowledged and Resolved.  Repeating the action did not solved the problem.

The solution was to resolve the Alert through NCLI with the following script:

for alert in `ncli alert ls | grep ID | awk ‘{print $3}’`;  do echo “Resolving alert $alert”;  ncli alert resolve ids=$alert; sleep 2; done

You may have to run the script multiple times until all Alerts are cleared.

How to check the Nutanix cluster metadata store

Just a quick note.

While upgrading 32 Nutanix nodes for a customer, I wanted to make sure that every node is part of the metadata store.

This can be achieved by entering the command:

nodetool -h localhost ring

Output will look like:

nutanix@NTNX-14SX31290007-C-CVM:10.83.9.152:~$ nodetool -h localhost ring
Address         Status State   Load            Owns    Token
                                                          w7iesrvWOTsU53XPvNTlWCVgub36H9PIJcE3nYDS2rHb4N7XzJEUpGp            lSYYc
10.83.9.153     Up     Normal     1.99 MB         7.18%   0ZbVFDk5VFoTcX78ofInvdnKw0uwRcxGqQsWrwEqmAFERi9ylGNkW86            9cgAe
10.83.9.169     Up     Normal     3.32 MB         11.31%  7aP5UmSTUO2pKO0I6AoPOit7jPzFGXlrDPYvimtqqT3qYI5el9ZXHUx            EdZZf
10.83.9.165     Up     Normal     3.42 MB         6.38%   BXeWgqIKH85IUcIITh1fjgcoYcv1EzVO15vjlxqPKFCUbHEqRhXMLCw            h9xy7
10.83.9.156     Up     Normal     3.16 MB         7.14%   FxyoAoJjF94Yd548AgmY8sXaoLtw5YRyF6dpDTbnXLzOCThcxzM9JrG            md2wa
10.83.9.161     Up     Normal     2.51 MB         5.83%   JaB5mJO5Q7DtPoHbPJd7QFRop57CY3dw3V0LwuJk58EnRPDxmRBr6FM            HLN3R
10.83.9.173     Up     Normal     1.52 MB         6.40%   NYI0KclyR6E8A6Yuok88lWE0yyzBil7vYe6dNrJfaB9iYem7X1D0sHJ            p2oyZ
10.83.9.176     Up     Normal     1.36 MB         6.28%   RRnlFLPALjpn3sCsA7qaMq7Msf8BEEpds0uSokeYLtlYYrb9gBr3mUw            LVcBS
10.83.9.152     Up     Normal     1.56 MB         6.81%   VfPTSjlJyU6ZFogXfaHlFKBwScidSEs61CLtW51mMb2SCOcTZauL7lc            xXrzE
10.83.9.181     Up     Normal     1.32 MB         3.85%   Y3SsPwG3mIRdzZVeED7hKZoCkH32NbDuPqBSH9moN6vtwvD8OGrFR3o            vNXyi
10.83.9.164     Up     Normal     2.31 MB         7.43%   cewHQfBTGVNdNy1BSABrh3DrI7XmbBLtF1EcvZ248cNygdiyAeYv0rk            SapAL
10.83.9.177     Up     Normal     2.82 MB         6.36%   gbLIMnJplxOXX4Jbk4jNxXnMo8njhxcrj6RFsMtZ5hQ7ha6hjT2wsi5            7lsrn
10.83.9.168     Up     Normal     3.16 MB         6.06%   kMR8DmhH2i1YuRgL746IOOXlsv5hitFUHjLO78K1dAlnPBcLdeXKjjX            h2EVF
10.83.9.172     Up     Normal     2.74 MB         6.24%   oEEJaKrjE1ukEkzB6n6fOVUyNK3P8qLkoFyHqrTzXItNBtNS1fxYBv8            DvmjW
10.83.9.180     Up     Normal     1.71 MB         6.70%   sNu0haLD0Zmb3JjtUFkk0Iffuqf0EwspDDbfVD4bDtiHCCarjVF18nY            1zBea
10.83.9.160     Up     Normal     1.8 MB          6.03%   w7iesrvWOTsU53XPvNTlWCVgub36H9PIJcE3nYDS2rHb4N7XzJEUpGp            lSYYc


Updating a single Nutanix node

All the controller VMs of a Nutanix cluster have to have the same version of NOS installed. If you upgrade a cluster. All the CVMs are upgraded one-by-one.If you want to add a new node to the cluster, that node has to have to same version of NOS.

So what if you bought new Nutanix servers and they have a previous version of NOS installed. You cannot add this node to cluster because the NOS version is not the same.

There is a command that you can issue from a CVM that’s part of the cluster to upgrade a single Nutanix node.

  1. Login to a CVM that’s part of the cluster.
  2. Issue: /home/nutanix/cluster/bin/cluster -u [IP of the node that will be upgraded] upgrade_node

Note that this only upgrades the node and doesn’t downgrade it.