Viva la HashTable

We always tried to fully optimize scripts and applications that we developed because as soon as the infrastructure is quite large, the gain will not be insignificant (as we have seen previously with vCheck optimizations).

While we were looking to reduce execution time of some script (or should I say OneLiner), we found a new way for script enhancement, in the same spirit as NoSQL, based on a key-value association rather than a complex relational schema.

This is done using hash tables that will be populated at the beginning of the script and be used afterward. The hash table structure is very useful for search operations, because the time consumption is very low, with an average of O(1) in BigO notation (aka no time wasted):

For instance, we had a relatively simple OneLiner which listed virtual machines that were not in a resource pool (ie with ‘Resources’ resource pool as parent, you can find more information on page 43 of VMware vSphere Resource Management document) and their cluster:

get-view -ViewType virtualmachine -Property ResourcePool, Name | ?{(Get-View $_.ResourcePool -Property Name).Name -eq "Resources"} | Select @{n="Cluster";e={(get-view (Get-View $_.ResourcePool -Property Parent).Parent -Property Name).Name}}, Name

This OneLiner filters already all Get-View calls to minimize execution time but is still taking a lot of time…

So we tried to look into hash tables in order to see what could be done with it. We created several hash tables in order to get rid of Get-View calls (except of course for the first one retrieving virtual machines).

$htabResourcePool = @{}
$htabResourcePoolParent = @{}
$htabCluster = @{}

Once hash tables were declared, we filled them up with needed data (using MoRef as key) :

get-view -viewtype resourcepool -property name -Filter @{"name"="Resources"} | %{$htabResourcePool.Add($_.MoRef,$_.Name)}
get-view -viewtype resourcepool -property parent -Filter @{"name"="Resources"} | %{$htabResourcePoolParent.Add($_.MoRef,$_.Parent)}
get-view -viewtype clustercomputeresource -property name | %{$htabCluster.Add($_.MoRef,$_.Name)}

Finally we use these tables in the OneLiner, replacing Get-View calls by hash tables search operations:

get-view -ViewType virtualmachine -Property ResourcePool, Name | ?{$htabResourcePool[$_.ResourcePool] -eq "Resources"} | Select @{n="Cluster";e={$htabCluster[$htabResourcePoolParent[$_.ResourcePool]]}}, Name

For now, we reach the same results as the method without hash tables but now we have to look into the execution time :p

So we created a sample script which will run process using both methods:

  1. the first method will use Get-View calls only
  2. the second method will populate hash tables and will use them afterward (the hash table filling will be part of the process to fit the exact same perimeter).

The script will display the two methods’ results’ count (to ensure the similarity of the returned content) also with the duration. In our example, we had a little less than 2200 VMs on the platform and here are the results:

In order to not mess with hypervisor.fr (we’re already hearing him shouting “Don’t touch my OneLiner!”, the chosen example is actually very meaningful because of the Get-View calls overlaping so the transition to hash tables were very effective :p

It is possible though that depending on the script used, the gain of using hash tables will not be as obvious as it were in this example. However, we invite you to test and compare the execution time.

Here is the script used to perform benchmark using both methods:

Write-Host -ForegroundColor Yellow "Benchmarking starting..."

Write-Host "`nMethod 1 (with regular filtered Get-View)"
$startMethod1 = Get-Date
$resultMethod1 = get-view -ViewType virtualmachine -Property ResourcePool, Name | ?{(Get-View $_.ResourcePool -Property Name).Name -eq "Resources"} | Select @{n="Cluster";e={(get-view (Get-View $_.ResourcePool -Property Parent).Parent -Property Name).Name}}, Name
$endMethod1 = Get-Date

Write-Host -ForegroundColor Green "Found"(($resultMethod1 | Measure-Object).Count)"records in"(($endMethod1 - $startMethod1).TotalSeconds)"seconds"

Write-Host "`nMethod 2 (with hashtable)"
$startMethod2 = Get-Date
$htabResourcePool = @{}
$htabResourcePoolParent = @{}
$htabCluster = @{}

get-view -viewtype resourcepool -property name -Filter @{"name"="Resources"} | %{$htabResourcePool.Add($_.MoRef,$_.Name)}
get-view -viewtype resourcepool -property parent -Filter @{"name"="Resources"} | %{$htabResourcePoolParent.Add($_.MoRef,$_.Parent)}
get-view -viewtype clustercomputeresource -property name | %{$htabCluster.Add($_.MoRef,$_.Name)}

$resultMethod2 = get-view -ViewType virtualmachine -Property ResourcePool, Name | ?{$htabResourcePool[$_.ResourcePool] -eq "Resources"} | Select @{n="Cluster";e={$htabCluster[$htabResourcePoolParent[$_.ResourcePool]]}}, Name
$endMethod2 = Get-Date

Write-Host -ForegroundColor Green "Found"(($resultMethod2 | Measure-Object).Count)"records in"(($endMethod2 - $startMethod2).TotalSeconds)"seconds"

vExpert 2013

Early this morning John Troyer had release the vExpert 2013 list and we had the pleasure to be in it!

We would like to thanks again all the readers for everything and give a big hurray to John Troyer for the vExpert program and what it means!

Hail to the vOpenData baby !

Here is a quick post talking about vOpenData project on which we had the chance to offer our assistance with hypervisor.fr

I will not detail all the project history (you can get it in the must-read posts from William Lam vOpenData: An Open Virtualization Community Database or from Ben Thomas vOpenData – Crunching Everyone’s Data For Fun And Knowledge), but basically, vOpenData is a community based statistics database (giving us some answers for eternal questions as “What is the average size of VMDK?” or “What is the average consolidation ratio?”)

The project is composed with a small script (Perl or PowerCLI, pick one!) available from GitHub (https://github.com/vopendata/scripts) which will gather some statistics from your environnement (fully anonymously, there is no name, only UUID are reported), and the www.vopendata.org website which will let you upload the gathered data.

In the end, the result is just amazing, in just a few minute you’ll have uploaded your stats which will be displayed on a huge dashboard (combining all data gathered from community) available on http://dash.vopendata.org/public :

As you guessed out, all the project is based on community effort, so we count on you to participate, come visit and support www.vopendata.org !

List of ESXi servers with their IP

Here is a little memento we forgot to post a while ago, in order to get name and IP of all ESXi servers. There are several ways of getting these data, you can use only vSphere SDK properties:

Get-View -ViewType HostSystem -Property Name,Config | Select Name, @{n="IP";e={$_.config.network.vnic.spec.ip.ipaddress}}

Or you can use DNS resolve:

Get-View -ViewType HostSystem -Property Name | Select Name, @{n="IP";e={[System.Net.Dns]::GetHostAddresses($_.Name)}}

Even if the results are the same, execution time differ widely. With ~200 serveurs ESXi, the vSphere SDK method takes around 26s to complete (even with the use of Property filter for the Get-View cmdlet) instead of 0,3s for the DNS one:

Powershell v3 and PowerCLI

Yesterday evening, we heard about a great news, a new PowerCLI release (5.1 Release 2):

Alan Renouf published a post about it, explaining what’s new with Distributed Switches cmdlets: PowerCLI 5.1 R2 Released

As we were reading the release note (available here, there is always good information in it!), we finally saw Powershell v3 support for PowerCLI !!!

This will allow a lot of good stuff, including performance enhancement of Get-ChildItem cmdlet. We will make further tests in order to try again the use of PSDrive in order to get some VM files (like .vmx).

We also saw new way for server connection (this method is quite the same VIClient plugin used) :

vSphere PowerCLI introduces an improvement in PowerCLI views.
With the VimClient.Connect() method, you can now connect to a server by server session ID.

Finally, this release is awesome, let’s play ^^ The update is available on the VMware website: http://blogs.vmware.com/vipowershell/2013/02/powercli-5-1-release-2-now-available.html

Page 1 sur 26123451020Dernière page »