Netapp ONcommand Plug-in for SCOM 2019

This week i had the pleasure to install the OnCommand Plug-In to add the NetApp storage systems to our SCOM 2019 environment. It was a rather interesting installation that took me more time then I expected. I had to work around some issues. The software clearly isn’t ready for SCOM 2019. So I decided to write a post about it to maybe help others that suffer with installation problems as well.

Good thing to know is the environment I’m installing to:

  • SCOM 2019 on Windows Server 2016
  • Programs will all be installed on D:\
  • SQL 2016 on Windows Server 2016
  • Two dedicated SCOM servers for non server systems (like netapp, nimble, network devices etc.)
  • SCOM Console is installed on the two management servers.

First of all you have to download the software with NetApp from the support site using your NetApp account credentials. Also the installation guide and administrations guide are very useful and should be downloaded from here.

Now, I’m not going to talk about all the prerequisites, these are pretty good mentioned in the setup documentation. I will just show you how I took care of the problems that I came across. I will go through the whole setup and mitigate any problem I find.          

Step 1, start setup. As we use UAC  we must confirm everything. When you open the setup click Run to start  

Click Next

Since we need to change the default location to D:\ we click Change.

Change C:\ to D:\ and leave the rest

Click Next

Ok, i would expect that everything would be selected by default… that’s odd….

Let’s select the Integration Feature…. since we need this to administer the NetApp systems in SCOM.

Ok… what the heck is this????? I got the SCOM 2019 console installed to the system. That definitely is a later version than SCOM Console 2012 SP1 . I tried several thing to work around this problem in the current release… no luck…… So … If you click OK

The feature selection will default back and remove the SCOM Console Integration

Ok.. this is a blocking issue for me… so.. the SCOM 2019 console isn’t recognized, but I need this console to add the storage systems to SCOM. What I did from here on is cancel the install, adjusted the SCOM installation and removed the 2019 Console

After that I installed the SCOM 1801 console beside the SCOM1901 Management Server (Yes this works if you install the plain Console MSI) located in the directory %drive%\setup\AMD64\Console

Now if we rerun the istaller we will see that the SCOM Console Integration is preselected. So now we can install the integration plugin, and the upgrade the SCOM console afterwards.

Now we need a service account for the Plugin. Enter the credentials and than click next.

Shit, forgot to add the service account to the local adminsitrators group. Just do it on the spot and contiunue. After that you can click Next.

Enter the database credentials and click Next. For us, this is a remote SQL server.

If everything is working, we can click Next.

Ok, next problem. Database won’t install. So if you google LaunchResult 4, you will find some things. For the record i will first look at the log files.

First i note the location of the log file, than i will click next and afther that cancel to cancel the installation. A rollback will take place.

So copying the first line of the logfile will bring you to a NetApp article.
https://kb.netapp.com/app/answers/answer_view/a_id/1005355/~/create-failed-for-database-ocpmdb-%28oncommand-plug-in-for-microsoft-data-base%29-

After looking at all the requirements and several installations i decided to just pre-create the database at the SQL server using the following SQL query:

Create DB
USE [master]
GO

CREATE DATABASE [OCPMDB]
CONTAINMENT = NONE
ON  PRIMARY 
( NAME = N'OCPMDBData', FILENAME = N'E:\Data\MSSQL13.SQLAMS\MSSQL\DATA\OCPMDBData.mdf' , SIZE = 1048576KB , MAXSIZE = 21474304KB , FILEGROWTH = 20%)
LOG ON 
( NAME = N'OCPMDBLog', FILENAME = N'F:\Logs\MSSQL12.SQLAMS\MSSQL\Data\OCPMDBLog.ldf' , SIZE = 1048576KB , MAXSIZE = 21474304KB , FILEGROWTH = 10%)
GO

ALTER DATABASE [OCPMDB] SET COMPATIBILITY_LEVEL = 130
GO

IF (1 = FULLTEXTSERVICEPROPERTY('IsFullTextInstalled'))
begin
EXEC [OCPMDB].[dbo].[sp_fulltext_database] @action = 'enable'
end
GO
ALTER DATABASE [OCPMDB] SET ANSI_NULL_DEFAULT OFF 
GO
ALTER DATABASE [OCPMDB] SET ANSI_NULLS OFF 
GO
ALTER DATABASE [OCPMDB] SET ANSI_PADDING OFF 
GO
ALTER DATABASE [OCPMDB] SET ANSI_WARNINGS OFF 
GO
ALTER DATABASE [OCPMDB] SET ARITHABORT OFF 
GO
ALTER DATABASE [OCPMDB] SET AUTO_CLOSE OFF 
GO
ALTER DATABASE [OCPMDB] SET AUTO_SHRINK OFF 
GO
ALTER DATABASE [OCPMDB] SET AUTO_UPDATE_STATISTICS ON 
GO
ALTER DATABASE [OCPMDB] SET CURSOR_CLOSE_ON_COMMIT OFF 
GO
ALTER DATABASE [OCPMDB] SET CURSOR_DEFAULT  GLOBAL 
GO
ALTER DATABASE [OCPMDB] SET CONCAT_NULL_YIELDS_NULL OFF 
GO
ALTER DATABASE [OCPMDB] SET NUMERIC_ROUNDABORT OFF 
GO
ALTER DATABASE [OCPMDB] SET QUOTED_IDENTIFIER OFF 
GO
ALTER DATABASE [OCPMDB] SET RECURSIVE_TRIGGERS OFF 
GO
ALTER DATABASE [OCPMDB] SET  ENABLE_BROKER 
GO
ALTER DATABASE [OCPMDB] SET AUTO_UPDATE_STATISTICS_ASYNC OFF 
GO
ALTER DATABASE [OCPMDB] SET DATE_CORRELATION_OPTIMIZATION OFF 
GO
ALTER DATABASE [OCPMDB] SET TRUSTWORTHY OFF 
GO
ALTER DATABASE [OCPMDB] SET ALLOW_SNAPSHOT_ISOLATION OFF 
GO
ALTER DATABASE [OCPMDB] SET PARAMETERIZATION SIMPLE 
GO
ALTER DATABASE [OCPMDB] SET READ_COMMITTED_SNAPSHOT OFF 
GO
ALTER DATABASE [OCPMDB] SET HONOR_BROKER_PRIORITY OFF 
GO
ALTER DATABASE [OCPMDB] SET RECOVERY SIMPLE 
GO
ALTER DATABASE [OCPMDB] SET  MULTI_USER 
GO
ALTER DATABASE [OCPMDB] SET PAGE_VERIFY CHECKSUM  
GO
ALTER DATABASE [OCPMDB] SET DB_CHAINING OFF 
GO
ALTER DATABASE [OCPMDB] SET FILESTREAM( NON_TRANSACTED_ACCESS = OFF ) 
GO
ALTER DATABASE [OCPMDB] SET TARGET_RECOVERY_TIME = 60 SECONDS 
GO
ALTER DATABASE [OCPMDB] SET DELAYED_DURABILITY = DISABLED 
GO
ALTER DATABASE [OCPMDB] SET QUERY_STORE = OFF
GO
USE [OCPMDB]
GO
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = OFF;
GO
ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 0;
GO
ALTER DATABASE SCOPED CONFIGURATION SET PARAMETER_SNIFFING = ON;
GO
ALTER DATABASE SCOPED CONFIGURATION SET QUERY_OPTIMIZER_HOTFIXES = OFF;
GO
ALTER DATABASE [OCPMDB] SET  READ_WRITE 
GO

After this the instalation completed succesfully. Don’t forget to upgrade your SCOM console back to 1901 !!!

HPVirtual Connect Multiple tunneled VLANs with Active/Active Uplinks and 802.3ad (LACP) Ethernet and FCOE

Overview

This scenario will implement two VLAN-Tunnels per Virtual Connect Module to provide support for multiple VLANs. The upstream network switches connect VLAN-Tunnels to two ports on each FlexFabric 20/40 F8 modules, LACP will be used to aggregate those links. By using VLAN tunnels for MGMT and Production we remove the need for separate VLAN administration on the VC Domain.

One VLAN-Tunnel will be used to provide connectivity for the Management and VMotion networks, The other VLAN-Tunnel will be used to service the Virtual Machines on the Blades. As multiple VLANs will be supported in this configuration, the upstream switch ports will be configured for VLAN Trunking/Tagging. The upstream switches will also provide a Native VLAN to support PXE boot for the ESXi hosts which will be deployed with Auto-Deploy from VMware.

When configuring Virtual Connect, we can provide several ways to implement network fail-over or redundancy. One option would be to connect TWO uplinks to a single Virtual Connect network; those two uplinks would connect from different Virtual Connect modules within the enclosure and could then connect to the same upstream switch or two different upstream switches, depending on your redundancy needs. An alternative would be to configure TWO separate Virtual Connect networks, each with a single, or multiple, uplinks configured. Each option has its advantages and disadvantages. For example; an Active/Standby configuration places the redundancy at the VC level, where Active/Active places it at the OS NIC teaming or bonding level.

We will review the second option in this scenario and build a situation with 4 Tunneled VNET links to the individual blades . In addition, several Virtual Connect Networks can be configured to support the required networks to the servers within the BladeSystem enclosure. These networks could be used to separate the various network traffic types, such as iSCSI, backup and VMotion from production network traffic. This scenario will also leverage the Fibre Channel over Ethernet (FCoE) capabilities of the FlexFabric 20/40 F8 modules and will connect two fabrics, one to each of the FlexFabric 20/40 F8 modules using 2 Uplinks per Fabric

Requirements

This scenario must support both Ethernet and fibre channel connectivity. In order to implement this scenario, an HP BladeSystem c7000 enclosure with one or more server blades and TWO Virtual Connect FlexFabric 20/40 F8 modules, installed in I/O Bays 1& 2 are required. In addition, we will require ONE or TWO external Network switches which in our case are two Cisco Nexus 9k switches configured as a single VPC domain. The Fibre Channel uplinks will connect to the existing FC SAN fabrics. The SAN switch ports will need to be configured to support NPIV logins. Two uplinks from each FlexFabric 20/40 F8 module will be connected to the existing SAN fabrics.

Figure 1 – Physical View; The image shows two Ethernet uplinks from ports X5 and X6 on Module 1 to Port 1 on the Nexus switches and two Ethernet uplinks from ports X5 and X6 on Module 2 to port 2 on the Nexus switches. It also shows two Ethernet uplinks from ports X7 and X8 on Module 1 to Port 3 on the Nexus switches and two Ethernet uplinks from ports X7 and X8 on Module 2 to port 4 on the Nexus switches, all Ethernet uplinks use 10 GB SFP’s to connect to the Cisco Network. The SAN fabrics are also connected redundantly, with TWO uplinks per fabric, from ports X1 and X2 on module 1 to Fabric A and ports X1 and X2 on module 2 to Fabric B, all FC Uplinks use 8 GB SFP’s to connect to the SAN Fabrics.

Figure 196 – Logical View; The server blade profile is configured with four FlexNICs and two FlexHBAs. NICs 1 and 2 are connected to MGMT which are part of VLAN Tunnel 1 to support ESXi management and vMotion. The VLAN-Trunks are configured to support 1-10Gb port speeds. NICs 3 and 4 are connected to VLAN-Tunnel-2 which is supporting the VM guest VLANs. The VLAN-Tunnels are configured to support 10-20Gb port speed. Two Flexhba’s provide access to the storage platform. and are configured to support 4-8GB port speeds.

Installation and configuration

Nexus Switch configuration
As the Virtual Connect module acts as an edge switch, Virtual Connect can connect to the network at either the distribution level or directly to the core switch. In this situation the Virtual Connect is connected at the distribution level. For more information on how to configure a vPC domain on the Nexus 9k click this link:

Whether connecting to a Shared Uplink Set or Tunnel, the switch ports are configured as VLAN TRUNK ports (tagging) to support several VLANs. All frames will be forwarded to Virtual Connect with VLAN tags. One VLAN on the MGMT Tunnels will be configured as Native on the Nexus switch.

Note: When adding additional uplinks to the Tunnel, if the additional uplinks are connecting from the same FlexFabric 20/40 F8 module to the same switch, in order to ensure all uplinks are active, the switch ports connected to each Tunnel will need to be configured for LACP within the same Link Aggregation Group.The network switch port should be configured for Spanning Tree Edge as Virtual Connect appears to the switch as an access device and not another switch. By configuring the port as Spanning Tree Edge, it allows the switch to place the port into a forwarding state much quicker than otherwise, this allows a newly connected port to come online and begin forwarding much quicker.

The following port configurations are an example. On the MGMT ports VLAN 6 will be used for ESXi MGMT and VLAN7 will be used for VMotion. For the production network the following VLAN’s will be defined: VLAN 2-11, VLAN 16-43, VLAN 46, VLAN76, VLAN100-220 and VLAN 360-425.

interface Ethernet1/1
description VC1_Bay1_MGMT_X5
switchport mode trunk
switchport trunk native vlan 6
switchport trunk allowed vlan 6,7
spanning-tree port type edge trunk
storm-control broadcast level 1.00
ch
annel-group 1001 mode active

           

interface Ethernet1/3
description VC1_Bay1_PROD_X7
switchport mode trunk
switchport trunk allowed vlan 2-11,16-43,46,76,100-220,360-425
spanning-tree port type edge trunk
storm-control broadcast level 1.00
channel-group 1003 mode active

Configuring the VC module

– Physically connect Port 1 of Nexus 1 to Port X5 of the VC module in Bay 1
– Physically connect Port 1 of Nexus 2 to Port X6 of the VC module in Bay 1
– Physically connect Port 3 of Nexus 1 to Port X7 of the VC module in Bay 1
– Physically connect Port 3 of Nexus 2 to Port X8 of the VC module in Bay 1
– Physically connect Port 2 of Nexus 1 to Port X5 of the VC module in Bay 2
– Physically connect Port 2 of Nexus 2 to Port X6 of the VC module in Bay 2
– Physically connect Port 4 of Nexus 1 to Port X7 of the VC module in Bay 2
– Physically connect Port 4 of Nexus 2 to Port X8 of the VC module in Bay 2

Note: If you have only one network switch, connect VC ports X5, X6, X7 & X8 (Bay 2) to an alternate port on the same switch. This will NOT create a network loop and Spanning Tree is not required.
– Physically connect Ports X1/X2 on the FlexFabric in module Bay 1 to switch ports in SAN Fabric A
– Physically connect Ports X1/X2 on the FlexFabric in module Bay 2 to switch ports in SAN Fabric B

Since we use only use the FlexFabric 20/40 F8 modules and no other 1GB Virtual Connect modules are present there is no need to expand the VLAN Capacity in the VirtualConnect domain. A total of 4096 VLAN’s can be handled by the Virtual Connect Modules and is the default for new VC Domains as of version 4.30

We will create four Ethernet networks MGMT_IB1, MGMT_IB2, Prod_IB1 and Prod_IB2 using the connected interfaces as described below.

After creating the four networks the list of networks will look like this.

The SAN connection will be made with redundant connections to each Fabric. SAN switch ports connecting to the FlexFabric 20/40 F8 module must be configured to accept NPIV logins.

Now the enclosure is connected to the rest of the infrastructure we must create server profiles

All server profiles will look like somewhat like this. The values for MGMT_IB1 an MGMT_IB2 in the picture are set to 1Gb-6Gb, in the real life scenario this is set to 1Gb-10Gb. SAN connections are set to static 8 GB.

When we review the server bay connections you can see there is an additional adapter available on each Ethernet Adapter port.

 

 

PSOD on a non existing piece of hardware

Last weeks at a customer side we had some weird problems with random PSOD’s on HP BL460c Gen 8 servers.

The customer is migrating from ESXi 5.5 to ESXi 6.0 and after a test period of a few months we started to upgrade all hosts using VMWare Update Manager. After a few days some hosts suddenly started to give PSOD’s as shown below.

All PSOD’s happened on HP BL460c Gen 8 blades equipped with QLogic and Emulex adapters. The other thing in common is that all PSOD’s were caused by Linux RedHat servers with Oracle installed on them.

If we look at the PSOD and distillate the cause we see mlx4_core that is required by mlx4_en. These are Mellanox driver. The fun part is that we don’t have any mellanox hardware installed in the blades. Simular post can be found Here.

After consulting both HP and VMWare, the conclusion was to uninstall the VIBs instead of upgrading them. The drivers should have been removed during the upgrade tot ESXi 6 but remained at the host.

After removing the VIBS and rebooting the hosts there were no more PSOD’s.

ESX-cli commands:

esxcli software vib remove -n net-mlx4-en
esxcli software vib remove -n net-mlx4-core
esxcli software vib remove -n net-mst

The only remaining question for me is why only Linux machines with Oracle can stress the host and cause a PSOD.

 

 

 

My learning path for the VCAP-DCD exam

At the beginning of this year I really wanted to challenge myself for the upcoming year. After completing MCSE 2012 last year which that once again showed me that if you understand the questioning of Microsoft Exams you can take virtually any exam without any preparations.

Even though Hyper-V of Microsoft has really evolved over the last years, I’m still a huge fan of the VMware virtualization platform which evolved even more. After being VPC for some years now I feel the time is right to take the next step. At the end of this month I will do the design VMware vSphere 5 design workshop to prepare for the VCAP-DCD exam which I think will be the most difficult exam I have ever taken in my life. After reading several blogs and forums about this exam I might even say that I am a little bit afraid for it. But, as with all my exams, I will prepare for the exam and take the test with full confidence. I hope to be fully prepared before the start of the summer, so in the autumn I would be able to start preparing for the VCAP-DCA exam.

There is so much material available that it is a real puzzle to find the right exam preparation tools. For start I will take the Design workshop, also reading the book of Paul McSharry, the official CertGuide for the VCAP-DCD (ISBN -13: 978-0-7897-5018-1) and of course I’ve downloaded the exam blueprint from VMware.

I wonder if anyone has any tips for me