Tuesday, February 21, 2012

IMC trap adjustments

IMC is a great piece of code but it does require some adjustments from the defaults, specifically with some traps and their description, along with the default traps to alarms not always escalating what does require attention.

This post is a draft but I am using it to document things I would change on a factory default IMC WRT the above. You will notice that the adjustments that I make tend to revolve around making us of the parameters passed with the trap and making sure to highlight in bold the name of the feature / process which is rising the trap so that glancing through alarms lets you easily see what to look when analyzing on the CLI.

Loopback-detection: broadcast probing for loops when STP isnt.

Often enough I find myself recommending this function on edge devices as an additional peace-of-mind loop prevention measure. Loopback-detection sends a broadcast probe on the port and if it hears itself (by default same port, you can enable multi-port) it will block inbound packets until it stops hearing itself for a duration equal to 2 or 3 times the probe timer. Its a low maintenance feature which does its thing and auto recovers from incidents by default (you can have it action to shutdown too). IMC's descriptions are poor for the related traps and they are not escalated to alarms by default.


OID1.3.6.1.4.1.25506.2.95.1.6.1
Original descriptionLoopback detected on an interface.
New descriptionLOOPBACK-DETECTION: Loop detected on interface $2.
CommentWe are being passed ifDescr (param 2) as a trap parameter - how about using it?

OID 1.3.6.1.4.1.25506.2.95.1.6.2
Original descriptionTrap message is generated when the loops on the interface are eliminated.
New description LOOPBACK-DETECTION: Loop no longer exists on interface $2.
Comment We are being passed ifDescr (param 2) as a trap parameter - how about using it?


Add the above to a Trap to Alarm object titled "Loopback-detection". You can also program OID.2 as a recovery for OID.1 if desirable.


Storm-constrain: metered broadcast/multicast/unicast with rising and falling thresholds and actions

OID1.3.6.1.4.1.25506.2.66.3.6.1
Original descriptionAny type of the flux "$2" exceeds its upper limit "$3" on a port of Device "$R($a)".
New descriptionSTORM-CONSTRAIN: $2(Broadcast:1,Multicast:2,Unicast:3) storm rising on port $c of device "$R($a)". Port status is $4(controlled:1,normal:2).
CommentLets use the parameter switch with parenthesis which the related falling trap below factory description was making us of. I wish HP reported ifDescr as a parameter for this trap: tracking by port index in a 600 port chassis is a pleasure that only a select few seem to appreciate, myself excluded.


OID1.3.6.1.4.1.25506.2.66.3.6.1
Original descriptionA flux which used to overflow its upper limit, falls below its lower limit "$3" on a port(Index:$c) of Device "$R($a)". Trap type is $2. The port status is $4.
New descriptionSTORM-CONSTRAIN: $2(Broadcast:1,Multicast:2,Unicast:3) storm rising on port $c of device "$R($a)". Port status is $4(controlled:1,normal:2).
CommentLets use the parameter switch with parenthesis which the falling trap below base description was making us of. Here again, I wish HP reported ifDescr as a parameter for this trap. Or we could have a function in IMC which returns ifDescr (Gig4/0/34) from the index (167, arbitrary example of the kind of madness you get trying to locate what the hell is port 167 in your 10 slot 7500).

Sunday, February 19, 2012

HP A-Series / H3C / Comware HTTPS howto with Microsoft CA 2008

As most leading switching vendors, Comware has an HTTPS management interface available. Unlike some of the leading switching vendors, Comware's web interface actually lets you do a whole lot of core stuff besides applying macros to interfaces.
With that said, Comware based switches do not have provisions for creating self signed certificates. Some are flustered by this shortcoming - usually the same folks that get a false feeling of security just typing https:// instead of http:// and who dont understand that it takes 5 seconds to fire up an MITM tool with HTTPS support to intercept the credentials.
This article is a short howto on using a Microsoft Windows 2008 CA to automatically (SCEP) generate certificates for HP Comware-based switches, including Comware configuration to get this going. 
Before we begin lets make sure NTP is correctly setup and that your switches are somewhat in sync with the CA's time. This isnt a requirement per-se, but the switch will refuse CA certificates with issuance times in the future.
 1) Configure the PKI entity, which defined parameters for the switch itself.
#
pki entity a5120
  common-name a5120.mforelab.com
  country CA
#

2) Configure the PKI domain, which defines parameters for your CA.
#
pki domain mforelab
  ca identifier win2k8
  certificate request url http://10.1.4.65/certsrv/mscep/mscep.dll
  certificate request from ra
  certificate request entity a5120
  crl check disable
#


3) Request the CA certificate through SCEP.
[A5120-24G-PoE+]pki retrieval-certificate ca domain mforelab
The trusted CA's finger print is:
    MD5  fingerprint:E27E 2F32 9ADF B410 C5C1 12B9 2A45 5DA7
    SHA1 fingerprint:4AD6 5188 2394 441F 66F7 65B8 0D41 EB89 1CB8 7FB8

Is the finger print correct?(Y/N):Y

Saving CA/RA certificates chain, please wait a moment......
%Feb 19 21:46:44:336 2012 A5120-24G-PoE+ PKI/6/PKI_CA_CERT_TRUSTED: Root CA certificate of the domain mforelab is trusted.....
CA certificates retrieval success.
%Feb 19 21:46:49:064 2012 A5120-24G-PoE+ PKI/6/PKI_RETRIEVAL_CA_SUCC: Retrieved the CA certificates of domain mforelab successfully.

3) Comware does not support SCEP challenge, so on 2008 you have to set the following registry key to 0 (its an actual configuration option in 2003):
HKLM\Software\Microsoft\Cryptography\MSCEP\EnforcePassword\EnforcePassword

4) Lets request the certificate. The first attempt is what happens if your Windows CA is still enforcing the challenge password. The second, successful one is with SCEP password disabled with the previously mentioned registry entry (don’t forget to restart the services – that’s why I got the below failure on the first try).

[A5120-24G-PoE+]pki request-certificate domain mforelab
Certificate is being requested, please wait......
[A5120-24G-PoE+]
Enrolling the local certificate,please wait a while......
Certificate request failed.

### ... forgot to restart the darn CA service after regedit - here we go again ###
[A5120-24G-PoE+]pki request-certificate domain mforelab
Certificate is being requested, please wait......
[A5120-24G-PoE+]
Enrolling the local certificate,please wait a while......
Certificate request Successfully!
Saving the local certificate to device......
Done!

%Feb 19 21:53:26:224 2012 A5120-24G-PoE+ PKI/6/PKI_REQUEST_CERT_SUCC: Requested the local certificate of domain mforelab successfully.
[A5120-24G-PoE+]

4) Configure the SSL policy, bind to HTTPS service and enable HTTPS
#
ssl server-policy sslswitch
 pki-domain mforelab
#
 ip https ssl-server-policy sslswitch
 ip https enable
#

Dont forget that you must trust the CA certificate of the issuing CA on the machines from which you plan on managing your Comware environment from in addition of creating A records for all the managed devices in DNS and/or static host file.

Saturday, June 4, 2011

Comware RRPP usage in layer 2 bridging - notes and configuration

Most switching vendors nowaday have an implementation of a ring protocol with fast convergence to replace slower STP and all its associated pitfalls.

Comware has RRPP (Rapid Ring Protection Protocol - based on Extreme EAPS or IETF RFC 3619 ), Cisco has REP (Resilient Ethernet Protocol), Brocade has MRP (Metro Ring Protocol), Juniper has ERP (Ethernet Ring Protocol) and other vendors likely all have their own approach to ring type topologies.

Most of these protocols tend to define more or less a similar technique in which one device along the ring is defined as the master and blocks an interface from transmitting trafic unless a failure is detected in the ring. Most of the ring protocols offer support for intersecting / overlapping rings. Cisco's REP offers support for the ring to have one of its segment shared with STP.

This post centers around implementing a single Comware RRPP ring on HP A-series switches. While ring protocols allow for extremely fast convergence, another very useful function is their capability to isolate STP domains. This is due to ring protocols replacing STP on the links over which they run (with the notable exception of REP's capability to share a single segment with STP). Isolating STP fault domains is a worthwhile feature which I tend to recommend in datacenter bridging topologies in which a remote datacenter uses a pair of layer 2 links for connectivity and redundancy (a necessary evil in today's "vmotion" world). Running a ring protocol over such redundant links not only means fast convergence, but also isolation of the 2 site's STP domains, a clear security improvement over having the whole thing converge as one.

The following schema presents a typical topology associated with implementing RRPP for bridging two sites at layer 2 redundantly and the biggest pitfall associated with ring protocols:


This topology makes use of redundant switches in each site to terminate both links, resulting in 4 devices participating in the ring. As the ring devices are themselves connected to an existing STP domain, one can likely foresee the two network loops that will result from this topology being implemented. The fact that RRPP would only block the backup link and that BPDUs would not be exchanged over the ring would result in a loop being created between all 4 of each site's switches. Cisco's REP's ability to share a single link with STP in an open ring topology would only solve one of the above loops, and the second site would remain with a loop condition.

There are two solutions that can alleviate this issue:
  1. Virtualize the STP switches and use a technology like HP Smart Links or Cisco Flex Links to ensure that link down due to a defective ring switch results in an alternate link to the second ring switch is brought up. Cisco users would have to resort to using Stackwise-equipped switches or a VSS pair, limiting product selection to 3750s and 6500s and ME-series switches for the ring (Cisco does not support REP on anything else than SP gear). HP users could pick just about any device that supports IRF and Smart Links, which is comprised of 90% of the A-series product line, and RRPP-supporting switches, which would be comprised of the 5120 and above.
  2. Virtualize the ring switches, and talk STP with the site's existing STP domain. A clearly simpler approach which greatly simplifies configuration of both the ring and each site's implementation. While at this point Cisco users are limited to using ME3750s to achieve this (only Metro Ethernet products support REP), HP users can once again pick any A-series device that supports IRF and RRPP, which means the 5120, 5500, 5800 and above.
  3. Virtulize the ring switches and the site's STP domain switches: STP-free, link-aggregation all-links-active network utopia! The HP solution will involve doing it all within the same IRFed 5500s, using IRFed 5120s for the ring and IRFed 5500s for your layer 3 site's core or any chassis solution. Cisco users will find themselves implementing that with stackwised ME3750s at the ring and stackwised 3750s or VSSed 6500s with at the core, all of which will carry a high price tag.

RRPP is implemented on A-series through the definition of an MST instance (it only uses MST's instance constructs - it does not rely on MST any other way) and association of a ring to protect the created instance. Take heed that any misconfiguration of the MST instance, including forgetting to activate the region config, can result in a loop being created. Make sure you adjust your region configuration before allowing any new VLANs on your ring-enabled port's trunking configs. Additionally, each RRPP domain uses 2 control VLANs to exchange ring state over - the number provided in the configuration will actually end up also reserving the next vlan (4092 being configured below also uses 4093).


stp region-configuration
 region-name TEST
 instance 1 vlan 1 to 4000
 active region-configuration
#
interface GigabitEthernet1/0/24
 port link-mode bridge
 port link-type trunk
 port trunk permit vlan 1
 stp disable
#
rrpp domain 1
 control-vlan 4092
 protected-vlan reference-instance 1
 timer hello-timer 2 fail-timer 7
 timer fast-hello-timer 100
 ring 1 node-mode master primary-port GigabitEthernet1/0/21 secondary-port GigabitEthernet1/0/22 level 0
 ring 1 enable

Saturday, May 14, 2011

HP A-Series / H3C / Comware RADIUS Administrative Login HOWTO

Most of the larger networks I work on typically involve central authentication to avoid credential management to become a nightmare.

Comware-based devices require some specific attributes to be returned by the RADIUS server in order to allow for administrative login.

Vendor ID 2011, attribute ID 29 will let you specify the user level to apply, using the following values:

0 H3C-Visitor
1 H3C-Monitor
2 H3C-Manager
3 H3C-Administrator

Additionally, you will want to return standard attribute Login-Service (AVP Type 15) with a value of "telnet" (0) if you want to grant telnet access, 50 for SSH and 52 for console access. Comware gear is picky on having the RADIUS server return the exact login-service along with the right exec privilege.

For those of you with Microsoft radius servers, you must alter the following file and add the above login-service AVPs to the right section:
c:\windows\system32\ias\dnary.xml

The section looks like this when properly filled in. Reboot the server after editing the file (no, a service restart is not sufficient).
 

<StandardValues>
   <StandardValue>
    <Name>Telnet</Name>
    <Value>0</Value>
   </StandardValue>
   <StandardValue>
    <Name>Rlogin</Name>
    <Value>1</Value>
   </StandardValue>
   <StandardValue>
    <Name>TCP Clear</Name>
    <Value>2</Value>
   </StandardValue>
   <StandardValue>
    <Name>Portmaster (proprietary)</Name>
    <Value>3</Value>
   </StandardValue>
   <StandardValue>
    <Name>LAT</Name>
    <Value>4</Value>
   </StandardValue>
   <StandardValue>
    <Name>X25-PAD</Name>
    <Value>5</Value>
   </StandardValue>
   <StandardValue>
    <Name>X25-T3POS</Name>
    <Value>6</Value>
   </StandardValue>
   <StandardValue>
    <Name>ssh</Name>
    <Value>50</Value>
   </StandardValue>
 <StandardValue>
    <Name>console</Name>
    <Value>52</Value>
   </StandardValue>

   <StandardValue>
<Name>TCP Clear Quiet (suppresses any NAS-generated connect    string)</Name>
    <Value>8</Value>
   </StandardValue>
  </StandardValues>

The message "Admin user's login type mismatches the radius server assigned" when debugging radius means that you are trying to login through telnet,ssh or the console and the radius server has either not returned the Login-Service attribute or has returned another one.

If you are reading this right, this means that you will require one RADIUS policy which will match the NAS port type at login and you usually will pick either telnet or ssh for your remote shell access. Console access will see the switch passing a different NAS port type than vty access, which will allow for differenciation at the RADIUS policy level.
Looking at a wireshark trace, the returned attributes will look like this:


The following config accomplishes RADIUS authentication (tested on an A5800 running 5.20 R1206):


#
 domain default enable RADLAB
#
radius scheme SCHEME-LAB
 server-type extended
 primary authentication 10.1.1.1
 primary accounting 10.1.1.1
 key authentication RADKEY
 key accounting RADKEY
 user-name-format without-domain
#
domain RADLAB
 authentication login radius-scheme SCHEME-LAB
 authorization login radius-scheme SCHEME-LAB
 accounting login radius-scheme SCHEME-LAB
 access-limit disable
 state active
 idle-cut disable
 self-service-url disable
#
user-interface vty 0 15
 authentication-mode scheme
#

If you want to differentiate the admin login requests coming from Comware switches, you may do so using a number of attributes that are attached to the Access-Request sent by Comware: