Posts

Troubleshooting Slow Drain Devices on Broadcom Switches

Image
  Slow drain devices are one of the more common problems on storage networks.  They can occur for a variety of reasons.  For a refresher on how they can affect your storage network you should watch  this video .    In this blog post I will go through the basic steps to troubleshoot a slow drain device on a Broadcom fabric.    I will be using command line output from switches.  The CLI format lends itself better to a blog post more readily than screen shots from a GUI, and the commands are consistent across different versions of FOS.    SANnav is a huge change from Brocade or IBM Network Advisor and the screens would look quite different between the two. The first command we will be using is porterrshow.    The above output has been truncated for the ports we are interested in. The counters of interest are in the c3timeout column.  You can see that there are 2 sub-columns, 'tx and 'rx'.   'tx' means the switch is trying to send frames to the device attached to that p

Spectrum Virtualize NPIV and Host Connectivity

Image
 A while ago I wrote  this post  as an introduction to the Spectrum Virtualize NPIV feature.  In this follow-up post I thought I would focus more on host connectivity and the effects of NPIV.    You can watch a quick review of the NPIV feature in this IBM Systems Rockstar video:      NPIV has 3 modes: 1.  Disabled - this mode means that hosts cannot connect to the virtual World Wide Port Names  (WWPNs) on the Spectrum Virtualize cluster, regardless of the fabric zoning 2.  Transitional - this mode means hosts can connect to either the physical or virtual WWPNs on the cluster.  If a host is zoned to both, it will connect to both.   Transitional mode is meant to only be used while you are migrating to NPIV mode and rezoning your hosts to the virtual WWPNs.  It is not meant to be used permanently or even long-term.   3.  Enabled - this means hosts can only connect to the virtual WWPNs.  If they are zoned to the physical WWPNs the connection will be listed as 'blocked' in the devic

IBM Spectrum Virtualize Safeguarded Copy

Image
  Several months ago I was asked by a local organization here if I could recover files from a system that had been encrypted by a ransomware attack.  After looking at the hard drive in the system and doing some research, I told the organization that I could not.   It did not have a backup of the files, at least not a recent one.  The most critical  data loss for this organization was financial records.   It took a few months and a lot of work to recover most of the missing records.     Had the organization done something as simple as periodically plug in a USB drive, run a backup and then remove the drive, that would have saved them a lot of work.   The USB drive is somewhat of an immutable copy of the data, at least as long as it is not plugged into the computer while the computer is still infected.   However, a USB-attached drive doesn't really scale well  at the enterprise level, and it is not a true immutable copy, since if it is plugged back into a computer that is infected, i

Brocade Fabric Performance Impact Notification

Image
  Brocade Fabric Performance Impact  Notification  (FPI N )  was released in Broadcom FOS v9.0.  It is available on Brocade Gen6 and Gen7 switches.    This  feature enables the switch to detect issues on a fabric such as congestion or physical link issues and then then notify the affected devices that have registered for these notifications.  FPI N  functions in a similar mechanism to RSCN.    RSCN enables the fabric to send  notifications to devices when a device they are zoned to is going offline.  The devices that receive these notifications can then proactively take steps such as path failover rather than have to react to a path being down.   FPIN provides a means to notify devices of link or other issues with a connection to a fabric or a path through it.    For both RSCN and FPI N , a device must register with fabric services to receive these notifications.  The new Brocade Gen7 hardware  can send hardware  or software signal notifications.  Gen6 can only send software notificati

Long Distance Fibre Channel Link Tuning

Image
In this video  I talk about some of the variables involved in long distance link tuning of fibre-channel distance links.  In this blog post I'll detail some of the tools that are available.  I will also provide an example of estimating the number of buffer credits you will need.  Note that this tuning is only for fibre-channel links.  This does not apply to FCIP tunnels or circuits.   One critical piece of information that you will need to calculate buffer credits is the frame size.  Smaller frames means more of them can fit in the link, so you would need more buffer credits.  Of the variables that go into the formula, this is the only unknown.  Everything else is either known or is a constant.  Brocade has the 'portbuffershow' command that can tell you the average frame size for a link.  You would look at the Framesize columns for  TX and RX in the portbuffershow output to get the frame size.  The portbuffershow output is organized by logical switch and then by port.     O

Using the IBM Storage Insights Pro Grouping Features

Image
  I recently posted  this post  on how you can help IBM Storage Support help you by ensuring you are utilizing the full monitoring features available on your storage systems and switches.    You should also have at least the free version of IBM Storage Insights installed.   If you have Storage Insights Pro or Storage Insights for Spectrum control, there are some additional steps that you should take that will benefit both you and the IBM Support team resolve your problems as quickly as possible.  IBM Storage Insights Pro and Storage Insights for Spectrum Control come with some powerful features for grouping and  organizing storage resources.  These features are found under the Groups menu.   You can organize your storage resources into Applications, Departments and General Groups.    There is a hierarchy to the organization of resources.   Departments can contain sub-departments, Applications or General Groups.  Applications can contain hosts or other applications.  General Groups can

Help IBM Storage Support Help You

Image
        I had a client recently ask me what was the most effective thing his company could do to get me the data that would be the most helpful in troubleshooting problems in his solution.  This was after we were unable to provide a definitive root cause to a problem that occurred intermittently in his solution.   He had a fairly simple fabric that consisted of two 96-port switches, a few IBM Storage Systems and 30 or so hosts.   His problem was an issue with performance on the hosts.  At the time the best I was able to tell him was data indicated a slight correlation between host read activity and a performance problem but I was not able to confirm anything with certainty.       My answer was simple:  configure better event detection and system logging.  This is something I teach as a best-practice at IBM Technical University.   I also suggested that his company install at least the free version of IBM Storage Insights.   Without a performance monitoring tool, troubleshooting performa