Child pages
  • IBM GPFS Storage upgrade
Skip to end of metadata
Go to start of metadata

Configure New Block Storage for NeVe

  1. The Large Memory nodes require connecting to the fabric switches *done

The new DS3500 for Large Memory node block storage will be installed into rack H26

 Tasks required are:

  1. Install DS3500                                                                                                  * done *
  2. Cable and zone onto the SAN fabric                                                                Still tracing Faults
  3. Configure and create arrays                                                                              * underway *
  4. Create LUN¿s and map to Large Memory nodes                                               * underway *

The addition of block storage for the large memory nodes will require:

Pre-requisite:      8 x FC cables are required from rack M27 to rack M26   * done *

  1. HBA card install and configuration                                                                          * done *
  2. FC cabling to the SAN fabric switch and subsequent zoning                                     TBD
  3. Install HBA drivers and Linux RDAC driver                                                          * to be checked *
  4. Configure block storage for LMN use.                                                                    * underway *

(see Aaron's documentation for Large Memory node setup requirements) NeVE Block Storage RHEL6 Installation

Config & Expand Meta Data Store for GPFS

DS3500 Metadata Subsystem

  1. Additional metadata LUNs to be configured for each file system *Done
  2. Copy metadata *Done

Note: The new metadata storage is not the same size drives as the existing

Meta Data Replication

  1. Replicate data and change GPFS config appropriately *Pending

TSM Backups

mmbackup failing.

  1. Investigate, Config and test TSM Backup configuration

Prerequisite: Install 10G ethernet cards and connect to ITS TSM subnet *Done

Validate GPFS Configuration

  1. Verify existing GPFS parameters are optimal

Validate IB / IPoIB configuration

Want standard config for servers and compute nodes.

  1. see Gerard and Yuriy notes PAN_Infiniband_Tuning Tips and Configuration
  2. Validate IS5200 configuration

Validate FC configuration

The DS3500 Storage Subsystems should be tuned for optimal performance in a GPFS environment.

  1.  For each DS3500 check & modify the controller¿s for optimum settings *Done

Nb. Changes made were to values in the original "As Built" document, so the "As Built" does not need modifying.

Storage Subsystem

Usage

Rack Location

Start cache flushing at (percentage)

Stop cache at (percentage)

Cache Block Size (KB)

Read cache

Write cache

Write cache without batteries

Write cache with mirroring

Flush write cache after (seconds)

Dynamic cache read prefetch

ds_meta

GPFS Metadata

New Rack

80

80

4

Enabled

Enabled

Disabled

Enabled

10

???

 

change to

 

 

50

50

16

 

ds_data1

GPFS Data

A3-U11

80

80

4

Enabled

Enabled

Disabled

Enabled

10

Enabled

 

ds_data2

GPFS Data

A3-U21

80

80

4

Enabled

Enabled

Disabled

Enabled

10

Enabled

 

ds_data3

GPFS Data

A3-U31

80

80

4

Enabled

Enabled

Disabled

Enabled

10

Enabled

 

change to

 

 

50

50

16

Disabled

Disabled

Disabled

Disabled

10

Disabled

 

ds_neve1

Block Storage New Rack

 

80

80

4

Enabled

Enabled

Disabled

Enabled

10

Enabled

 

change to

 

 

70

70

16

 

DS3500 Fibre Cabling

The original installation only connected one FC port from each DS3500 Storage Subsystem controller to the SAN Fabric switch.

  1. Extra FC ports will be connected to provide increased bandwidth.  Status ?

Pre-requisite:       6 x FC cables are required for intra-rack connections                       * done *

Note: Several short length cables are required for intra-rack connections of DS3512 to the SAN switches

Mellanox 4036E configuration IP to IB gateways

  1. Configure bridging of private IB IP network through to NeVe hosts
  2. Configure 10G vlans on IP switches to carry data traffic.
  3. Firmware update needed and completion of config.

Nb. Longer term goal is to use dedicated 10G interfaces in NeVe hosts for data traffic.

Performance Testing

  1. Specifically, run Landcare Possum code to find why it is 20x slower than local disk (and other clusters).

Topology diagram

  1. Update network/FC/IB as builts

Assistance setting up monitoring

  1. monitoring and traffic logs for FC/IB networks.
  2. monitoring and logging of disk arrays
  3. Notifications

Make GPFS1 same as GPFS 2-4

  1. GPFS1 is running Multicast DM, the other nodes are not *Done

Investigate File Corruption Event

  1. RDMA related?
  • No labels