IBM Spectrum Scale(formerly GPFS) is a scale-out high performance global parallel file system (cluster file system) that provides concurrent access to a single file system or set of file systems from multiple nodes. Enterprises and organizations are creating, analyzing and keeping more data than ever before. Islands of data are being created all over the organization and in the cloud creating complexity, difficult to manage systems and increasing costs. Those that can deliver insights faster while managing rapid infrastructure growth are the leaders in their industry. In delivering those insights, an organization’s underlying information architecture must support the hybrid cloud, big data and artificial intelligence (AI) workloads along with traditional applications while ensuring security, reliability, data efficiency and high performance. IBM Spectrum Scale™ meets these challenges as a parallel high-performance solution with global file and object data access for managing data at scale with the distinctive ability to perform archive and analytics in place.
Manually installing the IBM Spectrum Scale software packages on POWER nodes myhost1, myhost2 and myhost3
The following packages are required for IBM Spectrum Scale Standard Edition on Red Hat Enterprise Linux:
- gpfs.base*.rpm
- gpfs.gpl*.noarch.rpm
- gpfs.msg.en_US*.noarch.rpm
- gpfs.gskit*.rpm
- gpfs.license*.rpm
Step 1:Download spectrum scale 5.1.1.1 SE package from fix central and Install RPM packages on all nodes:
rpm -ivh gpfs.base*.rpm gpfs.gpl*rpm gpfs.license.std*.rpm gpfs.gskit*rpm gpfs.msg*rpm gpfs.docs*rpm
Step 2 : Verify installed GPFS packages
[root@myhost1 ]# rpm -qa | grep gpfs
gpfs.docs-5.1.1-1.noarch
gpfs.license.std-5.1.1-1.ppc64le
gpfs.bda-integration-1.0.3-1.noarch
gpfs.base-5.1.1-1.ppc64le
gpfs.gplbin-4.18.0-305.el8.ppc64le-5.1.1-1.ppc64le
gpfs.gskit-8.0.55-19.ppc64le
gpfs.msg.en_US-5.1.1-1.noarch
gpfs.gpl-5.1.1-1.noarch
Step 3 : Build GPL (5.1.1.1) module by issuing command mmbuildgpl on all nodes in cluster.
Step 4 : Verify GPFS packages installed on all nodes with GPL module built properly.
Export the path for GPFS commands.
export PATH=$PATH:/usr/lpp/mmfs/bin
Step 5 : Use the mmcrcluster command to create a GPFS cluster
mmcrcluster -N NodeFile -C smpi_gpfs_power8
where NodeFile has following entries
#cat NodeFile
myhost2:quorum
myhost1:quorum-manager
myhost3:quorum-manager
Step 6: Use the mmchlicense command to designate licenses as needed. This command controls the type of GPFS license associated with the nodes in the cluster. -- accept indicates that you accept the applicable licensing terms.
mmchlicense server --accept -N serverLicense
Step 7: mmgetstate command. Displays the state of the GPFS™ daemon on one or more nodes.
mmgetstate -a
Step 8: mmlslicense command displays information about the IBM Spectrum Scale node licensing designation or about disk and cluster capacity.
mmlslicense -L
Step 9: The mmcrnsd command is used to create cluster-wide names for NSDs used by GPFS. This is the first GPFS step in preparing disks for use by a GPFS file system.
mmcrnsd -F NSD_Stanza_smpi_gpfs_power -v no
where NSD_Stanza_smpi_gpfs_power has
#cat NSD_Stanza_smpi_gpfs_power
%nsd:
device=/dev/sda
nsd=nsd1
servers=myhost2
usage=dataAndMetadata
failureGroup=-1
pool=system
%nsd:
device=/dev/sdb
nsd=nsd2
servers=myhost1
usage=dataAndMetadata
failureGroup=-1
pool=system
%nsd:
device=/dev/sda
nsd=nsd3
servers=myhost3
usage=dataAndMetadata
failureGroup=-1
pool=system
Step 10: Use the mmlsnsd command to display the current information for the NSDs belonging to the GPFS cluster.
mmlsnsd -X
Step 11: Use the mmcrfs command to create a GPFS file system
mmcrfs smpi_gpfs -F NSD_Stanza_smpi_gpfs_power
Step 12: The mmmount command mounts the specified GPFS file system on one or more nodes in the cluster.
mmmount smpi_gpfs -a
Step 13 : Use the mmlsfs command to list the attributes of a file system.
mmlsfs all
Step 14: The mmlsmount command reports if a file system is in use at the time the command is issued.
mmlsmount all
step 15: How to change the mount point from /gpfs to /my_gpfs
mmchfs gpfs -T /my_gpfs
Step 16: GPFS auto start and auto mount setup
[root@myhost1 ~]# systemctl status gpfs.service
● gpfs.service - General Parallel File System
Loaded: loaded (/usr/lib/systemd/system/gpfs.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2021-07-20 03:27:04 EDT; 3 days ago
Process: 96622 ExecStart=/usr/lpp/mmfs/bin/mmremote startSubsys systemd $STARTSUBSYS_ARGS (code=exited, status=0/SUCCESS)
Main PID: 96656 (runmmfs)
CGroup: /system.slice/gpfs.service
├─96656 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/runmmfs
└─97093 /usr/lpp/mmfs/bin/mmfsd
[root@myhost1 ~]# systemctl is-active gpfs.service
active
[root@myhost1 ~]# systemctl is-enabled gpfs.service
disabled
[root@myhost1 ~]# systemctl is-failed gpfs.service
active
[root@myhost1 ~]# systemctl enable gpfs.service
Created symlink from /etc/systemd/system/multi-user.target.wants/gpfs.service to /usr/lib/systemd/system/gpfs.service.
[root@myhost1 ~]# systemctl is-enabled gpfs.service
enabled
[root@myhost1 ~]# ls -alsrt /etc/systemd/system/multi-user.target.wants/gpfs.service
0 lrwxrwxrwx 1 root root 36 Jul 23 05:43 /etc/systemd/system/multi-user.target.wants/gpfs.service -> /usr/lib/systemd/system/gpfs.service
[root@myhost1 ~]# mmgetstate -a
Node number Node name GPFS state
-------------------------------------------
1 myhost2 active
2 myhost1 active
3 myhost3 active
[root@myhost1 ~]# mmchfs smpi_gpfs -A yes
mmchfs: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@myhost1 ~]# mmlsfs smpi_gpfs -A
flag value description
------------------- ------------------------ -----------------------------------
-A yes Automatic mount option
[root@myhost1 ~]# mmchconfig autoload=yes
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@myhost1 ~]#
Step 17: Troubleshoot when GPFS node went to inactive state or when disk goes down .
[root@myhost1 ~]# mmlscluster
GPFS cluster information
========================
GPFS cluster name: my_spectrumScale_cluster
GPFS cluster id: 9784093264651231821
GPFS UID domain: my_spectrumScale_cluster
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
Repository type: CCR
Node Daemon node name IP address Admin node name Designation
---------------------------------------------------------------------
1 myhost2 10.x.y.1 myhost2 quorum
2 myhost1 10.x.y.2 myhost1 quorum-manager
3 myhost3 10.x.y.3 myhost3 quorum-manager
[root@myhost1 ~]# mmgetstate -a
Node number Node name GPFS state
-------------------------------------------
1 myhost1 active
2 myhost2 down
3 myhost3 active
[root@myhost1 ~]# mmstartup -a
Tue Jul 20 03:27:03 EDT 2021: mmstartup: Starting GPFS ...
myhost2: The GPFS subsystem is already active.
myhost3: The GPFS subsystem is already active.
[root@myhost1 ~]# mmgetstate -a
Node number Node name GPFS state
-------------------------------------------
1 myhost1 active
2 myhost2 active
3 myhost3 active
[root@myhost1 ~]#
[root@myhost1 ~]# mmunmount smpi_gpfs -a
Tue Jul 20 04:12:04 EDT 2021: mmunmount: Unmounting file systems ...
[root@myhost1 ~]#
[root@myhost1 ~]# mmlsdisk smpi_gpfs
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
nsd1 nsd 512 -1 Yes Yes ready up system
nsd2 nsd 512 -1 Yes Yes ready down system
nsd3 nsd 512 -1 Yes Yes ready up system
[root@myhost1 ~]#
[root@myhost1 ~]# mmchdisk smpi_gpfs start -d nsd2
mmnsddiscover: Attempting to rediscover the disks. This may take a while ...
mmnsddiscover: Finished.
myhost1: Rediscovered nsd server access to nsd2.
Scanning file system metadata, phase 1 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning file system metadata, phase 2 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning file system metadata, phase 5 ...
100 % complete on Tue Jul 20 04:24:14 2021
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Tue Jul 20 04:24:25 2021 ( 500736 inodes with total 26921 MB data processed)
Scan completed successfully.
[root@myhost1 ~]# mmmount smpi_gpfs -a
Tue Jul 20 04:24:42 EDT 2021: mmmount: Mounting file systems ...
[root@myhost1 ~]#
[root@myhost1 ~]# mmlsdisk smpi_gpfs
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
nsd1 nsd 512 -1 Yes Yes ready up system
nsd2 nsd 512 -1 Yes Yes ready up system
nsd3 nsd 512 -1 Yes Yes ready up system
[root@myhost1 ~]#
[root@myhost1 ~]# mmgetstate -a
Node number Node name GPFS state
-------------------------------------------
1 myhost1 active
2 myhost2 active
3 myhost3 active
[root@myhost1 ~]#
Step 18: Steps to permanently uninstall GPFS
- Unmount all GPFS file systems on all nodes by issuing the mmumount all -a command.
- Issue the mmdelfs command for each file system in the cluster to remove GPFS file systems.
- Issue the mmdelnsd command for each NSD in the cluster to remove the NSD volume ID from the device.
mmdelfs smpi_gpfs
mmdelnsd nsd1
mmdelnsd nsd2
mmdelnsd nsd3
- Issue the mmshutdown -a command to shutdown GPFS on all nodes.
- Uninstall GPFS from each node
rpm -qa | grep gpfs | xargs rpm -e --nodeps
Remove the /var/mmfs and /usr/lpp/mmfs directories.
rm -rf /var/mmfs
rm -rf /usr/lpp/mmfs
------------------------------------------------------------------------------------------
The Quick Start automatically deploys a highly available IBM Spectrum Scale cluster on the Amazon Web Services (AWS) Cloud. This Quick Start deploys IBM Spectrum Scale into a virtual private cloud (VPC) that spans two Availability Zones in your AWS account. You can build a new VPC for IBM Spectrum Scale, or deploy the software into your existing VPC. The deployment and configuration tasks are automated by AWS CloudFormation templates that you can customize during launch.
IBM's container-native storage solution for OpenShift is designed for enterprise customers who need global hybrid cloud data access. These storage services meet the strict requirements for mission critical data. IBM Spectrum® Fusion provides a streamlined way for organizations to discover, secure, protect and manage data from the edge, to the core data center, to the public cloud.
Spectrum Fusion
IBM launched a containerized derivative of its Spectrum Scale parallel file system called Spectrum Fusion. The rationale is that customers need to store and analyze more data at edge sites, while operating in a hybrid and multi-cloud world that requires data availability across all these locations. The ESS arrays provide Edge storage capacity and a containerized Spectrum Fusion can run in any of the locations mentioned. It’s clear that to build, deploy and manage applications requires advanced capabilities that help provide rapid availability to data across the entire enterprise – from the edge to the data center to the cloud.
Spectrum Fusion combines Spectrum Scale functionality with unspecified IBM data protection software. It will appear first in a hyperconverged infrastructure (HCI) system that integrates compute, storage and networking. This will be equipped with Red Hat Open Shift to support virtual machine and containerized workloads for cloud, edge and containerized data centres.
Spectrum Fusion will integrate with Red Hat Advanced Cluster Manager (ACM) for managing multiple Red Hat OpenShift clusters, and it will support tiering. Spectrum Fusion provides customers with a streamlined way to discover data from across the enterprise as it has a global index of the data it stores. It will manage a single copy of data only – i.e. there is no need to create duplicate data when moving application workloads across the enterprise. Spectrum Fusion will integrate with IBM’s Cloud Satellite, a managed distribution cloud that deploys and runs apps across the on-premises, edge and cloud environments.
References:
https://www.ibm.com/in-en/products/spectrum-scale
https://aws.amazon.com/quickstart/architecture/ibm-spectrum-scale
https://www.ibm.com/in-en/products/spectrum-fusion
https://www.ibm.com/docs/en/spectrum-scale/5.0.4?topic=installing-spectrum-scale-linux-nodes-deploying-protocols