HPC2N Backup Service

Introduction

This is the client-oriented documentation for the HPC2N Backup Service which is based on IBM Storage Protect, formerly known as IBM Spectrum Protect and Tivoli Storage Manager (TSM). Most of this documentation will continue to use the TSM acronym to refer to the product.

Vendor documentation

The IBM support portal is located at https://www.ibm.com/mysupport/

However, the good documentation/manuals are hard to find from the portal page. The direct link to the TSM knowledge center is listed here (select TSM version in the "Select" dropdown to show the documentation!).

Obtaining the backup client binaries

IBM documents the current client versions and downloads on this page: https://www.ibm.com/support/pages/ibm-storage-protect-downloads-latest-fix-packs-and-interim-fixes

However, as that page can be cumbersome to use we also provide direct download links. As of this writing the IBM Storage Protect (Spectrum Protect, TSM) client can be downloaded from:

HPC2N also provides a package repository for Ubuntu LTS releases, see Ubuntu/Debian using the HPC2N package repository.

Note that only 64bit Unix/Linux machines are supported by the current TSM client version.

There is also a technote page at https://www.ibm.com/support/pages/ibm-storage-protect-downloads-latest-fix-packs-and-interim-fixes that collects all IBM Storage Protect / Spectrum Protect / TSM downloads. Note that you should not use the Passport Advantage (PPA) download pages, but instead use the public download pages marked FTP.

Installing the backup client binaries

Supported platforms

For supported platforms, it's easiest to follow the IBM documentation found at https://www.ibm.com/docs/en/storage-protect

Note that it is required that the Common Inventory Technology component TIVsm-BAcit is installed for the query pvuestimate server command to work as intended.

Version 8.1

https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=windows-install-unix-linux-backup-archive-clients (Unix/Linux/Mac)

https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=windows-client-installation-overview (Windows)

Unsupported platforms

Some platforms are supported on a best-effort basis, or not supported at all. See the technote https://www.ibm.com/support/pages/node/397693 for more details.

Ubuntu/Debian using the IBM packages

IBM now provides best-effort packages for Ubuntu, but they are reported to work on Debian as well. Download the packages from the FTP site and follow the instructions on https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=clients-installing-ubuntu-linux-x86-64-client

Ubuntu/Debian using the HPC2N package repository

We provide a package repository with the IBM provided Ubuntu packages for use by our customers.

Enabling the HPC2N package repository

The packages are available at http://packages.hpc2n.umu.se/

In order to use the repository you need to:

  • Add the HPC2N archive key to your APT configuration
  • Enable the HPC2N package repository

To add the HPC2N archive key first download it:

wget -O /tmp/hpc2n.asc https://packages.hpc2n.umu.se/hpc2n.asc

Then verify that the key fingerprints [1] matches, the command for displaying the fingerprint differs between gpg versions:

gpg --with-fingerprint --import --import-options show-only /tmp/hpc2n.asc
OR
gpg --with-fingerprint /tmp/hpc2n.asc

[1]: The fingerprints of the key are:

7993 55A4 C770 4A4B 92F9  5F80 276D 295E 7646 A0C2
C43C 1CE7 63DD F2A1 86D8  4EE4 360B 6ED5 E7BB 1FC4

If the fingerprints are correct, add the key:

sudo apt-key add /tmp/hpc2n.asc

To enable the repository, create /etc/apt/sources.list.d/hpc2n.list using your favorite text editor and with the appropriate contents as shown below:

For Ubuntu Focal (20.04 LTS):

deb http://packages.hpc2n.umu.se/ubuntu/hpc2n focal hpc2n

For Ubuntu Jammy (22.04 LTS):

deb http://packages.hpc2n.umu.se/ubuntu/hpc2n jammy hpc2n

For Debian, pick the Ubuntu repository above that approximately matches your Debian version. All repositories currently contains the same client packages as shipped by IBM.

Installing the client packages

Update the list of available packages:

sudo apt-get update

Install the IBM Storage Protect (Spectrum Protect, TSM) client packages:

sudo apt-get install tivsm-ba tivsm-bacit

CentOS

Follow the instructions for the corresponding RHEL release in the IBM documentation.

Configuring the backup client

We provide example configuration hosted in GIT repositories. This simplifies merging local changes with any updates/enhancements that we might provide, for example by simply committing your local changes and update from our example repository using git pull --rebase, or creating a local branch with your changes and doing rebase against our example repository.

Explaining all features of GIT is outside the scope of this manual, we recommend the reference documentation and videos at https://git-scm.com/doc and interactive guides such as https://try.github.io/ in order to familiarize yourself with git.

A few items to note regarding the example configuration:

  • All files are encrypted by default, using the medium level of security where the encryption keys are stored in the server database.
    • Note that encryption will prohibit all forms of compression or deduplication techniques to reduce storage space.
    • For the highest security, use a local encryption key. However, if that key is lost there is no way the backup can be restored.

Linux

To configure the IBM Storage Protect (Spectrum Protect, TSM) backup client on Linux you need to:

  • Obtain the example configuration
  • Set up the IBM Storage Protect CA certificate DB, ensure that the en_US locale is available, and add symbolic links so our configuration is found by default.
  • Apply any needed local configuration.
  • Ensure that the scheduler dsmcad gets started on boot.

Obtain the example configuration

Use GIT to clone the example config from https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-linux onto your system:

sudo git clone https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-linux.git /etc/tsm

Setup client defaults

To do all the setup steps, run the provided preparation script (review the script for a complete list of all tasks performed):

sudo /etc/tsm/scripts/tsm-prepare.sh

Pay attention to the script output, it prints additional informative messages.

Local configuration

Now your system should be able to communicate with the backup server. Your backup administrator should have provided you with a node name and a password.

Verify that your provided node name matches what your system thinks it's called:

uname -n

If the node name is not an exact match you need to explicitly set the nodename in dsm.sys. tsm-prepare.sh can help you with this, run:

sudo /etc/tsm/scripts/tsm-prepare.sh --nodename=yournodename.example.com

Now you can initialize communication with the backup server by running any command that communicates with the backup server, for example query the backup schedule by:

sudo dsmc query schedule

Press Enter when asked for a node name (the suggested value should be correct if configuration is OK) and provide the password when asked.

Starting the scheduler on boot

Activate the dsmcad scheduler to start on boot, and start it now.

On modern distributions using systemd (Ubuntu 16.04, Debian 9, RHEL/CentOS 7 and newer) IBM ships a dsmcad.service systemd, but it varies from version to version if it gets installed by default:

sudo systemctl --quiet stop dsmcad
sudo sh -c "test -f /etc/systemd/system/dsmcad.service || cp -v /opt/tivoli/tsm/client/ba/bin/dsmcad.service /etc/systemd/system/dsmcad.service"
sudo mkdir -p /etc/systemd/system/dsmcad.service.d
sudo cp /etc/tsm/scripts/dsmcad-overrides.conf /etc/systemd/system/dsmcad.service.d/
sudo systemctl daemon-reload
sudo systemctl enable dsmcad
sudo systemctl start dsmcad

On older Debian/Ubuntu based systems:

sudo update-rc.d dsmcad defaults
sudo service dsmcad start

On older RHEL/CentOS based systems:

sudo chkconfig --add dsmcad
sudo service dsmcad start

Review the /var/log/dsmwebcl.log and /var/log/dsmsched.log to see if the scheduler starts and is able to get the backup schedule from the server.

macOS

To configure the IBM Storage Protect (Spectrum Protect, TSM) backup client on macOS you need to:

  • Obtain the example configuration
  • Configure the NodeName in dsm.sys and set up the IBM Storage Protect CA certificate DB.
  • Ensure that the scheduler dsmcad gets started on boot.

These instructions are designed to be cut\&paste friendly and used in a Terminal window.

Obtain the example configuration

Use GIT to clone the example config from https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-macos onto your system:

sudo git clone https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-macos.git "/Library/Preferences/Tivoli Storage Manager"

If git complains that the target directory is not empty this means that you have a preexisting configuration. Rename the target directory and try again.

If the git command is not found, install Xcode Command Line Tools:

xcode-select --install

Configure client

To configure the client, run the provided preparation script. This script will: 1) ask for the NodeName (provided by your backup administrator) 2) record the NodeName in dsm.sys 3) commit the local change using git 4) setup the CA certificate DB.

sudo "/Library/Preferences/Tivoli Storage Manager/scripts/tsm-prepare.sh"

Now you can initialize communication with the backup server by running any command that communicates with the backup server, for example query the backup schedule by:

sudo dsmc query schedule

Press Enter when asked for a node name (the suggested value should be correct if configuration is OK) and provide the password when asked.

Starting the scheduler on boot

IBM provides a helper script that ensures that dsmcad runs:

sudo "/Library/Application Support/tivoli/tsm/client/ba/bin/StartCad.sh"

As an alternative, you can start IBM Storage Protect Tools for Administrators and select Start the Client Acceptor Daemon.

Review the /Library/Logs/tivoli/tsm/dsmwebcl.log and /Library/Logs/tivoli/tsm/dsmsched.log to see if the scheduler starts and is able to get the backup schedule from the server.

Configuring the TSM client

FIXME: This entire section is to be removed and replaced with OS-specific example config repositories

Using SSL/TLS

SSL/TLS is used when you want to protect your TSM sessions from eavesdropping, for example when you are doing backups on a public wireless network. The data stored on the TSM server is not encrypted, see Using client side encryption if storing sensitive data.

As of version 8.1.2 SSL/TLS is used during authentication by default, but NOT for data transfer.

The clients needs to have a trusted root certificate installed. As of this writing the HPC2N TSM server certificate is issued by Sunet TCS, see their FAQ at https://wiki.sunet.se/display/TCS/SUNET+TCS+2020-+Information+for+administrators for details.

See also https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=cspc-configuring-storage-protect-clientserver-communication-secure-sockets-layer for more information.

Configuration

Add the following to dsm.sys in order to enable encryption of transferred data:

SSL YES

Windows setup

Start by downloading the root certificate https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-linux/raw/branch/master/cacerts/AAA_Certificate_Services.pem and store it into the directory C:\Program Files\Tivoli\TSM\baclient

Open a command-line window (cmd.exe) as administrator.

Initiate the TSM client certificate store with the password notsecret, add our downloaded root certificate, and verify by listing the contents of the certificate store:

cd \Program Files\Tivoli\TSM\baclient
set PATH=C:\Program Files\IBM\gsk8\bin;C:\Program Files\IBM\gsk8\lib64;%PATH%
gsk8capicmd_64 -keydb -create -db dsmcert.kdb -pw notsecret -stash
gsk8capicmd_64 -cert -add -db dsmcert.kdb -stashed -label "AAA_Certificate_Services" -file AAA_Certificate_Services.pem
gsk8capicmd_64 -cert -list all -db dsmcert.kdb -stashed

Instead of gsk8capicmd_64 you might also use the dsmcert utility, it does the job but the description it adds in the certificate store is confusing:

cd \Program Files\Tivoli\TSM\baclient
set PATH=C:\Program Files\IBM\gsk8\bin;C:\Program Files\IBM\gsk8\lib64;%PATH%
dsmcert -add -server AAA_Certificate_Services -file AAA_Certificate_Services.pem

Linux setup

sudo -sH
cd /opt/tivoli/tsm/client/ba/bin
wget https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-linux/raw/branch/master/cacerts/AAA_Certificate_Services.pem
gsk8capicmd_64 -keydb -create -db dsmcert.kdb -pw notsecret -stash
gsk8capicmd_64 -cert -add -db dsmcert.kdb -stashed -label "AAA_Certificate_Services" -file AAA_Certificate_Services.pem
gsk8capicmd_64 -cert -list all -db dsmcert.kdb -stashed

macOS setup

sudo -sH
cd /Library/Application Support/tivoli/tsm/client/ba/bin
curl -O https://git.hpc2n.umu.se/HPC2N-Public/tsmconfig-macos/raw/branch/master/cacerts/AAA_Certificate_Services.pem
PATH=$PATH:/Library/ibm/gsk8/bin
gsk8capicmd -keydb -create -db dsmcert.kdb -pw notsecret -stash
gsk8capicmd -cert -add -db dsmcert.kdb -stashed -label "AAA_Certificate_Services" -file AAA_Certificate_Services.pem
gsk8capicmd -cert -list all -db dsmcert.kdb -stashed

Verification

To verify that SSL/TLS is in use, run dsmc query session and verify that there is SSL information provided. If there is no mention of SSL at all, then the session is NOT using SSL/TLS.

dsmc query session | grep SSL

Should output something similar to:

SSL Information.........: TLSv1.3 TLS_AES_256_GCM_SHA384

Using client side encryption

Use client side encryption to protect user data. TSM transfers/stores data in a plain-text format, so any sensitive data should be encrypted.

For medium security use a per session generated key. This protects from eavesdropping and a third party accessing the data from TSM server related storage media. However, as the generated key is stored in the TSM server database (separate from the data storage media) you can retrieve/restore the data as long as you have proper access.

For sensitive data choose high security that uses a pregenerated fixed encryption key. You need to the key in a safe location. There is no way to retrieve the data if the key is lost.

Medium security

Add the folowing to dsm.sys:

Encryptkey        generate

Add the following to the TSM client exclude-include file:

include.encrypt   /.../*

This will encrypt all backups and archives with the default AES128 encryption type.

The generated encryption keys are stored on the backup server in the database separate from the stored data.

High security

Instead of generating the key automatically, use a pregenerated encryption key only kept on the machine and in a safe location. There is no way to retrieve the data if the key is lost.

Add the following to dsm.sys:

Encryptiontype    AES256
Encryptkey        save

Generate an encryption key (ie encryption password) with random characters, we recommend at least 30 characters (63 maximum). We also recommend excluding national characters and characters that can be mistaken for others, like O0, 1lI etc. This is to ensure that you can successfully enter it from a printed copy later on (ie worst-case recovery).

Enter the encryption key when asked during the setup process. It is then saved in a local encryption key file.

It is the responsibility of the machine owner to store the encryption key in a safe location. Choose a location that ensures safety against theft, fire and flooding amongst other things. In particular store it separately from the computer in case of theft and fire.

We recommend that the encryption key (ie. encryption password) is kept both on a dedicated USB key and as a printed copy in a fire-proof safe or similar. Experience shows that plain old paper is more heat resistant than USB keys, and thus a cheap last resort.

Do NOT store the encryption key in any on-line form of storage (file on computer, internet-connected password manager, cloud storage, etc).

There is no way to retrieve the data if the key is lost.

Upgrading the backup client binaries

Official IBM documentation on how to install/upgrade is available at https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=clients-installing-storage-protect-backup-archive-unix-linux-windows (look in the Installation section relevant to the OS you are using).

In summary, it usually works to just install the updated packages using the normal tools and methods for your OS.

NOTE: In recent IBM packaging for Linux the dsmcad scheduler is stopped on upgrade, but not restarted afterwards! Restart it by issuing sudo systemctl start dsmcad (or whatever is appropriate for your system), alternatively reboot the machine.

See Obtaining the backup client binaries and Installing the backup client binaries for additional details.

Administration tasks

Starting the administrative interface

The interface is called dsmadmc, upon startup it will ask for a username and a password. If Multifactor Authentication is enabled, the authentication token is appended to the password (there is no separate prompt).

Setting up Multifactor Authentication

RFC 6238 TOTP is used for Multifactor Authentication (MFA), use your preferred smartphone application (probably the same as you use for other MFA/TOTP logins). See https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=sumaa-setting-up-multifactor-authentication-administrators-using-command-line-administrative-client for the official IBM documentation.

This is a short summary of the setup steps:

  • HPC2N staff enables MFA (or resets it) on your admin user. Your admin user is now in a MFA Transitional state, MFA setup must be completed before you can perform admin tasks.
  • You start dsmadmc and log in using your username and password.
  • You issue the command GENERATE SECRET to generate the MFA shared secret.
  • You add the shared secret to your favorite MFA/TOTP app.
  • You log out from dsmadmc using the QUIT command.
  • You start dsmadmc again and log in using your username together with password and TOTP token, where the TOTP token is simply appended to your password.
  • If login is successful, your admin user will proceed into MFA enabled state and you can now issue admin commands.

Generating MFA QR code

Since dsmadmc outputs the shared secret in text format it's cumbersome and error-prone to enter into smartphone apps. It's even hard to cut\&paste, since dsmadmc tends to introduce line breaks.

We provide the following recipes for displaying a QR code from the OTPAUTH-encoded secret value for easy scanning into MFA TOTP apps.

When doing this, take care to not save the shared secret unintentionally (shell history, scrollback buffer, image file, cut\&paste buffer, etc).

Terminal variant, Dependencies: BIG terminal window (min 105x55), bash, qrencode:

bash -c 'read -p "Enter otpauth:// single-line string: " OTP; qrencode -o- -t ANSI -m 1 "$OTP"'

Graphic variant, Dependencies: bash, qrencode, display/ImageMagick:

bash -c 'read -p "Enter otpauth:// single-line string: " OTP; qrencode -o- -d 300 -s 10 "$OTP" | display'

Getting help

After you have logged in you can start using the help. help by itself gives a help screen of sorts, help command gives help on command.

How to get information from the server

The query command is used for (almost) all queries for information. The most common ones are listed here, for a complete list use help query.

Most query commands accept the flag format=detailed, or f=d for short to give a verbose display of information. All commands and parameters can usually be shortened to the shortest unique name, q for query for example.

Wildcards, ie *, are usually accepted as parameter.

For more information on each command, use the help:

help query actlog

query node

query node by itself gives a list of all nodes. To obtain verbose information about a node you could for example use q node cws.* f=d.

Examples:

query node NODE-NAME
query node *.domain
query node domain=DOMAIN-NAME

query process

This gives a list of all currently running processes

query session

Lists all sessions of all session types.

query volume

This lists all volumes known to the server. To list all volumes which are disks you could use q vol devcl=disk.

query auditoccupancy

Gives you a list of the space usage of all nodes. Supply nodename to shorten the list.

query occupancy

Returns a verbose list of the space usage of all nodes. Supply nodename to shorten the list.

query admin

Lists all administrative users.

query association

Lists the association between Policy Domains, Schedules and nodes.

query mount

Lists all mounted tapes, if any.

query copygroup

Lists information on how many file versions are kept in the different management classes.

query drive

Lists information on tape drives. q dr f=d lists detailed info.

query actlog

Lists the activity log.

A few examples, see the help for detailed info:

q actlog begind=-1
q actlog begint=03:00
q actlog search="PROCESS: 1234"
q actlog begind=-7 msgno=8944
q actlog begind=-7 begint=00:00 search=tapealert

query pvuestimate

Lists license requirement summary for nodes. Add format=detailed for a verbose list.

NOTE: That the PVU numbers provided tend to be too high, as there are multiple bugs causing the per-core value to be 100 PVU instead of 70 PVU for a number of CPUs.

For more detailed reporting requirements you usually have to run a custom query against the TSM database, start with the following expression and tune it to your needs:

SELECT * FROM PVUESTIMATE_DETAILS

As an example, save the following file as cpuvpu_stats and execute it using macro cpuvpu_stats 'YourDomain' in dsmadmc, the listed EstPVU values are those provided by the IBM estimate function while the OurPVU assumes that those entries with per-core value of 100 PVU really should be 70 PVU.

SELECT \
CAST(n.node_name AS CHAR(30)) AS "NodeName", \
CAST(p.proc_count||'x '||p.proc_type||'core '||p.proc_vendor||' '||p.proc_brand||' '||p.proc_model AS CHAR(40)) AS "CPU Info", \
p.value_from_table AS "Known", \
CAST(p.pvu AS CHAR(5)) AS "EstPVU", \
CAST(CASE \
WHEN p.value_units<>100 OR (p.proc_count=1 AND p.proc_type=1) THEN p.pvu \
ELSE p.proc_count*p.proc_type*70 \
END AS CHAR(5)) AS "OurPVU" \
FROM nodes n,pvuestimate_details p \
WHERE n.node_name=p.node_name \
AND p.role_effective='SERVER' \
AND n.locked='NO' \
AND domain_name LIKE %1 \
ORDER BY n.node_name

SELECT \
CAST(n.domain_name AS CHAR(16)) AS "Domain", \
COUNT(n.node_name) as "Servers", \
CAST(SUM(CASE \
WHEN p.value_units<>100 OR (p.proc_count=1 AND p.proc_type=1) THEN p.pvu \
ELSE p.proc_count*p.proc_type*70 \
END) AS CHAR(6)) AS "PVU" \
FROM nodes n,pvuestimate_details p \
WHERE n.node_name=p.node_name \
AND p.role_effective='SERVER' \
AND n.locked='NO' \
AND domain_name LIKE %1 \
GROUP BY n.domain_name

query event

Display scheduled/completed events.

q event DOMAINNAME *

select

The TSM server exposes an SQL interface via the select command to enable more complex queries, usually used in macros or scripts for custom tasks.

See the IBM documentation https://www.ibm.com/docs/en/storage-protect/8.1.22?topic=commands-select-perform-sql-query-storage-protect-database for details and examples.

List backup client versions

The SQL interface can be used to give a nice overview of backup client versions used.

Replace HPC2N in the example below with your domain name (or a valid LIKE wildcard such as CS%).

SELECT CAST(node_name AS CHAR(40)) AS "NodeName",\
CAST(client_version||'.'||client_release||'.'||client_level||'.'||client_sublevel AS CHAR(10)) AS "ClientVersion" \
FROM nodes WHERE client_version IS NOT NULL AND locked='NO' AND \
domain_name LIKE 'HPC2N' \
ORDER BY client_version,client_release,client_level,client_sublevel,node_name

Adding stuff

There are a number of concepts that you need to know when adding a new node, and some are specific to the implementation on HPC2N. Below we list the HPC2N specifics and how we expect the options to be set/used.

Client option sets

In order to achieve good tape performance the client option TXNBYTELIMIT needs to be tuned on every client. To facilitate this the HPC2N TSM server provides the following client option sets:

  • NET_100MBIT - For a client with 100Mbit/s class networking (or slower).
  • NET_GIGE - For a client with 1000Mbit/s, ie Gigabit Ethernet, class networking.
  • NET_10GIGE - For a client with 10000Mbit/s, ie 10GigE, class networking.

These client option sets tune the TXNBYTELIMIT to achieve approx one aggregate every 20 seconds of data transmission. Larger is better, but on clients with slow networks the cost of file retransmission gets too high due to the fact that the entire aggregate has to be resent.

To update the client option set for a node, do something like:

update node NODENAME cloptset=NET_GIGE

The default role (wrt licensing) for a TSM node varies. It is often client for Windows/MacOS and server for Unix/Linux, but there are exceptions.

In order to make the query pvuestimate command return what you're expecting you'll have to override the role in the cases where the default doesn't match.

For minimum amount of confusion we recommend to always set roleoverride!

To do this, override the default role by doing something like:

update node NODENAME roleoverride=server
update node OTHERNODENAME roleoverride=client

When using virtualization you still have to license the hardware running the virtualization, we handle this by always installing a TSM client on the bare-metal servers in order to get the built-in license metrics to report sane numbers. For example, on a KVM/Ganeti virtualization with three physical servers we install the TSM client and backup the host OS of those three servers. Virtualization guests are then backed up by installing the TSM client and flagging the node with roleoverride=other to avoid double-counting licenses.

Virtualization guests, proxynodes and decommisioned/unused nodes can be flagged as such with:

update node NODENAME roleoverride=other

Notification preferences

On HPC2N we have a custom notify functionality (aka the tsmdude mails) that will send annoying emails when backup for a node isn't working as expected. The target email and notify timeout is mined from the comment field on the node, and it's expected to be present.

The following rules apply for the contact string:

  • Each item is separated by a semicolon ; followed by a space.
  • Each item has the form name=value.
  • Required items are:
    • admin - Where to send primary notifications, either on the form admin@example.com or The admin, admin@example.com to include a name.
    • notify - If backup hasn't run successfully for this many days, a notification email is sent.
  • Optional items are:
    • fallback - Where to send fallback notifications, either on the form superadmin@example.com or The Super admin, superadmin@example.com to include a name.
    • fbnotify - If backup hasn't run successfully for this many days, a fallback notification email is sent. It is expected for this value to be higher than the notify value.

A full example of a contact string that will mail the primary admin after being broken for 2 days and a fallback admin after being broken for 30 days is:

admin=The admin, admin@example.com; notify=2; fallback=The Super admin, superadmin@example.com; fbnotify=30

Registering a new client node

It's pretty easy to do something wrong when adding a node since you have to override most defaults to get them right ;)

In the following examples we will add a node and register a schedule to it for the main domains served by our backup service.

HPC2N

Add the node example.hpc2n.umu.se with the password somekindofpassword to the HPC2N policy domain:

reg node example.hpc2n.umu.se somekindofpassword contact="admin=Sys admins, sysop@hpc2n.umu.se; notify=2" domain=HPC2N forcepwreset=yes maxnummp=10 clopt=NET_100MBIT/NET_GIGE/NET_10GIGE

Note: clopt should always be specified regardless of type of system, choose one of the NET_xxx types. Only use NET_100MBIT when absolutely necessary.

For laptops/desktops (personal computers) add:

contact="admin=Sys admins, <youruser>@hpc2n.umu.se; notify=2" roleoverride=client

or (if you want notifications to be sent to sysop if the backups have not worked for a long time)

contact="admin=Sys admins, <youruser>@hpc2n.umu.se; notify=2; fallback=sysop@hpc2n.umu.se; fbnotify=30" roleoverride=client

For KVM/Ganeti/Proxynode/User instances the server nodes are backed up, avoid double-counting licenses by adding:

roleoverride=other

After registering the node you need to define an association with a schedule.

define association HPC2N SERVERSCHED example.hpc2n.umu.se

or

define association HPC2N LAPTOPSCHED example.hpc2n.umu.se

ACC

Add the node example.ac2.se with the password somekindofpassword to the ACC policy domain:

reg node example.ac2.se somekindofpassword contact="admin=Sys admins, sysadm@accum.se; notify=2" domain=ACC forcepwreset=yes maxnummp=10 clopt=NET_10GIGE

For KVM/Ganeti instances the server nodes are backed up, avoid double-counting licenses by adding:

roleoverride=other

After registering the node you need to define an association with a schedule.

define association acc acc_sched example.ac2.se

TP

Add the node example.tp.umu.se with the password somekindofpassword to the TP policy domain:

reg node example.tp.umu.se somekindofpassword contact="admin=Sys admins, backupadm@tp.umu.se; notify=2" domain=TP forcepwreset=yes clopt=NET_GIGE maxnummp=10 roleoverride=client/server

After registering the node you need to define an association with a schedule.

define association tp tp_sched example.tp.umu.se

NDGF

Add the node example.ndgf.org with the password somekindofpassword to the NDGF policy domain:

reg node example.ndgf.org somekindofpassword contact="admin=OoD, support@ndgf.org; notify=2" domain=NDGF forcepwreset=yes maxnummp=10 clopt=NET_10GIGE

For KVM/Ganeti instances the server nodes are backed up, avoid double-counting licenses by adding:

roleoverride=other

After registering the node you need to define an association with a schedule.

define association ndgf ndgf_sched example.ndgf.org

C3SE

Add the node example.c3se.chalmers.se with the password somekindofpassword to the C3SE policy domain:

reg node example.c3se.chalmers.se somekindofpassword contact="admin=Backup Admin, tekniker@C3SE.Chalmers.se; notify=2" domain=C3SE forcepwreset=yes maxnummp=10 clopt=NET_GIGE/NET_10GIGE

For KVM/Ganeti instances the server nodes are backed up, avoid double-counting licenses by adding:

roleoverride=other

After registering the node you need to define an association with a schedule.

define association c3se c3se_sched example.c3se.chalmers.se

Informatik

We separate the clients (workstations, laptops, whatnot) and the servers in different domains, simply because we want no collocation for the clients but some collocation for the servers.

This explicit separation wouldn't be needed if we could grant authority to modify collocation groups by domain.

Informatik clients

Add the node example.informatik.umu.se with the password somekindofpassword to the ITIK policy domain:

reg node example.informatik.umu.se somekindofpassword contact="admin=User Name, user.name@informatik.umu.se; notify=2; fallback=sysadm@informatik.umu.se; fbnotify=30" domain=ITIK forcepwreset=yes maxnummp=10 clopt=NET_100MBIT roleoverride=client

After registering the node you need to define an association with a schedule.

define association itik itik_sched example.informatik.umu.se

Changing stuff

update admin

You need to use this command to change your admin user password.

Choose a unique random password between 12 and 63 characters.

NOTE: The password is shown in clear-text when entered, so take care to not accidentally show it to others (screenshots, terminal logs, etc).

update admin youradminuser yournewadminpassword

update node

This is used to update all information about the node. The most common operation is probably to set a new password by

update node example-node.hpc2n.umu.se newpassword forcepwreset=yes

which will set the password to newpassword which will be automatically changed upon next access to something automatically generated.

unlock node

When someone has tried to guess a password too many times..

unlock node NODE-NAME

rename node

Use this command to rename a node.

rename node OLD-NAME NEW-NAME

It's recommended to use a node name that matches the machine hostname, this is the default for Unix/Linux clients at least. If the node name is hard-coded in the node client config, update the config after renaming the node and restart any TSM services running.

Windows - additional steps required

On Windows the TSM client stores the password in a registry key named as the target TSM server in a path containing the node name. This path needs renaming, or you need to set a new password for the node and enter the password upon startup of the TSM client.

In short, use the regedit tool to find any keys named on the form (you most likely only have one of these):

HKEY_LOCAL_MACHINE\SOFTWARE\IBM\ADSM\CurrentVersion\Nodes\''OLD-NAME''\BYTEGRINDER
HKEY_LOCAL_MACHINE\SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Nodes\''OLD-NAME''\BYTEGRINDER

and rename them, replacing OLD-NAME with NEW-NAME.

For more information on how to use the Windows Registry Editor regedit, see the Microsoft Support pages at http://support.microsoft.com/default.aspx?scid=kb;en-us;136393

Move node to another backup domain

First, move the node:

update node NODENAME domain=NEW-DOMAIN

Then, associate the moved node with a backup schedule in the new domain:

def assoc NEW-DOMAIN SCHEDULENAME NODENAME

Proceed with moving the data of the node. For this you must know which primary storage pools holds data for the node. Find this out by checking the occupancy:

q occupancy NODENAME

Then move all node data from the primary sequential storage pools, usually only one tape pool.

move nodedata ITCHY.CS.UMU.SE from=CST to=CSSRVD

If moving multiple nodes, specify them all in the same move nodedata command separated by commas, or use the collocgroup argument (see the help).

Creating/changing collocation groups

The relevant commands are

define collocg examplegroup desc="Example colloc group"
define collocmem examplegroup NODENAME
delete collocmem examplegroup NODENAME

Removing stuff

Implementing a grace period before node removal

To keep an old backup node for a while before removing it, leverage the HPC2N-specific notify functionality (aka the tsmdude mails) and set it to notify you at a suitable time in the future.

Perform the following steps:

  • Lock the backup node: lock node example-node
  • Increase the notify subfield in the node contact field:
    • List the current contact field for the node: q node example-node f=d
    • Cut-and-paste the current contact field as one line, change the notify= number to the number of days it should delay notifications, and update the node contact info. Pay attention to get the quotes and delimiters correct:
      • Example: update node example-node contact="admin=Sys admins, sysop@hpc2n.umu.se; notify=180"

Removing a client node

This procedure immediately removes a client node, there is no way to recover data afterwards.

If machine still has a backup-client running, ensure node is locked to avoid backups starting while you are removing file spaces:

lock node example-node

To remove all filespaces of all types (ie. backup/archive) related to the node:

delete filespace example-node *

Wait for the delete filespace process to finish.

Last, remove the node itself. Any remaining schedule associations, proxynodes definitions, etc, tied to the node will also be removed.

remove node example-node