Installing → Configuration Guide

Prerequisites

Before running the interactive xdmod-setup script, ensure the following:

The base XDMoD software and any desired XDMoD submodules have been installed following the Installation Guide.
The MariaDB server is configured per the instructions below and is running.

MariaDB Configuration

Open XDMoD does not support any of the strict Server SQL Modes. You must set sql_mode = '' in your MySQL server configuration.

Open XDMoD uses the GROUP_CONCAT() SQL function. The group_concat_max_len server system variable must be changed to 16MB from its default value of 1024 bytes.

The max_allowed_packet setting must be set to at least 16MB.

Some versions of MySQL have binary logging enabled by default. This can be an issue during the setup process if the user specified to create the databases does not have the SUPER privilege. If binary logging is not required you should disable it in your MySQL configuration. If that is not an option you can use the less safe log_bin_trust_function_creators variable. You may also grant the SUPER privilege to the user that is used to create the Open XDMoD database.

We recommend setting innodb_buffer_pool_size to around 50% of the memory on your server.

Whatever your set for innodb_buffer_pool_size, make sure innodb_log_file_size is 25% of innodb_buffer_pool_size.

The recommended settings in the MySQL server configuration file are as follows:

[mysqld]
sql_mode = ''
max_allowed_packet = 1G
group_concat_max_len = 16M
innodb_stats_on_metadata = off
innodb_file_per_table = On

Enabling InnoDB File Per Table setting

We recommend setting innodb_file_per_table = On for your Open XDMoD instance but it is not required. This setting helps to control the size of the database files and provides a minor speed up for InnoDB tables. It is important to note that setting innodb_file_per_table to On is a global setting that will affect all databases on the database server not just Open XDMoD related databases.

While not mandatory, when changing the innodb_file_per_table to innodb_file_per_table = On we recommend that you export, drop, and re-import all Open XDMoD InnoDB tables in order to make sure existing InnoDB data is moved to one file per table.

A script, /bin/xdmod-convert-innodb-fpt, is provided to help with this process. This script will only convert Open XDMoD related databases. For any non-Open XDMoD databases with InnoDB tables on your server you will need to export the tables manually, drop the tables and then load them back in.

The steps to enable the innodb_file_per_table MySQL option and making sure existing InnoDB data is moved to the appropriate file by using the xdmod-convert-innodb-fpt script is listed below.

Export all Open XDMoD InnoDB tables. This can be done with the following command xdmod-convert-innodb-fpt --export-tables --dir=path/to/dir
Drop all Open XDMoD InnoDB tables. This can be done using the --drop-tables flag, xdmod-convert-innodb-fpt --drop-tables. This command will drop your Open XDMoD InnoDB tables! Please make sure you have either run the command xdmod-convert-innodb-fpt --export-tables --dir=path/to/dir or have manually exported the InnoDB tables before running it.
Shutdown the MySQL server
Add the following line to /etc/my.cnf file.
```
innodb_file_per_table = On
```
Delete the ibdata1, ib_logfile0 and ib_logfile1 files from the MySQL data directory. The default location for this is /var/lib/mysql
Restart MySQL
Import Open XDMoD InnoDB data previously exported. This can be done with the following command: xdmod-convert-innodb-fpt --import-tables --dir=path/to/dir

Setup Script

Open XDMoD includes a setup script to help you configure your installation. This script will prompt you for information needed to configure Open XDMoD and update your configuration files accordingly. If you have modified your configuration files manually, be sure to make backups before running this command:

# xdmod-setup

General Settings

The general settings include:

Site address (The URL you will use to access Open XDMoD)
Email address (The email address Open XDMoD will use when sending emails)
Chromium path
Header logo (see Logo Image Guide for details)
Whether to enable the Dashboard tab (see the Dashboard Guide for details)

These settings are stored in portal_settings.ini.

Database Settings

Will create and initialize database as well as storing these settings:

Database hostname
Database port number
Database username
Database password

These settings are stored in portal_settings.ini.

You will be required to supply a username and password for a user that has privileges to create databases and users.

ACL Database Setup / Population

This step will run immediately after you have set up the database that Open XDMoD will be using and does not require any additional input. It is responsible for creating and populating the tables required by the ACL framework.

If your Open XDMoD Installation requires modifications to the ACL tables (/etc/xdmod/etl/etl_tables.d/acls/<table>.json) then running this step again or the acl-config bin script is required.

Organization Settings

The organization settings require a name and abbreviation for your organization. These will be used in the portal to refer to anything relating to your organization’s data.

Resources

For each resource you will need this information:

Resource name - A short name or abbreviation that will be used when displaying data about specific resources.
Formal name - A possibly longer, more descriptive name for the resource.
Resource type - The type that best describes this resource.
Node count - The current number of nodes in the resource.
Processor count - The total sum of all the processors (CPU cores) in the resource.

For example, if you have a resource dedicated to your physics department with 100 nodes that have 16 cores in each node, you could use these values:

Resource name: physics
Formal name: Physics Department Cluster
Resource type: hpc
Node count: 100
Processor count: 1600

The resource name supplied here must be specified during the shredding process. If you are using the Slurm helper script, this name must match the cluster name used by Slurm.

The resource type defines metadata that can be used to group and filter resources in the XDMoD user interface.

The number of nodes and cores in your resource are used to display the utilization charts (the percentage of your cluster that is being used). If these numbers are not accurate, these charts will likewise be inaccurate. If the number of nodes or processors in any of your resources changes, you will need to update your configuration. Refer to the resource_specs.json section below for details.

Create Admin User

This will allow you to create an administrative user that can log into the Open XDMoD portal and create other users. You will need to supply a username and password for this user along with the first name, last name and email address of your admin.

Hierarchy

Open XDMoD allows you to define a three level hierarchy that can be used to define various entities or groups and associate users with a group in the hierarchy. These can be decanal units and their associated departments or any hierarchy that is desired. If defined, this hierarchy is used to generate charts that aggregate usage metrics into groups based on users assigned to one of the groups.

See the Hierarchy Guide for more details.

Apache Configuration

A template Apache configuration file is provided. The path is /usr/share/xdmod/templates/apache.conf in the RPM install and share/templates/apache.conf in the source code install. This template file must be copied to the Apache configuration directory and edited to update site specific configuration settings.

For Rocky 8 and RHEL 8 the template file should be copied to /etc/httpd/conf.d/xdmod.conf. For other Linux distributions consult the distribution documentation to determine the path to the webserver configuration files.

This template file must be modified to update site specific settings:

Valid SSL certificates will need to be installed and configured. The template configuration file must be edited to specify the path to the SSL certificate file and SSL certificate key file. Refer to the Apache SSL documentation for SSL configuration information.

The ServerName setting should be updated to match the server name in the SSL certificate.

The name and port of the server must match with the site_address configuration setting in portal_settings.ini.

The template configuration file also configures the webserver to send the Strict-Transport-Security HTTP Header to indicate to web browsers that the Open XDMoD instance should only be accessed using HTTPS.

<VirtualHost *:443>
    # The ServerName and ServerAdmin parameters should be updated.
    ServerName localhost
    ServerAdmin postmaster@localhost

    # Production Open XDMoD instances should use HTTPS
    SSLEngine on

    # Update the SSLCertificateFile and SSLCertificateKeyFile parameters
    # to the correct paths to your SSL certificate.
    SSLCertificateFile /etc/pki/tls/certs/localhost.crt
    SSLCertificateKeyFile /etc/pki/tls/private/localhost.key

    <FilesMatch "\.(cgi|shtml|phtml|php)$">
        SSLOptions +StdEnvVars
    </FilesMatch>

    # Use HTTP Strict Transport Security to force client to use secure connections only
    Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains"

    DocumentRoot /usr/share/xdmod/html

    <Directory /usr/share/xdmod/html>
        Options FollowSymLinks
        AllowOverride All
        DirectoryIndex index.php

        <IfModule mod_authz_core.c>
            Require all granted
        </IfModule>
    </Directory>

    <Directory /usr/share/xdmod/html/rest>
        RewriteEngine On
        RewriteRule (.*) index.php [L]
    </Directory>

    ## SimpleSAML Single Sign On authentication.
    #SetEnv SIMPLESAMLPHP_CONFIG_DIR /etc/xdmod/simplesamlphp/config
    #Alias /simplesaml /usr/share/xdmod/vendor/simplesamlphp/simplesamlphp/www
    #<Directory /usr/share/xdmod/vendor/simplesamlphp/simplesamlphp/www>
    #    Options FollowSymLinks
    #    AllowOverride All
    #    <IfModule mod_authz_core.c>
    #        Require all granted
    #    </IfModule>
    #</Directory>

    # Update the path to rotatelogs if it is different on your system.
    ErrorLog "|/usr/sbin/rotatelogs -n 5 /var/log/xdmod/apache-error.log 1M"
    CustomLog "|/usr/sbin/rotatelogs -n 5 /var/log/xdmod/apache-access.log 1M" combined
</VirtualHost>

Logrotate Configuration

A logrotate config file is included for the Open XDMoD log files.

Cron Configuration

A cron config file is included that runs the script that sends out scheduled reports. You can also use this file to schedule shredding and ingestion.

# Every morning at 3:00 AM -- run the report scheduler
0 3 * * * xdmod /usr/bin/php /usr/lib/xdmod/report_schedule_manager.php >/dev/null

# Shred and ingest PBS logs
0 1 * * * xdmod /usr/bin/xdmod-shredder -q -r resource-name -f pbs -d /var/spool/pbs/server_priv/accounting && /usr/bin/xdmod-ingestor -q

Location of Configuration Files

The Open XDMoD config files (excluding the apache, logrotate and cron files) are located in the etc directory of the installation prefix or /etc/xdmod for the RPM distribution.

portal_settings.ini

Primary configuration file. Contains:

Site address (The URL you will use to access Open XDMoD)
Email address (The email address Open XDMoD will use when sending emails)
Chromium path
Header logo (see Logo Image Guide for details)
Database configuration
Integration settings (see Integrations for details)
Data Analytics Framework settings

datawarehouse.json

Defines realms, group bys, statistics.

etl/etl_data.d/jobs/xdw/processor-buckets.json

Defines the ranges used for number of processors/cores in “Job Size” charts. Sites may want to align the bucket sizes with the number of cores per node on their resources.

[
    ["id", "min_processors", "max_processors", "description"],
    [1,       1,          1, "1"],
    [2,       2,          2, "2"],
    [3,       3,          4, "3 - 4"],
    [4,       5,          8, "5 - 8"],
    [5,       9,         16, "9 - 16"],
    [6,      17,         32, "17 - 32"],
    [7,      33,         64, "33 - 64"],
    [8,      65,        128, "65 - 128"],
    [9,     129,        256, "129 - 256"],
    [10,    257,        512, "257 - 512"],
    [11,    513,       1024, "513 - 1024"],
    [12,   1025,       2048, "1k - 2k"],
    [13,   2049,       4096, "2k - 4k"],
    [14,   4097, 2147483647, "> 4k"]
]

After changing this file it must be re-ingested and all job data must be re-aggregated. If the job data are not re-aggregated the new labels will be displayed, but will not be accurate if the corresponding bucket has changed.

/usr/share/xdmod/tools/etl/etl_overseer.php -a xdmod.jobs-xdw-bootstrap.processorbuckets
xdmod-ingestor --aggregate=job --last-modified-start-date 1970-01-01

etl/etl_data.d/jobs/xdw/gpu-buckets.json

Defines the ranges used for the number of GPUs in “GPU Count” charts. Sites may want to align the bucket sizes with the number of GPUs per node on their resources.

[
    ["id", "min_gpus", "max_gpus", "description"],
    [1,       0,           0, "0"],
    [2,       1,           1, "1"],
    [3,       2,           2, "2"],
    [4,       3,           3, "3"],
    [5,       4,           4, "4"],
    [6,       5,           5, "5"],
    [7,       6,           6, "6"],
    [8,       7,           7, "7"],
    [9,       8,           8, "8"],
    [10,      9,          16, "9 - 16"],
    [11,      17,         32, "17 - 32"],
    [12,      33,         64, "33 - 64"],
    [13,      65,        128, "65 - 128"],
    [14,     129,        256, "129 - 256"],
    [15,     257,        512, "257 - 512"],
    [16,     513,       1024, "513 - 1024"],
    [17,    1025,       2048, "1k - 2k"],
    [18,    2049,       4096, "2k - 4k"],
    [19,    4097, 2147483647, "> 4k"]
]

See the section above for commands that can be used to re-ingest and re-aggregate the data.

roles.json

Defines roles and the modules and statistics that each role grants access to. The dimensions roles are associated with are also defined here.

By default, there is a public role (pub) for users that have not signed in and several other roles that apply to authenticated users. There is also a default role that is used as a basis of all the other roles. The other roles are user (usr), center director (cd), principal investigator (pi), center staff (cs) and manager (mgr).

{
    "roles": {
        "default": {
            "permitted_modules": [
                {
                    "name": "tg_summary",
                    "default": true,
                    "title": "Summary",
                    "position": 100,
                    "javascriptClass": "XDMoD.Module.Summary",
                    "javascriptReference": "CCR.xdmod.ui.tgSummaryViewer",
                    "tooltip": "Displays Summary Information",
                    "userManualSectionName": "Summary Tab"
                },
                ...
            ],
            "query_descripters": [
                {
                    "realm": "Jobs",
                    "group_by": "none"
                },
                ...
            ],
            "summary_charts": [
                ...
            ],
        },
        "usr": {
            "extends": "default",
            "dimensions": [
                "person"
            ]
        },
        "cd": {
            "extends": "default",
            "dimensions": [
                "provider"
            ]
        },
        "pi": {
            "extends": "default",
            "dimensions": [
                "pi"
            ]
        },
        "cs": {
            "extends": "default",
            "dimensions": [
                "provider"
            ]
        },
        "mgr": {
            "extends": "default",
            "dimensions": [
                "person"
            ]
        }
    }
}

organization.json

Defines the organization name and abbreviation.

{
    "name": "Example Organization",
    "abbrev": "EO"
}

resources.json

Defines resource names and types. Each object in the array represents the configuration for a single resource.

Optionally, allows specifying a column in the resource specific job table to identify the PI. The column names that may be used with this feature must exist in the corresponding shredded_job_* table (e.g. shredded_job_pbs, shredded_job_slurm) of the mod_shredder database for the resource manager you are using.

For example, to use accounts from PBS/TORQUE you must use "pi_column": "account", but to use accounts from Slurm you must use "pi_column": "account_name".

The "shared_jobs" option indicates that the resource allows multiple to share compute nodes. This information is used by the Job Performance Data (SUPReMM) module to determine which HPC jobs shared compute nodes. The default is that resources are assumed to not allow node sharing. If the SUPReMM module is in use and a resource does allow node sharing then this should be set to true.

For cloud resources the timezone is not used and times are converted to the local timezone that the server is in.

The resource_allocation_type option indicates how this resource is allocated to users, such as by CPU, GPU or Node. By default, there are 4 possible values, CPU, CPUNode, GPU, and GPUNode. CPUNode denotes a resource that allocates nodes of CPUs to users, whereas CPU denotes a resource that allocates individual CPUs to users.

[
    {
        "resource": "resource1",
        "name": "Resource 1",
        "description": "Our first HPC resource",
        "resource_type": "HPC",
        "resource_allocation_type": "CPUNode"
    },
    {
        "resource": "resource2",
        "name": "Resource 2",
        "resource_type": "HPC",
        "pi_column": "account_name",
        "resource_allocation_type": "GPU"
    },
    {
        "resource": "resource3",
        "name": "Resource 3",
        "resource_type": "HPC",
        "timezone": "US/Eastern",
        "resource_allocation_type": "CPU",
        "shared_jobs": true
    },
    {
        "resource": "resource4",
        "name": "Resource 4",
        "resource_type": "Cloud",
        "resource_allocation_type": "CPU",
    }
]

resource_specs.json

Defines resource node and processor counts. Each object in the array represents a resource’s specifications for a given time interval. If the number of nodes and processors in a resource have changed over time, multiple entries are required for that resource to calculate an accurate utilization metric.

Note that the end_date is necessary for resources with multiple entries and still active. The end_date should not be included for the last entry for a resource. A start_date is necessary for all entries.

It is also possible to change the utilization metric by specifying a percent allocated (see percent_allocated below). The utilization will then be normalized against this percentage. This allows you to specify the total number of nodes and processors in a resource, but force the utilization percentage to be displayed as if only a fraction of those processors are allocated to the jobs stored in the Open XDMoD data warehouse. If this data is omitted, it is assumed that the resource is 100% allocated.

[
    {
        "resource": "resource1",
        "start_date": "2016-12-27",
        "cpu_node_count": 400,
        "cpu_processor_count": 4000,
        "cpu_ppn": 10,
        "gpu_node_count": 0,
        "gpu_processor_count": 0,
        "gpu_ppn": 0,
        "end_date": "2017-12-01"
    },
    {
        "resource": "frearson",
        "start_date": "2017-12-02",
        "cpu_node_count": 800,
        "cpu_processor_count": 8000,
        "cpu_ppn": 10,
        "gpu_node_count": 0,
        "gpu_processor_count": 0,
        "gpu_ppn": 0,
        "end_date": "2018-01-01"
    },
    {
        "resource": "frearson",
        "start_date": "2018-01-02",
        "cpu_node_count": 800,
        "cpu_processor_count": 8000,
        "cpu_ppn": 10,
        "gpu_node_count": 10,
        "gpu_processor_count": 100,
        "gpu_ppn": 10,
        "end_date": "2019-01-01",
        "percent_allocated": 100
    },
    {
        "resource": "frearson",
        "start_date": "2019-01-02",
        "cpu_node_count": 800,
        "cpu_processor_count": 8000,
        "cpu_ppn": 10,
        "gpu_node_count": 10,
        "gpu_processor_count": 100,
        "gpu_ppn": 10,
        "percent_allocated": 90
    },
]

resource_types.json

Defines resource types and associates resource types with realms. Each resource in resources.json should reference a resource type from this file. This file typically should not be changed.

resource_allocation_types.json

Defines how a resource is allocated to users. Each resource in resources.json should reference a resource allocation type from this file. This file typically should not be changed.

update_check.json

Determines if Open XDMoD will automatically check for updates. Set "enabled": false to disable.

{
    "enabled": true,
    "name": "John Doe",
    "organization": "Acme Widgets",
    "email": "j.doe@example.com"
}

About

Support

Download

Installing

Using

Resource Manager Notes

Optional modules

Developing

Unsupported versions