Archivematica 1.16.0 is our latest release.

Upgrade from Archivematica 1.15.x to 1.16.0

On this page:

Note

While it is possible to upgrade a GitHub-based source install using ansible, these instructions do not cover that scenario.

Clean up completed transfers watched directory

Note

Ignore this section if you upgrading from Archivematica 1.11 or newer.

Upgrading from Archivematica 1.10.x or older to Archivematica 1.16.0 can result in a number of completed transfers appearing as failed in the Archivematica dashboard, as well as corresponding failure notification emails being sent. These are not actual failures, but are unintentional side effects of changes made in Archivematica 1.11 to the workflow and to how metadata files are stored and copied into the SIP.

To prevent these failures from occuring during an upgrade from Archivematica 1.10 or earlier:

  1. Confirm that all transfers and ingests are complete.

    Check that there are no transfers or SIPs that are still being processed or awaiting decisions in the Transfer and Ingest tabs. If there are, finish processing the transfers/ingests before proceeding.

  2. Delete all contents of the completedTransfers watched directory.

    sudo rm -rf /var/archivematica/sharedDirectory/watchedDirectories/SIPCreation/completedTransfers/*
    
  3. Perform the upgrade as described below.

Create a backup

Before starting any upgrade procedure on a production system, we strongly recommend backing up your system. If you are using a virtual machine, take a snapshot of it before making any changes. Alternatively, back up the file systems being used by your system. Exact procedures for updating will depend on your local installation. At a minimum you should make backups of:

  • The Storage Service SQLite (or MySQL) database
  • The dashboard MySQL database

This is a simple example of backing up these two databases:

sudo cp /var/archivematica/storage-service/storage.db ~/storage_db_backup.db
mysqldump -u root -p MCP > ~/am_backup.sql

If you do not have a password set for the root user in MySQL, you can take out the ‘-p’ portion of that command. If there is a problem during the upgrade process, you can restore your MySQL database from this backup and try the upgrade again.

If you’re upgrading from Archivematica 1.8 or lower to the 1.9 version or higher, the Elasticsearch version support changed from 1.x to 6.x and it’s also recommended to create a backup of your Elasticsearch data, especially if you don’t have access to the AIP storage locations in the local filesystem.

You can follow these steps in order to create a backup of Elasticsearch:

# Remove and recreate the folder that stores the backup
sudo rm -rf /var/lib/elasticsearch/backup-repo/
sudo mkdir -p /var/lib/elasticsearch/backup-repo/
sudo chown elasticsearch:elasticsearch /var/lib/elasticsearch/backup-repo/
# Allow elasticsearch to write files to the backup
echo 'path.repo: ["/var/lib/elasticsearch/backup-repo"]' |sudo tee -a /etc/elasticsearch/elasticsearch.yml
# Restart ElasticSearch and wait for it to start
sudo service elasticsearch restart
sleep 60s
# Configure the ES backup
curl -XPUT "localhost:9200/_snapshot/backup-repo" -H 'Content-Type: application/json' -d \
'{
     "type": "fs",
     "settings": {
     "location": "./",
     "compress": true
     }
 }'
# Take the actual backup, and copy it to a safe place
curl -X PUT "localhost:9200/_snapshot/backup-repo/am_indexes_backup?wait_for_completion=true"
cp /var/lib/elasticsearch/backup-repo elasticsearch-backup -rf

For more info, refer to the ElasticSearch 6.8 docs.

Upgrade on Ubuntu packages

  1. Update the operating system.

    sudo apt-get update && sudo apt-get upgrade
    
  2. Update package sources.

    echo 'deb [arch=amd64] http://packages.archivematica.org/1.16.x/ubuntu jammy main' >> /etc/apt/sources.list
    echo 'deb [arch=amd64] http://packages.archivematica.org/1.16.x/ubuntu-externals jammy main' >> /etc/apt/sources.list
    

    Optionally you can remove the lines referencing packages.archivematica.org/1.15.x from /etc/apt/sources.list.

  3. Update the Storage Service.

    sudo apt-get update
    sudo apt-get install archivematica-storage-service
    
  4. Update Archivematica. During the update process you may be asked about updating configuration files. Choose to accept the maintainers versions. You will also be asked about updating the database - say ‘ok’ to each of those steps. If you have set a password for the root MySQL database user, enter it when prompted.

    sudo apt-get install archivematica-common
    sudo apt-get install archivematica-dashboard
    sudo apt-get install archivematica-mcp-server
    sudo apt-get install archivematica-mcp-client
    sudo apt-get install archivematica
    
  5. Restart services.

    sudo service archivematica-storage-service restart
    sudo service gearman-job-server restart
    sudo service archivematica-mcp-server restart
    sudo service archivematica-mcp-client restart
    sudo service archivematica-dashboard restart
    sudo service nginx restart
    
  6. Depending on your browser settings, you may need to clear your browser cache to make the dashboard pages load properly. For example in Firefox or Chrome you should be able to clear the cache with control-shift-R or command-shift-F5.

Upgrade on Rocky Linux/Red Hat packages

  1. Upgrade the repositories for 1.16:

    sudo sed -i 's/1.15.x/1.16.x/g' /etc/yum.repos.d/archivematica*
    
  2. Remove the current installed version of ghostscript:

    sudo rpm -e --nodeps ghostscript ghostscript-x11 \
                         ghostscript-core ghostscript-fonts
    
  3. Upgrade Archivematica packages:

    sudo yum update
    
  4. Apply the Archivematica database migrations:

    sudo -u archivematica bash -c " \
        set -a -e -x
        source /etc/default/archivematica-dashboard || \
            source /etc/sysconfig/archivematica-dashboard \
                || (echo 'Environment file not found'; exit 1)
        cd /usr/share/archivematica/dashboard
        /usr/share/archivematica/virtualenvs/archivematica/bin/python manage.py migrate --noinput
    ";
    
  5. Apply the Storage Service database migrations:

    Warning

    In Archivematica 1.13 or newer, the new default database backend is MySQL. Please follow our migration guide to move your data to a MySQL database before these migrations are applied.

    If you want to continue using SQLite, please edit the environment configuration found in /etc/sysconfig/archivematica-storage-service. Comment out SS_DB_URL and indicate the path of the SQLite database with SS_DB_NAME, e.g.: SS_DB_NAME=/var/archivematica/storage-service/storage.db.

    sudo -u archivematica bash -c " \
        set -a -e -x
        source /etc/default/archivematica-storage-service || \
            source /etc/sysconfig/archivematica-storage-service \
                || (echo 'Environment file not found'; exit 1)
        cd /usr/lib/archivematica/storage-service
        /usr/share/archivematica/virtualenvs/archivematica-storage-service/bin/python manage.py migrate
    ";
    
  6. Restart the Archivematica related services, and continue using the system:

    sudo systemctl restart archivematica-storage-service
    sudo systemctl restart archivematica-dashboard
    sudo systemctl restart archivematica-mcp-client
    sudo systemctl restart archivematica-mcp-server
    
  7. Depending on your browser settings, you may need to clear your browser cache to make the dashboard pages load properly. For example in Firefox or Chrome you should be able to clear the cache with control-shift-R or command-shift-F5.

Upgrade on Vagrant / Ansible

This upgrade method will work with Vagrant machines, but also with cloud based virtual machines, or physical servers.

  1. Connect to your Vagrant machine or server

    vagrant ssh # Or ssh <your user>@<host>
    
  2. Install Ansible

    sudo pip install ansible==2.9.10 jmespath jinja2==3.0.3
    
  3. Checkout the deployment repo:

    git clone https://github.com/artefactual/deploy-pub.git
    
  4. Go into the appropiate playbook folder, and install the needed roles

    Ubuntu 22.04 (Jammy):

    cd deploy-pub/playbooks/archivematica-jammy
    ansible-galaxy install -f -p roles/ -r requirements.yml
    

    Rocky Linux 9:

    cd deploy-pub/playbooks/archivematica-rocky9
    ansible-galaxy install -f -p roles/ -r requirements.yml
    

    All the following steps should be run from the respective playbook folder for your operating system.

  5. Verify that the vars-singlenode.yml has the appropiate contents for Elasticsearch and Archivematica, or update it with your own

  6. Create a hosts file.

    echo 'am-local   ansible_connection=local' > hosts
    
  7. Upgrade Archivematica running

    ansible-playbook -i hosts singlenode.yml --tags=elasticsearch,archivematica-src
    

Upgrade in indexless mode

As of Archivematica 1.7, Archivematica can be run in indexless mode; that is, without Elasticsearch. Installing Archivematica without Elasticsearch, or with limited Elasticsearch functionality, means reduced consumption of compute resources and lower operational complexity. By setting the archivematica_src_search_enabled configuration attribute, administrators can define how many things Elasticsearch is indexing, if any. This can impact searching across several different dashboard pages.

  1. Upgrade your existing Archivematica pipeline following the instructions above.

  2. Modify the relevant systemd EnvironmentFile files by adding lines that set the relevant environment variables to false.

    If you are using Ubuntu, run the following commands.

    sudo sh -c 'echo "ARCHIVEMATICA_DASHBOARD_DASHBOARD_SEARCH_ENABLED=false" >> /etc/default/archivematica-dashboard'
    sudo sh -c 'echo "ARCHIVEMATICA_MCPSERVER_MCPSERVER_SEARCH_ENABLED=false" >> /etc/default/archivematica-mcp-server'
    sudo sh -c 'echo "ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_SEARCH_ENABLED=false" >> /etc/default/archivematica-mcp-client'
    

    If you are using Rocky Linux, run the following commands.

    sudo sh -c 'echo "ARCHIVEMATICA_DASHBOARD_DASHBOARD_SEARCH_ENABLED=false" >> /etc/sysconfig/archivematica-dashboard'
    sudo sh -c 'echo "ARCHIVEMATICA_MCPSERVER_MCPSERVER_SEARCH_ENABLED=false" >> /etc/sysconfig/archivematica-mcp-server'
    sudo sh -c 'echo "ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_SEARCH_ENABLED=false" >> /etc/sysconfig/archivematica-mcp-client'
    
  3. Restart services.

    If you are using Ubuntu, run the following commands.

    sudo service archivematica-dashboard restart
    sudo service archivematica-mcp-client restart
    sudo service archivematica-mcp-server restart
    

    If you are using Rocky Linux, run the following commands.

    sudo -u root systemctl restart archivematica-dashboard
    sudo -u root systemctl restart archivematica-mcp-client
    sudo -u root systemctl restart archivematica-mcp-server
    
  4. If you had previously installed and started the Elasticsearch service, you can turn it off now.

    sudo -u root systemctl stop elasticsearch
    sudo -u root systemctl disable elasticsearch
    

Upgrade with output capturing disabled

As of Archivematica 1.7.1, output capturing can be disabled at upgrade or at any other time. This means the stdout and stderr from preservation tasks are not captured, which can result in a performane improvement. See the Task output capturing configuration <task-output-capturing-admin> page for more details. In order to disable output capturing, set the ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CAPTURE_CLIENT_SCRIPT_OUTPUT environment variable to false and restart the MCP Client process(es). Consult the installation instructions for your deployment method for more details on how to set environment variables and restart Archivematica processes.

Back to the top

Update search indices

Note

Ignore this section if you are planning to run Archivematica without search indices.

Archivematica releases may introduce changes that require updating the search indices to function properly, e.g. Archivematica v1.12.0 introduced new fields to the search indices and made some changes to text field types. Please keep an eye on our release notes before you start the upgrade.

The update can be accomplished one of two ways. Preferably, you can reindex the documents which is usually faster because the same documents that you already have indexed will be re-ingested. We would love to know if this is not working for you, but when that’s the case, it is possible to recreate the indices which will take much longer to complete because it accesses the original data, e.g. your AIPs.

Reindex the documents

In Elasticsearch, it is possible to add new fields to search indices but it is not possible to update existing ones. The recommended strategy is to create new indices with our desired mapping and reindex our documents. This is based on the Reindex API.

It is a multi-step process that we have automated with a script: es-reindex.sh. Please follow the link and read the instructions carefully.

Warning

Before you continue, we recommend backing up your Elasticsearch data. Please read the official docs for instructions.

Note

We may implement this script as a Django command in the future for better usability. For the time being, please download the script and tweak as needed.

Recreate the indices

This method will allow you to delete and rebuild the existing Elasticsearch indices so that all the Backlog and Archival Storage column fields are fully populated, including for transfers and AIPs ingested prior to the upgrade to Archivematica 1.16.0. Run the commands described in Rebuild the indexes to fully delete and rebuild the indices.

Execution example:

sudo -u archivematica bash -c " \
    set -a -e -x
    source /etc/default/archivematica-dashboard || \
        source /etc/sysconfig/archivematica-dashboard \
            || (echo 'Environment file not found'; exit 1)
    cd /usr/share/archivematica/dashboard
    /usr/share/archivematica/virtualenvs/archivematica/bin/python \
        manage.py rebuild_transfer_backlog --from-storage-service --no-prompt
";

sudo -u archivematica bash -c " \
    set -a -e -x
    source /etc/default/archivematica-dashboard || \
        source /etc/sysconfig/archivematica-dashboard \
            || (echo 'Environment file not found'; exit 1)
    cd /usr/share/archivematica/dashboard
    /usr/share/archivematica/virtualenvs/archivematica/bin/python \
        manage.py rebuild_aip_index_from_storage_service --delete-all
";

Note

Please note, the use of encrypted or remote Transfer Backlog and AIP Store locations may require use of the option to rebuild indices from the Storage Service API rather than from the filesystem. At this time, it is not possible to rebuild the indices for all types of remote locations.

Note

Please note, the execution of this command may take a long time for big AIP and Transfer Backlog storage locations, especially if the packages are stored compressed or encrypted, or you are using a third party service. If that is the case, you may want to reindex the Elasticsearch documents instead.

Review the processing configuration

After any Archivematica upgrade, it is recommended to perform a sanity check on your processing configurations. Look for new decision points where you want to establish a default, like the new “Scan for viruses” introduced in Archivematica 1.13.

The default and automated bundled configurations can be reset to the Archivematica defaults.

Migrate from MySQL 5.x to 8.x

It is recommended the MySQL databases for Archivematica and Storage Service use the MySQL 8 utf8mb4 character set and its default collation utf8mb4_0900_ai_ci (or utf8mb4_general_ci in MariaDB).

If you migrate your databases from MySQL 5.x you can check the character set and encoding of their tables with:

SELECT
   t.table_schema, t.table_name, c.character_set_name, t.table_collation
FROM
   information_schema.tables t,
   information_schema.collation_character_set_applicability c
WHERE
   c.collation_name = t.table_collation
   AND t.table_type = 'BASE TABLE'
   AND (t.table_schema = 'MCP' OR t.table_schema = 'SS');

If they use the utf8mb3 character set and collation you should update them to avoid potential migration conflicts like this:

Running migrations:
  Applying admin.0003_logentry_add_action_flag_choices... OK
  Applying auth.0009_alter_user_last_name_max_length... OK
  Applying auth.0010_alter_group_name_max_length... OK
  Applying auth.0011_update_proxy_permissions... OK
  Applying auth.0012_alter_user_first_name_max_length... OK
  Applying locations.0031_rclone_space...Traceback (most recent call last):
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/django/db/backends/mysql/base.py", line 73, in execute
    return self.cursor.execute(query, args)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/MySQLdb/cursors.py", line 179, in execute
    res = self._query(mogrified_query)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/MySQLdb/cursors.py", line 330, in _query
    db.query(q)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/MySQLdb/connections.py", line 255, in query
    _mysql.connection.query(self, query)
MySQLdb.OperationalError: (3780, "Referencing column 'space_id' and referenced column 'uuid' in foreign key constraint 'locations_rclone_space_id_adb7fd1d_fk_locations_space_uuid' are incompatible.")

django.db.utils.OperationalError: (3780, "Referencing column 'space_id' and referenced column 'uuid' in foreign key constraint 'locations_rclone_space_id_adb7fd1d_fk_locations_space_uuid' are incompatible.")

The following script can be used as a reference to update the character set of the databases and their tables.

#!/usr/bin/env bash

set -o errexit   # abort on nonzero exitstatus
set -o nounset   # abort on unbound variable
set -o pipefail  # do not hide errors within pipes

# Array of database names
DATABASES=(
  MCP
  SS
)

# Collation and CHARSET
CHARSET="utf8mb4"
COLLATION="utf8mb4_0900_ai_ci"

# MySQL authentication (optional, default no auth)
MYSQL_USE_AUTH=False
MYSQL_USER=root
MYSQL_PASSWORD="THE_PASSWORD"

# Function to execute a query
execute_query() {
    local query="$1"
    local db_name="$2"
    local user_arg=""

    if [ "$MYSQL_USE_AUTH" = "True" ]; then
        user_arg="-u$MYSQL_USER"
        export MYSQL_PWD="$MYSQL_PASSWORD"
    fi

    mysql -N -B $user_arg -e "$query" "$db_name"
}

# Function to fix database charset and collation
fix_database_charset() {
    local query="ALTER DATABASE ${DB_NAME} CHARACTER SET $CHARSET COLLATE $COLLATION;"
    echo "Fixing database charset and collation"
    execute_query "$query" "$DB_NAME"
    echo "Fixed database charset and collation"
}

# Function to fix tables charset and collation
fix_tables_charset() {
    local query="SELECT CONCAT('ALTER TABLE \`',  table_name, '\` CHARACTER SET $CHARSET COLLATE $COLLATION;') \
    FROM information_schema.TABLES AS T, information_schema.\`COLLATION_CHARACTER_SET_APPLICABILITY\` AS C \
    WHERE C.collation_name = T.table_collation \
    AND T.table_schema = '$DB_NAME' \
    AND (C.CHARACTER_SET_NAME != '$CHARSET' OR C.COLLATION_NAME != '$COLLATION');"

    local alter_table_queries=$(execute_query "$query" "$DB_NAME")
    alter_table_queries_no_foreign_key_checks=$(echo -e "SET FOREIGN_KEY_CHECKS=0;\n$alter_table_queries\nSET FOREIGN_KEY_CHECKS=1;")
    # echo "$alter_table_queries_no_foreign_key_checks"
    echo "Fixing tables charset and collation"
    execute_query "$alter_table_queries_no_foreign_key_checks" "$DB_NAME"
    echo "Fixed tables charset and collation"
}

# Function to fix column collation for varchar columns
fix_varchar_columns_collation() {
    local query="SELECT CONCAT('ALTER TABLE \`', table_name, '\` MODIFY \`', column_name, '\` ', DATA_TYPE, \
    '(', CHARACTER_MAXIMUM_LENGTH, ') CHARACTER SET $CHARSET COLLATE $COLLATION', \
    (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';') \
    FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = '$DB_NAME' AND DATA_TYPE = 'varchar' AND \
    ( CHARACTER_SET_NAME != '$CHARSET' OR COLLATION_NAME != '$COLLATION');"

    local alter_table_queries=$(execute_query "$query" "$DB_NAME")
    alter_table_queries_no_foreign_key_checks=$(echo -e "SET FOREIGN_KEY_CHECKS=0;\n$alter_table_queries\nSET FOREIGN_KEY_CHECKS=1;")
    # echo "$alter_table_queries_no_foreign_key_checks"
    echo "Fixing column collation for varchar columns"
    execute_query "$alter_table_queries_no_foreign_key_checks" "$DB_NAME"
    echo "Fixed column collation for varchar columns"
}

# Function to fix column collation for non-varchar columns
fix_non_varchar_columns_collation() {
    local query="SELECT CONCAT('ALTER TABLE \`', table_name, '\` MODIFY \`', column_name, '\` ', DATA_TYPE, ' \
    CHARACTER SET $CHARSET COLLATE $COLLATION', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';') \
    FROM information_schema.COLUMNS \
    WHERE TABLE_SCHEMA = '$DB_NAME' \
    AND DATA_TYPE != 'varchar' \
    AND (CHARACTER_SET_NAME != '$CHARSET' OR COLLATION_NAME != '$COLLATION');"

    local alter_table_queries=$(execute_query "$query" "$DB_NAME")
    alter_table_queries_no_foreign_key_checks=$(echo -e "SET FOREIGN_KEY_CHECKS=0;\n$alter_table_queries\nSET FOREIGN_KEY_CHECKS=1;")
    # echo "$alter_table_queries_no_foreign_key_checks"
    echo "Fixing column collation for non-varchar columns"
    execute_query "$alter_table_queries_no_foreign_key_checks" "$DB_NAME"
    echo "Fixed column collation for non-varchar columns"
}

# Loop through each database in the array
for DB_NAME in "${DATABASES[@]}"; do
    echo "Processing database: $DB_NAME"
    fix_database_charset
    fix_tables_charset
    fix_varchar_columns_collation
    fix_non_varchar_columns_collation
    echo "Migration completed for $DB_NAME"
done

# Unset the MYSQL_PWD environment variable after executing the queries
unset MYSQL_PWD