Archivematica 1.8 is our latest release.

Unzipped and zipped bags

Archivematica supports the ingest of materials packaged in accordance with the Library of Congress BagIt specification. Users can ingest both zipped and unzipped bags by using the appropriate transfer type.

On this page:

Unzipped bags

Unzipped bags can be created by hand or by using a BagIt tool like BagIt-python or Bagger. The bag must comply with the BagIt specification.

To ingest an unzipped bag, select Unzipped bag from the transfer type dropdown menu in the Transfer tab and then select your material from the transfer browser.

Unzipped bag transfer in dashboard

The screenshot above shows a simple bag containing three digital objects to be preserved (LICENSE, README, and TRADEMARK) as well as the accompanying files required by the BagIt specification (bag-info.txt, bagit.txt, and a manifest file, in this case for sha512 checksums.) Note that the digital objects to be preserved are within a subdirectory called data.

For more information on processing your transfer, see process transfer on the Transfer page.

Zipped bags

Zipped bags can be created by hand or by using a BagIt tool like BagIt-python or Bagger. The bag must comply with the BagIt specification.

To ingest a zipped bag, select the transfer type Zipped bag from the dropdown menu in the transfer tab of the Dashboard. When you open the transfer browser, you will notice that only materials that use the compression formats .zip, .tgz, or tar.gz can be selected for transfer. These are the only compressed formats that Archivematica accepts for zipped bag transfers.

Zipped bag transfer in dashboard

The bag itself should be structured internally like an unzipped bag, as shown above.

Note that zipped bag transfers always use the name of the bag as the transfer name.

For more information on processing your transfer, see process transfer on the Transfer page.

Adding descriptive/rights metadata and submission documentation to bags

Similar to standard transfers, it is possible to add descriptive and rights metadata to unzipped and zipped bag transfers. We recommend creating the bag first, then adding the metadata directory and the metadata.csv and/or rights.csv file manually. You can also add submission documentation to the bag by adding a submissionDocumentation directory inside the metadata directory.

Image shows a directory containing a metadata folder with CSV files and submission documentation inside

Even though the metadata directory, submissionDocumentation directory, and contents are not represented in the BagIt files (i.e. the checksum manifests), Archivematica will ignore the metadata directory during Job: Verify bag, and restructure for compliance so that the bag is verified successfully. Archivematica will then parse the metadata as usual. Note that the filename paths in the CSV files must reflect the structure of the bag.

For more information about adding metadata and submission documentation to transfers in Archivematica, please see Preparing digital objects for transfer.

Index and search bag metadata

In Archivematica 1.4 and higher, fields in the bag-info.txt file are indexed as source metadata in Elasticsearch, making their contents searchable in the Archival Storage tab after the bag transfer has been stored.

Labels in the bag-info.txt file are serialized as XML in the METS sourceMD field and linked to the objects directory of the AIP.

For example, the bag-info.txt might include the following information (sample provided via https://tools.ietf.org/html/draft-kunze-bagit-10).

Source-Organization: Spengler University
Organization-Address: 1400 Elm St., Cupertino, California, 95014
Contact-Name: Edna Janssen
Contact-Phone: +1 408-555-1212
Contact-Email: ej@spengler.edu
External-Description: Uncompressed greyscale TIFF images from the Yoshimuri papers colle...
Bagging-Date: 2008-01-15
External-Identifier: spengler_yoshimuri_001
Bag-Size: 260 GB
Payload-Oxum: 279164409832.1198
Bag-Group-Identifier: spengler_yoshimuri
Bag-Count: 1 of 15
Internal-Sender-Identifier: /storage/images/yoshimuri
Internal-Sender-Description: Uncompressed greyscale TIFFs created from microfilm and are...</pre>

When preserved in the resulting AIP’s METS XML file, the above information is represented like so:

<mets:amdSec ID="amdSec_14">
  <mets:sourceMD ID="sourceMD_1">
    <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="BagIt">
      <mets:xmlData>
        <transfer_metadata>
          <Source-Organization>Spengler University</Source-Organization>
          <Organization-Address>1400 Elm St., Cupertino, California, 95014</Organization-Address>
          <Contact-Name>Edna Janssen</Contact-Name>
          <Contact-Phone>+1 408-555-1212</Contact-Phone>
          <Contact-Email>ej@spengler.edu</Contact-Email>
          <External-Description> Uncompressed greyscale TIFF images from the Yoshimuri papers colle...</External-Description>
          <Bagging-Date>2008-01-15</Bagging-Date>
          <External-Identifier>spengler_yoshimuri_001</External-Identifier>
          <Bag-Size>260 GB</Bag-Size>
          <Payload-Oxum>279164409832.1198</Payload-Oxum>
          <Bag-Group-Identifier>spengler_yoshimuri</Bag-Group-Identifier>
          <Bag-Count>1 of 15</Bag-Count>
          <Internal-Sender-Identifier>/storage/images/yoshimuri</Internal-Sender-Identifier>
          <Internal-Sender-Description>Uncompressed greyscale TIFFs created from microfilm and are...</Internal-Sender-Description>
        </transfer_metadata>
      </mets:xmlData>
    </mets:mdWrap>
  </mets:sourceMD>
</mets:amdSec>

Note

In order to be parsed into the METS file, bag-info.txt labels (i.e. Source-Organization) must be compliant with XML so they cannot contain spaces or forbidden characters.

The metadata contained within the <transfer_metadata> tags can now be used for searching on the Archival Storage tab.

Searching for any of the terms (i.e. Spengler University) in the bag-info.txt using the search parameter Any should display stored packages that includes the search term in any field (or in the AIP name, etc. as per Searching the AIP store).

The image shows a search carried out using the term "Spengler University" with the search parameter set to "Any" and the search type set to "Keyword"

In the above example, the AIP coyote contained the search phrase in the descriptive metadata, rather than bag-info.txt. The other two AIPs contained the search phrase in bag-info.txt.

You can narrow the search results to just search the metadata that comes from bag-info.txt by selecting Transfer metadata as the search parameter. This will search for anything within the <transfer_metadata> tags in the METS file.

The image shows a search carried out using the term "Spengler University" with the search parameter set to "Transfer metadata" and the search type set to "Keyword"

You can narrow the search results even further by using the Transfer metadata (other) search parameter, which allows you to define the specific sub-field within the <transfer_metadata> that you want to search. For example, you may want to search for AIPs where the search phrase “Spengler University” is present in the Source-Organization field, but not other fields.

The image shows a search carried out using the term "Spengler University" with the search parameter set to "Transfer metadata (other)", the field name set to "Source-Organization", and the search type set to "Keyword"

To search on a date range in <transfer_metadata> or one if its sub-fields, the user enters two dates in ISO date format separated by a colon. For example, 2015-01-03:2015-04-14.

Back to the top.