Output from tests/test_demo_using_bagit_bags.py
.
Imagine that we have a Bagit bag tests/testdata/bags/uaa_v1
that represents the initial state of an object. We can use --create
to make a new OCFL object /tmp/obj
with that content as the v1
state:
> python ocfl-object.py --create --objdir tmp/obj --srcbag tests/testdata/bags/uaa_v1 -v
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v1/data/my_content/dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v1/data/my_content/poe.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v1/bagit.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v1/bag-info.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v1/manifest-sha512.txt
INFO:ocfl.object:Created OCFL object info:bb123cd4567 in tmp/obj
Now that we have the object it is of course valid.
> python ocfl-validate.py tmp/obj
INFO:ocfl.object:OCFL v1.1 Object at tmp/obj is VALID
Looking inside the object we see v1
with the expected 2 content files.
> python ocfl-object.py --show --objdir tmp/obj
WARNING:ocfl.object:OCFL v1.1 Object at tmp/obj has VALID STRUCTURE (DIGESTS NOT CHECKED)
WARNING:ocfl.object:Object tree
[tmp/obj]
├── 0=ocfl_object_1.1
├── inventory.json
├── inventory.json.sha512
└── v1
├── content (2 files)
├── inventory.json
└── inventory.json.sha512
If we have a bag tests/testdata/bags/uaa_v2
with updated content we can --update
the object to create v2
.
> python ocfl-object.py --update --objdir tmp/obj --srcbag tests/testdata/bags/uaa_v2 -v
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/data/my_content/a_second_copy_of_dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/data/my_content/dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/data/my_content/poe-nevermore.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/data/my_content/another_directory/a_third_copy_of_dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/bagit.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/bag-info.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v2/manifest-sha512.txt
INFO:ocfl.object:Will update info:bb123cd4567 v1 -> v2
INFO:ocfl.object:Updated OCFL object info:bb123cd4567 in tmp/obj by adding v2
Looking inside the object we now see v1
and v2
. There are no content files inside v2
because although this update added two files they have identical content (and hence digest) as one of the files in v1
> python ocfl-object.py --show --objdir tmp/obj
WARNING:ocfl.object:OCFL v1.1 Object at tmp/obj has VALID STRUCTURE (DIGESTS NOT CHECKED)
WARNING:ocfl.object:Object tree
[tmp/obj]
├── 0=ocfl_object_1.1
├── inventory.json
├── inventory.json.sha512
├── v1
│ ├── content (2 files)
│ ├── inventory.json
│ └── inventory.json.sha512
└── v2
├── inventory.json
└── inventory.json.sha512
Similarly we can --update
with tests/testdata/bags/uaa_v3
to create v3
.
> python ocfl-object.py --update --objdir tmp/obj --srcbag tests/testdata/bags/uaa_v3 -v
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v3/data/my_content/dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v3/data/my_content/poe-nevermore.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v3/data/my_content/another_directory/a_third_copy_of_dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v3/bagit.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v3/bag-info.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v3/manifest-sha512.txt
INFO:ocfl.object:Will update info:bb123cd4567 v2 -> v3
INFO:ocfl.object:Updated OCFL object info:bb123cd4567 in tmp/obj by adding v3
Looking inside again we see that v3
does add another content file.
> python ocfl-object.py --show --objdir tmp/obj
WARNING:ocfl.object:OCFL v1.1 Object at tmp/obj has VALID STRUCTURE (DIGESTS NOT CHECKED)
WARNING:ocfl.object:Object tree
[tmp/obj]
├── 0=ocfl_object_1.1
├── inventory.json
├── inventory.json.sha512
├── v1
│ ├── content (2 files)
│ ├── inventory.json
│ └── inventory.json.sha512
├── v2
│ ├── inventory.json
│ └── inventory.json.sha512
└── v3
├── content (1 files)
├── inventory.json
└── inventory.json.sha512
Finally, we can --update
again with tests/testdata/bags/uaa_v4
to create v4
.
> python ocfl-object.py --update --objdir tmp/obj --srcbag tests/testdata/bags/uaa_v4 -v
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/data/my_content/dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/data/my_content/dunwich.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/data/my_content/poe-nevermore.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/data/my_content/another_directory/a_third_copy_of_dracula.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/bagit.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/bag-info.txt
INFO:bagit:Verifying checksum for file /home/runner/work/ocfl-py/ocfl-py/tests/testdata/bags/uaa_v4/manifest-sha512.txt
INFO:ocfl.object:Will update info:bb123cd4567 v3 -> v4
INFO:ocfl.object:Updated OCFL object info:bb123cd4567 in tmp/obj by adding v4
Taking the newly created OCFL object /tmp/obj
we can --extract
the v4
content as a Bagit bag.
> python ocfl-object.py --extract v4 --objdir tmp/obj --dstbag tmp/extracted_v4 -v
INFO:ocfl.object:Extracted v4 into tmp/extracted_v4
INFO:bagit:Creating bag for directory tmp/extracted_v4
INFO:bagit:Creating data directory
INFO:bagit:Moving my_content to tmp/extracted_v4/tmpq_2ho4cy/my_content
INFO:bagit:Moving tmp/extracted_v4/tmpq_2ho4cy to data
INFO:bagit:Using 1 processes to generate manifests: sha512
INFO:bagit:Generating manifest lines for file data/my_content/dracula.txt
INFO:bagit:Generating manifest lines for file data/my_content/dunwich.txt
INFO:bagit:Generating manifest lines for file data/my_content/poe-nevermore.txt
INFO:bagit:Generating manifest lines for file data/my_content/another_directory/a_third_copy_of_dracula.txt
INFO:bagit:Creating bagit.txt
INFO:bagit:Creating bag-info.txt
INFO:bagit:Creating tmp/extracted_v4/tagmanifest-sha512.txt
Extracted content for v4 saved as Bagit bag in tmp/extracted_v4
We note that the OCFL object had only one content
file in v4
but the extracted object state for v4
includes 4 files, two of which have identical content (dracula.txt
and another_directory/a_third_copy_of_dracula.txt
). We can now compare the extracted bag /tmp/uaa_v4
that with the bag we used to create v4
tests/testdata/bags/uaa_v4
using a recursive diff
.
> diff -r tmp/extracted_v4 tests/testdata/bags/uaa_v4
diff -r tmp/extracted_v4/bag-info.txt tests/testdata/bags/uaa_v4/bag-info.txt
1,2c1
< Bag-Software-Agent: bagit.py v1.8.1 <https://github.com/LibraryOfCongress/bagit-python>
< Bagging-Date: 2023-03-08
---
> Bagging-Date: 2020-01-04
diff -r tmp/extracted_v4/tagmanifest-sha512.txt tests/testdata/bags/uaa_v4/tagmanifest-sha512.txt
2c2
< fb20d9575bd2a36d589c32af7118552b96b2ca8500d8a90494be8f4354c31909a70a0b2792c9b89652f59725895b5326a57e423e2065d5edfe380f47a9c82c40 bag-info.txt
---
> 10624e6d45462def7af66d1a0d977606c7b073b01809c1d42258cfab5c34a275480943cbe78044416aee1f23822cc3762f92247b8f39b5c6ddc5ae32a8f94ce5 bag-info.txt
(last command exited with return code 1)
The only differences are in the bag-info.txt
file and the checksum file for that file (tagmanifest-sha512.txt
). The content matches.