#18: Allow embed=[] in get_bitstreams and embed format in response #20

kshepherd · 2024-06-11T10:53:56Z

@alanorth This is one way of fixing #18 -- I am interested to hear if you think it's the right approach.

We can request linked HAL stuff to be embedded in the parent object using ?embed=... in the REST API call. So this change allows a client to say bitstreams = d.get_bitstreams(bundle=my_bundle, embeds=['format']), and then in the resulting Bitstream objects you can get format out like format = BitstreamFormat(bitstream.embedded['format']) without any additional API calls.

The only thing I wasn't sure about is if we should handle that in the client lib instead of making the actual client do it... this way keeps things relatively simple in the library, we don't have to know anything in the bitstream model because it'll just copy all _embedded JSON into a dict. If we wanted to make that model more 'smart' we'd tell it what kind of embeds to expect, and we could parse and instantiate the BitstreamFormat objects before handing back in the get_bitstreams() return value.

Here is my example script to test it out:

import pprint

from dspace_rest_client.client import DSpaceClient
from dspace_rest_client.models import BitstreamFormat

# Authenticate against the DSpace client
authenticated = d.authenticate()
if not authenticated:
    print('Error logging in! Giving up.')
    exit(1)

print('\nExample of ORIGINAL bundle output with format embedded.\n')
# Get top communities
top_communities = d.get_communities(top=True)
for top_community in top_communities:
    # Get all collections in this community
    collections = d.get_collections(community=top_community)
    for collection in collections:
        # Get all items in this collection - see that the recommended method is a search, scoped to this collection
        # (there is no collection/items endpoint, though there is a /mappedItems endpoint, not yet implemented here)
        items = d.search_objects(query='*:*', scope=collection.uuid, dso_type='item')
        for item in items:
            print(f'{item.name} ({item.uuid})')
            # Get all bundles in this item
            bundles = d.get_bundles(parent=item)
            for bundle in bundles:
                if bundle.name == 'ORIGINAL':
                    print(f'\n\nBUNDLE {bundle.name} ({bundle.uuid})')
                    # Get all bitstreams in this bundle
                    bitstreams = d.get_bitstreams(bundle=bundle, embeds=['format'])
                    for bitstream in bitstreams:
                        print(f'{bitstream.name} ({bitstream.uuid})')
                        if 'format' in bitstream.embedded:
                            format = BitstreamFormat(bitstream.embedded['format'])
                            pprint.pp(format.as_dict())

…t in response _embedded in HALResource new BitstreamFormat model get_bitstreams takes e.g. (embeds=['format'])

kshepherd · 2024-06-11T11:31:39Z

(i should note, you can also just bypass the python obj and use the dict directly like bitstream.embedded['format']['mimetype'] -- it's hard to predict with the different ways and context we use libs like this, whether the "knowledge" of how the data looks should be in the client impl, or further up in the library

the-library-code#18: Allow embed=[] in get_bitstreams and embed forma…

27b9444

…t in response _embedded in HALResource new BitstreamFormat model get_bitstreams takes e.g. (embeds=['format'])

kshepherd self-assigned this Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#18: Allow embed=[] in get_bitstreams and embed format in response #20

#18: Allow embed=[] in get_bitstreams and embed format in response #20

kshepherd commented Jun 11, 2024

kshepherd commented Jun 11, 2024

#18: Allow embed=[] in get_bitstreams and embed format in response #20

Are you sure you want to change the base?

#18: Allow embed=[] in get_bitstreams and embed format in response #20

Conversation

kshepherd commented Jun 11, 2024

kshepherd commented Jun 11, 2024