Skip to content

file locations: distinguish between EOSPUBLIC and OPENDATA URIs #115

@tiborsimko

Description

@tiborsimko

Current behaviour

The client currently exposes EOSPUBLIC locations of files, for example:

$ cernopendata-client get-file-locations --recid 5000          
http://opendata.cern.ch/eos/opendata/cms/software/2011-doubleelectron-doublemu-mueg-ttbar/2011-doubleelectron-doublemu-mueg-ttbar-1.0.0.tar.gz

This file also exist attached to the record as /record/NNN/files/FILE.EXTENSION, which would give:

http://opendata.cern.ch/record/5000/files/2011-doubleelectron-doublemu-mueg-ttbar-1.0.0.tar.gz

What is the difference? In the first case, the file is served from OPENDATA via reverse HTTP proxy to EOSPUBLIC (and is not cached). In the second case, the file is served from OPENDATA via XRootD proxy to EOSPUBLIC (and is cached if it is sufficiently small).

Due to several issues with EOSPUBLIC reverse proxy, in PR #113 we have introduced file index lookups from the latter URIs, while still exposing the former URIs.

Expected behaviour

It would be good to consistently expose both kind of URIs and allow user to specify a command-line switch to use one or the other.

Example: we can introduce a new command-line option --uri-style having two values, "eos" and "record":

$ cernopendata-client get-file-locations --recid 5000 --uri-style=eos
http://opendata.cern.ch/eos/opendata/cms/software/2011-doubleelectron-doublemu-mueg-ttbar/2011-doubleelectron-doublemu-mueg-ttbar-1.0.0.tar.gz
$ cernopendata-client get-file-locations --recid 5000 --uri-style record
http://opendata.cern.ch/record/5000/files/2011-doubleelectron-doublemu-mueg-ttbar-1.0.0.tar.gz

The default value could be "eos" to keep the old behaviour, but we could switch to "record" if this one is more stable.

Things to beware about:

  • The new option --uri-style would be used for everything, i.e. for exposing URI locations, for downloading files, etc.
  • All files attached to records are usually accessible under "record" URI style, with the exception of files behind file indexes (see next point).
  • The files behind file indexes, such as for record ID 1, are a special case. The file index files themselves (example: http://opendata.cern.ch/eos/opendata/cms/Run2010B/BTau/AOD/Apr21ReReco-v1/file-indexes/CMS_Run2010B_BTau_AOD_Apr21ReReco-v1_0000_file_index.json) are well acessible also under "record" URI style, but the data files (example: http://opendata.cern.ch/eos/opendata/cms/Run2010B/BTau/AOD/Apr21ReReco-v1/0005/FE3F8388-E471-E011-9377-00E08179189B.root) are only accessible under "eos" URI style. Hence a special care will have to be made regarding the difference between cernopendata-client get-file-locations --recid 1 --no-expand and cernopendata-client get-file-locations --recid 1. Namely, in the "expand" use case, the "eos" URI style is forced; while in the "no-expand" use case, people could use both "eos" style and "record" style.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions