Skip to Content.
Sympa Menu

metadata-support - Re: [Metadata-Support] Extending Metadata Query Protocol

Subject: InCommon metadata support

List archive

Re: [Metadata-Support] Extending Metadata Query Protocol


Chronological Thread 
  • From: Ian Young <>
  • To:
  • Subject: Re: [Metadata-Support] Extending Metadata Query Protocol
  • Date: Mon, 23 Mar 2015 15:23:30 +0000


> On 18 Mar 2015, at 13:25, Jaime Perez Crespo
> <>
> wrote:
>
> Recently, we’ve incorporated MQP support to SimpleSAMLphp, in order to
> dynamically retrieve metadata from a trusted MDX server, both for identity
> and service providers. This is a valuable addition in my opinion, and we
> are looking forward to improve it as much as possible.

That's great news. I look forward to any feedback you might have.

> One of the issues that we’ve observed is that using the Metadata Query
> Protocol to fetch metadata for previously unknown entities introduces a
> (potentially big) delay when serving the request that originated the query
> to the MDX server.

I'm not sure where a potentially big delay would come in (or what you mean by
"big", exactly). Obviously it depends a lot on the implementation, but the
most likely thing to extend the query time would be the signature step and
those just don't take that long these days (or we would complain more often
about SAML, which requires the same).

> While this could be acceptable in certain circumstances, we would like to
> be able to avoid this delay as much as possible by periodically prefetching
> metadata from the MDX server. For that to be possible, I can imagine the
> following requisites:
>
> * To be able to query the MDX server for a list of all the entities served.

I don't see why having a list of the entities helps you in any way. Either
you then go on to pulling all of the entities (in which case, as Scott points
out, you might as well have just pulled an aggregate) or you don't (in which
case you still have the same delay when you query for each one).

> When I say “list of entities”, I mean a list of identifiers used by the MDX
> implementation that can be used to request the metadata of a particular
> entity (i.e. the entityID or its SHA-1).

As Tom says, we're currently adding something like this as an experimental
extension to my *implementation* of the specification. What we're
experimenting with would be more detailed than just a list of the entities:
more like a discovery feed, in fact. In the longer term, that's something
that would be more likely dealt with using content negotiation than a
protocol evolution.

"Just" a list of entities is something that might make sense as a protocol
evolution if there's a real use case. I'm not convinced that this is one, so
far.

> * To be able to query the MDX server for a list of all the entities
> *modified since* a specific date. This would allow us to query the server
> later only for those entities that have been modified since the last
> request.

I don't think that's likely to be a workable addition to the protocol. One of
the things I've tried to preserve is the possibility of an entirely static
implementation of MDQ, and what you're describing here would require a
dynamic web service.

Anything which requires real-time (relative to a query) dynamic assembly of
an aggregate would fall into the same category. Or put another way, if
samlbits can't cache it, it's problematic.

> I understand the first one could be easily disregarded by using the MDX as
> a standard metadata feed, that is, fetching the whole metadata set it
> serves, processing, caching, and then proceeding onwards by leveraging the
> second one. However, I see benefits on being able to iteratively retrieve
> entities instead of a huge feed, like better performance and availability
> of entities. In any case, both features would be interesting to make the
> Metadata Query Protocol even more useful for big deployments, I think

One option you might like to consider would be cacheing individual results
you get from MDQ and then re-querying for individual entities before their
cacheDuration / validUntil have expired, using conditional GET based on the
ETag. If you do this on a background thread for entities which have been used
since last fetched, you can hide the query latency for all entities that are
in frequent use (or even occasional use, depending on cacheDuration).

[Full disclosure: my implementation doesn't do conditional GET correctly yet,
but it's on the list:
https://github.com/iay/mdq-server/issues/7
]

So, for example, if you fetch an entity's metadata and it has 6 hours to
live, you might hourly check for (a) that entity having been used since the
last fetch and (b) less than 3 hours before cacheDuration expiry.

Of course you don't have to do anything quite that complex, as cacheDuration
expiry does not mean that the metadata is invalid. So another alternative
would be to re-query once cacheDuration has expired, but do that on a
background thread unless validUntil has also expired and make use of the
previously fetched results meanwhile. You'd still hide the latency most of
the time in this case.

-- Ian




Attachment: smime.p7s
Description: S/MIME cryptographic signature




Archive powered by MHonArc 2.6.16.

Top of Page