Skip to Content.
Sympa Menu

per-entity - Re: [Per-Entity] Some thoughts about availability and scalability

Subject: Per-Entity Metadata Working Group

List archive

Re: [Per-Entity] Some thoughts about availability and scalability


Chronological Thread 
  • From: Chris Phillips <>
  • To: Per-Entity Metadata Working Group <>
  • Subject: Re: [Per-Entity] Some thoughts about availability and scalability
  • Date: Tue, 2 Aug 2016 15:03:56 +0000
  • Accept-language: en-US

Arriving late to the conversation due to holidays so apologies if these things were already talked about..

Can someone weigh in on the qualities REQUIRED of an MDQ consumer and the place on the wiki they exist?  
E.g. 'Consumer MUST be able to retrieve MDQ ack/nack of a record in X ms'
'Consumer MUST respect validUntil timestamp and cache MDQ response until then'

If we don't have these we need them otherwise we are going to be fragmented in an answer due to different perceptions of what are the 'right' qualities.


For me, the current optics around MDQ delivery is that the SLA for it is identical to the current Metadata delivery infrastructure: 4 9's if not better.
How to accomplish this can be by any and all means. It's not untrodden territory either —it's  a solved problem(HA web serving, caching, DNS infrastructure cleverness etc) with known risks and limitations and so is profiling such an infrastructure or operating one. Federation operators around the globe operate this now and MDQ 'answers'  are not very different from serving up a metadata aggregate at all — are they?

I very much agree that that it's not a good idea to burden an IdP operator for caching and resiliency aspects and expect necessary robustness from different pieces/roles. The MDQ clients should 'just work' when turned on just like Metadata aggregates.

The roles I see that contribute to an HA MDQ architecture:
  • MDQ publisher  (federation operator/fed ops service provider)
  • Some optional intermediary cache service (samlbits.org, akami)
  • the MQD Client (IdP/SP) (federation member)
If we can characterize the above service resiliency requirements according to role, I think that would be good (and can contribute to that).
Note that I don't wrap these requirements in the MDQ spec as it's about how to OPERATE the protocol which drives requirements around implementing features/functionality in the clients to achieve the perception of HA — things like: 'The SP should cache MDQ requests until ValidUntil expires' etc..


C


From: <> on behalf of David Walker <>
Date: Monday, August 1, 2016 at 1:11 PM
To: Per-Entity Metadata Working Group <>
Subject: [Per-Entity] Some thoughts about availability and scalability

I've been thinking about a couple of things...

  • Setting an expectation that MDQ client software should protect itself from failures of the server infrastructure.  We've talked a good amount about the pros and cons of this.
  • Another risk to per-entity metadata distribution we haven't discussed, that the server infrastructure may not be able to handle peak loads.

A federation-provided MDQ service must be able to handle two types of load, 1) metadata updates, and 2) queries from client IdPs and SPs.  The first of these is slow and fairly predictable at a federation level, but the latter is not.  Queries from IdPs and SPs will vary rapidly and unpredictably, based on the workload demands of individual federation members, but all federation members bear the impact.  The UK approach puts the unpredictable load on the web servers, which is better than putting it on the MDQ server, but it's still unpredictable.

The next thing I realized is that the UK's approach of creating an Apache-like web server layer between the MDQ server and the client IdPs and SPs doesn't require that the web servers be run by the federation.  They can be run by the member institutions:

Doing things this way lets each member institution decide what level of availability and scalability they want to provide to their community and deploy the necessary infrastructure without affecting the rest of the federation.  The federation is responsible only for the scalability and availability of the MDQ server behind the web servers.

Make sense?

David




Archive powered by MHonArc 2.6.19.

Top of Page