per-entity - RE: [Per-Entity] implementing a cache on the client

Subject: Per-Entity Metadata Working Group

List archive

RE: [Per-Entity] implementing a cache on the client

From: "Cantor, Scott" <>
To: Tom Mitchell <>, "" <>
Subject: RE: [Per-Entity] implementing a cache on the client
Date: Fri, 5 Aug 2016 20:06:47 +0000
Accept-language: en-US
Authentication-results: spf=pass (sender IP is 164.107.81.212) smtp.mailfrom=osu.edu; bbn.com; dkim=none (message not signed) header.d=none;bbn.com; dmarc=bestguesspass action=none header.from=osu.edu;
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

> What if we do a case study? Let's assume that every InCommon SP (3,107)
> and every InCommon IdP (434) are fully connected. That would require ~2.7
> million MDQ queries. That's every SP asking about every IdP and vice versa.

Not just every SP/IdP, but I assume you mean every day.

> Now comes the question of how long or short the endpoints can cache that
> data. I ran numbers for a few different cache lifetimes, from what I
> consider
> to be very short (10 minutes) up to the InCommon recommendation of at
> least daily (i.e. the minimum refresh).

I think 1 hour is a good stab, though obviously if InCommon maintains the
offline signing approach, it's likely that 1 day is more accurate.

> Are these kinds of calculations useful to inform requirements related to
> potential load?

I think the risk with them is that they're so much on the high side that they
could lead to over-engineering the requirements. Obviously it's always great
to be cautious and shoot high, but somewhat like SSO systems, when you
benchmark with totally inflated requirements, you get the impression you need
about 10 times the capacity you really need, so it can be counter-productive
too.

> Taking Shibboleth as an exemplar, how often would the MDQ entries expire,
> causing a new query to go out? Or is that determined by the HTTP headers
> received from the MDQ server?

I can speak to the SP. Brent would have to address the IdP.

HTTP layer doesn't impact it, apart from conditional GETs, and I'm not 100%
sure I even implemented them because of the document sizes here. I probably
did, but I can't recall.

The caching is based on combining cacheDuration with a min/max setting to
control how aggressively it tries to get new copies. It has an option also to
maintain expired metadata if you want it to do (expired means not-valid). It
delays retries of failed requests, though that's per-entity, so it will still
bang a bit on a failed MDQ server if it's under load to make requests for
different entities. I'll make a note that we should probably look at ways to
limit that more across servers and not entities.

-- Scott

Re: [Per-Entity] implementing a cache on the client, (continued)

List archive

RE: [Per-Entity] implementing a cache on the client