At a previous job I considered a system that would use synchronous API calls to the backend API until it went down (or we got rate limited). When the backend was unavailable we'd switch to filtering in our service using the data we'd previously cached.
I.E. if a cached query asked for (cpu>=3.0Ghz, cores>=2) we can also answer (cores>=4) by filtering the previous result. This wouldnt be able to find any CPUs with less than 3Ghz, unless it there were other cached responses. This works well when a "best effort" response is desirable, even when it's incomplete.