This method works out pretty well for me. I’m wondering if people have other strategies that work better?
This question is inspired by the Google Data Engineer certification exam: https://cloud.google.com/certification/guides/google-certified-professional-data-engineer.pdf
More detail on exam here: https://cloud.google.com/blog/big-data/2017/01/registration-now-open-for-google-data-engineer-certification-exam-in-beta
Imagine we're making a webservice called "Primes". It has 1 endpoint, something like this:
from flask import Flask
import random
app = Flask('primes')
@app.route("/rand_prime")
def rand_prime(n):
return random.sample(primes, 1)
if __name__ == "__main__":
global primes
# some function to load primes from a file into a set
primes = load_primes()
app.run()
We want the fastest webservice to serve this endpoint that handles as much QPS as possible.### Constraints
- We have to load all the primes __into memory__, there's no database at all. - We cannot use memcache, redis, etc. - We have to use Python.
### Current Solution
The code above just serves the primes via the default webserver that comes bundled with Flask. Some ideas to improve upon this:
- Use a better webserver (eg. Tornado) - Proxy several instances of this behind an Nginx server
Can we make this even faster? We really don't need complicated routing or security... Is it possible to do better by writing a custom webserver?
Thanks!
For an absolute beginner, what is the quickest way to learn enough to be productive in industrial design? Any websites, articles, or resources are greatly appreciated.
Also, any insight into what the process is would be great too. Like what kind of tools are used, what are the industry standards, what kind of file format factories accept, etc.
Thank you!
I'm having trouble understanding the importance of this - what are some specific examples of the knowledge lost?