1. Docker. I resisted learning it for the longest time because of the stories about its unreliability in production (just do a hacker news search) and the constantly shifting APIs. But once I built and gave collaborators a few containers I realized it is a very complete toolkit for getting code to run reliably anywhere. Since research code is often a mishmash of Octave, Python, Pearl, Julia, and crufty C++, and anywhere often includes ancient government laptops, running anywhere is a real pain point. Docker’s complete configuration language, command line tools, and collection of minimal base images together make it feel like a big step forward compared to VMs for my use cases. And it will probably buy my team another 10 years of using scruffy code bases without doing a clean modern rewrite (for better or for worse).
2. Scikit-learn. Although there haven’t been any machine learning breakthroughs in my field (power systems), it was time for me to learn what all this ML hullabaloo is about. I found it really easy to use scikit-learn to build some simple models for load forecasting, and I appreciate the library’s clean and pythonic APIs. It’s also nice how complete the library is—if there’s a model you’ve heard about it’ll probably be in there and well-documented.
Things I tried and won’t end up using:
1. VSCode. It’s great, but it wasn’t a big enough improvement over Sublime for me to make the switch. I also am more of a minimalist, and VSCode wants you to use a generous set of panes/terminals/wizards.
2. Serverless (AWS Lambda, zeit.co). One of these days I suspect the interface to a cluster will be as clean as the Unix interface to the single machine, but for scientific computing I’m finding it easier to stick with older methods (single machines, occassionally a cluster built by hand).