I'm interested to hear what other people think. Do you run code as part of code review?
- Require the author to attach screenshots of the feature in a markdown table format showing the before and after comparison. This is enforced using a MR template.
- Require the author to attach screen recording for complex user interactions.
Overall effect is that it saves everyone's time.
- The author has to verify that the feature works anyway, so taking screenshots / recordings doesn't take much addtional efforts.
- The reviewer can focus on code itself instead of checking whether the code works.
- Easy for both parties to spot design and UX issues/regressions visually.
I'd think that for frontend, you'd have CI/CD pipeline that deploys the code into a staging server, where I can test it myself.
A lot of people conflate these IMHO. Code review should not really be about whether it "works" or not. That's what tests are for.
Code reviews are about checking for code complexity, good use of abstractions, readability, etc.
Wish I could remember the github service that did this?
I haven't found a good solution except manually asking in every MR that I sense a potential issue. Maybe it is a good idea to have it in MR template as a checklist item.
Another problem with back button is that the expected behaviour is usually missing in the design or requirements to begin with, requiring a lot of back-and-forth on what's actually expected (especially for SPA).
For recordings, if you use Windows you can bring up the game bar with Win+G. It has a very easy to use screen capture tool meant for gaming, but perfectly fit for small software demos. It even captures sound which is sometimes useful.
- The author might have checked that the code works fine (A) or haven't done it (B).
- The reviewer might decide to run the code locally (1) or not (2).
Combination A1 results in duplicate effort. Combinations A2 and B1 are perfect. Combination B2 results in potential bugs.
The MR standard merely eliminating the combinatorial possibilities by sharing information between the author and the reviewer automatically. The end result is that both parties know A2 is the process that the other person follows.
I didn't used to, but that was a very big mistake.
Just wait until something you signed off on doesn't run and folks ask, "But didn't you review this code and say that everything 'looked good'? It doesn't even run!"
It's embarrassing to say the least.
If you're doing a code review, run the darn code. What would you think of a mechanic who didn't make sure the car was able to turn on after claiming it was fixed?
If someone pushes a change without anyone reviewing it, I sometimes find their fear of making a mistake causes them to scrutinize the change more than if it had been reviewed. Not always of course, depends on the context and people involved.
As a reviewer (in the places I've worked so far) for me it depends what type of change it is. E.g., for a frontend change I might need to run it to be able to actually try it out and check what it looks like, whereas for a backend change I might rely more on reviewing the tests and if the tests look good then I probably wouldn't bother running locally to poke at the thing myself. Of course there are also just general issues of how big and how complicated the change is.
Anyway, wherever possible mechanical stuff like "does it build and run" should be covered by a CI system or other automation, not by spending reviewer time on it.
If you can't trust the person who created the PR to have tried to compile the code... What are they even doing.
I'd say much of the value of a code review on senior devs is making sure that it's just all properly committed and nothing breaking.
Which is to say 'it doesn't work' is nitpicking that reviewers shouldn't be wasting their time checking.
Additional benefit is that this shows to devs and management what a time saver TDD and testing in general is. It shows immediately that lacking tests cost valuable, senior, time, rather than save time by not writing tests.
It also is simply needed: when there are no tests, we'll need to manually verify and check for regressions.
In my previous team, the reviewer was the main responsible for the code they were approving. They were expected to test locally and should actively hunt for potential issues, such as checking in the logs that the ORM was building correct SQL.
In my current team, the developer is the main responsible for the code they're pushing, and is expected to do all the tests needed. Reviews are more about ensuring code standards, finding possible oversights, requirements mismatches, and so on.
No one strategy is right or wrong, as long as expectations are set and everyone knows their responsibilities and expectations.
Why? it feels like individual preference
>such as checking in the logs that the ORM was building correct SQL.
Couldnt the person who creates PR copy/paste sample generated SQLs into PR?
If correct SQL is a priority (which seems reasonable), but it's not clear who should check it, it's likely to not be checked at least some of the time. Same with whatever other things have difuse responsibility.
There's still some room for individual preference. If it's author's repsonsibility to do X, they can ask for the reviewer to do it in the review request; or if it's the reviewer's responsibility, they can approve but say they didn't do X and are relying on the author to do it; or the author can assert they've done X in the request (perhaps it's difficult to do for this speciric request) and the reviewer can note they've relied on that assertion. But having a clear expectation strongly reduces the cases where author was relying on reviewer to check X and the reviewer was relying on the author, and X wasn't checked and the check would have found an issue before production.
Some things are a much bigger problem when found after production, and some things aren't; diffuse responsibility is ok for things that aren't a big deal, IMHO.
To your second point, sure, if that's what the team agrees is better.
The developer can definitely do work like providing samples of generated SQL, UI screenshots and so on. Again, knowing who will do that deep due diligence is more important than the actual decision.
Sometimes, if it adds a new feature or something 'interesting', I've checked it out locally to see what the user-facing behaviour is, but this is very rare.
For straightforward regressions and minor tweaks I am usually satisfied to see a video and test automation, but for any kind of new functionality I strongly advocate pulling the code and actually playing with it.
Depending on your company structure, product managers can help with this functional review as well, although this requires working tooling and org structure in a way that doesn't exist at some companies.
I've had companies host release parties before, where everybody downloads the product and plays with it to identify glaring issues. It's a decent approach, but you'd get way better results having a dedicated testing team whose job is to test the software from a UX/UI/contract functionality if that's what you care about.
- the author to test and have a succinct PR description that includes before and after screenshots/video as appropriate
- the reviewer to test the code locally
The reviewer test is particularly important for front-end code which is where automated tests are more likely to be poorly written or absent.
It goes without saying that there should be targeted automated tests, but I find that in practice the front-end is where this tends to fall short.
I should probably learn from him.
- what’s in the change; if I’m reasonably sure I understand it, and that it won’t have side effects, and that it’s the right thing to do, I may skip running it
- how familiar I am with the area of the codebase; even if the above is satisfied, I’ll pull it and explore around the change to look for potential side effects I’ve missed, or other approaches that might be better for maintenance… and generally to improve my familiarity
- how confident I am in the other contributor’s (and reviewers’) thoroughness; I generally scrutinize even trivial changes if I don’t think others have, because surprising things sometimes reveal themselves
I am always amazed at how many repetitive tasks folks want to load up developers with. I think there’s a tendency among developers-turned-managers to see their job as crafting the perfect “program” of processes for their developers to follow, instead just automating these tasks. Like they say, the cobbler’s children have no shoes.
Code compilation (and checks) & tests must be requirement as part of each PR
If it's something simple I can grok by reading the code and tests I usually don't.
If it's a complex piece of code, I find it valuable to run it locally, to setup some breakpoints, debug it and step it to understand it better.
Though sometimes I'll check out their branch, open up a repl and run a few functions or something on my own.
Only if something really draws my attention do I pull and run a particular PR (CR, CL, whatever yo call it). Once I did that to illustrate that the change introduces a security issue, so I had to write an exploit against it, and demonstrate how it works. (It helped.)
- We are finding deficiencies in our testing/development methods. The more people have to test/run code the more we are able to find bottlenecks and eliminate them.
- Knowledge sharing. We found that team members had knowledge/tricks in their heads which are being shared more.
- The overall reliability of our systems(measured via SLOs) has greatly increased.
Other than that, no. I am reviewing the coding standard, architecture, and algorithms used. It's up to the dev to ensure that the code works and runs. The CI builds as well as the QA testers will test the actual functionality works as expected (obviously if I can't find the code that is supposed to fix/implement the PR item, I will question it).
Note that this is quite rare and is maybe a once in a 100 code review occurrence. Usually it's quite obvious that something is wrong and you can just point it out or start a discussion about it.
In most of my companies - no. The reason? Not practical for time constraints.
To be honest - this is mostly how you find out if a company values quality or speed. Most I’ve been at valued speed more than anything.
Bugs introduced will be caught and fixed in general testing usually.
As for my own process, I have no hard and fast rule, but generally for user-impacting and functional UI changes I strongly encourage this on my teams. At companies with a functioning tooling setup, pushing to a remote staging environment is also useful because other stakeholders like Product and Design can make changes before the feature lands.
I do also advocate a fairly strict rule around screenshots before/after and a regression test for bug fixes. This prevents engineers from pushing speculative "I totally know what's going on" fixes which seem intuitively correct to a reviewer as well and then simply don't work. I'm tempted to do this all the time and even some really basic process does a good job at dissuading it.
Now that I’m in startup land, we use Heroku’s Review App feature, which is by far my favorite Heroku feature, and helps us catch bugs all the time
If you're working on some legacy codebase and don't have these luxuries, I totally get running it locally first. I am lucky to work with people who I do trust deeply to not waste others' precious time by testing first so there is probably also a human element for me.
Risk also needs to be taken into account. It's not necessary to spin up the application locally for tweaking error message wording or a small CSS change with no real responsiveness or accessibility implications. But for most fixes and features, I think at least a quick check is worth it. And if you're mentoring an intern or very junior developer, you should probably check every change, at least for the first few months.
Finally, the rest of your testing makes a big difference. If you have a formal QAer who tests your application before release or a community of people running nightlies, it's less urgent.
https://app-staging.foo.com/feature/add-sorting/
compared to the original
given you have created a PR pointing from branch named feature/add-sorting to master.
This also makes it easier for our QA engineers to test things before they get merged to master.
No one else in the team was doing the same including myself. I'm not exactly sure but maybe part of the reason was that he was not much familiar with the code base, and checking if it works was the way to test if the code is correct.
Think of an rdbms, and I'm reviewing changes that optimize or allow new schema options. It would not be productive for me to design new schemas each time I review code. Instead, the person sending the review should already have done that work in tests.
I'm a big proponent of "trial by fire" reviews. The idea is to setup a dumpster environment, deploy the change there and give it for a spin to someone who's keenly interested in seeing some progress - sometimes it's the PM, other times the sales guy - anyone who has some time in between meetings and would appreciate a little bit of variety in their work life.
Such people are amazingly talented at finding catastrophic edge cases.
currently at a place with extremely large code base and with dedicated testers, I don't run the code.
Sometimes even with dedicated testers etc. I will run the code if I think something might be problematic and I want to double check. I tend to only comment on code though if I think it could really be improved, if you have named a variable something I wouldn't I probably won't care.
1. I need to check out ~8 branches
2. I then need to reinstall all dependencies for those branches.
3. I then need to run those branches locally on a different set of ports to my standard dev environment to avoid clashes (or shutdown my main one temporarily)
4. If it involves the mobile app I may need to wait some time for a clean build
It's a lot of administrative overhead compared to what I tend to default to in most cases which is just merge the branch after reading through it and then test in our staging environment once CI has run.
I do however check out code locally. Online code review only shows what's been changed and I often find subtle issues in the code that wasn't changed. In short I'm trying to ask myself: What's missing in implementation?
I certainly wouldn’t make it a requirement of reviewers. I leave it to them to make their own determination on whether a patch is satisfactory.
If you are missing stuff like this, than yes. I'm surprised at the amount of time developers deliver something that doesn't even boot
So people doing code review can poke at the resulting running program if they like. But generally code review is used for the "how" instead of the "does it work?"
If it's code that is really changing behavior significantly then I'll run it on my device and just see if I notice anything. I've found some surprisingly gnarly bugs this way.
However, if the code is a large enough change, and it needs testing anyways, I will run it.
1. PRs should target development, which is an *unstable* branch, so no biggie if something breaks
2. CI should catch things like syntax errors, failing tests, or poor test coverage
3. I trust my colleagues to do their work properly and fix things if they've made a mistake- how can we even say it works?
- who’s responsibility is it to fix it?
- and when does it get fixed?
* Tests
* The engineer making the PR
* Before they're allowed to merge
Your CI should be robust enough to catch "does it run" type issues. That's not always possible for one reason or another. If there are things that your CI can not currently check, you should be verifying local. You should also be striving to close those gaps in your CI.
In those cases I’ll pull the code
or as in the latest case when the developer has the newest msvc compiler with ' as digit separators, but the CI not.
But... our system is a large pile of legacy crap, and too often the only way to check something properly is to step through it in a debugger.
Then we have preflight.com UI tests that run on those environments. Would love to tell you my experience.
Most of the time I pull the code to easily navigate it in context. The diff viewer will only get you so far.
So it might be useful to run the thing before AND after merging, but that is very very tedious..
Better I find out about edge cases that weren’t considered or unit tested than our customers.