Ask HN: Do you pull and run code as part of code review?

89 pointsgary_chambers4y ago141 comments

I've been having a debate about whether code should be checked out and run locally (or on a staging environment) as part of code review. Personally, I find it easier to review a change after I've run it locally, but perhaps that's because I'm not very good at reading code.

I'm interested to hear what other people think. Do you run code as part of code review?

89 pointsgary_chambers4y ago141 comments

I'm interested to hear what other people think. Do you run code as part of code review?

141 comments

125 comments · 75 top-level

paradite4y ago· 17 in thread

I'm a senior frontend engineer doing a lot of code reviews and my team loves my MR (PR) submission standard:

- Require the author to attach screenshots of the feature in a markdown table format showing the before and after comparison. This is enforced using a MR template.

- Require the author to attach screen recording for complex user interactions.

Overall effect is that it saves everyone's time.

- The author has to verify that the feature works anyway, so taking screenshots / recordings doesn't take much addtional efforts.

- The reviewer can focus on code itself instead of checking whether the code works.

- Easy for both parties to spot design and UX issues/regressions visually.

BossingAround4y ago

But you still have to test the actual functionality, don't you? What if the author makes a gif of clicking a button, but doesn't record if the browser back button still works?

I'd think that for frontend, you'd have CI/CD pipeline that deploys the code into a staging server, where I can test it myself.

bradwood4y ago

Code review != Testing.

A lot of people conflate these IMHO. Code review should not really be about whether it "works" or not. That's what tests are for.

Code reviews are about checking for code complexity, good use of abstractions, readability, etc.

4 more replies

stephen_cagle4y ago

I remember a few companies ago we had a system that would deploy every branch to a separate subdomain (same name as the PR) that you could access. It was fantastically useful for just seeing what the deployed code would look like. I think (for UI things at least) this is a very reasonable solution.

Wish I could remember the github service that did this?

3 more replies

ryan_lane4y ago

Isn't this what automated tests are for? Manually testing every change is a huge burden, requires most of the automation necessary for automated testing (from the infrastructure point of view), and actually discourages people from writing automated tests.

paradite4y ago

It's interesting that you brought up the issue of back button. It is indeed an area where bugs frequently occur.

I haven't found a good solution except manually asking in every MR that I sense a potential issue. Maybe it is a good idea to have it in MR template as a checklist item.

Another problem with back button is that the expected behaviour is usually missing in the design or requirements to begin with, requiring a lot of back-and-forth on what's actually expected (especially for SPA).

mnutt4y ago

You can even automate this with something like Percy that takes screenshots and does visual diffs at selected points in the test suite. You just have to be careful about including random/date-based data in your tests or you’ll get more diffs than expected. Also not quite a substitute for someone calling out “this was a major UX change”.

paradite4y ago

Interesting to hear that there are automation solutions around this. In your experience / expertise, how much work can be automated by such tools? 80%? 99%?

dinkleberg4y ago

This is an excellent idea for a submission standard, will have to store this in the back of my mind.

slaymaker19074y ago

Word is actually really great for this stuff. You get a portable file and can add notes to the screenshots. There is even a screenshot button built into Word!

For recordings, if you use Windows you can bring up the game bar with Win+G. It has a very easy to use screen capture tool meant for gaming, but perfectly fit for small software demos. It even captures sound which is sometimes useful.

tomatohs4y ago

I'm working on an "instant replay for developers" desktop app for exactly this case. Create desktop instant-replays to embed in pull requests:

http://replayable.io/

bobobob4204y ago

The fact that people think this is an excellent idea.....screen recording the proof that the feature works wtf what is the alternative you are getting? People are submitting features that don't work? Just fire them or teach them your set of standards. If my boss asked me to do this I would laugh and find a new job. I truly feel sorry for you but also please stop ruining our industry with this nonsense.

paradite4y ago

Requiring screen recording isn't about lack of trust or lack of competence. Rather it's a way of eliminating information asymmetry between the author and the reviewer:

- The author might have checked that the code works fine (A) or haven't done it (B).

- The reviewer might decide to run the code locally (1) or not (2).

Combination A1 results in duplicate effort. Combinations A2 and B1 are perfect. Combination B2 results in potential bugs.

The MR standard merely eliminating the combinatorial possibilities by sharing information between the author and the reviewer automatically. The end result is that both parties know A2 is the process that the other person follows.

1 more reply

flippinburgers4y ago

Consider yourself lucky to be on a team where people are competent at testing the features they create.

1 more reply

Folcon4y ago

This is great, do you do this in github? Or elsewhere? Just curious if you had a template to hand that myself and others could use =)...

paradite4y ago

The PR/MR template itself is pretty generic. You can make one yourself quickly by referring to GitHub markdown table documentation: https://docs.github.com/en/get-started/writing-on-github/wor...

x86a4y ago

"MR" so probably Gitlab

1 more reply

lazypenguin4y ago

I’ve found this works extremely well for visual changes and can’t believe when it’s not the default behavior!

tharne4y ago· 12 in thread

Yes.

I didn't used to, but that was a very big mistake.

Just wait until something you signed off on doesn't run and folks ask, "But didn't you review this code and say that everything 'looked good'? It doesn't even run!"

It's embarrassing to say the least.

If you're doing a code review, run the darn code. What would you think of a mechanic who didn't make sure the car was able to turn on after claiming it was fixed?

city414y ago

I sometimes find code reviews can create a "gap" in the center where neither person fully vets the change. The reviewer thinks "surely the author fully tested it, I'm just added assurance." And the author thinks "they approved the change so it must be good" and the end result can be code that is less reviewed overall.

If someone pushes a change without anyone reviewing it, I sometimes find their fear of making a mistake causes them to scrutinize the change more than if it had been reviewed. Not always of course, depends on the context and people involved.

peterhunt4y ago

I think the fix for this is for every pull request to include a written test plan. Usually this is just a few bullets (and often is just "ran the test suite" if it's applicable). It's on the author to come up with a test plan and execute it. It's on the reviewer to review both the code and the test plan and ensure that the test plan is sufficient.

newaccount744y ago

I had a coworker who seemed to just open PRs as a part of the development workflow. As soon as the feature kind of works, they open a PR, wait for someone to complain about problems, fix them, ask for another review, get more complaints about bugs, rinse and repeat. If the reviewer didn't pay attention, they would always merge broken code.

azornathogron4y ago

Making the code correct is a shared responsibility between reviewer and author; if the code has a bug or doesn't even run that means both people missed the problem. Unless the author is very new/junior (still at a stage where they need close guidance on individual changes) then I would be more annoyed or concerned by an author who hasn't run their change at all than a reviewer who hasn't run it and misses some problem. But I guess it depends on the conventions of the organisation exactly what the balance of responsibility is.

As a reviewer (in the places I've worked so far) for me it depends what type of change it is. E.g., for a frontend change I might need to run it to be able to actually try it out and check what it looks like, whereas for a backend change I might rely more on reviewing the tests and if the tests look good then I probably wouldn't bother running locally to poke at the thing myself. Of course there are also just general issues of how big and how complicated the change is.

Anyway, wherever possible mechanical stuff like "does it build and run" should be covered by a CI system or other automation, not by spending reviewer time on it.

makapuf4y ago

Letting the author run it could lead to "it ran on my pc" surprises.

maximus-decimus4y ago

To be fair, it's more like "what would you think a mechanic who doesn't verify that the mechanic who fixed the car didn't check it ran."

If you can't trust the person who created the PR to have tried to compile the code... What are they even doing.

muzani4y ago

In most of the situations where the code doesn't run, the person creating the PR didn't commit the code needed to make it run, but have it offline. I've bitched at someone who approved a PR in about about 10 minutes, but there was a merge conflict where the ">>>>>>>" would have been apparent for anyone who actually even glanced at the code.

I'd say much of the value of a code review on senior devs is making sure that it's just all properly committed and nothing breaking.

bobobob4204y ago

Just make the developer responsible for code they write. It is not hard or complex. You shouln't have to build code from every branch, compile, run, and test. That is not what a code review is...

bluGill4y ago

My job isn't to make your code work, it is to find those little things that don't seem to be bad that are. I have tools for obvious things.

Which is to say 'it doesn't work' is nitpicking that reviewers shouldn't be wasting their time checking.

verdverm4y ago

Or have a machine run the code, like a CI system?

icedchai4y ago

Often, it is not that simple. Many code bases are completely lacking any integration or end-to-end tests, despite having tons of unit tests. Unit tests can pass, but the system may still "not work" due to failures to call the new code in the correct place. The original developer can miss this and it may not be obvious from the context of the PR that something not already in the PR needs to also change. "You don't know what you don't know..."

1 more reply

makapuf4y ago

Review should ensure there are tests, and that those tests run on ci and my laptop. When tests are difficult or not worth it (sometimes ui or end to end) I think to be sufficiently confident reviewer shall run the code. Its possible to do without but that's technical debt

ricardolopes4y ago· 4 in thread

This should be something agreed within a team, so that review standards are consistent across team members.

In my previous team, the reviewer was the main responsible for the code they were approving. They were expected to test locally and should actively hunt for potential issues, such as checking in the logs that the ORM was building correct SQL.

In my current team, the developer is the main responsible for the code they're pushing, and is expected to do all the tests needed. Reviews are more about ensuring code standards, finding possible oversights, requirements mismatches, and so on.

No one strategy is right or wrong, as long as expectations are set and everyone knows their responsibilities and expectations.

tester7564y ago

>This should be something agreed within a team, so that review standards are consistent across team members.

Why? it feels like individual preference

>such as checking in the logs that the ORM was building correct SQL.

Couldnt the person who creates PR copy/paste sample generated SQLs into PR?

toast04y ago

A team standard helps contribute to clear responsibility.

If correct SQL is a priority (which seems reasonable), but it's not clear who should check it, it's likely to not be checked at least some of the time. Same with whatever other things have difuse responsibility.

There's still some room for individual preference. If it's author's repsonsibility to do X, they can ask for the reviewer to do it in the review request; or if it's the reviewer's responsibility, they can approve but say they didn't do X and are relying on the author to do it; or the author can assert they've done X in the request (perhaps it's difficult to do for this speciric request) and the reviewer can note they've relied on that assertion. But having a clear expectation strongly reduces the cases where author was relying on reviewer to check X and the reviewer was relying on the author, and X wasn't checked and the check would have found an issue before production.

Some things are a much bigger problem when found after production, and some things aren't; diffuse responsibility is ok for things that aren't a big deal, IMHO.

pc864y ago

Code review isn't really something we do at an individual level because we feel like it, it's something the team, as an entity, does to ensure work is completely to a certain standard of quality. One's personal preference on reviews (or how to perform them) is less important than the team as a whole being comfortable with the process.

To your second point, sure, if that's what the team agrees is better.

ricardolopes4y ago

As someone said in another reply, setting these team expectations is important so that the developer won't assume the reviewer will do the heavy testing, and the reviewer assumes the developer must have done it, leading to preventable production outages.

The developer can definitely do work like providing samples of generated SQL, UI screenshots and so on. Again, knowing who will do that deep due diligence is more important than the actual decision.

xen04y ago· 4 in thread

Most of the time I would expect the code to be already exercised by automated tests.

Sometimes, if it adds a new feature or something 'interesting', I've checked it out locally to see what the user-facing behaviour is, but this is very rare.

bri3d4y ago

Automated tests are very poor at capturing the nuance of user interaction, and I find that they frequently are not exhaustive or watching the video shows only a "happy path" and doesn't expose functional deficiencies in a feature. They show that a feature works, but not that it works correctly or especially not that it works _well_.

For straightforward regressions and minor tweaks I am usually satisfied to see a video and test automation, but for any kind of new functionality I strongly advocate pulling the code and actually playing with it.

Depending on your company structure, product managers can help with this functional review as well, although this requires working tooling and org structure in a way that doesn't exist at some companies.

quixoticaxolotl4y ago

It sounds like what you want is a beta testing phase once a feature is released, not a per-PR play phase.

I've had companies host release parties before, where everybody downloads the product and plays with it to identify glaring issues. It's a decent approach, but you'd get way better results having a dedicated testing team whose job is to test the software from a UX/UI/contract functionality if that's what you care about.

1 more reply

antifa4y ago

I'd expect tests to have assertions that all the expected 401/403/404/422 responses actually happen and have the correct response messages.

ryan_lane4y ago

You're describing what QA's function normally is.

nicoburns4y ago· 3 in thread

I wish there was a better workflow for this. We have code across ~10 repos at work, and I'm usually working on something else. So checking out somebody's code for pull requests is pretty disruptive. We generally consider "that the code runs" not to be part of code review, and have a separate testing phase before codes goes into production. Which works, but it definitely could be a smoother process.

codecurve4y ago

Tried git worktrees? https://git-scm.com/docs/git-worktree

nicoburns4y ago

I have played with them before, but I haven't properly tried them for this. I can see it solves half the problem (being able to open something twice). But there's still a lot left to be solved:

1. I need to check out ~8 branches

2. I then need to reinstall all dependencies for those branches.

3. I then need to run those branches locally on a different set of ports to my standard dev environment to avoid clashes (or shutdown my main one temporarily)

4. If it involves the mobile app I may need to wait some time for a clean build

It's a lot of administrative overhead compared to what I tend to default to in most cases which is just merge the branch after reading through it and then test in our staging environment once CI has run.

dkns4y ago

Have you tried using git worktree?

c7DJTLrn4y ago· 3 in thread

No.

  1. PRs should target development, which is an *unstable* branch, so no biggie if something breaks
  2. CI should catch things like syntax errors, failing tests, or poor test coverage
  3. I trust my colleagues to do their work properly and fix things if they've made a mistake

wodenokoto4y ago

I don’t get your first point. If the pull request doesn’t run:

- how can we even say it works?

- who’s responsibility is it to fix it?

- and when does it get fixed?

c7DJTLrn4y ago

What do you mean "doesn't run"? Tests should catch the most obvious/common bugs. If you mean it doesn't compile, then something is wrong because why would any engineer check in code that doesn't compile unless they're distracted or under pressure? And again, this would be caught by CI.

striking4y ago

In order:

* Tests

* The engineer making the PR

* Before they're allowed to merge

Pooge4y ago· 2 in thread

No. That's the job of unit tests, integration tests and your CI/CD pipeline or alternatively the test environment (dev, staging, quality control, you name it...). Moreover, the person having written the code is supposed to have checked that it runs and meets business requirements - although the latter should be ultimately checked by your Product Owner or client.

berkes4y ago

I do test manually when the author does not practice TDD. (Or if there aren't tests.)

Additional benefit is that this shows to devs and management what a time saver TDD and testing in general is. It shows immediately that lacking tests cost valuable, senior, time, rather than save time by not writing tests.

It also is simply needed: when there are no tests, we'll need to manually verify and check for regressions.

bobobob4204y ago

makes me sad to see this so far down.

koyote4y ago· 2 in thread

I only do it if I suspect that the code will not work. Sometimes I can think of edge cases that I suspect are not covered; so I check out the code, add the edge case as a unit test, and if it fails I mention it in the code review for the other dev to add and fix.

Other than that, no. I am reviewing the coding standard, architecture, and algorithms used. It's up to the dev to ensure that the code works and runs. The CI builds as well as the QA testers will test the actual functionality works as expected (obviously if I can't find the code that is supposed to fix/implement the PR item, I will question it).

nameisname4y ago

Wouldn't it be better to ask in the MR if they could just write a test to cover that case? It prevents wasting your time writing a test case that's going to be thrown out and prevents the other engineer from being blind sided by someone else committing a test case to their branch. I'm really not sold that the approach you're taking is of more value than just asking the MR author to add the test themselves.

koyote4y ago

That's a good point and you are right. I have simply asked for the engineer to write the test in the past, but sometimes my curiosity just gets the better of me and I want to try it out myself. Other times I am not quite sure what is wrong but it just "feels" wrong so I end up throwing a couple unit tests at it to see if my intuition was correct.

Note that this is quite rare and is maybe a once in a 100 code review occurrence. Usually it's quite obvious that something is wrong and you can just point it out or start a discussion about it.

nine_k4y ago· 1 in thread

Usually I don't, because the CI bot does it. I do read the code thoroughly though.

Only if something really draws my attention do I pull and run a particular PR (CR, CL, whatever yo call it). Once I did that to illustrate that the change introduces a security issue, so I had to write an exploit against it, and demonstrate how it works. (It helped.)

wodenokoto4y ago

I agree with this take. Generally it’s the tests job to run it, but test coverage is not 100%, so “sometimes” is a really good answer here.

alkonaut4y ago· 1 in thread

Only very rarely. Because we have a dedicated QA team (normally an antipattern but for complex enough software you some times end up building software you literally can’t run which is - yes - weird and difficult).

ChrisLTD4y ago

As long as you’re not using your QA team as an excuse to avoid writing automated tests, I don’t see why having a QA team would be an anti-pattern.

renewiltord4y ago· 1 in thread

I don’t pull it to run it. I pull it if I need to navigate the PR in an IDE.

tough4y ago

Github beta has file tree for PR's, really cool to avoid for such cases https://github.com/github/feedback/discussions/12341

imetatroll4y ago

If you want to approach the highest quality code your team can manage, you want:

- the author to test and have a succinct PR description that includes before and after screenshots/video as appropriate

- the reviewer to test the code locally

The reviewer test is particularly important for front-end code which is where automated tests are more likely to be poorly written or absent.

It goes without saying that there should be targeted automated tests, but I find that in practice the front-end is where this tends to fall short.

alexdowad4y ago

Historically, I have rarely tried running code under review, but I have a coworker who routinely does this and by doing so, he has caught my mistakes a couple of times.

I should probably learn from him.

eyelidlessness4y ago

It depends on:

- what’s in the change; if I’m reasonably sure I understand it, and that it won’t have side effects, and that it’s the right thing to do, I may skip running it

- how familiar I am with the area of the codebase; even if the above is satisfied, I’ll pull it and explore around the change to look for potential side effects I’ve missed, or other approaches that might be better for maintenance… and generally to improve my familiarity

- how confident I am in the other contributor’s (and reviewers’) thoroughness; I generally scrutinize even trivial changes if I don’t think others have, because surprising things sometimes reveal themselves

hatchnyc4y ago

Hell no. Also if this is your expectation you’re absolutely insane if this isn’t automated…also this kind of sounds like QA to me. My presumption is the base functionality requirements are met, I’m looking for potential performance issues, code standards, sound architecture, etc.

I am always amazed at how many repetitive tasks folks want to load up developers with. I think there’s a tendency among developers-turned-managers to see their job as crafting the perfect “program” of processes for their developers to follow, instead just automating these tasks. Like they say, the cobbler’s children have no shoes.

auslegung4y ago

So far everyone is saying, “it depends”, and that’s my experience as well. Depending on the complexity of the change, how impactful it will be in production, what the test suite is like, etc. I may or may not pull and run it.

simplyinfinity4y ago

For me personally, code review is the place where you take a look at the code structure, architecture and if it is using the correct ways to do things. It's not the place to check if business requirements are met.

Code compilation (and checks) & tests must be requirement as part of each PR

rvieira4y ago

Depends on the context and scope.

If it's something simple I can grok by reading the code and tests I usually don't.

If it's a complex piece of code, I find it valuable to run it locally, to setup some breakpoints, debug it and step it to understand it better.

dec0dedab0de4y ago

We have a server automatically deployed for every branch, so most of the time if I want to see it running I can just check out that server.

Though sometimes I'll check out their branch, open up a repl and run a few functions or something on my own.

tootie4y ago

We have a CICD pipeline for every branch. We use squash.io which spins up a VM and runs the docker compose to create a per-branch QA env. The process however is that a developer does a code review on just the code first, then a tester tests the feature branch, then we merge and do some regression testing (manual and auto), then release.

manibatra4y ago

Yes we started doing this around 6 months ago. We started using [Github PR templates](https://docs.github.com/en/communities/using-templates-to-en...) to prompt for instructions on how to run and test the PR. We are seeing following benefits :

- We are finding deficiencies in our testing/development methods. The more people have to test/run code the more we are able to find bottlenecks and eliminate them.

- Knowledge sharing. We found that team members had knowledge/tricks in their heads which are being shared more.

- The overall reliability of our systems(measured via SLOs) has greatly increased.

bradlys4y ago

Varies by company/project/team.

In most of my companies - no. The reason? Not practical for time constraints.

To be honest - this is mostly how you find out if a company values quality or speed. Most I’ve been at valued speed more than anything.

bombcar4y ago

For us the developer is assumed to have run it, the CI checks that it builds and loads, code review is more for the decisions and any gotchas.

Bugs introduced will be caught and fixed in general testing usually.

Klaster_14y ago

Yes, absolutely. One time, my team was left without QA personnel, so we had to check features manually for a while. Even after this no longer became mandatory, running the code almost always reveals aspects worthy of a discussion as a part of code review. Obviously, this has a drawback of longer reviews, especially if testing a particular scenario involves quite a bit of effort. Would recommend, certainly beats longer feedback loop when the same issues are revealed during the QA acceptance phase.

jhnct4y ago

Yes, UI tests are very hard to maintain on Android so I always pull and run the code to check the actual behavior. As said by others compilation and linting are done by the CI.

faizshah4y ago

I work on a frontend team, you usually want at least 1 person on the team who reviews code by pulling and running it. They can find unique bugs you didn’t think of while coding. For me, if I’m not busy I try to run it, if I’m busy I probably only give it a quick read. I try to maintain one development server whose purpose is to host my in review CR’s code so that reviewers can just click it and try it instead of having to pull it.

version_five4y ago

For smaller PRs we usually review without running locally unless there is something specific to check. And then we do separate release reviews where we install and run it locally as part or testing. This is for ML projects though, we are not doing any kind of continuous integration or continuous releases, and it's generally small repositories. It must really depend on the nature of the project and how the reviews fit in.

bri3d4y ago

I think this is a good practice to think about - especially from the perspective of "if not, why not, and how can we make it easier." Some systems are "just complex" and will always be that way, but often this thought exercise can expose some low cost investments in developer tooling or staging environments which can lift all ships.

As for my own process, I have no hard and fast rule, but generally for user-impacting and functional UI changes I strongly encourage this on my teams. At companies with a functioning tooling setup, pushing to a remote staging environment is also useful because other stakeholders like Product and Design can make changes before the feature lands.

I do also advocate a fairly strict rule around screenshots before/after and a regression test for bug fixes. This prevents engineers from pushing speculative "I totally know what's going on" fixes which seem intuitively correct to a reviewer as well and then simply don't work. I'm tempted to do this all the time and even some really basic process does a good job at dissuading it.

asd884y ago

If you need to run the code to make sure it works, you probably need tests for it. Tests + CI makes running code being reviewed pointless.

mepiethree4y ago

In my experience the idea that you will catch all bugs by just reading the code is just wishful thinking - I saw plenty of bugs merged into the development branch on codebase I worked on at Google despite somewhat strict PR review process.

Now that I’m in startup land, we use Heroku’s Review App feature, which is by far my favorite Heroku feature, and helps us catch bugs all the time

sorobahn4y ago

I think this depends on the scale and type of the project, but it's good to get in the habit of codifying tests and setting up CI ASAP. It's basically scaling you pulling down code and running it. Now with containers, there isn't much of a reason to not have automated build checks and testing. Even when you have changes in a distributed environment, things like docker-compose make it much easy to create mocked CI environments with dbs, caches, etc. Also, CI helps me prioritize PRs when I have many to review. Being able glance at github to see which ones are mergeable immediately vs which ones may have other dependencies causing CI to break is great.

If you're working on some legacy codebase and don't have these luxuries, I totally get running it locally first. I am lucky to work with people who I do trust deeply to not waste others' precious time by testing first so there is probably also a human element for me.

actuallyalys4y ago

While I think the author has the most responsibility to test changes, it's too easy to commit only some of the changes or otherwise make a change that's dependent on your development environment. Running tests and building code on CI prevents some, but not all of these mistakes.

Risk also needs to be taken into account. It's not necessary to spin up the application locally for tweaking error message wording or a small CSS change with no real responsiveness or accessibility implications. But for most fixes and features, I think at least a quick check is worth it. And if you're mentoring an intern or very junior developer, you should probably check every change, at least for the first few months.

Finally, the rest of your testing makes a big difference. If you have a formal QAer who tests your application before release or a community of people running nightlies, it's less urgent.

mjaniczek4y ago

Mostly we depend on the CI actions (running automated tests and linters), but every PR also creates a feature branch on the testing/staging environments so anybody can run the app _as of that branch_ online. The URL schema is kind of like

https://app-staging.foo.com/feature/add-sorting/

compared to the original

https://app-staging.foo.com/

given you have created a PR pointing from branch named feature/add-sorting to master.

This also makes it easier for our QA engineers to test things before they get merged to master.

gsej4y ago

I used to pull the code and run it for each PR, as I believed it just wasn't possible to read code adequately. Then I worked with one guy who clearly could read code very well - he found quite subtle issues just by reading it, and having seen it was possible, I put in the effort and learned to do it too. So in my case, my reason for always pulling and running the code was, as you suggest, that I simply wasn't very good at reading code. I still won't hesitate to pull and run code if I need to, but I'm pretty suspicious of code I can't read these days, and usually request it be made clearer.

jcubic4y ago

In my last project at the current job, I've had one team member that always pulled the code and check it locally. And that from my experience was best code review that I've ever had. He sometimes found a bug in my code, because of something that I didn't tested, even before go to dev env and tested properly by QA.

No one else in the team was doing the same including myself. I'm not exactly sure but maybe part of the reason was that he was not much familiar with the code base, and checking if it works was the way to test if the code is correct.

keyanp4y ago

No almost never. That’s the job of the CI build. If the code doesn’t even run then it should be caught by the tests. If there are no tests that would catch it then they must be written before passing code review.

asdfasgasdgasdg4y ago

Very rarely. We have presubmit actions that confirm everything is building and passing tests. And I insist on at least some degree of testing in all cases. Running it locally would be of little help for our case because the systems I work on are large enough that it's not trivial to examine behavior through local commands.

Think of an rdbms, and I'm reviewing changes that optimize or allow new schema options. It would not be productive for me to design new schemas each time I review code. Instead, the person sending the review should already have done that work in tests.

Tade04y ago

Every time I skipped that part and the change wasn't trivial, I missed something important, so yes.

I'm a big proponent of "trial by fire" reviews. The idea is to setup a dumpster environment, deploy the change there and give it for a spin to someone who's keenly interested in seeing some progress - sometimes it's the PM, other times the sales guy - anyone who has some time in between meetings and would appreciate a little bit of variety in their work life.

Such people are amazingly talented at finding catastrophic edge cases.

abootstrapper4y ago

Yes, I read the code and also run the code locally. Reading the code is for strategic larger picture thinking and running the code is to find tactical errors in the implementation.

bryanrasmussen4y ago

it depends on application size, if you have dedicated testers etc.

currently at a place with extremely large code base and with dedicated testers, I don't run the code.

Sometimes even with dedicated testers etc. I will run the code if I think something might be problematic and I want to double check. I tend to only comment on code though if I think it could really be improved, if you have named a variable something I wouldn't I probably won't care.

jFriedensreich4y ago

this really depends on many factors like maturity of the tests and static code analysis, size and seniority distribution of the team and also if part of the pr acceptance is accepting the feature or if the team has a proper product management where the feature is accepted independent from the more code focused PR. i would totally agree with many comments, that a project that is way past mvp phase with proper feature acceptance and integration tests as part of the CI/CD pipeline should not require you to run the code for pr review, there are many phases in early projects where this might be necessary. eg when i review prs from junior devs where something just does not feel right, but change requests are too complex for simple comments on the pr i always use codespaces to run and manually test the code and understand what exactly they did and how things should be. also for a complex and critical feature i could not understand the PR just by reading the diff, its sometimes necessary to run and see the code in context to really understand it.

crnkofe4y ago

I tend not to run stuff locally due to the complexity of setting up whatever is being tested. This goes especially for bug fixes and 3rd party integrations.

I do however check out code locally. Online code review only shows what's been changed and I often find subtle issues in the code that wasn't changed. In short I'm trying to ask myself: What's missing in implementation?

drakonka4y ago

I don't have any particular rule for it, it depends entirely on the PR. If I feel confident that I can review the change without running it locally, I don't do it. But if I feel that my review might be compromised by not seeing the behaviour first hand or that I need to step through some part of the code myself to properly understand it, I'll run it.

teeray4y ago

It depends. If looking at the diffs isn’t enough to really understand the change, then yeah, I’m pulling it down. I can more easily navigate using static analysis tools that way. I can also play with it too to find edge cases.

I certainly wouldn’t make it a requirement of reviewers. I leave it to them to make their own determination on whether a patch is satisfactory.

rapfaria4y ago

Every person saying "no" is probably part of a full, competent team, that has automated tests for everything, QA folks - that truly understant what they are doing and what are the business requirements, etc, etc.

If you are missing stuff like this, than yes. I'm surprised at the amount of time developers deliver something that doesn't even boot

anderspitman4y ago

I made a little service called quarantest[0] for web apps that you can point at a GitHub repo. For every PR/push it does a fresh build and makes the app available at a url based on the commit hash.

[0]: https://github.com/anderspitman/quarantest

User234y ago

I use Emacs so yes of course[1]. Not only do I get a nice editing environment for editing comments, I can also take advantage of language server features to make navigating and understanding the change easier.

[1] https://github.com/magit/forge

xtracto4y ago

Where i work ehen a PR is created, a copy of the environment is spinned with the changes automatically by the CI peocess. Mainly so that QA can test the PR.

So people doing code review can poke at the resulting running program if they like. But generally code review is used for the "how" instead of the "does it work?"

mrlucax4y ago

I imagine this is particular to company/team policy. In my case, the team does 2 reviews before a deployment. The first validates architecture and gerneral structure, what is considered not be a code review. The second is a "functional" validation, where the person must run and test the use cases.

wombat-man4y ago

it varies. I work on mobile applications, the last place I worked did not afford us much/any time to really do quality unit testing.

If it's code that is really changing behavior significantly then I'll run it on my device and just see if I notice anything. I've found some surprisingly gnarly bugs this way.

monster2control4y ago

I almost never run the code locally unless I see something questionable and can’t reason about what it will do because I’m unfamiliar with something in it. For that, I tend to use a REPL for just the snippet of questionable code.

However, if the code is a large enough change, and it needs testing anyways, I will run it.

jghn4y ago

It depends.

Your CI should be robust enough to catch "does it run" type issues. That's not always possible for one reason or another. If there are things that your CI can not currently check, you should be verifying local. You should also be striving to close those gaps in your CI.

braza4y ago

I run data pipelines using some Docker as a function pattern and if the ELT is small o kickoff to see if it works and then I start the review. Takes a bit more time but at the end of the day everyone has a bit more security to review a “compiled” piece of code for our ELT.

bArray4y ago

I've worked mostly on back-end stuff and "works for me/this" is asking for trouble. Making sure it builds is literally the minimum review you can do. You learn whether it actually compiles and whether it requires changes to the development environment.

_benj4y ago

Usually no. There are situations in which I want to understand better how something is working or if an edge case was taken into account, or if I want to suggest a change and I wanna check my suggestion against the code first.

In those cases I’ll pull the code

rurban4y ago

Sometimes I do, esp. when CI is behind and doesn't have the latest compilers, sanitizers and not enough warnings turned on.

or as in the latest case when the developer has the newest msvc compiler with ' as digit separators, but the CI not.

idontwantthis4y ago

I do only when I suspect they overlooked something that I have prior knowledge about.

3144y ago

Not generally, and I prefer not to. The code being clear enough to read is one of our review criteria.

But... our system is a large pile of legacy crap, and too often the only way to check something properly is to step through it in a debugger.

BossingAround4y ago

Code that I haven't tested is not code I can give a thumbs up on. Yes, we pull and test. This is enforced for any PRs that actually change some code (with the obvious exceptions of comments, formatting, or similar PRs)

karmakaze4y ago

Depending on the change and my understanding of it, I will: (a) review the changes, (b) browse the version of the files on GitHub, (c) checkout the version locally and browse in IDE, (d) run stuff (app, tests, console).

mesadb4y ago

We have bird slots like: Canary, Eagle, Owl etc. When there is a PR according to its order it gets deployed to one of them.

Then we have preflight.com UI tests that run on those environments. Would love to tell you my experience.

vasergen4y ago

Sometimes, mostly for important features or for cases when I am not sure if it will work or how it works. On the other side I run code locally whenever I do PR, this gives me more confidence in changes I made.

daviddever23box4y ago

Yes, but only in tandem with actually reading the code (not just the changes).

fuzzy24y ago

Sometimes. It was worth it every time, too. "I thought this was working"–well, it wasn't.

Most of the time I pull the code to easily navigate it in context. The diff viewer will only get you so far.

mynegation4y ago

Generally, no, it’s CI’s job to build and run tests on every PR. On some occasions when tests do not cover some condition, I ask to add the test, instead of running it myself.

mattnewton4y ago

I expect automated tests to run, and where practical a demo link with the change to be provided for larger changes that aren't captured by automated testing.

raman1624y ago

Most of the time I don't because I can understand the code. If I can't, then I usually pull it down and interact with it.

detaro4y ago

No hard-and-fast rule, depends if I feel I need to play through options or not, what others are involved, ...

svilen_dobrev4y ago

once in a while, a new feature (or fix) can mess up with another new feature (or fix). And running those branches separately would be ok by themselves. But when both merge, then stuff goes wrong.

So it might be useful to run the thing before AND after merging, but that is very very tedious..

rm_-rf_slash4y ago

Almost always.

Better I find out about edge cases that weren’t considered or unit tested than our customers.

tdstein4y ago

Only if I don’t understand something and need to validate an assumption.

niros_valtos4y ago

As long as you don’t rubber stamp the PRs, any approach would be good.

thunkshift14y ago

Isnt this what a ci/cd is supposed to do? Automatically?

ykevinator24y ago

I never do, you have to trust people.

leemcd564y ago

Y-Yes sweats profusely

j / k navigate · click thread line to collapse

141 comments

125 comments · 75 top-level

paradite4y ago· 17 in thread

I'm a senior frontend engineer doing a lot of code reviews and my team loves my MR (PR) submission standard:

- Require the author to attach screenshots of the feature in a markdown table format showing the before and after comparison. This is enforced using a MR template.

- Require the author to attach screen recording for complex user interactions.

Overall effect is that it saves everyone's time.

- The author has to verify that the feature works anyway, so taking screenshots / recordings doesn't take much addtional efforts.

- The reviewer can focus on code itself instead of checking whether the code works.

- Easy for both parties to spot design and UX issues/regressions visually.

BossingAround4y ago

But you still have to test the actual functionality, don't you? What if the author makes a gif of clicking a button, but doesn't record if the browser back button still works?

I'd think that for frontend, you'd have CI/CD pipeline that deploys the code into a staging server, where I can test it myself.

bradwood4y ago

Code review != Testing.

A lot of people conflate these IMHO. Code review should not really be about whether it "works" or not. That's what tests are for.

Code reviews are about checking for code complexity, good use of abstractions, readability, etc.

4 more replies

stephen_cagle4y ago

Wish I could remember the github service that did this?

3 more replies

ryan_lane4y ago

paradite4y ago

It's interesting that you brought up the issue of back button. It is indeed an area where bugs frequently occur.

I haven't found a good solution except manually asking in every MR that I sense a potential issue. Maybe it is a good idea to have it in MR template as a checklist item.

mnutt4y ago

paradite4y ago

Interesting to hear that there are automation solutions around this. In your experience / expertise, how much work can be automated by such tools? 80%? 99%?

dinkleberg4y ago

This is an excellent idea for a submission standard, will have to store this in the back of my mind.

slaymaker19074y ago

Word is actually really great for this stuff. You get a portable file and can add notes to the screenshots. There is even a screenshot button built into Word!

tomatohs4y ago

I'm working on an "instant replay for developers" desktop app for exactly this case. Create desktop instant-replays to embed in pull requests:

http://replayable.io/

bobobob4204y ago

paradite4y ago

Requiring screen recording isn't about lack of trust or lack of competence. Rather it's a way of eliminating information asymmetry between the author and the reviewer:

- The author might have checked that the code works fine (A) or haven't done it (B).

- The reviewer might decide to run the code locally (1) or not (2).

Combination A1 results in duplicate effort. Combinations A2 and B1 are perfect. Combination B2 results in potential bugs.

1 more reply

flippinburgers4y ago

Consider yourself lucky to be on a team where people are competent at testing the features they create.

1 more reply

Folcon4y ago

This is great, do you do this in github? Or elsewhere? Just curious if you had a template to hand that myself and others could use =)...

paradite4y ago

The PR/MR template itself is pretty generic. You can make one yourself quickly by referring to GitHub markdown table documentation: https://docs.github.com/en/get-started/writing-on-github/wor...

x86a4y ago

"MR" so probably Gitlab

1 more reply

lazypenguin4y ago

I’ve found this works extremely well for visual changes and can’t believe when it’s not the default behavior!

tharne4y ago· 12 in thread

Yes.

I didn't used to, but that was a very big mistake.

Just wait until something you signed off on doesn't run and folks ask, "But didn't you review this code and say that everything 'looked good'? It doesn't even run!"

It's embarrassing to say the least.

If you're doing a code review, run the darn code. What would you think of a mechanic who didn't make sure the car was able to turn on after claiming it was fixed?

city414y ago

peterhunt4y ago

newaccount744y ago

azornathogron4y ago

Anyway, wherever possible mechanical stuff like "does it build and run" should be covered by a CI system or other automation, not by spending reviewer time on it.

makapuf4y ago

Letting the author run it could lead to "it ran on my pc" surprises.

maximus-decimus4y ago

To be fair, it's more like "what would you think a mechanic who doesn't verify that the mechanic who fixed the car didn't check it ran."

If you can't trust the person who created the PR to have tried to compile the code... What are they even doing.

muzani4y ago

I'd say much of the value of a code review on senior devs is making sure that it's just all properly committed and nothing breaking.

bobobob4204y ago

Just make the developer responsible for code they write. It is not hard or complex. You shouln't have to build code from every branch, compile, run, and test. That is not what a code review is...

bluGill4y ago

My job isn't to make your code work, it is to find those little things that don't seem to be bad that are. I have tools for obvious things.

Which is to say 'it doesn't work' is nitpicking that reviewers shouldn't be wasting their time checking.

verdverm4y ago

Or have a machine run the code, like a CI system?

icedchai4y ago

1 more reply

makapuf4y ago

ricardolopes4y ago· 4 in thread

This should be something agreed within a team, so that review standards are consistent across team members.

No one strategy is right or wrong, as long as expectations are set and everyone knows their responsibilities and expectations.

tester7564y ago

>This should be something agreed within a team, so that review standards are consistent across team members.

Why? it feels like individual preference

>such as checking in the logs that the ORM was building correct SQL.

Couldnt the person who creates PR copy/paste sample generated SQLs into PR?

toast04y ago

A team standard helps contribute to clear responsibility.

Some things are a much bigger problem when found after production, and some things aren't; diffuse responsibility is ok for things that aren't a big deal, IMHO.

pc864y ago

To your second point, sure, if that's what the team agrees is better.

ricardolopes4y ago

The developer can definitely do work like providing samples of generated SQL, UI screenshots and so on. Again, knowing who will do that deep due diligence is more important than the actual decision.

xen04y ago· 4 in thread

Most of the time I would expect the code to be already exercised by automated tests.

Sometimes, if it adds a new feature or something 'interesting', I've checked it out locally to see what the user-facing behaviour is, but this is very rare.

bri3d4y ago

quixoticaxolotl4y ago

It sounds like what you want is a beta testing phase once a feature is released, not a per-PR play phase.

1 more reply

antifa4y ago

I'd expect tests to have assertions that all the expected 401/403/404/422 responses actually happen and have the correct response messages.

ryan_lane4y ago

You're describing what QA's function normally is.

nicoburns4y ago· 3 in thread

codecurve4y ago

Tried git worktrees? https://git-scm.com/docs/git-worktree

nicoburns4y ago

I have played with them before, but I haven't properly tried them for this. I can see it solves half the problem (being able to open something twice). But there's still a lot left to be solved:

1. I need to check out ~8 branches

2. I then need to reinstall all dependencies for those branches.

3. I then need to run those branches locally on a different set of ports to my standard dev environment to avoid clashes (or shutdown my main one temporarily)

4. If it involves the mobile app I may need to wait some time for a clean build

dkns4y ago

Have you tried using git worktree?

c7DJTLrn4y ago· 3 in thread

No.

  1. PRs should target development, which is an *unstable* branch, so no biggie if something breaks
  2. CI should catch things like syntax errors, failing tests, or poor test coverage
  3. I trust my colleagues to do their work properly and fix things if they've made a mistake

wodenokoto4y ago

I don’t get your first point. If the pull request doesn’t run:

- how can we even say it works?

- who’s responsibility is it to fix it?

- and when does it get fixed?

c7DJTLrn4y ago

striking4y ago

In order:

* Tests

* The engineer making the PR

* Before they're allowed to merge

Pooge4y ago· 2 in thread

berkes4y ago

I do test manually when the author does not practice TDD. (Or if there aren't tests.)

It also is simply needed: when there are no tests, we'll need to manually verify and check for regressions.

bobobob4204y ago

makes me sad to see this so far down.

koyote4y ago· 2 in thread

nameisname4y ago

koyote4y ago

Note that this is quite rare and is maybe a once in a 100 code review occurrence. Usually it's quite obvious that something is wrong and you can just point it out or start a discussion about it.

nine_k4y ago· 1 in thread

Usually I don't, because the CI bot does it. I do read the code thoroughly though.

wodenokoto4y ago

I agree with this take. Generally it’s the tests job to run it, but test coverage is not 100%, so “sometimes” is a really good answer here.

alkonaut4y ago· 1 in thread

ChrisLTD4y ago

As long as you’re not using your QA team as an excuse to avoid writing automated tests, I don’t see why having a QA team would be an anti-pattern.

renewiltord4y ago· 1 in thread

I don’t pull it to run it. I pull it if I need to navigate the PR in an IDE.

tough4y ago

Github beta has file tree for PR's, really cool to avoid for such cases https://github.com/github/feedback/discussions/12341

imetatroll4y ago

If you want to approach the highest quality code your team can manage, you want:

- the author to test and have a succinct PR description that includes before and after screenshots/video as appropriate

- the reviewer to test the code locally

The reviewer test is particularly important for front-end code which is where automated tests are more likely to be poorly written or absent.

It goes without saying that there should be targeted automated tests, but I find that in practice the front-end is where this tends to fall short.

alexdowad4y ago

Historically, I have rarely tried running code under review, but I have a coworker who routinely does this and by doing so, he has caught my mistakes a couple of times.

I should probably learn from him.

eyelidlessness4y ago

It depends on:

- what’s in the change; if I’m reasonably sure I understand it, and that it won’t have side effects, and that it’s the right thing to do, I may skip running it

hatchnyc4y ago

auslegung4y ago

simplyinfinity4y ago

Code compilation (and checks) & tests must be requirement as part of each PR

rvieira4y ago

Depends on the context and scope.

If it's something simple I can grok by reading the code and tests I usually don't.

If it's a complex piece of code, I find it valuable to run it locally, to setup some breakpoints, debug it and step it to understand it better.

dec0dedab0de4y ago

We have a server automatically deployed for every branch, so most of the time if I want to see it running I can just check out that server.

Though sometimes I'll check out their branch, open up a repl and run a few functions or something on my own.

tootie4y ago

manibatra4y ago

- We are finding deficiencies in our testing/development methods. The more people have to test/run code the more we are able to find bottlenecks and eliminate them.

- Knowledge sharing. We found that team members had knowledge/tricks in their heads which are being shared more.

- The overall reliability of our systems(measured via SLOs) has greatly increased.

bradlys4y ago

Varies by company/project/team.

In most of my companies - no. The reason? Not practical for time constraints.

To be honest - this is mostly how you find out if a company values quality or speed. Most I’ve been at valued speed more than anything.

bombcar4y ago

For us the developer is assumed to have run it, the CI checks that it builds and loads, code review is more for the decisions and any gotchas.

Bugs introduced will be caught and fixed in general testing usually.

Klaster_14y ago

jhnct4y ago

Yes, UI tests are very hard to maintain on Android so I always pull and run the code to check the actual behavior. As said by others compilation and linting are done by the CI.

faizshah4y ago

version_five4y ago

bri3d4y ago

asd884y ago

If you need to run the code to make sure it works, you probably need tests for it. Tests + CI makes running code being reviewed pointless.

mepiethree4y ago

Now that I’m in startup land, we use Heroku’s Review App feature, which is by far my favorite Heroku feature, and helps us catch bugs all the time

sorobahn4y ago

actuallyalys4y ago

Finally, the rest of your testing makes a big difference. If you have a formal QAer who tests your application before release or a community of people running nightlies, it's less urgent.

mjaniczek4y ago

https://app-staging.foo.com/feature/add-sorting/

compared to the original

https://app-staging.foo.com/

given you have created a PR pointing from branch named feature/add-sorting to master.

This also makes it easier for our QA engineers to test things before they get merged to master.

gsej4y ago

jcubic4y ago

keyanp4y ago

asdfasgasdgasdg4y ago

Tade04y ago

Every time I skipped that part and the change wasn't trivial, I missed something important, so yes.

Such people are amazingly talented at finding catastrophic edge cases.

abootstrapper4y ago

Yes, I read the code and also run the code locally. Reading the code is for strategic larger picture thinking and running the code is to find tactical errors in the implementation.

bryanrasmussen4y ago

it depends on application size, if you have dedicated testers etc.

currently at a place with extremely large code base and with dedicated testers, I don't run the code.

jFriedensreich4y ago

crnkofe4y ago

I tend not to run stuff locally due to the complexity of setting up whatever is being tested. This goes especially for bug fixes and 3rd party integrations.

drakonka4y ago

teeray4y ago

I certainly wouldn’t make it a requirement of reviewers. I leave it to them to make their own determination on whether a patch is satisfactory.

rapfaria4y ago

If you are missing stuff like this, than yes. I'm surprised at the amount of time developers deliver something that doesn't even boot

anderspitman4y ago

I made a little service called quarantest[0] for web apps that you can point at a GitHub repo. For every PR/push it does a fresh build and makes the app available at a url based on the commit hash.

[0]: https://github.com/anderspitman/quarantest

User234y ago

[1] https://github.com/magit/forge

xtracto4y ago

Where i work ehen a PR is created, a copy of the environment is spinned with the changes automatically by the CI peocess. Mainly so that QA can test the PR.

So people doing code review can poke at the resulting running program if they like. But generally code review is used for the "how" instead of the "does it work?"

mrlucax4y ago

wombat-man4y ago

it varies. I work on mobile applications, the last place I worked did not afford us much/any time to really do quality unit testing.

If it's code that is really changing behavior significantly then I'll run it on my device and just see if I notice anything. I've found some surprisingly gnarly bugs this way.

monster2control4y ago

However, if the code is a large enough change, and it needs testing anyways, I will run it.

jghn4y ago

It depends.

braza4y ago

bArray4y ago

_benj4y ago

In those cases I’ll pull the code

rurban4y ago

Sometimes I do, esp. when CI is behind and doesn't have the latest compilers, sanitizers and not enough warnings turned on.

or as in the latest case when the developer has the newest msvc compiler with ' as digit separators, but the CI not.

idontwantthis4y ago

I do only when I suspect they overlooked something that I have prior knowledge about.

3144y ago

Not generally, and I prefer not to. The code being clear enough to read is one of our review criteria.

But... our system is a large pile of legacy crap, and too often the only way to check something properly is to step through it in a debugger.

BossingAround4y ago

karmakaze4y ago

mesadb4y ago

We have bird slots like: Canary, Eagle, Owl etc. When there is a PR according to its order it gets deployed to one of them.

Then we have preflight.com UI tests that run on those environments. Would love to tell you my experience.

vasergen4y ago

daviddever23box4y ago

Yes, but only in tandem with actually reading the code (not just the changes).

fuzzy24y ago

Sometimes. It was worth it every time, too. "I thought this was working"–well, it wasn't.

Most of the time I pull the code to easily navigate it in context. The diff viewer will only get you so far.

mynegation4y ago

Generally, no, it’s CI’s job to build and run tests on every PR. On some occasions when tests do not cover some condition, I ask to add the test, instead of running it myself.

mattnewton4y ago

I expect automated tests to run, and where practical a demo link with the change to be provided for larger changes that aren't captured by automated testing.

raman1624y ago

Most of the time I don't because I can understand the code. If I can't, then I usually pull it down and interact with it.

detaro4y ago

No hard-and-fast rule, depends if I feel I need to play through options or not, what others are involved, ...

svilen_dobrev4y ago

once in a while, a new feature (or fix) can mess up with another new feature (or fix). And running those branches separately would be ok by themselves. But when both merge, then stuff goes wrong.

So it might be useful to run the thing before AND after merging, but that is very very tedious..

rm_-rf_slash4y ago

Almost always.

Better I find out about edge cases that weren’t considered or unit tested than our customers.

tdstein4y ago

Only if I don’t understand something and need to validate an assumption.

niros_valtos4y ago

As long as you don’t rubber stamp the PRs, any approach would be good.

thunkshift14y ago

Isnt this what a ci/cd is supposed to do? Automatically?

ykevinator24y ago

I never do, you have to trust people.

leemcd564y ago

Y-Yes sweats profusely

j / k navigate · click thread line to collapse