Wouldn't the fact that a location was fixed recently imply that it now has fewer bugs? And wouldn't a location that hasn't been touched recently be likely to be problematic?
- if a file introduced introduced a bug recently, it will tend to introduce bugs again - new files added with the bug introducing file will tend to introduce bugs - other files changed with the bug introducing file will tend to introduce bugs - files often changed together with the bug introducing file will tend to introduce bugs soon
fixCache maintains a fixed-size cache of these bug-prone files based on bug fix-commits. This helps in prioritizing verification and testing resources (right now it only updates a pull request with a comment and a label). If a file no longer introduces a bug, it will eventually be replaced from the cache.
Maybe critical areas, e.g. that have the same amount of bugs as the rest but are complained about more? (since the algorithm can only consider bugs that have been reported, so biased to areas important to users) Or maybe that are prioritized by management? (since it considers fixed bugs, so bias towards bugs that were fixed first)
Hopefully an increased scrutiny on new patches to those areas leads to fewer bugs getting in which breaks the feedback loop, but if bugs are fixed in separate commit this sounds like it could have negative effects (specific developers/areas getting all the attention, leading to the discovery of more bugs/nitpicks there, reinforcing the bias...).
Then users could write a custom hook that looked up the given commit using github's APIs or whatever other crazy scheme your team uses for bug tracking (e.g. cross reference JIRA ticket number baked into commit message with JIRA and look at the type of the JIRA ticket), but FixCache itself could be kept clean and pure from these integrations, which most users wouldn't want.
Github has lots of APIs, I'd bet it is possible to do this with github provided the data defining the relationship between the commit and the bug is encoded somehow -- either in commit message or in github issue or PR metadata or comment text.
You set up a branch to track ("TRACKED_BRANCH") commits on (its 'master' by default). A history of commits, from last 30 days by default, is extracted from that branch on installation and bug fixes are detected from commit messages.
After installation, the app subscribes to push events (pull request merges are also push events) on the "tracked_branch" .On every push to the tracked branch, all the commits in the push are checked for bug-fix commits and the file cache is updated accordingly if there are any bug-fix commits.
The paper's introduction section is more useful, but I wonder if people will get that far. Likewise the repo screenshot is useful but it's way down the README page.
Anyway, these are just well intentioned comments. It's easy to lose track that potential users won't get what you've been building until you pitch it in their terms.
Thanks for the input!
distance(e1, e2) = 1/c(e1, e2) if c(e1, e2) > 0 otherwise infinity. c(e1, e2) is the count of times e1 and e2 have been changed together.
I have not implemented the spatial entity in v0, as it is a bit tricky to identify the exact file that introduced a bug from a fix commit. For now, only the files modified in a bug-fix commit are put in the cache.
The app maintains a fix-sized cache of bug-prone files from fix-commits and updates a pull request with information about the cache if these bug-prone files have been updated in the pull request.