Replay started off as a simple experiment in what would happen if we added a step back button and rewind button to the Debugger. We quickly realized two things. First, nobody uses breakpoints. Second, being able to share is so much more powerful than being able to rewind.
Here’s how Replay works today. Somebody on the team records a bug with the Replay Browser and shares the replay url with the team. From there, developers jump in and add print statements. The logs appear in the Console immediately so you don’t need to refresh and reproduce a thing.
Over the past year we’ve talked to hundreds of users, recorded 2.5 million replays, and worked incredibly hard to ensure Replay would be fast, secure, and robust from the get go.
Want to check it out? You can download Replay today. Can’t wait to hear what you think!
Interested in learning more, here is our announcement blog post https://medium.com/replay-io/launching-replay-the-time-trave...
That works well for ruby and javascript and probably lots of others.
e.g. something like the equivalent of
def foo
bar.map do |b|
b.do_something!
debugger if b.state == :something_youre_interested_in
end
rescue => e
debugger
end
I'm really excited by Replay. I think it will be invaluable.Also the GP claims "nobody" uses breakpoints? Everyone in my team uses them all the time.
Suppose it all depends on your language/tool chain, in my experience breakpoints are very commonly used by developers in visual studio.
It's really a poor man's replay, though. That tool looks really slick, I'll definitely give the Python version a go if/when it comes!
Also, "nobody" uses breakpoints? Everyone in my team uses them all the time.
It saddens me that a lot of people don't use debuggers and default to adding print statements. As far as I can tell, it's for several reasons:
1. The debugger is primitive (e.g. Godot GDScript - no conditional breakpoints or watches).
2. The debugger is unstable (e.g. Android Studio - frequently hangs, or takes a long time to populate data needlessly)
3. The debugger's UI is not friendly (e.g. Android Studio - hitting a breakpoint in multiple threads causes unexpected jumps or loss of current state; VSCode C++ debugger - doesn't display information properly or easily (arrays of objects) or displays too much information (CPU registers, flags, memory addresses); C++ debugger for D - doesn't display D data types).
4. The debugger is not properly integrated into the environment - can't find symbols, libraries or source files, or finds the wrong source files, etc. Need to jump through hoops to configure those.
5. Platforms don't support debuggers properly (e.g. again Android - ANRs when debugging the main thread, can't leave a debugging session overnight without some timer killing the process)
6. Developers got used to the workflow of "add a print statement, rerun and check the console" since high school and nobody taught them a more powerful tool
7. Developers code all day, so adding print statements by coding feels more natural than switching to the debugger's UI and way of doing things. (e.g. "if (i == 100) console.log(value)" allows you to stay in the same code, as opposed to setting a breakpoint, finding out how to add the 'i == 100' condition and pray that there's no issue with variables being optimized out at runtime).
I like Replay's features and that it's improving the state of the current tools. At the end of the day, adding print statements in Replay doesn't seem to affect the state of the application, so in that sense it's similar to gdb commands in that it's just a UI choice, but I wouldn't go as far as encouraging print-based debugging.
Outside of Replay, print-based debugging is still a primitive way of analyzing the state of the app and promoting this state of affairs reduces the pool of people who use and would hopefully improve the existing debuggers.
We all appreciated Firebug and the Chrome DevTools because of the powerful features they give us to inspect the state of the application. Imagine a person who adds print statements to their code every time they want to inspect the DOM or check the current CSS attributes. It works, but we have better tools, and we should make them even better.
It is an amazing way to discover how a codebase works. You pick a point of interest, and then you get the entire path from the beginning of the app's execution to that point as your stack trace, and every variable along the way too. Watches are great too for tracking a value changing over time.
Micro-services and Docker also took debugging many steps backwards - one advantage of a monolith is that you can easily step-through the entire execution, whereas if you have to cross process-boundaries it becomes a lot more complex to properly debug.
I'm working on a monorepo template at the moment where everything is debuggable with a single-click. This includes debugging native addons in C++ and Rust for Node.js projects. It's not easy - which is why people avoid debuggers so much.
I recently setup debugging in a Rust project for IntelliJ was the alternative was adding `dbg()!` statements which involved 10s recomplilation. The difficulty was implementing the right pretty-printers in lldb so you could see the value of various types because support is quite immature at the moment.
But if I don't have VS or PyCharm available, I'll switch to printf debugging.
Though there are some cases where even with a good debugger I'll end up debugging by modifying the code. Sometimes it's necessary for performance reasons. Conditional breakpoints when debugging C# are extremely expensive so tossing one on a line that's executed many times may make the process far too slow. In that case it's better to compile in an if statement and then drop the breakpoint inside there. Other times the debugger is just limited in what information it can provide. Pointers to arrays in C++ are a common annoyance since the debugger has no length information.
Replay allows you to go back in time which is to me the biggest breakthrough. This actually makes them useful!
Same goes with automated QA. Record the UI tests using this, and if one fails, store the state + stack.
There are a LOT of hard problems in that workflow... Good luck!
Imagine a world where instead of ignoring, skipping, or marking "known failure" on all those flaky tests your CI hits (we all have them) you could capture recordings of them, and then actually investigate and fix them! That world is possible!
We don't currently have any plans to add additional authentication mechanisms but we've heard this feedback from a couple folks and we'll sit down to prioritize it after the excitement of launch has died down. Sorry about that!
How are side effects (mutating HTTP calls, cookie creation, other I/O) handled?
If you're interested you can learn more about how Replay works here: https://medium.com/replay-io/how-replay-works-5c9c29580c58
Lost me here, using breakpoints are an invaluable tool for people that actually take the time to use their tools correctly.
It can be a bit cumbersome to setup and it can be a little buggy especially when you're working with transpiled code, but to say no one uses breakpoints is a bit disingenuous.
Thanks for making this. It's so cool!
Replay does look interesting. I'm wondering about an itch I can't yet scratch. If someone on my team knows how to repro, we don't really need Replay. I have repro steps and control of the environment. But if a customer in the field has a bug reproduceable that we can't reproduce in house, this is my itch. What is the least friction and least invasive way to get a Replay recording in that scenario?
Are you aware of that? If yes how is it different?
In fact: we've been collaborating with them on it! :)
https://app.replay.io/recording/cf2da81f-3a74-45ba-9241-f6d4...
Brian walks through our approach here https://medium.com/replay-io/how-replay-works-5c9c29580c58
- https://undo.io/ (It can also support Golang https://docs.undo.io/GoDelve.html)
- Mozilla RR https://rr-project.org/
- GDB https://www.gnu.org/software/gdb/news/reversible.html
Unfortunately, works only in Linux.
Of course, it doesn't rollback the heap or any shared state. And playing forward again it will redo things again. So beware of side effects. But in codebases with lots of immutable data structures (Kotlin ftw) it works great.
Combined with hot swapping, one can even drop the frame, change the implementation of the function and then reenter, making it possible to test code changes without spending long time getting back into the same state/context.
One thing I wish Java debuggers supported was the ability to move the instruction pointer to a different line, as has been possible in other debuggers for ages. Is it a JVM limitation maybe? I remember being able to drag the "current line" pointer forwards or backwards in languages like C, C++, and C# in maybe 2003. I wish I could do this with Java; dropping the whole frame is useful but this feature lets you do a lot more, like break out of a loop or skip a block of code you _just_ realized shouldn't execute.
All the debuggers mentioned above for the backend work only under Linux, because from what I understand, they use `ptrace` syscall, and on Mac have completely different format, and different capabilities.
Do you plan support Golang, especially on Mac, maybe with custom fork, or similar?
Thank you!
Sometimes things are complicated. Often there is a need to do digging to uncover the issue. Being able to move forward and backwards and even jumping between seemingly disjoint parts of the timeline are all at your disposal with Replay.
Replay has saved me hours of time. And that isn't hyperbolee. On a couple occasions due to laziness and familiarity I'd do stuff the traditional way and be stuck still after hours (sometimes days) on the same bug. With Replay I was able to shorten that time to about an hour on even the trickiest of bugs.
So stoked to now have Replay available to others to help record reproductions of their bugs.
How exactly? What is it that Replay gives you that you normally wouldn't have?
IMO the most time consuming issue in triaging bug reports is reproducing them. Replay definitively solves the "can't repro this" problem.
Bingo! We’ve been iterating on our story for a year, and the “let’s refer to Learnable Programming” angle felt like a huge breakthrough.
So glad it resonated with you, thanks for letting us know!
You can click on the line-number next to a line of code to add a print at that location. If you hover over the line-number, it'll show you a count (exact and correct) of how many times that line of code hit during execution. Clicking on it adds a print log at that line with a default string.
At that point, the console log in the left should change immediately to include a bunch of new entries with "Loading.." text that resolves to the text of each print statement.
Clicking on the string allows you to replace it with an expression to be evaluated. The expression can include references to local scope variables, etc.
If you edit the expression, the console entries for the prints go back to "Loading.." and the new values are resolved.
The prints in the console are ordered in time sequence along with all the other events that can be shown there (regular console logs, mouse and keyboard and other events, etc.)
Pernosco's tool is described pretty well on their website, but basically it allows you to view a program inside and out, forwards /and/ backwards, with zero replay lag. Everything from stack traces to variable displays (at any point in time in your code execution) is extremely easy to view and understand. The best part is the lightning fast search functionality (again: zero lag).
On top of this: extraordinary customer service if anything breaks (in my experience, they fix bugs within 24 hours and are highly communicative).
After prying for technical details, Replay came up, and I asked to see it out of curiosity.
Really blew my mind. Every once in a while a piece of technology comes around that doesn't quite have an equivalent.
I could immediately see where being able to have your users or coworkers record bug reproductions or features and submit them in issues or PR's would save monumental amounts of time.
Wishing Replay team best of luck, I was thoroughly impressed.
I dunno, https://www.rrweb.io/ comes close (closer?) and it's open source.
There's also other paid solutions like FullStory and LogRocket.
From: https://www.notion.so/How-Replay-works-cc65abf5eb11443586abb...
So if you can’t capture API calls directly, what do you do? You drop down one level and record the browser system calls. This probably sounds like a terrible idea. We started off with a simple 3 line JS program with one API call and one clock and instead of just recording these two calls, we’re now recording a program with millions of lines of C++ and the complexity of an Operating System. Yep! It’s crazy, but works, and it’s pretty awesome.
So the nice thing about system calls is there are not too many of them and they don’t change that often. This means that instead of recording an API call directly, we can record the network engine’s system calls to open socket and process packets from the server. And by recording at this level, it’s possible to replay the website exactly as it ran before, with the same performance characteristics and everything else. It’s a little bit like putting your browser into “the Matrix” and tricking it into believing everything is normal, when in fact it is just running in a simulation.The biggest difference is that when you're viewing a replay, we're re-running the identical browser on our backend. This way you can debug the real thing.
The good: I checked out the tool and seems to work as advertised. Also nice to see Replay browser based on a Firefox fork.
The bad (and this is more your marketing/PR/branding, not product):
- You require an account signup, OK. It's a Google only signup, OK step over that. But it did not clearly mention that you this put me on a mailing list and surely 5 minutes after I signup, I get a random email to support a launch on product hunt.
- With the amount of engineering that went into it, I would expect you to be proud of the craftsmanship and your team. Instead the top of your website states you are proud of getting money from investors. This is more a vote against this trend, than your particular behavior.
- I was able to find the post "How Replay works" [1] which is the actual content addressing your target market. The post conveys 2000 characters of information and uses 4.3MB of data to do that for a signal/noise ratio of 0.04%. It is the type of web obesity [2] that we are used to nowadays, so nothing new. Mentioning this only because you are a web engineering-centric company. Promoting the right values of web performance and engineering attention to detail is IMO important for a product talking to web engineers.
I realize this may come as unpopular/beyond conventional wisdom but getting a different perspective is what HN is good for. Use the feedback at your discretion.
Props for making an innovative product and good luck!
[1] https://medium.com/replay-io/how-replay-works-5c9c29580c58
Your post feels smugly opinionated. If that wasn’t your intention, I don’t know what to say. Just look at your first two bullets. Passive aggressive and more.
1) How does the step backward functionality work? Do you take snapshots every so often of the Javascript environment? How do you handle destructive assignments?
2) Does Replay record actual syscalls made by the browser, or is it recording calls to the browser APIs by the javascript code (which I guess are effectively syscalls from the javascript code's perspective)?
3) The ordered lock technique described in https://medium.com/replay-io/recording-and-replaying-d6102af... makes sure that threads access a given resource in the same order, but what about threads accessing different resources in the same order? e.g. when recording, thread 1 accesses resource A before thread 2 accesses resource B. It seems like the ordered lock technique doesn't help you maintain that ordering in the replay. Is maintaining that kind of ordering across resources not actually necessary most of the time?
1. Rather than having to restore state to the point at the previous step, we can step backwards by replaying a separate process to the point before the step, and looking at the state there (this post talks about how that works: https://medium.com/replay-io/inspecting-runtimes-caeca007a4b...). Because everything is deterministic it doesn't matter if we step around 10 times and use 10 different processes to look at the state at those points.
2. We record the calls made by the browser, though it is the calls into the system libraries rather than the syscalls themselves (the syscall interfaces aren't stable/documented on mac or windows).
3. Maintaining ordering like this isn't normally necessary for ensuring that behavior is the same when replaying. In the case of memory locations, the access made by thread 2 to location B will behave the same regardless of accesses made by thread 1 to location A, because the values stored in locations A and B are independent from one another.
For question 3 on the ordering, I was imagining the following kind of scenario: one thread maybe calls a system library function to read a cursor position and another calls a system library function to write a cursor position. So even though they're separate functions, they interact with the same state. Do you require users to manually call to the recorder library to give the recorder runtime extra info in this kind of scenario? Sorry if this is a dumb question, I haven't really done any programming at this level.
> The interface which Replay uses for the recording boundary is the API between an executable and the system libraries it is dynamically linked to.
I assume the ordered locks use a global order.
Partly because I want to see the product do well, but also selfishly as an engineer, because we've been dogfooding replay and it's made squashing bugs 10x easier. Having somebody attach a replay to an issue makes that issue immediately better than an expertly-written one, which as an engineer, I can start debugging in seconds with minimal back-and-forth.
Really impressive. Will keep it around for sure to try debugging a real issue next time. Congrats on the launch and the great app!
p.s. On some pages (e.g. https://www.replay.io/pricing), I only see the Mac download button, even though I'm running Edge on Windows.
Any advice for leveling up WinDbg skills, especially as they relate to post mortem analysis? I suspect I also need to develop better assembly (or is it machine?) language skills. I'd like to learn a lot more about this stuff but resources (free or paid) are hard to find.
Replay also makes it easier to jump into a new codebase, I can see how things work.
Also nice side-effect is that it's amazing to explore other code-bases, just being able to put a console.log somewhere to see how often it's run when using an application is a lot of fun.
I want this to succeed, so I want the company to succeed. On that note I think you guys should change up your pricing.
Seems like there's too big a gap b/w the free forever (individual) and $20/mo/user for team.
I'd love to pay for this as an individual at a smaller amount - like $10/mo - for a few extra features. Or maybe reduce the functionality of free forever.
The great thing about the individual plan is that you have access to the complete feature set of recording and replaying and can invite collaborators to work with others.
I'd encourage you to jump in, start using the product and sharing feedback with us and that'll help Replay immensely.
Replay does this at the system API level, catching network IO, disk IO, IPC, and any other system interaction done by the recorded browser.
Async dependencies are a tough nut to crack. The main query of importance there seems to be "when was this currently executing code first added to the scheduler". Time travel debugging gives us the infrastructure to answer that question in a single click (and transitively the entire chain back to the initial program execution).
However we haven't implemented specific support for these use cases yet. Our initial public beta feature set is "reversible-execution in debugging", "print statements that work in the past", and "cloud collaboration on debugging through shared discussions on individual replays". I'm oversimplifying but that's the gist.
We'll be prioritizing features to implement and user feedback is of course critical in directing that effort. Features for async debugging, more frameworks support, network monitoring, more runtimes and execution environments, time-travel watchpoints on variables and objects, and different domains such as CI-integration (having replays automatically made for CI test runs) and serverside (easily recording and replaying backend code - starting with nodejs).
The current MVP is pretty breathtaking, and there's a ton that can be done on top of it going forward, and we're excited to deliver more :)
I mentioned in my other comment about Windows support, but even better if you could do something like browserstack, where I could just direct users to a URL where in the backend you guys are running the replay browser, but from their perspective they're just "using the website", that would be a killer feature. "Here, go to this URL and make the bug happen again, as soon as it happens click the little bug icon" - wouldn't have to convince an IT department to allow custom software on their COE, I could foot the bill and pass it on in my invoices so don't need to convince their accounting to approve licensing, etc, and you wouldn't need to compile OS-specific clients...
Anyway I digress, really cool stuff and thanks for expanding a bit on how it works, taking something so low-level as syscalls and wrapping it up in a user-friendly interface is no mean feat - good luck!
My angle on this thread is that I don't have much control over the frontend frameworks and usually when I land in an enterprise integration job a lot of the broader architecture is already in place - longer term I can have some influence but generally debugging involves numerous stakeholders. A tool like replay seems useful because most testers / users are reporting issues from their direct experience using the frontend, and being able to record and scrub through their interactions would be a massive timesaver, in lieu of setting up custom testing frameworks etc.
I get what you're saying though, and I'll definitely check out cypress, just wanted to add some context.
Being able to get them to record the exact process and scrub through it at my leisure without having to worry about hammering APIs would be a massive timesaver, replication is about 90% of the time taken to fix a bug, while the fixes themselves are usually trivial. Not having to worry about accidentally replaying a bugged-out API call is a huge plus.
In my case the major hurdle I can foresee is the majority of my enterprise-level clients use Windows and a brief look at the website says currently there's only Mac / Linux support for the replay browser. What's the timeline on Windows support?
The only other thing I can think may be an issue is that the recordings are cloud-based and a lot of the clients I deal with are finnicky about exfiltration and governance, and would be a lot more comfortable self-hosting where possible. Is that possible or on the roadmap?
This looks like a super useful tool to add to the toolbelt for sure, I would love to have all my clients using something like this to report issues. Nice work all!
I’ve got tons of crash reports in Sentry that I have no idea what to do about.
On things like:
CFRelease() called with NULL
replay could help me find where the heck that NULL came from as I don’t have any CFRelease call in my codeOr the more annoying
BUG IN CLIENT OF LIBDISPATCH: Assertion failed: Block was expected to execute on queue [com.apple.main-thread]
Like, how does that specific line run perfectly fine thousands of times and then every once in a while it decides it needs the main thread. I wish I could replay that traceback to see why the main thread is suddenly needed.For what it’s worth, if it helps with marketing: I am immediately skeptical that it’s going to be some mix of tedious to get working with my specific webpack dev server setup, be slow or unreliable, or never actually keep the state I need.
I’m hoping to be wrong on all of that. But when I scrolled the (beautiful) website, I would have loved to see some really nasty example in addition to the very elegant introductory “show don’t tell” example.
I want to see the first and think “okay I get it” and then be shown something really complicated to illustrate just how durable the tool is.
The usual bug workflow is for one person to file an issue with steps to reproduce, which the engineer will use to reproduce the bug and try to debug the problem. Replay replaces that workflow by allowing a person to record a replay, and send that link (which is immediately debuggable) to the engineer.
What I'd love to see is a logging framework which records the values of program specified variables while running as well as the current stack trace plus a monotonic time so you can piece together what happens through a thread of execution over time. However, unlike traditional logging, it would be connected to the source code like how a debugger works so you could mouse over a variable to see its state over time.
Honestly it'd be really cool with a tracing system like this to be configurable without modifying source as well. Maybe trace specific functions by noting the stack trace and arguments when invoked as well as the return value.
Years ago I used Adobe LiveCycle, a horrible "low code" enterprise framework foisted on us from far above.
One thing I always did like about it though was its replay tool which this reminds me of. I was always surprised it wasn't more of a thing in the dev space, it seems very useful.
Would it be possible to communicate this by disabling the button and adding some kind of "coming soon" messaging straight to its right?
But there was one earlier presentation I cannot find it, guy was showing live debugging of the video game. Not sure is it TED or one of conferences ...
Edit:
This is it:
Bred Victor: https://youtu.be/EGqwXt90ZqA?t=1006
2013 Future of programming https://www.youtube.com/watch?v=8pTEmbeENF4
Many past threads on HN: https://hn.algolia.com/?q=worrydream
Replay debugging is a huge step forward, and gives console.log() much more power.
Thank you!
There used to be a product called "Chronon" back 10-12 years ago... The blog spammed Dzone and a lot of other websites and refused to pay for advertising. Their CEO encouraged people to stop writing log statements out and just run their debugger all the time. Looks like it's defunct now: http://www.chrononsystems.com
We knew early on that a cool debugger wasn't going to cut it. Replay had to be super easy to adopt, fast, secure, and stable. That's our goal!
We're really grateful for the support we had within Mozilla in the early days, but what we've come to learn is that projects like Replay really benefit from being able to be nimble and solely focused on a great experience which is difficult when you're a small feature in a larger product.
Also you can find our runtime forks and entire frontend on github. http://github.com/RecordReplay/
Happy to answer any additional questions.
We're really grateful for the support we had within Mozilla in the early days, but what we've come to learn is that projects like Replay really benefit from being able to be nimble and solely focused on a great experience which is difficult when you're a small feature in a larger product.
Also you can find our runtime forks and entire frontend on github. http://github.com/RecordReplay/
Happy to answer any additional questions.
Better to spin out desirable features (like this) and buy in undesirable features(e.g. Pocket)
IMHO having Firefox as a base is not a great idea, Chrome would be better and closer to what users use.
I just visited the recording and the upload seems to have completed - I can view the recording from your link and debug it. I assume you shared it publicly?
Edit: on Ubuntu 18.04 I get the following error:
$ ./replay
./replay: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by ./replay)
> What are the limits for the recordings in terms of time and input data size?
The best answer right now is that it varies a lot depending on overall CPU usage, the amount of memory used, and the length, so it's somewhat difficult to nail down. Right now we recommend less than 2 minutes long as an attempt to keep things reasonable, but if it's 2 minutes at full CPU usage it may still not load.
> on Ubuntu 18.04 I get the following error:
For the Glibc error, I'll ask around but not sure off the top of my head.
How do I know it's safe? Is there any other way to try it out?
However, we do need you to sign up for an account to record a replay.
It turned out to be multiple cases, one was that a colleague changed a default export to a named one which excluded my screen. Another was an in an unrelated test which missed a React context wrapper so I needed to refactor her tests.
I don't know what kind of magic debugging tool would help with these kinds of things.
Genuinely excited!
That shouldn't be the case for you though so perhaps there's a network issue? Clearing your cache and trying again might resolve it. Feel free to jump on our discord (https://replay.io/discord) and we can help troubleshoot more together.
Your payload appears to compile down for older browsers (as is evident by the use of `var`, unless.. you actually write code using `var`) but it misses the ||= operator transformation it seems.
Thus, the browser throws a syntax error, which your app interprets as "new version" for some reason.
By the way, please don't disable right clicking. It doesn't actually solve any problems and only annoys users.