How to Find and Fix Bugs in Commercial Software on Windows
According to Wikipedia, “A software bug is an error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways, eventually crashing the application. The process of fixing bugs is termed “debugging” and often uses formal techniques or tools to pinpoint bugs […].”
Most of the time bugs can be reproduced in a developer environment and fixed before sending a version to production. But sometimes bugs depend on users’ particular environment or profile—for example, using special characters in a folder name—making it harder to understand a bug’s root cause. The worst bugs will require user support to set up a screenshare session in order to watch a user reproduce a bug.
At Dashlane, the Windows team uses several tools to find and reproduce bugs:
- Technical logs to spot where issues arise and help fix them, sometimes requiring several iterations and interaction with the user experiencing an issue
- Microsoft Process Monitor and Process Explorer, two free tools that pinpoint bad installations or a corrupted environment
Recently, Microsoft introduced a new version of another developer tool, WinDbg. This application was ported into an UWP app, is freely available on Microsoft Store, and comes with a very nice and useful feature: TTD, or Time Travel Debugging. Think about a time you had to fix a very hard-to-reproduce bug that was happening only on one specific machine. Or think about the time you spent installing a developer environment on that machine—if you could. And then think about how many times you had to launch your application in debug mode in order to reproduce the bug, and then you missed the corrupted line by accidentally hitting F5 instead of F11. TTD turns this into a piece of cake, as you only need to record the bug once to let you replay it over and over, both forward and backward.
Time Travel Debugging with WinDbg
TTD is a reverse debugging solution. It consists of 3 steps:
- Record the app or process on the machine that can reproduce the bug. Use WinDbg with admin rights and launch an executable, or attach to one already running to record its trace. The result is a file (.run extension) containing all of the information to reproduce the bug. For this reason, at Dashlane we can use it only to debug bugs that happen before login so as not to collect users’ sensitive information.
- Replay the recorded trace forward and backward as many times as necessary to understand the problem. Once loaded, WinDbg creates an index of the trace, which provides complete and fast memory lookup. The recorded file can be loaded on any machine; we usually open on a developer machine, where we can load source code and PDB symbols. This way, we can replay the trace as we’re running code in our IDE with process and memory records from the recorded machine. (Make sure that the source code and symbols match the recorded process version.)
- Analyze by adding breakpoints, run queries to identify common code issues, and get full access to memory and locals to understand what is going on.
Suppose you have written and built a simple console application.
Let’s assume the built application (TestTTD.exe) and the symbol file (TestTTD.pdb) are in C:\TestTTD\bin, and the source code is in C:\TestTTD\src.
- Open WinDbg with elevated rights.
- On the File tab, choose Launch executable (advanced). In the Executable field, enter C:\TestTTD\TestTTD.exe. Check Record process with Time Travel Debugging, set the Output directory to a writable and existing path, then click OK.
- As the application crashes, WinDbg saves a trace file (.run extension) and logs the location while loading it.
- Stop the debugger, and open the trace file location.
- Open WinDbg.
- On the File tab, select Settings from the left sidebar. Select Debugging settings, add C:\TestTTD\src to the Source path field and C:\TestTTD\bin to the Symbol path field. Click OK to save changes.
- On the File tab, select Open trace file, and browse to the trace file created at the Record step.
- Once the recorded trace file is loaded in WinDbg, add a breakpoint to stop and analyze the state of the application. Load a script file from the Source tab, then click on the left side of the row of the desired breakpoint. Hit F5 to run to the breakpoint.
- Now hit F10 to step over, as in a Visual Studio debugging session. Click on Step Over Back to run backward.
When debugging production code and hardly reproducible bugs, WinDbg and its Time Travel Debug are a great tool for developers that can help save time and resources by recording the faulty application just on the affected environment and giving you the ability to replay it anywhere. For more information, please visit Microsoft blog, or watch their demo on MSDN Channel 9’s website.