Oct 13, 2021 3:00 AM

Optimizing C and C++ Windows binaries with SizeBench

More of Microsoft’s internal tools get released to the outside world.

Thinkstock

One key to the modern application development life cycle is the toolchain: all the various applications and services that come together to deliver our own custom tools. They range from code generators that build scaffolding and deliver application frameworks, through IDEs and editors, to testing tools and continuous integration and continuous delivery (CI/CD) pipelines. It’s often easy to get used to one set of tools, using them to the exclusion of everything else.

That’s not a bad thing; it pays to get familiar with one set of keystrokes, building the muscle memory needed to get peak productivity. But if we get fixated on the tools we use, it’s possible to miss out on new ones that might go a long way to helping us deliver better code more quickly. So how do we find out about them?

One source of useful information is the various Microsoft development blogs, hosted on their DevBlogs site. I’d recommend subscribing to its RSS feed to keep up to date with posts from across the developer platform, with a focus on languages, Visual Studio, and on working with Azure’s tools. One recent post from the Performance and Diagnostics team talked about the public release of what has long been an internal tool: SizeBench.

Microsoft’s own tooling is always interesting to explore. We tend to forget that at heart, the company is a development organization that uses its own tools for much of what it does, and the lessons it learns often drive the next generation of tools. Microsoft’s shift to using git for Windows development led to the development of the git virtual filesystem, allowing developers to work with tens of thousands of objects hosted in git without having to clone entire branches every time they needed to check out a single file.

Introducing SizeBench

SizeBench is one of those tools that clearly started out scratching a Microsoft itch, providing a way to understand how big your binaries will be on users’ devices. More specifically it lets you examine how an application came to be that size, giving you the opportunity to make source code changes that could have a significant effect quickly. It originally was used by the Windows Phone team but is now used across the company to help manage C and C++ code.

Why is application size still important? We’re not trying to squeeze code onto 1.44MB floppies anymore; we’re installing them onto 20GB hard drives. Most machines now have at least 128GB of storage and often more than a terabyte. That’s led us to often assume that binary size doesn’t matter. But there’s still one big limitation on our code: bandwidth. Not everyone has access to fast broadband, and where remote work is increasingly important, download speeds remain an issue. If we can make downloads smaller, it’ll be easier to push out updates, ensuring that important security updates aren’t skipped.

There’s also the issue of reading data from storage into memory. It’s less of a problem on PCs with faster disks and CPUs, but things become more of an issue when we move to working with software-defined storage in public and private clouds. Suddenly we’re shifting data over the network every time we launch an application, either loading containers or starting up virtual machines. Add the startup times for serverless application instances, where we’re launching full virtual environments to host our functions, and there’s distinct value in keeping cloud code small.

Analyzing binaries with SizeBench

SizeBench is perhaps best thought of as a specialized form of static analysis. Files need to be executables compiled as Portable Executables (PEs). PE files are the most common form of Windows executable; they’re not specific to any one architecture, so you can have PE binaries for ARM, x86, and x64. It’s a long-lived format, with elements that date back to DOS. You can have PE-format DLLs and EXEs, giving SizeBench considerable scope over working with all your compiled binaries.

SizeBench shows you what code produces the most bytes. As the announcement blog post notes, that can be counterintuitive, as much of what modern compilers do to optimize your code is hidden from you. You hit “build” and get a binary. One example given is C++ templates. They may be small in your editor, but when they’re used across many different types, suddenly there can be hundreds if not thousands of copies of the template in your binary. A few kilobytes of text end up as megabytes of code, all due to how the compiler works with it.

Once you’ve installed SizeBench from the Windows Store, you can start to use it on your code. You’ll need both the compiled executable and the PDB file used to store debugging symbols. With the two files selected, you can start to drill down into a binary.

The application gives you two main areas to explore. One shows known sources of waste, and another lets you explore different areas of your binary. The first option, looking at binary sections, will show you how code is loaded into memory. Data is rounded up to page sizes so you can see how it’s actually allocated when loaded. Each section can be explored still further, for example, showing what libraries contribute the most to binary size.

Exploring code can feel random at first, as each link you click on takes you deeper into your binary. But after a while you can start to track down functions and other elements that might be managed differently, using symbol links to show the original source files used. Links from inside the application can be shared with other developers, allowing you to quickly send an interesting report to a colleague. All you need is access to the same binaries and symbols to share an analysis.

Using prepackaged analysis for quick results

There’s a lot here, so it’s often easier to jump straight to the prepackaged waste analysis that Microsoft engineers have built into the application. These include tools to find wasteful virtuals, which can affect large codebases with lots of virtual functions, as they require binary allocations for each type and can’t share allocations between types. You can save space by making these functions individual implementations, which can then be called directly, both saving space and making code more compatible with modern security measures and easier for a compiler to optimize.

The results of using this tool to work through functions in an older codebase can be significant: Microsoft was able to reduce the size of the Windows.UI.Xaml DLL by more than a megabyte, identifying bad patterns and rewriting sections of the code.

Other tools help track down duplicated data, usually the result of static declarations. These make a copy for every reference. This can show that you’re not using all the right compiler options for your code, which might help add optimizations that reduce the amount of data stored in your code. The other analytical tool in SizeBench looks at template foldability, looking for near-identical copies of the same base template in your code. Here, a few changes in code to make your templates more similar can reduce code size substantially.

One thing to note: This is only for compiled C and C++ code, built using Microsoft’s own compiler and linker. You can’t use it with managed code or with code compiled using Clang’s linker (although Clang-compiled code linked with Microsoft’s can be analyzed). There’s also no support for other languages, such as Java.

Using a tool like SizeBench helps you get closer to your code. Too often we build it and pass it straight on to users, not thinking about how our compilers work or what they do to take source and make a binary. Once you understand how a modern compiler works with your code, you can start to pre-optimize your own code, choosing design patterns and coding techniques that deliver smaller, more efficient binaries. Not only will adding tools like SizeBench to our toolchains make users’ lives easier, they’ll make applications easier to debug and monitor.