Native Memory Profiling with Android Studio 4.1

Joshua Gilpatrick
Android Developers
Published in
6 min readJul 31, 2020

--

This is second in a two part series on What’s New in Profilers in Android Studio 4.1. Our previous post focused on What’s New in System Trace.

We’ve heard from those of you using C++ that debugging native memory can be fairly difficult, particularly in games. With Android Studio 4.1, we’ve implemented the ability to record call stacks of native memory allocations in our Memory Profiler. The native memory recording is built on top of the Perfetto backend, the next generation performance instrumentation and tracing solution for Android.

A common technique when trying to debug memory issues is to understand what is allocating memory and what is freeing memory. The rest of this article will walk you through how to use the Native Memory Profiler to help track down a leak, using the Gpu Emulation Stress Test as an example project.

Getting Started

To follow along, clone or download the sample from https://github.com/google/gpu-emulation-stress-test.

When a memory leak is suspected, it’s often a good idea to start at a high level and watch for patterns in the system memory. To do this click the profile button in Android Studio, and enter the memory profiler for more detailed memory tracking information.

Top level view of memory profiler. Showing gradual increase in native memory with each run of the “Gpu emulation stress test”

After running the simulation a few times we can see a few interesting patterns.

  1. The GPU memory increases as one may expect from a GPU emulation app, however it also looks like this memory gets properly cleaned up after the Activity is finished.
  2. The Native memory grows each time we enter the GpuEmulationStressTestActivity, however this memory does not seem to reset after each run, which might be indicative of a leak.

Native Memory Table View

Starting with Android Studio 4.1 Canary 6, we can grab a recording of native memory allocations to analyze why memory isn’t being released. To do this with the GPU emulation app, I stopped the running app and started profiling a fresh instance. Starting from a clean state, especially when looking at an unfamiliar codebase, can help narrow our focus. From the memory profiler I captured a native allocation recording throughout the duration of the GPU emulation demo. To do this restart the app by selecting Run-> Profile ‘app’. After the application starts and the profile window opens, click on the memory profiler and select “record native allocation”

First look at a native memory capture when it is loaded in Android Studio.

The table view is useful for games/applications that use libraries implementing their own allocators highlighting malloc calls that are made outside of new.

When a recording is loaded, the data is first presented in a table. The table shows the leaf functions calling malloc. In addition to the function name, the table shows module, count, size, and delta. This information is sampled so it is possible not all malloc / free calls will be captured. This largely depends on the sampling rate, which will be discussed a bit later.

It is also useful to know where these functions that allocate memory are being called from. There are two ways to visualize this information. The first is by changing the “Arrange by allocation method” dropdown to “Arrange by call stack”. The table shows a tree of callstacks, similar to what you may expect from a CPU recording. If the current project has symbols (which is usually the case for debuggable builds; if you’re profiling an external APK check out the guide here) they will automatically be picked up and used. This allows you to right click on a function and “Jump to source”.

Within the table view Right Clicking an element shows a “Jump to Source” context menu

Memory Visualization (Native and non-native)

We’ve also added a new flame chart visualization to the memory profilers, allowing you to quickly see what callstacks are responsible for allocating the most memory. This is especially useful when a call stack is really deep.

There are four ways you can sort this data along the X axis:

  • “Allocation Size” is the default, showing the total amount of memory tracked.
  • “Allocation Count” shows the total number of objects allocated.
  • “Total Remaining Size” is the size of memory sampled throughout the capture that was not freed before the end of the capture.
  • “Total Remaining Count”, like the remaining size, is the count of objects captured but not freed before the end of the capture.
With this capture loaded, in the “Total Remaining Size” view, it is easy to see “lodepng” is responsible for allocating a lot of memory.

From here we can right click on the call stacks and select “Jump to Source” to take us to the line of code responsible for the allocation. However, taking a second glance at the visualization, we notice that the common parent, WorldState, is responsible for multiple leaks. To validate this, it can help to filter the results.

Filtering / Navigation

Like with the table view, the chart can be filtered using the filter bar. When the filter is used, the data in the chart is automatically updated to show only call stacks that have functions matching the word/regex searched.

After applying a filter it seems clear that WorldState is responsible for leaking ~70MB of our total assumed leak ~72MB.

Sometimes call stacks can get fairly long, or there just isn’t enough room to display the function name on screen. To assist with this, ctrl + mouse wheel will zoom in/out, or you can click on the chart to use W,A,S,D to navigate.

Verifying the findings

Adding a breakpoint and running the Emulation twice quickly reveals that on the second run we cause the leak by overwrite the pointer from our first run.

Quick view of the debugger showing “sWorld” already has a value the 2nd time around

As a quick fix to the sample we can delete the world after it is marked done, profiling the application again to validate the fix.

Memory view after running the demo two times

Ending where we started by looking at the high level memory stats. Validating that deleting sWorld at the end of the simulation frees up the 70mb held by our first run.

Startup profiling and sample rate setting.

The sample above shows how native memory tracking can be used to find and fix memory leaks. Another common use for native memory tracking is understanding where memory is going during startup of the application. In Android Studio 4.1, we also added the ability to capture native memory recordings from the startup of the application. This is available in the “Run/Debug Configurations” dialog under the “Profiling” tab.

Profiling tab located in the Run Configuration dialog.

You can customize the sampling interval or record memory at startup in the Run configuration dialog.

Here you can also change the sampling rate for new captures. A smaller sampling rate can have a large impact on overall performance, while a larger sampling rate can miss some allocations. Different sampling rates work for different types of memory problems.

Wrapping up

With the new native memory profiler finding memory leaks and understanding where memory is being held on to just got a little bit easier. Give the native memory profiler a try in Android Studio 4.1, and leave any feedback on our bug tracker. For additional tips and tricks be sure to also check out our talk earlier this year at the Google for Games summit, Android memory tools and best practices.

--

--