Native Memory Profiling with Android Studio 4.1
This is second in a two part series on What’s New in Profilers in Android Studio 4.1. Our previous post focused on What’s New in System Trace.
We’ve heard from those of you using C++ that debugging native memory can be fairly difficult, particularly in games. With Android Studio 4.1, we’ve implemented the ability to record call stacks of native memory allocations in our Memory Profiler. The native memory recording is built on top of the Perfetto backend, the next generation performance instrumentation and tracing solution for Android.
A common technique when trying to debug memory issues is to understand what is allocating memory and what is freeing memory. The rest of this article will walk you through how to use the Native Memory Profiler to help track down a leak, using the Gpu Emulation Stress Test as an example project.
Getting Started
To follow along, clone or download the sample from https://github.com/google/gpu-emulation-stress-test.
When a memory leak is suspected, it’s often a good idea to start at a high level and watch for patterns in the system memory. To do this click the profile button in Android Studio, and enter the memory profiler for more detailed memory tracking information.
After running the simulation a few times we can see a few interesting patterns.
- The GPU memory increases as one may expect from a GPU emulation app, however it also looks like this memory gets properly cleaned up after the Activity is finished.
- The Native memory grows each time we enter the GpuEmulationStressTestActivity, however this memory does not seem to reset after each run, which might be indicative of a leak.
Native Memory Table View
Starting with Android Studio 4.1 Canary 6, we can grab a recording of native memory allocations to analyze why memory isn’t being released. To do this with the GPU emulation app, I stopped the running app and started profiling a fresh instance. Starting from a clean state, especially when looking at an unfamiliar codebase, can help narrow our focus. From the memory profiler I captured a native allocation recording throughout the duration of the GPU emulation demo. To do this restart the app by selecting Run-> Profile ‘app’. After the application starts and the profile window opens, click on the memory profiler and select “record native allocation”
The table view is useful for games/applications that use libraries implementing their own allocators highlighting malloc calls that are made outside of new.
When a recording is loaded, the data is first presented in a table. The table shows the leaf functions calling malloc. In addition to the function name, the table shows module, count, size, and delta. This information is sampled so it is possible not all malloc / free calls will be captured. This largely depends on the sampling rate, which will be discussed a bit later.
It is also useful to know where these functions that allocate memory are being called from. There are two ways to visualize this information. The first is by changing the “Arrange by allocation method” dropdown to “Arrange by call stack”. The table shows a tree of callstacks, similar to what you may expect from a CPU recording. If the current project has symbols (which is usually the case for debuggable builds; if you’re profiling an external APK check out the guide here) they will automatically be picked up and used. This allows you to right click on a function and “Jump to source”.
Memory Visualization (Native and non-native)
We’ve also added a new flame chart visualization to the memory profilers, allowing you to quickly see what callstacks are responsible for allocating the most memory. This is especially useful when a call stack is really deep.
There are four ways you can sort this data along the X axis:
- “Allocation Size” is the default, showing the total amount of memory tracked.
- “Allocation Count” shows the total number of objects allocated.
- “Total Remaining Size” is the size of memory sampled throughout the capture that was not freed before the end of the capture.
- “Total Remaining Count”, like the remaining size, is the count of objects captured but not freed before the end of the capture.
From here we can right click on the call stacks and select “Jump to Source” to take us to the line of code responsible for the allocation. However, taking a second glance at the visualization, we notice that the common parent, WorldState, is responsible for multiple leaks. To validate this, it can help to filter the results.
Filtering / Navigation
Like with the table view, the chart can be filtered using the filter bar. When the filter is used, the data in the chart is automatically updated to show only call stacks that have functions matching the word/regex searched.
Sometimes call stacks can get fairly long, or there just isn’t enough room to display the function name on screen. To assist with this, ctrl + mouse wheel will zoom in/out, or you can click on the chart to use W,A,S,D to navigate.
Verifying the findings
Adding a breakpoint and running the Emulation twice quickly reveals that on the second run we cause the leak by overwrite the pointer from our first run.
As a quick fix to the sample we can delete the world after it is marked done, profiling the application again to validate the fix.
Ending where we started by looking at the high level memory stats. Validating that deleting sWorld at the end of the simulation frees up the 70mb held by our first run.
Startup profiling and sample rate setting.
The sample above shows how native memory tracking can be used to find and fix memory leaks. Another common use for native memory tracking is understanding where memory is going during startup of the application. In Android Studio 4.1, we also added the ability to capture native memory recordings from the startup of the application. This is available in the “Run/Debug Configurations” dialog under the “Profiling” tab.
You can customize the sampling interval or record memory at startup in the Run configuration dialog.
Here you can also change the sampling rate for new captures. A smaller sampling rate can have a large impact on overall performance, while a larger sampling rate can miss some allocations. Different sampling rates work for different types of memory problems.
Wrapping up
With the new native memory profiler finding memory leaks and understanding where memory is being held on to just got a little bit easier. Give the native memory profiler a try in Android Studio 4.1, and leave any feedback on our bug tracker. For additional tips and tricks be sure to also check out our talk earlier this year at the Google for Games summit, Android memory tools and best practices.