Corporate Proxy SSL inspection and Pain (Zscaler + Android Studio emulator)

December 12, 2024

I’m Just the Man in the Middle!

We recently rolled out Zscaler at our org, which has been real “fun” to do. It has definitely come with quite a number of pain points. For those who don’t know, one of Zscaler’s core features is to proxy web traffic largely to perform content filtering, threat analysis, and more. One of the ways this occurs is through SSL inspection, where the SSL certificate is swapped at the proxy with a new trusted cert (by default, a Zscaler CA cert for your org). Funny enough, this is effectively a Man-in-the-Middle (MitM). For most things, this is fine, because you will have deployed the Zscaler certificate to your device’s trust store (keychain on macOS), which most applications and tools use. However, many applications are more sensitive to MitM because they have their own trust stores (fun fact, even Firefox has its own trust store!).

To restore the trust for those applications or tools, you either have to tell it to use the system trust store (which many applications don’t have the option to do), or you have to tell that tool to trust the Zscaler cert, typically by pointing to it or adding it to its own trust store. Else, you have to bypass SSL inspection entirely for the particular FQDNs/IPs (that is, to not inspect that traffic, therefore not swapping the SSL certificate at the proxy so it keeps its natural cert). In particular, developer tools and applications such as Docker, NPM, Python pip and requests, and so on, are all highly sensitive and have their own trust stores. Along with this, is Android Studio.

Zscaler “helpfully” (throw in another 6 or so air quotes for me?) provides a page that details how to add custom certificates to application-specific trust stores, but it’s helpfulness is mixed at best (particularly because some of the steps are not clear, outdated, or wrong). Android Studio is listed which is cool and good, however the steps there are really only for helping Android Studio itself trust the cert. However, what this is missing, is for the emulator itself to be able to trust the certificate. Along with the multi-month game of whack-a-mole dealing with SSL inspection woes, Android Studio in particular was pure pain, leading to over a week of research, trial and error, and blocking far too many teams. This issue isn’t unique to Android Studio either, Xcode has the same issue where the iOS simulator also does not trust the certificate. In both cases, it’s not obvious how to get the emulator/simulator to trust the certificate. For the iOS simulator in Xcode though, the answer was bizarrely simple: drag-and-drop the cert onto the simulator, and then verify the cert is installed. Far too easy. For Android Studio however, it was not so simple.

Symptom – Blank Page (it no worky)

One of the symptoms of the issue, is any WebView shows up as a blank page. Temporarily disabling Internet Security in Zscaler so that the traffic is not proxied (therefore, not inspected) will result in the WebView working as desired.

A screenshot showing Android Studio's android emulator with a blank page

Trust but Verify – do it myself

Because this was issue taking up so much time and to avoid having to pester our app devs, I set out to set up my own Android Studio. I had not touched anything Android development related in many, many years, so of course all my knowledge went out the window, so this took some doing to reacquaint myself. I created a simple application using one of the example apps, in order to create a WebView to test with. A screenshot of Android Studio depicting code that adds a WebView

This allowed me to experiment and see for myself the results as a I furthered my research.

Going Nowhere (fast) – the light(speed) at the end of the tunnel

The days of research were getting quite long, involving reading dozens of Stack Overflow and Google developer community posts, joining Discord and slack communities, making forum and Reddit posts, all not yielding any results. That is until late one night (yeah, after midnight), I was continuing my research in hopes of an answer, when I stumbled across a random one-off Stack Overflow response, with a link to Google’s recommendation to use debug builds to trust a the CA.

A screenshot of a Stack Overflow post with a response by a user named robert giving the answer.

This got me pretty curious, because everything it was saying made technical sense. Equipped with my extremely basic test app where I could then replicate any failures or successes, and verify with the blank page symptom and SSL errors, I decided to give it a shot. This was probably going to be my final attempt before giving up.

Steps to Resolve – 🎵every step I make

I placed the Zscaler Root CA cert (.pem) at a res/raw/zscalerroot.pem
Created res/xml/network_security_config.xml

<?xml version="1.0" encoding="utf-8"?>
<network-security-config>
    <debug-overrides>
        <trust-anchors>
            <certificates src="@raw/zscalerroot"/>
        </trust-anchors>
    </debug-overrides>
</network-security-config>

A screenshot of Android Studio showing the path to the Zscaler certificate and the network security XML

Added to the manifest the following:

        android:networkSecurityConfig="@xml/network_security_config"
android:debuggable="true"

A screenshot of Android Studio showing the addition to the manifest

Success! (it worky)

To my surprise, this worked! After building the app and running it, the WebView loaded the page as desired.

The common "It's happening" animated gif/meme

A screenshot of the Android Studio emulator showing the page loading correctly

At that point, it was 1:19am when I sent the message in Slack to explain the steps and share the results. I then proceeded to stay awake a bit longer fueled by the adrenaline of finally squashing that issue (and being confident that it will work for the impacted teams), and finally went to bed.

And yep, it does work for the teams as expected!

(Thank you Robert, you Stack Overflow mad lad)

Bonus Thoughts (thinking is literally in the name)

Don’t ship the cert; Debug mode instead?

Earlier on in the research, we did have the idea to simply put the cert in the app during development. However, nobody should ship their Zscaler cert with their app. That just seems like a bad idea. Which it more interesting that a similar approach is exactly what is going on here. By using the debug-overrides, we are able to specify the trust-anchors key in a relatively safe way, where this will only get applied when the app is being ran in debug within the emulator device. Don’t quote me on this part, but from what I read, this is also safeguarded by app stores (the Google Play Store) automatically rejecting applications that include a debug configuration. Android Studio also helpfully cautions against hardcoding the debug in the manifest, with “Avoid hardcoding the debug mode; leaving it out allows debug and release builds to automatically assign one”. It helpfully provides the option to ignore hardcoded debug mode, with “tools:ignore=“HardcodedDebugMode””. However, this should be unnecessary since Android Studio also has, well, a debug button right there.

This isn’t just for Zscaler; Why not just bypass?

SSL inspection happens with many other proxy tools (Netskope for example). So if you’re running into the same thing when implementing your own proxied SSL inspection but you don’t use Zscaler, that’s okay! The concepts are exactly the same. And while it didn’t directly lead to the answer, some of my research did involve using other proxy tools such as Netskope in my key words.

One of the common approaches that many recommended, including some fairly knowledgeable Zscaler experts (where their team’s sole role is working with Zscaler implementation), and those that use other proxy tools, is to simply bypass it and move in. Which, if all else fails, is basically the only way to go, accepting the risk that the traffic cannot be inspected. This is often the case for applications or tools that use SSL pinning. However, it’s generally a good practice to not bypass things unless you have to.

We've generally taken this approach to remediating issues:

Solve the issue through other means first (adding certs to trust stores, making sure related traffic isn't blocked): This can be painful particularly for developer tools, but it's better to keep traffic proxied and inspected than to just yolo stuff on through
Bypass SSL inspection: If something is particularly sensitive (MDM related things for example), we will bypass it for SSL but keep it proxied
Bypass in the PAC: Some things are just pain and cannot be resolved through other means, so we bypass FQDNs/IPs in PACs if no other option works (because Zscaler does not have application-specific bypasses for macOS since they do not yet have packet-based filtering implemented and has been delayed again and again…).

There’s also the aspect where some things need to be bypassed for one application, but not others, because of the exact same work you’re doing elsewhere. For example, the app devs might be working on an app that involves a WebView for say, app.thethinkingsir.com. If all else had failed, we could have simply bypassed that and any other FQDN. However, another team could be interacting with an API in the same FQDN (say, app.thethinkingsir.com/api/v1), through python. Because they’re making other requests inPython to many other FQDNs including that one, they will have ended up setting the env variables to point Python’s requests and pip at the certificate to have it trusted. The issue shows up when bypassing the FQDN for the app devs, that now winds up breaking the environment for the devs using Python, because Python is even more sensitive – when you tell it to expect a cert, it really really expects it and no longer relies on the system trust store. And because it gets set globally in their environment, you can’t just remove that env variable for just that one FQDN, it applies to everything. You then end up in an endless chain of deciding what to bypass, what not to bypass, that by the time you’re done, you’ve bypassed everything and the entire exercise was pointless. Similarly, every single FQDN would have to be bypassed, and you might end up playing an excessive game of whack-a-mole bypassing each and every FQDN, only to find out more have to be bypassed because you’ve gotten one step further and discovered more things aren’t loading. It can truly never end. So ideally, do our best to do what needs to be done to get the application or tool to work without simply going around it.

You probably know more than me

This entire journey of research and headaches was largely due to not having the answers up front, and probably not knowing enough context to reach the conclusion as quickly as I could have. I am not an Android developer, nor do I live in a dev world. It just so happens that the org has a few hundred devs, some of which are app devs. But despite all of the headaches, this was a fun learning experience and resulted in a quality hit of dopamine when the solution finally presented itself.

tl;dr: it worky now

-Mark