Module 6: JavaScript Deobfuscation
Introduction
Introduction
This module teaches locating and deobfuscating JavaScript in web pages to analyze hidden or malicious behavior.
Code deobfuscation is a core skill for code analysis and reverse engineering. Obfuscated JavaScript often conceals functionality (for example, malware that fetches payloads). Understanding and reversing that obfuscation reveals intent and enables manual replication or mitigation.
The module workflow:
examine the HTML page structure and locate embedded/linked JavaScript;
define obfuscation, common techniques, and typical use-cases;
apply deobfuscation methods to recover readable code;
decode encoded messages found in scripts;
perform basic code analysis to determine behavior;
send simple HTTP requests to reproduce or observe network interactions.
Source Code
Client-side web applications commonly split responsibilities: HTML defines structure and semantics, CSS defines presentation, and JavaScript implements behavior. Browsers render these artifacts so users rarely inspect the raw source, but viewing the source reveals client-side logic and comments that can contain useful (or sensitive) information.
HTML
Open the page source (for example, via Ctrl+U) to inspect structure, inline comments, and references to external assets. Source often contains developer comments and markup that clarify how the page should behave; attackers or analysts can leverage that information.
Example (fictionalized) minimal page:
CSS
Styles may be embedded inside <style> blocks or provided by external .css files referenced with <link>. Inline styles appear in the HTML; external styles are useful to locate for additional context.
Example inline snippet:
JavaScript
Behavioral logic can be inline in <script> tags or kept in external .js files. External scripts are followed from the HTML source (click the filename in the source viewer). Obfuscated code commonly uses eval or code-generators to hide intent; when you open such a file you may see a dense eval(...) wrapper rather than readable logic. Identifying whether a script is internal or external and then fetching that file is a first step toward deobfuscation.
Example (fictionalized) observed pattern:
Obfuscation
Code Obfuscation
Obfuscation transforms readable source into a functionally equivalent form that is intentionally hard for humans to understand. Tools perform this automatically by rewriting identifiers, restructuring control flow, and replacing literal tokens with lookups or computed values. The transformed code runs normally (sometimes with reduced performance) but resists manual inspection and simple signature detection.
What is obfuscation
Obfuscators commonly replace identifiers and literals with references into generated dictionaries, encode strings, and wrap logic in evaluators (e.g., eval) or decoder functions that reconstruct behavior at runtime. Interpreted client-side languages—especially JavaScript—are frequent targets because their source is delivered to users’ browsers in cleartext, unlike typical server-side languages.
Use Cases
Intellectual property protection: make reuse or copying harder by hiding original structure and names.
Attempted protection of client-side secrets: obfuscation can raise the effort to extract keys or algorithms (note: performing authentication or encryption in client-side code is not recommended).
Evasion and malicious use: attackers obfuscate payloads to bypass signature-based detectors and to slow analysis.
Anti-reverse-engineering: increase cost and time for analysts attempting to understand the code.
Intrusion Detection and Prevention systems (IDPS) are commonly targeted by obfuscation-based evasions.
Basic Obfuscation
Obfuscation is typically performed using automated tools that rewrite source code into a less readable but functionally identical form. Developers and attackers alike use such tools to hinder analysis or protect intellectual property.
Running JavaScript code
Start with a simple snippet and verify its behavior before any transformations.
When executed in a browser console, this prints:
Minifying JavaScript code
Minification removes whitespace and formatting without changing functionality. For a one-line snippet, the effect is minimal but demonstrates the process used on larger scripts.
Packing JavaScript code
Packing compresses and obfuscates by encoding tokens into a dictionary and reconstructing them at runtime. This technique often wraps the code inside an eval() call that decodes and executes the script dynamically.
Working packed example:
When executed, this produces the same output:
The function(p,a,c,k,e,d) wrapper pattern is characteristic of packer-style obfuscation, where identifiers and literals are stored in a lookup table and reassembled during runtime to conceal original logic.
Advanced Obfuscation
Quick Notes
Hide cleartext strings and semantic hints
Base64 string-array obfuscation
Runtime decoding and indirection
Verification via JS consoles
Performance trade-offs of extreme encodings
Why Advanced Obfuscation Is Needed
Simple obfuscation techniques still expose string literals, which often reveal program intent. Advanced obfuscation focuses on eliminating all readable strings so functionality cannot be inferred through static inspection.
Base64 String Array Obfuscation
A common technique is relocating all string literals into an encoded array that is decoded at runtime. A public tool that supports this approach is:
Typical usage:
Paste the JavaScript source.
Enable string array handling with Base64 encoding.
Generate the obfuscated output.
Execute the result to confirm behavior is unchanged.
Common characteristics of the output:
Encoded string array
Array index shuffling
Runtime Base64 decoder
Decoded-string caching
Dynamic property and function access
All meaningful identifiers and strings are resolved only during execution.
Verifying Behavior
Obfuscated output should still perform identically. This can be verified by running it in a JavaScript execution environment such as:
Observable output (e.g., logging or alerts) should match the original script.
Obfuscation Strength vs Performance
More aggressive settings increase resistance to analysis but introduce overhead:
Larger script size
Slower initialization
Increased runtime cost
Each added layer trades performance for opacity.
Extreme Expression-Based Obfuscation
Some obfuscators encode entire scripts using only booleans, arrays, coercion, and indexing. These rely on predictable JavaScript type conversions to rebuild identifiers at runtime.
Typical traits:
Expressions like
![],!![],[]+[]Character extraction from coerced strings
Deeply nested runtime reconstruction
These scripts execute correctly but can be very slow for non-trivial payloads.
Practical Considerations
Highly aggressive obfuscation is useful for:
Discouraging casual analysis
Demonstrations
Bypassing simplistic filters
It is generally unsuitable for maintainable production code due to performance impact and loss of debuggability. Tools such as JJ Encode and AA Encode fall into this category and should be used selectively.
Deobfuscation
Quick Notes
Minification vs obfuscation
Beautifying JavaScript for readability
Automated unpacking of common obfuscation patterns
Limits of deobfuscation tools
When manual reverse engineering is required
Purpose of Deobfuscation
Deobfuscation focuses on restoring readability and understanding behavior in obfuscated code. Automated tools can reverse common patterns, but success depends on how the code was originally transformed.
Beautifying Minified JavaScript
Obfuscated scripts are often minified into a single line. The first step in analysis is formatting the code.
Common options:
Browser developer tools (pretty-print)
Code editor plugins (Prettier, Beautifier)
Online JavaScript beautifiers
Beautification restores indentation and line structure, making control flow easier to follow. It does not remove obfuscation logic.
Why Formatting Is Insufficient
Even after beautifying:
Identifiers remain meaningless
Strings may be reconstructed at runtime
Logic may rely on
evalor regex-based replacement
Deobfuscation is required to expose intent.
Automated Deobfuscation (Unpacking)
A frequent obfuscation method is packing, where code is stored as an encoded string and rebuilt at runtime.
A reliable public unpacking tool is:
Typical workflow:
Copy the obfuscated JavaScript.
Paste it into the unpacker.
Run the unpack operation.
Tip: Do not include empty lines before the script, as this can cause incorrect results.
Successful unpacking often reveals:
Clear function definitions
Restored string literals
Expanded control flow
Visible network or DOM interactions
Example result (simplified):
Manual Unpacking Technique
For simple packing schemes:
Locate the function return value.
Replace
evalexecution withconsole.log.Output the reconstructed source instead of executing it.
This exposes decoded code without running it.
Limits of Automated Tools
Automated unpackers are pattern-based and may fail when:
Custom obfuscation is used
Multiple encoding layers exist
Runtime state influences decoding
Partial output or failure is common in advanced cases.
Manual Reverse Engineering
When tools fail, manual analysis is required:
Step through decoding logic
Resolve string transformations
Track control-flow manipulation
Reconstruct behavior incrementally
Advanced JavaScript deobfuscation typically combines static inspection with runtime debugging.
Deobfuscation Examples
Code Analysis
Quick Notes
Identify exposed functions after deobfuscation
Analyze network-related code behavior
Understand client-side vs server-side responsibility
Spot unused or unreleased functionality
Reviewing the Deobfuscated Code
After deobfuscation, the script is readable and minimal. The secret.js file defines a single function:
The only executable logic present is the generateSerial function.
HTTP Request Construction
The function begins by creating an XMLHttpRequest object. This object is used in JavaScript to send HTTP requests and handle responses asynchronously.
A second variable defines the request target:
"/serial.php"No domain is specified, so the request is sent to the same origin as the current page.
Request Execution
The following calls define and execute the request:
xhr.open("POST", url, true)Initializes an asynchronous HTTP POST request to
/serial.php.
xhr.send(null)Sends the request with no body data and without processing any response.
No headers are set, no payload is included, and no callback is defined to handle a response.
Functional Interpretation
The function’s behavior is limited to issuing a POST request to a server-side endpoint. It does not:
Generate data client-side
Process server responses
Modify the DOM
This suggests the actual serial-generation logic, if it exists, would be implemented server-side in serial.php.
Contextual Implications
The absence of any visible trigger (e.g., a button or event handler calling generateSerial) implies the function may be:
Unused
Incomplete
Reserved for future functionality
Such dormant or unreleased features are often poorly tested and may expose unintended behavior.
Next Analysis Step
With the client-side behavior understood, the logical next step is to manually replicate the request. Sending a POST request to /serial.php directly allows verification of whether:
The endpoint is active
It performs any action without authentication
It returns sensitive data or errors
Unimplemented or hidden functionality frequently contains logic flaws or security issues.
HTTP Requests
Quick Notes
Reproduce an empty POST request to
/serial.phpUse
curlfor GET and POST requestsReduce output noise with
-sInclude POST body parameters with
-d
Using cURL for Web Requests
curl is a command-line utility available on Linux, macOS, and modern Windows environments. Providing a URL returns the server response as plain text, which is useful for inspecting HTML or endpoint behavior.
The response should match the same HTML content previously observed when reviewing the page source.
Sending a POST Request
To explicitly send a POST request, specify the request method with -X POST. The -s flag suppresses progress and status output to keep the response clean.
Tip: Use
-sto avoid cluttering the output with transfer statistics.
Sending POST Data
POST requests commonly include body data. The -d flag is used to include parameters in the request body.
Multiple parameters can be included by separating them with & or by repeating the -d flag.
Decoding
Quick Notes
Encoded output returned from server-side logic
Common encodings used in obfuscated code
Base64, hex, and rot13 identification and decoding
Manual and automated decoding approaches
Encoded Server Response
After issuing a POST request to /serial.php, the server returns an encoded string:
Encoded data is frequently used in obfuscated workflows to hide meaningful output until runtime. Scripts often decode such values dynamically before using them.
Common Encoding Techniques
The following encodings are frequently encountered during JavaScript deobfuscation.
Base64
Base64 represents data using:
Uppercase and lowercase letters
Numbers
+and/Optional
=padding
The encoded length is always a multiple of 4 characters, with = used as padding when required.
Base64 Encode
Base64 Decode
Hex Encoding
Hex encoding converts each character to its hexadecimal ASCII value.
Spotting Hex
Only characters
0-9anda-fEven-length strings
Hex Encode
Hex Decode
Caesar Cipher / Rot13
A Caesar cipher shifts characters by a fixed number. The most common variant is rot13, which shifts letters by 13 positions.
Spotting Rot13
Output appears scrambled but retains recognizable structure
Character-to-character substitution is consistent
Rot13 Encode
Rot13 Decode
Identifying Unknown Encodings
Not all encoded data uses common formats. When the encoding is unclear:
Examine the character set and length
Look for padding or structural patterns
Use automated identifier tools to guess the encoding type
Some tools can analyze encoded strings and suggest likely encodings automatically.
Encoding vs Encryption
Encoding is reversible without a key. Encryption requires a key to recover the original data.
Obfuscation tools may use encryption instead of encoding. If the decryption key is not present in the client-side code, reversing the logic becomes significantly more difficult.
Last updated