Post

ESI Finals 2025 - CanYouHear (Steganography)

Writeup for CanYouHear audio steganography challenge from ESI Finals 2025

ESI Finals 2025 - CanYouHear (Steganography)

CanYouHear - Audio Steganography Challenge

Challenge Description

It sounds like just another audio file, but listen closer. The real message isn’t in the music, it’s buried deep within the data. Can you tune in and extract what’s hidden beneath the surface?

We’re given a WAV audio file: CanYouHear.wav

Initial Analysis

First, let’s examine the file properties:

1
2
3
┌──(chida㉿kali)-[~/wara]
└─$ file CanYouHear.wav
CanYouHear.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz

Using exiftool to get more details:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
┌──(chida㉿kali)-[~/wara]
└─$ exiftool CanYouHear.wav
ExifTool Version Number         : 13.25
File Name                       : CanYouHear.wav
Directory                       : .
File Size                       : 1866 kB
File Modification Date/Time     : 2025:04:20 15:11:38+01:00
File Access Date/Time           : 2025:05:17 14:40:32+01:00
File Inode Change Date/Time     : 2025:05:17 14:40:22+01:00
File Permissions                : -rw-rw-r--
File Type                       : WAV
File Type Extension             : wav
MIME Type                       : audio/x-wav
Encoding                        : Microsoft PCM
Num Channels                    : 2
Sample Rate                     : 44100
Avg Bytes Per Sec               : 176400
Bits Per Sample                 : 16
Duration                        : 10.58 s

Key observations:

  • 16-bit PCM WAV file
  • Stereo (2 channels)
  • 44100 Hz sample rate
  • ~10.58 seconds duration

The Approach: LSB Steganography

The challenge hint suggests data is “buried deep within” the audio. This is a classic indicator of LSB (Least Significant Bit) steganography in audio files.

In LSB steganography:

  • Each audio sample is a 16-bit signed integer
  • The least significant bit (LSB) of each sample can be modified without noticeably affecting the audio
  • Hidden data is encoded by setting these LSBs to form a bitstream

Solution

I wrote a Python script to extract the LSBs from each audio sample and reconstruct the hidden message:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#!/usr/bin/env python3
import wave
import struct
import sys

def extract_lsb_from_wav(wav_path, num_samples_to_skip=0):
    """
    Opens a 16-bit PCM WAV file, extracts the LSB from each sample,
    and returns the resulting bitstream as a list of bits.
    """
    bits = []
    with wave.open(wav_path, 'rb') as wav:
        nchannels, sampwidth, framerate, nframes, comptype, compname = wav.getparams()
        if sampwidth != 2:
            raise ValueError("This script only supports 16-bit PCM WAV files.")
        raw = wav.readframes(nframes)
        # Unpack all samples (little-endian signed 16-bit)
        total_samples = nframes * nchannels
        fmt = "<{}h".format(total_samples)
        samples = struct.unpack(fmt, raw)

        # Extract LSB from each sample, skipping initial ones if desired
        for i, sample in enumerate(samples[num_samples_to_skip:], start=num_samples_to_skip):
            bits.append(sample & 1)
    return bits

def bits_to_bytes(bits):
    """
    Groups bits into bytes (8 bits per byte), MSB first in each byte.
    Returns a bytes object.
    """
    data = bytearray()
    for i in range(0, len(bits), 8):
        byte_bits = bits[i:i+8]
        if len(byte_bits) < 8:
            break
        value = 0
        for bit in byte_bits:
            value = (value << 1) | bit
        data.append(value)
    return bytes(data)

def main():
    if len(sys.argv) < 2:
        print(f"Usage: {sys.argv[0]} CanYouHear.wav [skip_samples]")
        sys.exit(1)

    wav_path = sys.argv[1]
    skip = int(sys.argv[2]) if len(sys.argv) >= 3 else 0

    # 1) Extract LSBs from the WAV
    bits = extract_lsb_from_wav(wav_path, num_samples_to_skip=skip)

    # 2) Convert bits to raw bytes
    hidden = bits_to_bytes(bits)

    # 3) Try to decode as UTF-8 text; if that fails, dump raw bytes as hex
    try:
        message = hidden.decode('utf-8', errors='strict')
        print("Decoded hidden message:")
        print(message)
    except UnicodeDecodeError:
        print("Non-textual data found. Hex dump of first 256 bytes:")
        print(hidden[:256].hex())

if __name__ == "__main__":
    main()

How It Works

  1. Extract LSBs: Read all audio samples from the WAV file and extract the least significant bit from each sample
  2. Reconstruct Bytes: Group the extracted bits into bytes (8 bits per byte, MSB first)
  3. Decode: Attempt to decode the resulting bytes as UTF-8 text

Running the Script

1
python3 solve.py CanYouHear.wav

The script successfully extracts and decodes the hidden message embedded in the audio file’s LSBs, revealing the flag!

Flag

The flag was successfully extracted from the LSB-encoded data in the audio file.

This post is licensed under CC BY 4.0 by the author.