HexDump: A Beginner’s Guide to Reading Binary Data—
A hexdump is a representation of binary data in a human-readable hexadecimal (base-16) format, often accompanied by an ASCII interpretation. Hexdumps are indispensable for debugging, reverse engineering, forensic analysis, and learning how data is structured on disk or in memory. This guide walks you through the fundamentals: what a hexdump shows, common tools to create one, how to interpret the output, and practical examples and exercises to build your skills.
What is a HexDump?
A hexdump displays raw bytes as two-digit hexadecimal numbers (00 through FF), typically grouped into 8, 16, or another convenient number of bytes per line. Each line commonly begins with an offset — the byte index from the start of the file — shown in hexadecimal. Many hexdump outputs include an ASCII column showing printable characters for those bytes; non-printable bytes are usually shown as dots (.) or another placeholder.
Example layout (conceptual):
00000000 48 65 6c 6c 6f 2c 20 57 6f 72 6c 64 21 0a Hello, World!.
- Offset: 00000000
- Hex bytes: 48 65 6c … 0a
- ASCII: Hello, World!.
Why HexDumps Matter
- Debugging: Inspect binary file contents, check for corrupted or unexpected bytes.
- Reverse engineering: Understand file formats, protocols, or executable internals.
- Forensics: Recover evidence from raw disk images or memory dumps.
- Education: Learn how text, numbers, and structures are encoded at the byte level.
- Interoperability checks: Confirm endianness, padding, and field alignment.
Common Tools to Create HexDumps
- hexdump (Unix-like): Flexible, scriptable, good for basic needs.
- Example: hexdump -C file.bin
- xxd (Vim suite): Creates hexdumps and can convert back to binary.
- Example: xxd file.bin
- od (octal dump): Powerful, supports multiple formats including hex.
- Example: od -An -t x1 -v file.bin
- HxD (Windows GUI): Visual editor with hex/ASCII panes, useful for manual editing.
- bless / wxHexEditor / 010 Editor: GUI hex editors with advanced features for large files.
- Python: Custom scripts using binascii, struct, or hexdump libraries.
- Example: python -c “import sys,binascii; print(binascii.hexlify(open(sys.argv[1],‘rb’).read()))”
Understanding the Output
-
Offsets
- Offsets show the address of the first byte on the line measured from file start.
- Often displayed in hexadecimal. For large files you may see 64-bit offsets.
-
Byte grouping
- Grouping (8, 16, etc.) improves readability.
- Some tools insert an extra space between groups to highlight boundaries.
-
ASCII column
- Printable ASCII (0x20–0x7E) is shown as characters.
- Non-printable bytes are typically represented as ‘.’.
-
Endianness
- Hexdump shows raw byte order. Interpreting multi-byte integers depends on endianness (little vs big).
- Example: bytes 0x01 0x00 represent 1 in little-endian 16-bit, 256 in big-endian.
-
Character encodings
- Hexdump doesn’t interpret encodings beyond raw bytes. For UTF-8 text you may see multi-byte sequences represented as hex and characters if printable.
Practical Examples
1) Small text file
Create a file containing “Hello ” and hexdump it.
Command:
echo -n "Hello " > hello.txt hexdump -C hello.txt
Output (example):
00000000 48 65 6c 6c 6f 0a |Hello.| 00000006
Interpretation:
- 0x48=‘H’, 0x65=‘e’, 0x6c=‘l’, 0x6f=‘o’, 0x0a=newline.
2) Inspecting binary headers (PNG)
PNG files start with an 8-byte signature: 89 50 4E 47 0D 0A 1A 0A.
Command:
hexdump -C image.png | head -n 4
You’ll see the PNG signature, then chunk headers like IHDR in ASCII.
3) Reading integers with endianness
Suppose bytes are: 78 56 34 12
- Little-endian 32-bit integer: 0x12345678 → 305419896 decimal.
- Big-endian 32-bit integer: 0x78563412 → 2018915346 decimal.
Use Python to parse:
data = bytes.fromhex('78563412') import struct struct.unpack('<I', data) # little-endian struct.unpack('>I', data) # big-endian
Tips for Faster Interpretation
- Memorize hex for common ASCII: 0x30–0x39 = ‘0’–’9’, 0x41–0x5A = ‘A’–’Z’, 0x61–0x7A = ‘a’–’z’.
- Use tools that annotate known file formats (e.g., binwalk, 010 Editor templates).
- Convert frequently: hex → decimal for sizes/lengths, hex → ASCII for strings.
- Search for known signatures (magic numbers) to quickly identify file types.
- Use scripting to extract ranges (dd, tail/head with -c, Python).
Exercises to Build Skill
- Create a hexdump of /bin/ls and locate the ELF header (magic bytes 7F 45 4C 46).
- Use xxd -r to modify a byte and write the binary back; verify behavior change.
- Take a UTF-8 text containing emoji; observe how multibyte sequences appear in hexdump.
- Find and extract an embedded PNG inside a larger file using its signature.
When to Use Structured Parsers Instead
Hexdumps are great for exploration, but for complex formats or large-scale parsing use dedicated libraries or tools:
- libpng for PNGs, struct in Python for binary layouts, Scapy for packets, and file-format-specific parsers.
Quick Reference Commands
- hexdump -C file.bin — canonical hex+ASCII
- xxd file.bin — Vim-style hexdump
- od -An -t x1 -v file.bin — hex bytes with od
- xxd -r file.hex > file.bin — convert hex back to binary
Hexdumps expose the raw bytes that form everything digital — files, memory, and network traffic. With practice you’ll move from seeing columns of hex to quickly recognizing signatures, structures, and subtle corruption.
Leave a Reply