Hash Functions & Brute Force Attacks
Back to Projects
Cybersecurityhashingmd5securitycryptanalysis

Hash Functions & Brute Force Attacks

Implementation of MD5 hashing, rainbow table attacks, password cracking, and salt-based defenses.

Date
2024
Category
Cybersecurity

Understanding Cryptographic Hash Functions

Lab 3 delves into cryptographic hash functions with hands-on implementation of MD5, practical password cracking techniques, and defense mechanisms. This project explores hash function properties, collision resistance, and demonstrates why unsalted password hashes are vulnerable through participation in a hash-breaking competition.

What is Hashing?

A cryptographic hash function is a mathematical algorithm that takes an input (or "message") of any length and produces a fixed-size string of bytes. The output, typically a "digest," is unique to each unique input. Hash functions are deterministic, meaning the same input will always produce the same hash output. They are designed to be one-way functions, making it computationally infeasible to reverse the process and retrieve the original input from the hash.

  • Deterministic: Same input always produces the same hash
  • Fixed output size: Regardless of input length, hash is always the same size
  • Fast computation: Hash can be computed quickly for any input
  • One-way function: Infeasible to reverse the hash to get the original input
  • Avalanche effect: Small change in input drastically changes the output
  • Collision resistant: Hard to find two different inputs with the same hash

What is MD5?

MD5 (Message Digest Algorithm 5) is a widely used cryptographic hash function that produces a 128-bit (16-byte) hash value, typically expressed as a 32-character hexadecimal number. Designed by Ronald Rivest in 1991, MD5 was initially developed as a cryptographic hash function for digital signatures and message authentication.

MD5 processes input data in 512-bit blocks through a series of mathematical operations including bitwise operations, modular additions, and nonlinear functions. Despite its historical popularity, MD5 is now considered cryptographically broken due to discovered vulnerabilities that allow collision attacks, where two different inputs can produce the same hash output.

Technologies Used

PythonMD5Rainbow TablesDictionary AttacksSalt Generation

Key Features

MD5 hash function implementation from scratch

Brute force password cracking engine

Dictionary attack with common password lists

Rainbow table generation and lookup

Salt generation and secure password hashing

Hash collision detection

Showing 6 of 8 features

Implementation Details

1

Q&A: How does the length of hash correspond to input string?

One of the fundamental properties of cryptographic hash functions like MD5 is that they produce a fixed-size output regardless of the input size. Whether you hash a single character or an entire novel, the MD5 hash will always be exactly 128 bits (32 hexadecimal characters) long.

This fixed-size property is crucial for many applications, including password storage and data integrity verification. It means that comparing hashes is always a constant-time operation, and the storage requirement for hashes is predictable.

Code Explanation

The code demonstrates that regardless of input length (1 to 83 characters), MD5 always produces a 32-character hexadecimal hash. This fixed-size output is a defining characteristic of hash functions.

2

Q&A: Are there any visible correlations between the hash and input string?

No, there are no visible correlations between the input string and its hash output. This is due to the "avalanche effect" - a critical property of cryptographic hash functions where even a tiny change in the input produces a dramatically different hash.

For example, hashing "password" versus "Password" (just changing the case of one letter) produces completely different hashes with no visible pattern or relationship. This property is essential for security, as it prevents attackers from making educated guesses about the original input by examining the hash.

  • Similar inputs produce completely different hashes
  • No pattern can be observed between sequential inputs and their hashes
  • Hash values appear random and uniformly distributed
  • Single bit change in input changes approximately 50% of output bits
  • Impossible to predict hash output without computing it
3

Q&A: What are the issues related to cryptographic weakness of MD5?

MD5 is considered cryptographically broken and unsuitable for further use in security-critical applications. The algorithm has several significant vulnerabilities that have been discovered and exploited over the years:

  • Collision Attacks: Researchers can generate two different inputs that produce the same MD5 hash in a matter of seconds using modern hardware. This breaks the collision resistance property.
  • Preimage Attacks: While theoretically difficult, advances in computing power have made finding an input that produces a specific hash more feasible than originally intended.
  • Rainbow Tables: Pre-computed tables of hashes for common passwords make it trivial to reverse MD5 hashes of weak passwords.
  • Speed is a Weakness: MD5 was designed to be fast, which actually makes it vulnerable to brute-force attacks. Modern GPUs can compute billions of MD5 hashes per second.
  • No Salt Support: MD5 itself doesn't include salting, making identical passwords have identical hashes across different systems.

Despite these vulnerabilities, MD5 is still used in non-security contexts such as checksums for file integrity verification. However, for password storage, digital signatures, and other security-critical applications, modern alternatives like bcrypt, Argon2, or SHA-256 should be used instead.

In this lab, we explore these weaknesses hands-on by implementing brute force attacks, generating rainbow tables, and demonstrating how proper salting can mitigate some (but not all) of MD5's security issues.

4

Breaking Hashes with Brute Force

Brute force attacks on MD5 hashes involve systematically trying every possible input until finding one that produces the target hash. Due to MD5's speed and the absence of built-in protection mechanisms, this approach is surprisingly effective for short passwords.

The attack strategy depends on the expected password length and character set. For a 5-character password using lowercase letters and digits (36 possible characters), there are 36^5 = 60,466,176 possible combinations. While this sounds like a lot, modern computers can compute MD5 hashes at millions per second, making this entirely feasible.

  • Character Set Definition: Determine the possible characters (lowercase, uppercase, digits, symbols)
  • Combination Generation: Use itertools.product to generate all possible strings of target length
  • Hash Computation: Calculate MD5 hash for each generated string
  • Comparison: Check if computed hash matches any target hash
  • Optimization: Stop early if all target hashes are found
  • Progress Tracking: Monitor speed and estimated time remaining

The key insight is that MD5's computational efficiency becomes its weakness. A password that takes microseconds to hash can be cracked in seconds through brute force. This is why modern password hashing algorithms like bcrypt are intentionally slow, using key derivation functions with adjustable work factors.

Code Explanation

This brute force implementation systematically generates all 5-character combinations from the charset (a-z, 0-9) and computes their MD5 hashes. It uses a set for O(1) hash lookups, implements progress tracking, and stops early once all target hashes are found. On modern hardware, this can crack millions of hashes per second, demonstrating why password length and complexity are critical.

5

Defense Mechanism: Salt and Secure Password Storage

What is a Salt?

A salt is a random string of characters that is appended or prepended to a password before hashing. Think of it as adding unique "seasoning" to each password hash. The salt is stored alongside the hash in the database, and while not secret, it dramatically increases the difficulty of hash-breaking attacks.

How Salts Work:

  • Random Generation: A unique salt is generated for each password (typically 16-32 bytes)
  • Concatenation: Salt is combined with the password (e.g., salt + password or password + salt)
  • Hashing: The combined string is hashed using MD5 or a better algorithm
  • Storage: Both the salt and hash are stored in the database
  • Verification: During login, the stored salt is retrieved, combined with the entered password, hashed, and compared

Why Salts Are Critical:

  • Prevents Rainbow Table Attacks: Pre-computed hash tables become useless because each password has a unique salt
  • Eliminates Duplicate Hashes: Two users with password "password123" will have completely different hashes
  • Forces Per-Password Cracking: Attackers must crack each hash individually, no bulk cracking
  • Increases Search Space: Even weak passwords become more resistant to dictionary attacks
  • No Secrecy Required: Salt can be stored in plaintext alongside the hash

Example Without Salt:

Code Explanation

Without salt, identical passwords produce identical hashes. If an attacker cracks one hash, they instantly know the password for all users with that hash. Rainbow tables can be used to crack millions of hashes simultaneously.

6

Implementing Salted Password Hashing

With proper salting, each password gets a unique hash even if the passwords are identical. This makes rainbow tables ineffective and forces attackers to crack each hash individually.

Code Explanation

This implementation generates a unique random salt for each password, combines it with the password before hashing, and stores both salt and hash. During verification, the stored salt is used to hash the provided password for comparison. This defeats rainbow tables and prevents bulk password cracking.

7

Hash Breaking Techniques: A Comprehensive Analysis

Professional password cracking employs multiple sophisticated techniques, each with different strengths and weaknesses. Understanding these methods is crucial for both offensive security assessments and defensive password policy design.

1. Brute Force Attack

  • Method: Try every possible combination systematically
  • Time Complexity: O(c^n) where c = charset size, n = password length
  • Effectiveness: Guaranteed to find password given enough time
  • Speed: Modern GPUs can test billions of MD5 hashes per second
  • Best Against: Short passwords (≤6 characters), limited character sets
  • Defense: Use long passwords (12+ characters) with mixed character types

2. Dictionary Attack

  • Method: Try words from a pre-compiled dictionary/wordlist
  • Common Wordlists: rockyou.txt (14M passwords), SecLists, common passwords lists
  • Variations: Add common patterns (password123, Password!, p@ssw0rd)
  • Effectiveness: 10-30% success rate on real-world databases
  • Speed: Can test millions of passwords in seconds
  • Best Against: Dictionary words, common passwords, predictable patterns
  • Defense: Avoid dictionary words, use passphrases, add random characters

3. Rainbow Table Attack

  • Method: Pre-computed hash tables using time-memory tradeoff
  • How it Works: Generate hash chains that cover the entire keyspace efficiently
  • Storage Requirements: Gigabytes to terabytes depending on coverage
  • Speed: Near-instant lookups once table is built
  • Effectiveness: Extremely fast for unsalted hashes
  • Limitation: Each salt requires a new rainbow table (making them impractical)
  • Defense: Always use unique salts, making rainbow tables ineffective

4. Hybrid Attack

  • Method: Combine dictionary words with character substitutions and patterns
  • Examples: "password" → "p@ssw0rd", "P@ssw0rd!", "Password123!"
  • Rules: Common substitutions (a→@, e→3, i→1, o→0, s→$)
  • Tools: Hashcat and John the Ripper excel at hybrid attacks
  • Effectiveness: Catches users who think l33tspeak makes passwords secure
  • Defense: Use truly random passwords or long passphrases

5. Mask Attack (Smart Brute Force)

  • Method: Brute force with pattern constraints (e.g., "?u?l?l?l?l?d?d" for Password12 pattern)
  • Pattern Definition: ?l=lowercase, ?u=uppercase, ?d=digit, ?s=special
  • Effectiveness: Much faster than pure brute force by focusing on likely patterns
  • Common Patterns: Uppercase first letter, digits at end, special char at end
  • Defense: Use random passwords that don't follow predictable patterns

6. Combinator Attack

  • Method: Concatenate words from multiple wordlists
  • Example: "summer" + "2024" = "summer2024"
  • Effectiveness: Catches passphrases made from common word combinations
  • Speed: Faster than pure brute force, more thorough than dictionary
  • Defense: Use 4+ random words or add random characters between words

7. Statistical Attack / Markov Chains

  • Method: Generate passwords based on statistical probability of character sequences
  • How it Works: Learn patterns from leaked password databases
  • Effectiveness: Can crack "random-looking" passwords that follow human patterns
  • Advanced: Machine learning models trained on billions of real passwords
  • Defense: Use cryptographically random password generators
8

Online Hash Breaking Resources

For educational purposes and security testing, several online resources provide hash lookup and cracking capabilities. These demonstrate how quickly unsalted hashes can be compromised.

Popular Online Hash Crackers:

  • CrackStation (https://crackstation.net/) - Free rainbow table lookup for MD5, SHA1, SHA256, NTLM. Database of 15+ billion entries.
  • Hashes.com (https://hashes.com/en/decrypt/hash) - Multi-algorithm hash lookup with submission system. Community-powered database.
  • MD5 Online (https://www.md5online.org/md5-decrypt.html) - Dedicated MD5 hash cracker with large database.
  • HashKiller (https://hashkiller.io/) - Free hash lookup supporting multiple algorithms.
  • OnlineHashCrack (https://www.onlinehashcrack.com/) - Paid service for serious cracking, supports GPU acceleration.
  • Cyber Chef (https://gchq.github.io/CyberChef/) - Not a cracker but excellent for hash analysis and encoding/decoding.

Offline Tools for Professional Use:

  • Hashcat - GPU-accelerated, supports 300+ hash types, incredibly fast. Industry standard.
  • John the Ripper - CPU-based, excellent rule engine, great for dictionary attacks.
  • Hydra - Network login cracker, supports many protocols.
  • Medusa - Parallel network login cracker.
  • RainbowCrack - Generates and uses rainbow tables.
  • ophcrack - Windows password cracker using rainbow tables.

⚠️ Legal and Ethical Considerations:

  • Only crack hashes you have explicit permission to crack
  • Use these tools for security research, penetration testing, or educational purposes
  • Unauthorized access to computer systems is illegal in most jurisdictions
  • Check local laws regarding password cracking tools and activities
  • Always get written authorization before security testing
  • Respect responsible disclosure practices when finding vulnerabilities

Best Practices for Defense:

  • Never use MD5 or SHA1 for password hashing - use bcrypt, Argon2, or PBKDF2
  • Always use unique random salts (16+ bytes)
  • Implement key stretching (multiple rounds of hashing)
  • Enforce minimum password length (12+ characters)
  • Use password complexity requirements wisely (length > complexity)
  • Implement rate limiting and account lockouts
  • Consider multi-factor authentication (MFA)
  • Monitor for breached credentials using services like Have I Been Pwned
  • Educate users about password managers
  • Conduct regular security audits and penetration testing

Challenges

Implementing MD5 algorithm correctly according to RFC 1321, optimizing brute force performance, generating effective rainbow tables, understanding time-space tradeoffs in password cracking.

Outcome & Impact

Successfully implemented MD5 from scratch, cracked thousands of unsalted password hashes, demonstrated effectiveness of salted hashes in preventing rainbow table attacks.

How I Grew

Mastered cryptographic hash function properties and requirements

Learned practical password cracking techniques and defenses

Understood the critical importance of salt in password storage

Gained experience with time-space tradeoffs in cryptanalysis

Developed skills in optimizing cryptographic operations

Learned why MD5 is deprecated for security-critical applications

Try It: Hash Breaker

Experience firsthand how quickly weak passwords can be cracked through brute force attacks. This demo uses a simplified hash function (SHA-1 truncated to 32 chars) for demonstration purposes, as browsers don't natively support MD5. The principles remain the same: systematic enumeration until a match is found.

⚠️ Note: Real MD5 brute force attacks are much faster. This educational demo runs in your browser and may be slower than native implementations.

Target Hash

Understanding the Attack

Character Set: Defines the pool of possible characters (a-z, A-Z, 0-9)

Search Space: For length N with C characters: C^1 + C^2 + ... + C^N combinations

Example: 4-char lowercase+digits (36 chars) = 36^1 + 36^2 + 36^3 + 36^4 = 1,727,604 combinations

Defense: Use longer passwords (10+ chars), mix character types, and modern algorithms like bcrypt