Cracking a 4x hashed md5 with a couple of hints

Question

I made this project in my free time to try to crack a 4x hashed md5 with a couple of hints but I'm looking for ways I can optimize this algo because it's pretty slow and I know you could probs use threading or some other ways to increase speed.

If you can even point out some minor ways to increase performance I would be eternally grateful.

I was told I could try re-using the memory for the strings but I'm not sure how to do that. I was also told to not generate the wordlist in advance and try as I go but I dont think the algorithm I used supports that?

Here's the branch to review (more info in the readme): https://github.com/TransmissionsDev/crack_yearn_md5/tree/v1

Here's the main.rs file:

#[cfg(debug_assertions)]
use num_format::{Locale, ToFormattedString};

fn hash_md5(input: String) -> String {
    format!("{:x}", md5::compute(input.into_bytes()))
}

fn main() {
    let goal_hash = "dbba1bfe930d953cabcc03d7b6ab05e";

    let length = 17;

    let alphabet = "bdeilmostu-"
        .split("")
        .map(String::from)
        .filter(|letter| !letter.is_empty())
        .collect::<Vec<String>>();

    let mut words = alphabet.clone();

    for phase in 1..length {
        let mut temp: Vec<String> = Vec::new();

        #[cfg(not(debug_assertions))]
        let loopable_words = words.iter();

        #[cfg(debug_assertions)]
        let loopable_words = words.iter().enumerate();

        for data in loopable_words {
            #[cfg(not(debug_assertions))]
            let word = data;

            #[cfg(debug_assertions)]
            let word = data.1;
            #[cfg(debug_assertions)]
            let index = data.0;

            for letter in alphabet.iter() {
                let new_word = format!("{}{}", word, letter);

                temp.push(new_word);
            }

            #[cfg(debug_assertions)]
            if index != 0 && ((index % 1000000) == 0) {
                println!(
                    "Completed phase {}/{}'s sub-phase {}/{}",
                    phase,
                    length,
                    index.to_formatted_string(&Locale::en),
                    words.len().to_formatted_string(&Locale::en),
                );
            }
        }

        words = temp;
    }

    #[cfg(debug_assertions)]
    let word_list_length = words.len();
    #[cfg(debug_assertions)]
    println!("\n\nLength of word list: {}\n\n", word_list_length);

    #[cfg(not(debug_assertions))]
    let loopable_words = words.iter();

    #[cfg(debug_assertions)]
    let loopable_words = words.iter().enumerate();

    for data in loopable_words {
        #[cfg(not(debug_assertions))]
        let word = data;

        #[cfg(debug_assertions)]
        let word = data.1;
        #[cfg(debug_assertions)]
        let attempts = data.0;

        let merged = format!(
            "{}{}",
            word, "........................................................!1"
        );

        let hash = hash_md5(hash_md5(hash_md5(hash_md5(merged.clone()))));

        #[cfg(debug_assertions)]
        println!(
            "Attempts: {}/{}, Hash: {}, Text: {}",
            attempts, word_list_length, hash, merged
        );

        if hash == goal_hash {
            println!("We found the password: {}", merged);

            return;
        }
    }
}

\$\begingroup\$ This looks useful: github.com/deeprobin/bruteforce-rs \$\endgroup\$

TransmissionsDev
– TransmissionsDev

2020-12-09 22:37:17 +00:00
Commented Dec 9, 2020 at 22:37 — TransmissionsDev
– TransmissionsDev, Commented Dec 9, 2020 at 22:37

vallentin · Accepted Answer · 2020-12-21 16:24:55Z

First, I would change the approach fundamentally: I'd consider the plaintext as a 17-digit number in "base 11" (because there are 11 "letters" in the alphabeth). Then the loop becomes a simple iteration through a range from 0 to 11¹⁷-1. On top of that, threading will be easy to add using rayon.

As you were told, we should re-use memory for strings and not generate the wordlist in advance.

To re-use memory for strings, we treat them as long-living containers and call .clear() whenever we need to reuse them for another purpose.

This part in your code

for letter in alphabet.iter() {
    let new_word = format!("{}{}", word, letter);

    temp.push(new_word);
}

is very inefficient as it's calling format!, which is complex machinery, just to append a letter. It would be better to have at least String::push there, but we will take another approach below.

On 4 cores, the all-in-one is roughly as fast as just your non-parallelized first pass. For parallelism, I did parallel fold after Range<u64>. So each thread has a little bit of its own state there.

Make sure you add rayon = "0.6" in the Cargo manifest.

Here is the code:

#[cfg(debug_assertions)]
use num_format::{Locale, ToFormattedString};
use std::str;
use std::io::Write;
use rayon::prelude::*;

static PASS_LENGTH: u32 = 17;

#[derive(Default)]
struct State {
    plaintext: Vec<u8>,
    intermediate: Vec<u8>,
}

impl State {
    fn new() -> Self {
        let mut result = State::default();
        for _ in 0 .. PASS_LENGTH {
            result.plaintext.push(b"x"[0]);
        }
        result.plaintext.extend(b"........................................................!1".iter().copied());
        result
    }
}

fn main() {
    let goal_hash = "dbba1bfe930d953cabcc03d7b6ab05e";

    let num_of_md5 = 4;
    let alphabet = "bdeilmostu-";
    let alphabet_len = alphabet.len() as u64;

    let total = alphabet_len.pow(PASS_LENGTH);

    (0 .. total).into_par_iter().fold(
        || State::new(),
        |mut state, num| {
            for i in 0 .. PASS_LENGTH {
                let letter = ((num / alphabet_len.pow(i)) % alphabet_len) as usize;
                state.plaintext[i as usize] = alphabet.as_bytes()[letter];
            }

            state.intermediate.clear();
            write!(&mut state.intermediate, "{:x}", md5::compute(&state.plaintext[..]));
            for _ in 1 .. num_of_md5 {
                let digest = md5::compute(&state.intermediate[..]);
                state.intermediate.clear();
                write!(&mut state.intermediate, "{:x}", digest);
            }
            let result = &state.intermediate[..];

            if result == goal_hash.as_bytes() {
                println!("We found the password: {}", str::from_utf8(&state.plaintext[..]).unwrap());
            }

            if num % 1_000_000 == 0 && num != 0 {
                #[cfg(debug_assertions)]
                println!(
                    "Attempts: {}/{}, Text: {}",
                    num, total, str::from_utf8(&state.plaintext[..]).unwrap()
                );
            }

            state
        }
    ).for_each(|_| {});
}
```

@TransmissionsDev sure thing. It would be interesting to improve the conversion from number to plaintext probably through smarter division or add-1 with carry.Further, I'm not sure how you'd like to use this program, but it's possible to adapt it for distributed computation or even the cloud. — Peter Blackson
– Peter Blackson, Commented Dec 12, 2020 at 20:00

Stack Exchange Network

Cracking a 4x hashed md5 with a couple of hints

1 Answer 1

Your Answer

Hot Network Questions

Cracking a 4x hashed md5 with a couple of hints

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions