On the Electrodynamics of Generating Hamming Numbers Efficiently

Ok, this was part of my daily drill - something I wanted to spend about 15 minutes on each day. This one ended up eating up two entire day and disrupted my sleep too.. Shows how far behind the curve I am :)

Finding it so difficult (when I was pursuing the wrong approach) forced me to ask - why is this one only 4 kyu? (A unit of credit on Codewars).

Turns out, if you know the right way, it's easy.

The ask :

Write a function that computes the nth smallest Hamming number.

Specifically:

The first smallest Hamming number is 1 = 203050
The second smallest Hamming number is 2 = 213050
The third smallest Hamming number is 3 = 203150
The fourth smallest Hamming number is 4 = 223050
The fifth smallest Hamming number is 5 = 203051

The 20 smallest Hamming numbers are given in example test fixture.

Your code should be able to compute all of the smallest 5,000 (Clojure: 2000, NASM: 13282) Hamming numbers without timing out.

asdf

Sound easy? It looks so simple and elegant. But, consider, how do you generate the nth? You can't, there is no formula.

So, what you can do is start going down the list of natural numbers and check if each is a Hamming number (easy right, if the prime factors are only any or all of 2,3,5, you've got you an H#). Problem? Factoring numbers is expensive in computation. To find the 500th takes about 3 seconds and the 700th takes about 10 seconds. And they want to you be able to get the 5000th in the blink of an eye.

Turns out it is possible if you go a different route. Thanks to : https://stackoverflow.com/questions/29286845/finding-hamming-numbers-not-code-or-distance (the initial part of the answer by M. Oehm was all I needed. Didn't have much use for his code)

Basic idea - start with 1 and keep adding 2,3,5 as factors to generate H#s and, when you think you might have enough, (since you'll have duplicates with this approach - think about it, 2*3 and 3*2), stop, sort the list and report.

Sorting and list-lookup are okay once. Doing them within a loop is a killer.

So, I just generate n (given ask) + about 5000 numbers (or so, you can run with higher numbers and see if your answer changes :) ) and that works - really in the blink of an eye! In Jupyter!

def getHamming( n ) :

''' return the nth smallest Hamming number 2^i * 3^j * 5^k '''

H_l = [1]

if n == 1 :

return 1

else :

N = 2

while N < n + 5000 :

to_add = []

for core in [2,3,4,5] :

to_add.extend( [x*core for x in H_l] )

to_add = set( to_add )

H_l.extend( to_add )

H_l = list( set(H_l) )

N = len( H_l )

final = list( set(H_l) )

final.sort()

return final[n-1]

In case it helps you out, here are a few :

#	H
1000	51200000
1500	859963392
2000	8062156800
2500	53687091200
3000	278942752080
3500	1228800000000
4000	4701849845760
4500	16124313600000
5000	50837316566580

The perceptive software engineer looking at the code I am so proud of would have noticed some mistakes. As it turned out - though, by my standards the code executed in the blink of an eye, the attempt still failed to run all of Codewars' hundreds of tests in the stipulated 12 seconds.

Did you see what I'm doing wrong - or, what I, at this point in time, think I'm doing wrong?

I use the existing list of Hamming numbers to generate new ones by multiplying by 2,3,5. But, look at the code. If I have 100 now, I add 400 (since I'm also using 4 as a multiple). But, next time round, I'm still using the first 100. Is that lame or what? :)

Finally, gave up and peeked at the solution from this genius :

https://blog.katastros.com/a?ID=00650-45e6c39e-a48c-4f00-b8ab-3710f3a02c0c

which is, only generate the correct next Hamming number. That is, starting with [1], only add 2 (after you look at 2, 3, 5. But, next time, you shouldn't look at 1 from [1,2] when you're using 2. Get it?

My crappy code, so amateurish :

def hamming(n) :

H_l = [1]

i,j,k = (0,0,0)

N = 0

while N < n :

c_2,c_3,c_5 = H_l[i]*2,H_l[j]*3,H_l[k]*5

l = len( {c_2,c_3,c_5} )

if l == 1 :

H_l.append( c_2 )

i += 1

j += 1

k += 1

elif l == 2 :

if c_2 == c_3 :

if c_2 < c_5 :

H_l.append( c_2 )

i += 1

j += 1

else :

H_l.append( c_5 )

k += 1

elif c_2 == c_5 :

if c_2 < c_3 :

H_l.append( c_2 )

i += 1

k += 1

else :

H_l.append( c_3 )

j += 1

elif c_3 < c_2 :

H_l.append( c_3 )

j += 1

k += 1

else :

H_l.append( c_2 )

i += 1

else :

if c_2 < c_3 and c_2 < c_5 :

H_l.append( c_2 )

i += 1

elif c_3 < c_2 and c_3 < c_5 :

H_l.append( c_3 )

j += 1

else :

H_l.append( c_5 )

k += 1

N += 1

return H_l[n-1]

So, what can you learn from the best solution? What's better than this horrible nested if statement? We want to increment the i,j,k in the case that any of them was employed in creating the next H :

Instead of using i,j,k, use a list : expos and then, it's just :

for i in range(3):

expos[i] += int(next_hamms[i] == next_hamm)

I was trying to DRY starting with Karthik's code..

Cute! And, for generating the next candidates, be pythonic :)

Don't do :

c_2,c_3,c_5 = H_l[i]*2,H_l[j]*3,H_l[k]*5

Which also doesn't extend to the case of different bases :)

Instead, do :

next_hamms = [bases[i] * hamms[expos[i]] for i in range(3)]

Python Kaizen - The Infinite Loop of Improvement

Search This Blog

One Path to Bankruptcy for Replit

On the Electrodynamics of Generating Hamming Numbers Efficiently

Labels

Comments

Post a Comment

Popular posts from this blog

Align an Embedded Image in Jupyter Markdown

openCV : Really Filtering by Color

One Path to Bankruptcy for Replit