SHA256 is hashing function used by many internet standard such as TLS and SSL. form wiki, It one of SHA2 family of hash functions
SHA-2 includes significant changes from its predecessor, SHA-1. The SHA-2 family consists of six hash functions with digests (hash values) that are 224, 256, 384 or 512 bits:[5] SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256. SHA-256 and SHA-512 are hash functions whose digests are eight 32-bit and 64-bit words, respectively. They use different shift amounts and additive constants, but their structures are otherwise virtually identical, differing only in the number of rounds. SHA-224 and SHA-384 are truncated versions of SHA-256 and SHA-512 respectively, computed with different initial values. SHA-512/224 and SHA-512/256 are also truncated versions of SHA-512, but the initial values are generated using the method described in Federal Information Processing Standards (FIPS) PUB 180-4.
The algorithm is not super hard. I did something similar with MD5
Note 1: All variables are 32 bit unsigned integers and addition is calculated modulo 232 Note 2: For each round, there is one round constant k[i] and one entry in the message schedule array w[i], 0 ≤ i ≤ 63 Note 3: The compression function uses 8 working variables, a through h Note 4: Big-endian convention is used when expressing the constants in this pseudocode, and when parsing message block data from bytes to words, for example, the first word of the input message “abc” after padding is 0x61626380
Initialize hash values: (first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19): h0 := 0x6a09e667 h1 := 0xbb67ae85 h2 := 0x3c6ef372 h3 := 0xa54ff53a h4 := 0x510e527f h5 := 0x9b05688c h6 := 0x1f83d9ab h7 := 0x5be0cd19
Initialize array of round constants: (first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311): k[0..63] := 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
Pre-processing (Padding): begin with the original message of length L bits append a single ‘1’ bit append K ‘0’ bits, where K is the minimum number >= 0 such that (L + 1 + K + 64) is a multiple of 512 append L as a 64-bit big-endian integer, making the total post-processed length a multiple of 512 bits such that the bits in the message are: 1 <L as 64 bit integer> , (the number of bits will be a multiple of 512)
Process the message in successive 512-bit chunks: break message into 512-bit chunks for each chunk create a 64-entry message schedule array w[0..63] of 32-bit words (The initial values in w[0..63] don’t matter, so many implementations zero them here) copy chunk into first 16 words w[0..15] of the message schedule array
Extend the first 16 words into the remaining 48 words w[16..63] of the message schedule array: for i from 16 to 63 s0 := (w[i-15] rightrotate 7) xor (w[i-15] rightrotate 18) xor (w[i-15] rightshift 3) s1 := (w[i-2] rightrotate 17) xor (w[i-2] rightrotate 19) xor (w[i-2] rightshift 10) w[i] := w[i-16] + s0 + w[i-7] + s1 Initialize working variables to current hash value: a := h0 b := h1 c := h2 d := h3 e := h4 f := h5 g := h6 h := h7 Compression function main loop: for i from 0 to 63 S1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25) ch := (e and f) xor ((not e) and g) temp1 := h + S1 + ch + k[i] + w[i] S0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22) maj := (a and b) xor (a and c) xor (b and c) temp2 := S0 + maj h := g g := f f := e e := d + temp1 d := c c := b b := a a := temp1 + temp2 Add the compressed chunk to the current hash value: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d h4 := h4 + e h5 := h5 + f h6 := h6 + g h7 := h7 + h
Produce the final hash value (big-endian): digest := hash := h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7
Code Link to heading
The Algorithm is straightforward to translate to python. I only needed rightrotate
function. The wiki had the hash for empty string and that what i used for testing.
import struct
def rightrotate(x, n):
return (x >> n) | (x << (32 - n)) & 0xFFFFFFFF
def mysha256(data):
h0 = 0x6a09e667
h1 = 0xbb67ae85
h2 = 0x3c6ef372
h3 = 0xa54ff53a
h4 = 0x510e527f
h5 = 0x9b05688c
h6 = 0x1f83d9ab
h7 = 0x5be0cd19
k = [
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
]
original_byte_len = len(data)
original_bit_len = original_byte_len * 8
data += b'\x80'
while (len(data) * 8 + 64) % 512 != 0:
data += b'\x00'
data += struct.pack('>Q', original_bit_len)
for i in range(0, len(data), 64):
chunk = data[i:i+64]
w = [0] * 64
for j in range(16):
w[j] = struct.unpack('>I', chunk[j*4:j*4+4])[0]
for j in range(16, 64):
s0 = rightrotate(w[j-15], 7) ^ rightrotate(w[j-15], 18) ^ (w[j-15] >> 3)
s1 = rightrotate(w[j-2], 17) ^ rightrotate(w[j-2], 19) ^ (w[j-2] >> 10)
w[j] = (w[j-16] + s0 + w[j-7] + s1) & 0xFFFFFFFF
a, b, c, d, e, f, g, h = h0, h1, h2, h3, h4, h5, h6, h7
for j in range(64):
S1 = rightrotate(e, 6) ^ rightrotate(e, 11) ^ rightrotate(e, 25)
ch = (e & f) ^ ((~e) & g)
temp1 = (h + S1 + ch + k[j] + w[j]) & 0xFFFFFFFF
S0 = rightrotate(a, 2) ^ rightrotate(a, 13) ^ rightrotate(a, 22)
maj = (a & b) ^ (a & c) ^ (b & c)
temp2 = (S0 + maj) & 0xFFFFFFFF
h = g
g = f
f = e
e = (d + temp1) & 0xFFFFFFFF
d = c
c = b
b = a
a = (temp1 + temp2) & 0xFFFFFFFF
h0 = (h0 + a) & 0xFFFFFFFF
h1 = (h1 + b) & 0xFFFFFFFF
h2 = (h2 + c) & 0xFFFFFFFF
h3 = (h3 + d) & 0xFFFFFFFF
h4 = (h4 + e) & 0xFFFFFFFF
h5 = (h5 + f) & 0xFFFFFFFF
h6 = (h6 + g) & 0xFFFFFFFF
h7 = (h7 + h) & 0xFFFFFFFF
return ''.join([f'{x:08x}' for x in [h0, h1, h2, h3, h4, h5, h6, h7]])
if __name__ == '__main__':
x = mysha256(b"")
print(x)
assert x == "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"