-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated the 'data/data-representation' session to be in line with the OpenEdu methodology. #13
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# ASCII Art | ||
|
||
My friend Tommy is really passionate about art. Lately, he got into something called ASCII art and e-mailed me the following attachment. | ||
I am too afraid to tell him that I don't know how to open it. Can you help? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Convert this file to linux line endings. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
import base64 | ||
Check failure on line 1 in chapters/data/data-representation/drills/tasks/ascii-art/solution/solution.py GitHub Actions / Checkpatch
|
||
|
||
text = flag = open("tommys_art_project.txt","r").read() | ||
text = (base64.b64decode(text.encode('ascii'))).decode('ascii') | ||
|
||
ascii_list = [] | ||
a = 0 | ||
|
||
for line in text: | ||
for x in line: | ||
if x.isdigit() == True: | ||
a = a * 10 + int(x) | ||
if x == ' ' or x == '#': | ||
if a != 0: | ||
ascii_list.append(a) | ||
a = 0 | ||
|
||
solution = [] | ||
|
||
for x in ascii_list: | ||
solution.append(chr(x)) | ||
|
||
for x in solution: | ||
print(x, end="") |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
ICA4MyM4MyAgICM4MyMjICAgIzEyMyMgICAgOTcjICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgMTE1ICAgCiAjICAgICAjICMgICAgICMgIyAgICAgIyAgIyAgICAjICAgICMgICM5OSMgMTA1IzEwNSAgICAgICAgIyAgICAjIDk1IzQ5IyA1MyM5NSAgNTMjNDgjICAgICMgIAogIyAgICAgICAjICAgICAgICMgICAgICAgICMgICAgOTUgICAjICMgICAgIyAgICMgICAgICAgICAgICMgICAgIyAjICAgICAgIyAgICAjICMgICAgICAgICAjICAKICA5OSM0OCAgIDQ4IzQ4ICAgNDgjNDggIDQ4ICAgICMgIyAgIyAjICAgICMgICAjICAgICAgICAgICA0OCM0OCMgNDgjNDggICMgICAgIyAjMTA4IyAgICAgIyMgCiAgICAgICAjICAgICAgICMgICAgICAgIyAgIyAgICAjICAjICMgIyAgICAjICAgIyAgICAgICAgICAgIyAgICAjICMgICAgICAjMTI1IyAgIyAgICAgICAgICMgIAogIyAgICAgIyAjICAgICAjICMgICAgICMgICMgICAgIyAgICMjICMgICAgIyAgICMgICAgICAgICAgICMgICAgIyAjICAgICAgIyAgICMgICMgICAgICAgICAjICAKICAjIyMjIyAgICMjIyMjICAgIyMjIyMgICAgIyMjICMgICAgIyAgIyMjIyAgICAjICAgICAgICAgICAjICAgICMgIyMjIyMjICMgICAgIyAjIyMjIyMgIyMjICAgCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMjIyMjIyMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIA== |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Encoding Train | ||
|
||
I wanted to learn encodings, so I changed my e-mail password then I encoded it 20 times with base16, base32, base64 and base85, every time using a random one. | ||
Now I don't know how to get it back and I really need access to my e-mail. Can you take a look? |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import base64 | ||
flag = open("my_encodings.txt","rb").read() | ||
|
||
def isBase64(s): | ||
try: | ||
return base64.b64encode(base64.b64decode(s)) == s | ||
except Exception: | ||
return False | ||
|
||
def isBase32(s): | ||
try: | ||
return base64.b32encode(base64.b32decode(s)) == s | ||
except Exception: | ||
return False | ||
def isBase16(s): | ||
try: | ||
return base64.b16encode(base64.b16decode(s)) == s | ||
except Exception: | ||
return False | ||
def isBase85(s): | ||
try: | ||
return base64.b85encode(base64.b85decode(s)) == s | ||
except Exception: | ||
return False | ||
|
||
while b'SSS{' not in flag: | ||
if isBase16(flag): | ||
flag = base64.b16decode(flag) | ||
elif isBase32(flag): | ||
flag = base64.b32decode(flag) | ||
elif isBase64(flag): | ||
flag = base64.b64decode(flag) | ||
elif isBase85(flag): | ||
flag = base64.b85decode(flag) | ||
print(flag) |
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Froggified | ||
|
||
My human friend got turned into a frog and he's been trying to base his communication with me through a computer, but he couldn't really work his way with Linux and lost the message in all those 0s and 1s. | ||
He jumps around angry on my desk, can you help me find his message? |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Froggified: Solution | ||
|
||
First, we observe that we have `binary` text, so we decode it and get the fake flag. | ||
Next up, we see that the text looks like `ASCII`, so we decode it as `decimal`, thus getting the next fake flag. | ||
After that, we see that all digits higher than 7 are gone, so we immediately think of `octal`, which will get us to the next fake flag. | ||
We see a lot of `0x`s, so we know to decode from `hexadecimal` and we get to the last fake flag. | ||
In the end, we observe the text is encoding in `Base64`, so we decode it to get the real flag. |
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,4 @@ | ||||||||||
# Infinity Hashes | ||||||||||
|
||||||||||
Thanos had 6 Infinity Stones, but I have something better: 6 Infinity Hashes, that store the truth of the whole universe. | ||||||||||
Wrap it with `SSS{}` and `_` instead of space, and spread it with everyone! | ||||||||||
Comment on lines
+3
to
+4
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Infinity Hashes: Solution | ||
|
||
Use any tool (like [Crackstation](https://crackstation.net/)) to crack the hashes. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
9dee45a24efffc78483a02cfcfd83433 7d7764888dc13da6436666cd97043a5ead014596cd9aef6d12f95a5602d15993beebd44b92014732c340239e7cd1f07c5ed4035d1a02151916e94d43ffa55cfa 54c37762351ac25b7e86477d8a078faf347fcfabdf044c4376d0728ce6cdcfe4801e0e1bcdbc0e2d71042fbf20779b2dd580975ac6c59d157250d0ce398f7c69 2b167392676cab961cb5d2ed5633ae05 db3d405b10675998c030223177d42e71b4e7a312 9ed1515819dec61fd361d5fdabb57f41ecce1a5fe1fe263b98c0d6943b9b232e |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly there was no need for this image to be |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
# Character Encoding | ||
|
||
## ASCII | ||
|
||
ASCII (American Standard Code for Information Interchange): | ||
Going from 0 - 127 | ||
|
||
```text | ||
DEC HEX ASCII DEC HEX ASCII DEC HEX ASCII DEC HEX ASCII DEC HEX ASCII | ||
0 00 NUL 26 1A SUB 52 34 4 78 4E N 104 68 h | ||
1 01 SOH 27 1B ESC 53 35 5 79 4F O 105 69 i | ||
2 02 STX 28 1C FS 54 36 6 80 50 P 106 6A j | ||
3 03 ETX 29 1D GS 55 37 7 81 51 Q 107 6B k | ||
4 04 EOT 30 1E RS 56 38 8 82 52 R 108 6C l | ||
5 05 ENQ 31 1F US 57 39 9 83 53 S 109 6D m | ||
6 06 ACK 32 20 SPACE 58 3A : 84 54 T 110 6E n | ||
7 07 BEL 33 21 ! 59 3B ; 85 55 U 111 6F o | ||
8 08 BS 34 22 " 60 3C < 86 56 V 112 70 p | ||
9 09 HT 35 23 # 61 3D = 87 57 W 113 71 q | ||
10 0A LF 36 24 $ 62 3E > 88 58 X 114 72 r | ||
11 0B VT 37 25 % 63 3F ? 89 59 Y 115 73 s | ||
12 0C FF 38 26 & 64 40 @ 90 5A Z 116 74 t | ||
13 0D CR 39 27 ' 65 41 A 91 5B [ 117 75 u | ||
14 0E SO 40 28 ( 66 42 B 92 5C \ 118 76 v | ||
15 0F SI 41 29 ) 67 43 C 93 5D ] 119 77 w | ||
16 10 DLE 42 2A * 68 44 D 94 5E ^ 120 78 x | ||
17 11 DC1 43 2B + 69 45 E 95 5F _ 121 79 y | ||
18 12 DC2 44 2C , 70 46 F 96 60 ` 122 7A z | ||
19 13 DC3 45 2D - 71 47 G 97 61 a 123 7B { | ||
20 14 DC4 46 2E . 72 48 H 98 62 b 124 7C | | ||
21 15 NAK 47 2F / 73 49 I 99 63 c 125 7D } | ||
22 16 SYN 48 30 0 74 4A J 100 64 d 126 7E ~ | ||
23 17 ETB 49 31 1 75 4B K 101 65 e 127 7F | ||
24 18 CAN 50 32 2 76 4C L 102 66 f | ||
25 19 EM 51 33 3 77 4D M 103 67 g | ||
``` | ||
|
||
We can see that by adding 32 to an uppercase letter, we get that same letter in lowercase: | ||
e.g. `E + 32 = e` | ||
|
||
Below, you can see the built-in Python functions `ord` and `chr` that help us determine what character coresponds to a certain ASCII code and what ASCII code a character has. | ||
|
||
```py | ||
>>> ord('E') | ||
69 | ||
>>> chr(69) | ||
'E' | ||
>>> chr(ord('E') + 32) | ||
'e' | ||
>>> chr(ord('e') - 32) | ||
'E' | ||
>>> | ||
``` | ||
|
||
In terms of storage efficiency, we can encode | ||
`UTF-8` for ASCII text (English and other Western languages) | ||
`UTF-16` for non-ASCII text (Chinese and other Asian languages) | ||
|
||
Let's say we have a string in Chinese. With Python, we can get the hex bytes of the string, using the built-in function `str.encode()`: | ||
|
||
```py | ||
Str = ("老板") | ||
print(Str) | ||
|
||
Str = (("老板").encode("utf-8")) | ||
print(Str) | ||
``` | ||
|
||
The output will be: | ||
|
||
```text | ||
老板 | ||
b'\xe8\x80\x81\xe6\x9d\xbf' | ||
``` | ||
|
||
If we want to get back to the original string, we will execute: | ||
|
||
```py | ||
print((Str.decode())) | ||
``` | ||
|
||
And we will get: | ||
|
||
```text | ||
老板 | ||
``` | ||
|
||
## Base64 | ||
|
||
Base64 is a way of representing binary data in sequences of 24 bits (3 bytes) that can be represented by 4 Base64 digits. | ||
|
||
Base64 Encoding Table: | ||
|
||
```text | ||
Index Char Index Char Index Char Index Char | ||
0 A 16 Q 32 g 48 w | ||
1 B 17 R 33 h 49 x | ||
2 C 18 S 34 i 50 y | ||
3 D 19 T 35 j 51 z | ||
4 E 20 U 36 k 52 0 | ||
5 F 21 V 37 l 53 1 | ||
6 G 22 W 38 m 54 2 | ||
7 H 23 X 39 n 55 3 | ||
8 I 24 Y 40 o 56 4 | ||
9 J 25 Z 41 p 57 5 | ||
10 K 26 a 42 q 58 6 | ||
11 L 27 b 43 r 59 7 | ||
12 M 28 c 44 s 60 8 | ||
13 N 29 d 45 t 61 9 | ||
14 O 30 e 46 u 62 + | ||
15 P 31 f 47 v 63 / | ||
``` | ||
|
||
If we want to convert to and from Base64 in Python we can use the `base64` module: | ||
|
||
```py | ||
import base64 | ||
|
||
message = "Some random message" | ||
message_bytes = message.encode('ascii') # We transform the string to "b'Some random message'", making it a sequence of bytes | ||
base64_bytes = base64.b64encode(message_bytes) | ||
base64_message = base64_bytes.decode('ascii') # We do the opposite of the previous process, now eliminating "b''", to make it a string | ||
|
||
print(base64_message) | ||
``` | ||
|
||
Thus getting us to the base64 string: | ||
|
||
```text | ||
U29tZSByYW5kb20gbWVzc2FnZQ== | ||
``` | ||
|
||
If we would want to decode it, we would have to simply revert the commands as follows: | ||
|
||
```py | ||
import base64 | ||
|
||
base64_message = 'U29tZSByYW5kb20gbWVzc2FnZQ==' | ||
base64_bytes = base64_message.encode('ascii') | ||
message_bytes = base64.b64decode(base64_bytes) | ||
message = message_bytes.decode('ascii') | ||
|
||
print(message) | ||
``` | ||
|
||
Which will get us back to: | ||
|
||
```text | ||
Some random message | ||
``` | ||
|
||
As you can see, some Base64 strings have "=" at the end, some have "==" and others have nothing uncommon. | ||
Since Base64 represents binary data in 3 bytes, we should also know how to treat the case when the length is not divisible by 3. | ||
As a consequence, there is output padding for Base64 as follows: | ||
|
||
```text | ||
length % 3 = 1 => "==" | ||
length % 3 = 2 => "=" | ||
length % 3 = 0 => no padding | ||
``` | ||
|
||
The following can be used to better understand output padding: | ||
|
||
```py | ||
import base64 | ||
|
||
text1 = "SecuritySummmerSch" | ||
text2 = "SecuritySummmerScho" | ||
text3 = "SecuritySummmerSchoo" | ||
text4 = "SecuritySummmerSchool" | ||
|
||
b64_text1 = (base64.b64encode(text1.encode('ascii'))).decode('ascii') | ||
b64_text2 = (base64.b64encode(text2.encode('ascii'))).decode('ascii') | ||
b64_text3 = (base64.b64encode(text3.encode('ascii'))).decode('ascii') | ||
b64_text4 = (base64.b64encode(text4.encode('ascii'))).decode('ascii') | ||
|
||
print(f"Plain text\t\tPT length\tBase64 text\t\t\tB64 length") | ||
print(f"{text1}\t{len(text1)}\t\t{b64_text1}\t{len(b64_text1)}") | ||
print(f"{text2}\t{len(text2)}\t\t{b64_text2}\t{len(b64_text2)}") | ||
print(f"{text3}\t{len(text3)}\t\t{b64_text3}\t{len(b64_text3)}") | ||
print(f"{text4}\t{len(text4)}\t\t{b64_text4}\t{len(b64_text4)}") | ||
``` | ||
|
||
We get: | ||
|
||
```html | ||
Plain text PT length Base64 text B64 length | ||
SecuritySummmerSch 18 U2VjdXJpdHlTdW1tbWVyU2No 24 | ||
SecuritySummmerScho 19 U2VjdXJpdHlTdW1tbWVyU2Nobw== 28 | ||
SecuritySummmerSchoo 20 U2VjdXJpdHlTdW1tbWVyU2Nob28= 28 | ||
SecuritySummmerSchool 21 U2VjdXJpdHlTdW1tbWVyU2Nob29s 28 | ||
``` | ||
|
||
Try decoding yourself! | ||
|
||
```text | ||
SGVsbG8gZnJvbSB0aGUgRWFydGgtNjQgIQ== | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use one sentence per line.