By convention sweet is sweet, bitter is bitter, hot is hot, cold is cold, color is color; but in truth there are only atoms and the void. — Democritus
import time from flask import Flask, request app = Flask(__name__) SECRET_TOKEN = "hacky" @app.route("/") def protected(): token = request.headers.get("X-TOKEN") if not token: return "Missing token", 401 if strcmp(token, SECRET_TOKEN): return "Hello admin user! Here is your secret content" else: return "WHO ARE YOU? GET OUT!", 403
def strcmp(s1, s2): if len(s1) != len(s2): return False for c1, c2 in zip(s1, s2): if c1 != c2: return False time.sleep(0.01) return True if __name__ == "__main__": app.run()
$ python3 hack.py Trying to find the character at position 1 with prefix '' ..........................Found character at position 1: 'h' Median: 0.014210520494088996 Max: 0.014913284991052933 Min: 0.011371066997526214 Stddev: 0.0009157040256580121 Following characters were: Character: 'o' Median: 0.0038346719957189634 Max: 0.004211759005556814 Min: 0.0018405050068395212 Stddev: 0.0005487564034908333 (73% slower) Character: 'i' Median: 0.0037917059962637722 Max: 0.004303172987420112 Min: 0.0014267200022004545 Stddev: 0.0006530132317953015 (73% slower) Character: 'j' Median: 0.0037598100025206804 Max: 0.004142267003771849 Min: 0.0015765940042911097 Stddev: 0.0007008104778241527 (73% slower) [snip…] Trying to find the character at position 5 with prefix 'hack' ......................... The token is: 'hacky' !!!!!!!!!!
C7Re8J81k
CTF Team at UTEC
Best (and only) peruvian team at CTF Time
Alvaro
CS graduate
Cybersecurity speciality
To be decided:
A program is a file with instructions for our CPU and OS.
When executed, the OS creates a process (with some state) with the instructions and the CPU runs them.
A command is some executable available in the system.
Running a command means that the OS starts a process with the executable code and passes the corresponding arguments and environment.
NOTE: actually, Bash has some commands that are in fact not system executables (e.g., aliases, builtins)
Open a terminal and type the command, arguments are specified after the command name. Commands can be specified specifying a path or be unqualified. If specified by path, the location must hold the file to be executed.
NOTE: Any program can invoke other program, Bash is just another program
print_args.py
#!/usr/bin/env python3
import sys
for i, a in enumerate(sys.argv):
print(f'arg {i}: {a}')
This python script prints all arguments given.
$ ./print_args.py arg 0: ./print_args.py $ $PWD/print_args.py arg 0: /tmp/tmp.x343BPOn09/print_args.py
The first argument passed to a command (implicitly) is the executable name used to invoke it.
Passing words after the command are arguments to the command.
$ ./print_args.py A B C arg 0: ./print_args.py arg 1: A arg 2: B arg 3: C
$IFSNOTE: The characters in “$IFS” are used to separate the arguments.
$ echo "${IFS@Q}"
$' \t\n'
As $IFS characters separate words, arguments using such characters must be quoted ("" or '') or escaped (\).
$ ./print_args.py "A B" C D arg 0: ./print_args.py arg 1: A B arg 2: C arg 3: D $ ./print_args.py A\ B C D arg 0: ./print_args.py arg 1: A B arg 2: C arg 3: D
It’s up to the command to interpret as whatever they need. E.g., they must resolve the paths to files, convert the strings to integers, etc.
sum_two.py
#!/usr/bin/env python import sys print(sys.argv[1] + sys.argv[2])
$ ./sum_two.py 1 2 12
Python summed the strings and the result is the concatenation.
If we wanted the integer sum we need to convert them to the datatype.
Similar operations should be done in other languages
sum_two_fixed.py
#!/usr/bin/env python import sys print(int(sys.argv[1]) + int(sys.argv[2]))
$ ./sum_two_fixed.py 1 2 3
Paths to files can be specified in relative or absolute form.
Specify the same location anywhere in the system. Start with /
Examples:
$ cat /etc/passwd $ cat /proc/1/environ
Start with ./, ../, or with the name of a file/directory.
Examples:
$ cat ./file.txt $ cat file.txt # Same as last one $ cat ../file_in_parent.txt
They are resolved relative to the shell present working directory (PWD).
$PWDEach Linux process (e.g., Bash) has a PWD associated to it by the OS.
This can be changed by the process itself (OS: chdir syscall).
Bash uses cd builtin (e.g., cd some/path). C programs would do the chdir
syscall. Python has os.chdir().
Note: Target path can be relative or absolute.
In [1]: import os
In [2]: os.getcwd()
Out[2]: '/home/hacker'
In [3]: os.chdir("Desktop")
In [4]: os.getcwd()
Out[4]: '/home/hacker/Desktop'
echoDisplay text
echo STRING
$ echo 'Hello, world' Hello, world
pwdPrint the name of the working directory
$ pwd /home/hacker
cdChange the working directory (also called CWD, current working directory)
cd DIR
$ pwd /home/hacker $ cd Desktop/ $ pwd /home/hacker/Desktop $ cd /etc/systemd/ $ pwd /etc/systemd
Note: This is not actually an external executable but a bash builtin that changes the process’ CWD
lsList directory contents
ls [DIR]
$ ls A B C $ ls A file_A_001.txt file_A_002.txt file_A_003.txt file_A_004.txt file_A_005.txt
catConcatenate file contents
cat FILE1 FILE2
$ cat hello.txt Hello, world! $ cat hello.txt hello.txt Hello, world! Hello, world! $ cat hello_tab.txt Hello there, friend $ cat -A hello_tab.txt Hello^Ithere,^Ifriend$
sshLog onto remote systems and get a shell or run commands
$ ssh hacker@dojo.pwn.college -p 22 Connected! hacker@reverse-engineering~the-patch-directive:~$ id uid=1000(hacker) gid=1000(hacker) groups=1000(hacker)
$ ssh hacker@dojo.pwn.college -p 22 id uid=1000(hacker) gid=1000(hacker) groups=1000(hacker)
NOTE: sftp can be used to transfer files
ncDo TCP or UDP connections or listens
$ nc 127.0.0.1 31337 Hello hacker
Commands print their output to the terminal (STDOUT, STDERR), the output can be redirected to a file or to the input of other commands (STDIN).
COMMAND >FILE
$ echo 'Hello, world!' >hello.txt $ cat hello.txt Hello, world!
COMMAND1 | COMMAND2
$ echo 'Hello, world!' | rev !dlrow ,olleH
NOTE: rev reverses the text line per line.
Programs are usually documented in many places.
You can STFW (search the friendly web), RTFM (read the fine manual) or RTFS (read the fabulous source).
--help
$ cat --help
Usually a summarized version
man
$ man cat
CAT(1) User Commands CAT(1)
NAME
cat - concatenate files and print on the standard output
SYNOPSIS
cat [OPTION]... [FILE]...
DESCRIPTION
Concatenate FILE(s) to standard output.
With no FILE, or when FILE is -, read standard input.
-A, --show-all
equivalent to -vET
More complete
info
$ info cat
File: coreutils.info, Node: cat invocation, Next: tac invocation, Up: Output of entire files
3.1 ‘cat’: Concatenate and write files
======================================
‘cat’ copies each FILE (‘-’ means standard input), or standard input if
none are given, to standard output. Synopsis:
cat [OPTION] [FILE]...
The program accepts the following options. Also see *note Common
options::.
‘-A’
‘--show-all’
Equivalent to ‘-vET’.
Sometimes even more complete, especially for GNU programs
tldr
Shows some common variations of commands
$ tldr cat
cat
Print and concatenate files.
More information: https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html.
- Print the contents of a file to `stdout`:
cat path/to/file
- Concatenate several files into an output file:
cat path/to/file1 path/to/file2 ... > path/to/output_file
- Append several files to an output file:
cat path/to/file1 path/to/file2 ... >> path/to/output_file
...
Available online at https://tldr.inbrowser.app/
fileDetermine file type
file FILE
$ file hello.c hello.c: C source, ASCII text $ file hello.out hello.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/zdpby3l6azi78sl83cpad2qjpfj25aqx-glibc-2.40-66/lib/ld-linux-x86-64.so.2, for GNU/Linux 3.10.0, not stripped
findFind files in a directory
find PATH -name NAME
$ find . -name secret.txt ./some/arbitrarily/nested/path/secret.txt
Note: Many, many useful options, check tldr
grepFind patterns in files
grep PATTERN FILE
$ grep -i "flag" dump.txt
flag{secret_value}
$ cat dump.txt | grep -i "flag"
flag{secret_value}
NOTE: Check regular expressions and grep -P
NOTE: quite a lot of commands that take a FILE argument can also read from STDIN.
stringsPrint the sequences of printable characters in files
strings FILE
$ strings hello.out /nix/store/zdpby3l6azi78sl83cpad2qjpfj25aqx-glibc-2.40-66/lib/ld-linux-x86-64.so.2 __printf_chk __libc_start_main libc.so.6 GLIBC_2.3.4 GLIBC_2.34 /nix/store/2651x1rpsl18jlwg5a38vrpv5n0yfv0g-shell/lib:/nix/store/zdpby3l6azi78sl83cpad2qjpfj25aqx-glibc-2.40-66/lib:/nix/store/bmi5znnqk4kg2grkrhk6py0irc8phf6l-gcc-14.2.1.20250322-lib/lib __gmon_start__ PTE1 Hello, world! …
base64Encode or decode FILE to/from base64
base64 FILE
base64 -d FILE
$ echo 'Hello, world' | base64 SGVsbG8sIHdvcmxkCg== $ echo 'SGVsbG8sIHdvcmxkCg==' | base64 -d Hello, world
xxdMake a hexdump or reverse it
$ echo 'Hello, world' | xxd 00000000: 4865 6c6c 6f2c 2077 6f72 6c64 0a Hello, world. $ echo 'Hello, world' | xxd | xxd -r Hello, world
nanoEdit text files in terminal
nano FILE
CTRL-x exits
sudoSoon™
Computers only work on bits (0 or 1), any value we may wish to operate on must be represented by a sequence of 0s and 1s.
How may we represent numbers?
We use binary representation:
This means that any number we may want to operate on will be represented by sums of power of two. The computer carries normal arithmetic operations on them.
Handling binary literals in code is a handful, we need one character for each digit and \(1 + \log_2 n\) digits total.
Representing a C int (32 bits) would look like:
10111000000001101100001100100001
We can use hex digits to make the representation more concise.
Each hex digit represents 4 bits (called a nibble), half a byte.
We count from 0 to 9 and then use A to F for the following digits (10 to 15).
| Binary | Hex | Decimal | Binary | Hex | Decimal |
|---|---|---|---|---|---|
| 0b0000 | 0x0 | 0 | 0b1000 | 0x8 | 8 |
| 0b0001 | 0x1 | 1 | 0b1001 | 0x9 | 9 |
| 0b0010 | 0x2 | 2 | 0b1010 | 0xA | 10 |
| 0b0011 | 0x3 | 3 | 0b1011 | 0xB | 11 |
| 0b0100 | 0x4 | 4 | 0b1100 | 0xC | 12 |
| 0b0101 | 0x5 | 5 | 0b1101 | 0xD | 13 |
| 0b0110 | 0x6 | 6 | 0b1110 | 0xE | 14 |
| 0b0111 | 0x7 | 7 | 0b1111 | 0xF | 15 |
Note: For notation purposes, we prefix binary literals by 0b and hex literals by 0x
Note: Two hex digits make a byte
All previous notations can be used to represent numbers in many programming languages.
i1 = 10 i2 = 0b10 i3 = 0x10 (i1, i2, i3) # (10, 2, 16)
If we are converting from string we can specify base:
int('0b1010', 2) # 10 int('0xA', 16) # 10
Note: The prefix may be skipped here
We can also convert to some representation
bin(10) # '0b1010' hex(10) # '0xa'
We also want to represent negative numbers, for implementation reasons signed numbers (numbers that may be positive or integer) are represented in two’s complement format.
The representation of a given \(-i\) using \(N\) bits is the representation of unsigned \(2^N - i\).
To represent \( -7 \) using \( 8 \) bits:
\[ 2^8 - 7 = 256 - 7 = 249 = \mathrm{0b11111001} \]
So -7 using 8 bits is 0b11111001.
Note: We need to define an integer size (in bits) to determine the binary representation.
An easier way to compute the two’s complement is to invert all bits in the binary representation and add 1 to the result.
So, to compute the two’s complement 8-bit representation of -7, we use the representation of 7 = \(0b00000111\).
Negate: not 0b00000111 = 0b11111000
Add 1: 0b11111000 + 1 = 0b11111001
Same result as before.
The first bit in a signed two’s complement integer indicates the sign, all negative numbers have first bit 1 while all positive numbers (and zero) have 0.
Using 8 bits:
0 = 0b00000000 = 0x00
1 = 0b00000001 = 0x01
127 = 0b01111111 = 0x7F
-1 = 0b11111111 = 0xFF
-128 = 0b10000000 = 0x80
x86-64 has memory addressable memory, we then have a choice of how store multi-byte values in memory.
If we want to store a C int (4 bytes) 0xAABBCCDD (\(2864434397_{10}\)) into 4 bytes of memory at ptr[0], ptr[1], ptr[2], ptr[3], we could do:
| ptr[0] | ptr[1] | ptr[2] | ptr[3] | |
|---|---|---|---|---|
| Little-endian | 0xDD | 0xCC | 0xBB | 0xAA |
| Big-endian | 0xAA | 0xBB | 0xCC | 0xDD |
That is, little-endian stores the last (least significant) byte first in memory, while big-endian stores the first (most significant) byte first.
Despite big-endian seeming more reasonable, x86-64 use little-endian
Using cling (C++ interpreter)
[cling]$ #include <stdint.h>
[cling]$ uint8_t arr[4];
[cling]$ arr
(uint8_t [4]) { '0x00', '0x00', '0x00', '0x00' }
[cling]$ int* int_p = (int*) arr;
[cling]$ *int_p = 0xAABBCCDD
(int) -1430532899
[cling]$ arr
(uint8_t [4]) { '0xdd', '0xcc', '0xbb', '0xaa' }
[cling]$ *int_p = 255
(int) 255
[cling]$ arr
(uint8_t [4]) { '0xff', '0x00', '0x00', '0x00' }
We may also wish to represent characters. As we only have bits to work with, we must associate (that is, encode) bit patterns to some characters.
ASCII (American Standard Code for Information Interchange) is the oldest standard character encoding (1960s).
It uses 7 bits to map letters (a-zA-Z), digits (0-9), symbols (e.g., @#$%^&), space (' ') and control characters (e.g., LF: new line, HT: horizontal tab).
Notably:
0x0A: ’\n’ - line feed, newline
0x20: ’ ’ - space
0x30-0x39: 0, 1, 2, …, 9
0x41-0x5A: A, B, C, …, Z
0x61-0x7A: a, b, c, …, z
Full table: man ascii
ASCII uses only 7 bits, so bytes in the range 0x80-0xFF can not be interpreted as ASCII.
In [1]: b'\x80'.decode('ascii')
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
Cell In[1], line 1
----> 1 b'\x80'.decode('ascii')
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
ordinal not in range(128)
Note: There are some ASCII extensions that use 8 bits instead, but none standard
In low level languages (C, ASM), strings are just a sequence of encoded characters in memory.
The string Hello would look like 0x48656c6c6f. To indicate where the string ends, a special terminator byte (0x00: NUL) is used. So Hello would really be stored like 0x48656c6c6f00.
Note: Other means of indicating string lengths are used by other languages
ASCII can only encode 128 characters (7 bits), not enough for modern times.
Can support all characters in Unicode (Chinese, Arabic, emojis, etc.).
Backward compatible with ASCII, that is, all ASCII encodings represent the same characters in UTF-8.
Variable-length:
✓ = 0xE29C93)Same as ASCII, a single byte in range 0x80–0xFF is not valid UTF-8.
In [8]: b'\x80'.decode() --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) Cell In[8], line 1 ----> 1 b'\x80'.decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
There are some other multi-byte sequences that are not valid.
Bytes are just bytes, knowing what they are representing (i.e., the encoding used) allows us to interpret them.
In [1]: s = '😤🔥🚩' In [2]: s.encode() Out[2]: b'\xf0\x9f\x98\xa4\xf0\x9f\x94\xa5\xf0\x9f\x9a\xa9' In [3]: bytes_to_long(s.encode()) Out[3]: 74469342404751707337991887529
We may want to represent binary data with ASCII
We could use hex encoding for this:
$ echo 'Hello, world' | xxd -ps 48656c6c6f2c20776f726c640a
But this doubles the bytes that we must transfer, each byte is now represented by two ASCII characters (two bytes).
Base64 is an alternative encoding that gives more compact representation. It
uses characters A–Za–z0–9+/ to represent bit strings 0b000000 to 0b111111 (in
that order).
Each Base64 character encodes 6 bits of data. Every 4 characters of Base64 encode 3 bytes of data.
As data may not be multiple of 6 bits, padding characters = are appended to indicate how many extra 2-bits were in the base64 encoding that do not correspond to encoded data.
$ printf A | base64 QQ== $ printf AA | base64 QUE= $ printf AAA | base64 QUFB $ printf AAAA | base64 QUFBQQ==
Remember: base64 (command) encodes/decodes to/from base64
There are other similar encodings (base32, base58, base85) but base64 is by far the most popular.