Digital Information

Understanding how digital information is encoded, processed, and transmitted is essential for working with computers and programming languages like Python. By grasping these principles, you'll gain a solid foundation for understanding how computers process data and interact with the digital world.

Number Systems


While we humans primarily use the decimal number system (base 10), computers fundamentally operate on binary numbers (base 2). However, understanding and manipulating long sequences of 0s and 1s can be cumbersome for humans. To bridge this gap, programmers often utilize other number systems: octal (base 8) and hexadecimal (base 16).

Decimal (Base 10)

  • The everyday number system we use.
  • Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
  • Example: 123

Binary (Base 2)

  • Used by computers to represent data.
  • Digits: 0, 1
  • Example: 1101
  • To represent binary numbers in Python, use the 0b prefix.

Octal (Base 8)

  • Sometimes used in older systems.
  • Digits: 0, 1, 2, 3, 4, 5, 6, 7
  • Example: 17
  • To represent octal numbers in Python, use the 0o prefix.

Hexadecimal (Base 16)

  • Commonly used in computer programming, especially for representing colors and memory addresses.
  • Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
  • Example: A3
  • To represent hexadecimal numbers in Python, use the 0x prefix.
## NUMBER SYSTEM EXAMPLES ##
# Print command displays all types in decimal
decimal_number = 42
print(decimal_number)

binary_number = 0b1101
print(binary_number) # Prints in de

octal_number = 0o17
print(octal_number)

hexadecimal_number = 0xA3
print(hexadecimal_number)

print(bin(decimal_number)) # Convert Decimal to Binary with bin()

print(oct(decimal_number)) # Convert Decimal to Binary with oct()

print(hex(decimal_number)) # Convert Decimal to Binary with hex()
42
13
15
163
0b101010
0o52
0x2a

While decimal numbers are our default, understanding and using binary, octal, and hexadecimal number systems in Python empowers programmers to work more effectively with computers at a fundamental level. These number systems are crucial for tasks ranging from low-level system programming to high-level applications that involve data manipulation, color representation, and network communication.

Encoding Text


In the realm of computer science, text isn't inherently understood. It needs to be transformed into a format that computers can process: binary code. This process is known as encoding.

Why is Encoding Necessary?

  • Universal Representation: Different characters and symbols from various languages need a standardized way to be represented.
  • Storage and Transmission: Encoded text can be efficiently stored and transmitted over digital networks.
  • Computer Processing: Computers can only understand binary data, so text must be converted into binary form before it can be processed.

Common Python Encodings:

ASCII

  • A 7-bit encoding standard for English characters.
  • Limited to 128 characters, making it unsuitable for many languages.

UTF-8

  • A variable-length encoding scheme that can represent all characters in the Unicode standard.
  • Widely used due to its efficiency and compatibility with ASCII.

UTF-16

  • A fixed-width 16-bit encoding scheme for Unicode characters.
  • More efficient for languages with a large character set, but less efficient for English text.

Latin-1

  • An 8-bit encoding scheme for Western European languages.
  • Limited to 256 characters.
## ENCODING & DECODING EXAMPLES ##
text = "Hello, world!"
encoded_text = text.encode('utf-8')  # Encodes the text in UTF-8
print(encoded_text)

decoded_text = encoded_text.decode('utf-8')
print(decoded_text)
b'Hello, world!'
Hello, World!

Encoded text, represented as a sequence of bytes, is the foundation for digital information. It enables the storage and transmission of text data in various file formats and over networks. By understanding the encoding format, character set, and byte order, we can process and analyze text, extract meaning, and apply machine learning techniques. Additionally, encoding plays a crucial role in securing sensitive information through encryption and digital signatures. It powers web development, software development, and data visualization, making it an essential component of modern computing.

Hexadecimal


Hexadecimal numbers, which computers use to input and output data, offer a more concise and human-readable representation of binary data, which is the fundamental language for computers to process data. By grouping binary digits into sets of four, one hexadecimal character reduces the number of digits needed to represent a given value. Furthemore, grouping binary digits into sets of 8 can be done using two hexadecimal characters, which we often use to represent the decimal value range from 0 to 255, RGB values, and the ascii table.

HEX BIN HEX BIN
0 0000 8 1000
1 0001 9 1001
2 0010 A 1010
3 0011 B 1011
4 0100 C 1100
5 0101 D 1101
6 0110 E 1110
7 0111 F 1111

This Python code generates an ASCII table, displaying each character, its decimal, octal, hexadecimal, and binary representations.

## ASCII TABLE ##
print("CHR\tDEC\tOCT\tHEX\tBIN")
for i in range(0,256):
	print(f"{chr(i)}\t{i}\t{oct(i)}\t{hex(i)}\t{bin(i)}")

Pixel Images


A pixel image is a digital image composed of tiny squares called pixels. Each pixel has a specific color, and the arrangement of these colored pixels creates the image we see. Common image formats like PNG, JPEG, and GIF use this pixel-based approach.

Color Formatting

There exist many ways to represent colors with computers but the most common two are RGB and HEX. It is important to note that the HEX format is just an encoding of the the RGB values, not an entierly different format such as HSV.

  • RGB: Stands for Red, Green, Blue. It's a color model that uses three values, each ranging from 0 to 255, to represent a color.
  • Hexadecimal: A 6-digit hexadecimal number, where each pair of digits represents the intensity of Red, Green, and Blue, respectively.



## RGB/HEX CONVERSIONS ##
def rgb_to_hex(r, g, b):
  hex_color = "#"
  for color_value in (r, g, b):
    hex_value = hex(color_value)[2:]
    hex_color += hex_value.zfill(2)
  return hex_color

def hex_to_rgb(hex_string):
	r = int(hex_string[1:3], 16)
	g = int(hex_string[3:5], 16)
	b = int(hex_string[5:], 16)
	return (r, g, b)

rgb_color = (102, 51, 153)
hex_color = rgb_to_hex(*rgb_color)
print(hex_color)

hex_color = "#663399"
rgb_color = hex_to_rgb(hex_color)
print(rgb_color)
#663399
(102, 51, 153)

Image Manipulation


Image manipulation refers to the process of modifying or altering an image's content or appearance. This can involve a wide range of techniques, from simple adjustments like cropping and resizing to more complex operations like color correction, filtering, and object detection.

Leveraging PIL

Python Imaging Library (PIL) is a popular Python library for opening, manipulating, and saving images. It provides a user-friendly interface to perform various image processing tasks. Here are some common image manipulation techniques you can achieve with PIL.

Note: All these examples require we import PIL.

from PIL import Image

Opening and saving

img = Image.open('open_file_path.jpg') # Open the image
img.show()  # Display the image
img.save('save_file_path.jpg') # Save the image

Resizing

resized_img = img.resize((200, 200))

Cropping

cropped_img = img.crop((100, 100, 300, 300))  # (left, top, right, bottom)

Rotating

rotated_img = img.rotate(90)
## BLACK & WHITE IMAGE FILTER ##
from PIL import Image

FILE_PATH = "ric_n_morty_2024.png"
img = Image.open(FILE_PATH)

pxls = img.load()
for y in range(img.height):
	for x in range(img.width):
		pxl = img.getpixel((x, y))
		grey = (pxl[0]+pxl[1]+pxl[2])//3
		pxls[x,y] = (grey,grey,grey)
		
img.show()
img.save("mod_"+FILE_PATH)

Data Compression


Data compression is a technique used to reduce the size of data files. This is achieved by identifying and eliminating redundancy in the data. By doing so, we can store and transmit data more efficiently, saving storage space and reducing transmission time. The two primary methods of data compression are Lossless and Lossy.

Lossless Compression

  • This method reduces file size without losing any of the original data.
  • It works by identifying patterns in the data and replacing them with shorter codes.
  • Common lossless compression algorithms include:
    • Huffman Coding: Assigns shorter codes to more frequent symbols and longer codes to less frequent ones.
    • Lempel-Ziv-Welch (LZW): Replaces repeated sequences of data with shorter codes.

Lossy Compression

  • This method reduces file size by discarding some of the original data.
  • It's suitable for data types like images and audio, where a slight loss in quality is often imperceptible.
  • Common lossy compression algorithms include:
    • JPEG: Reduces the color depth and spatial resolution of images.
    • MP3: Reduces the number of audio samples and discards inaudible frequencies.

Python offers several built-in modules and libraries for data compression, making it a powerful tool for handling large datasets and optimizing storage and transmission.

Common Compression Modules:

  • gzip: Provides functions for compressing and decompressing files using the gzip format.
  • bz2: Offers functions for compressing and decompressing files using the bzip2 format.
  • lzma: Supports LZMA compression and decompression.
  • zipfile: Allows you to create, read, write, append, and list ZIP archives.
## GZIP COMPRESSION EXAMPLE ##
import gzip

with open('original_file.txt', 'rb') as f_in, gzip.open('compressed_file.gz', 'wb') as f_out:
    f_out.writelines(f_in)

Cryptography


Cryptography is the practice and study of techniques for secure communication in the presence of third parties called adversaries. It involves transforming information or data into a form that is unreadable to anyone except those possessing the secret key.

Key Points:

  • Encryption: The process of converting plain text (readable data) into ciphertext (unreadable data).
  • Decryption: The reverse process of converting ciphertext back into plain text.
  • Cipher: An algorithm used for encryption and decryption.
  • Key: A secret piece of information used in encryption and decryption.

Types of Cryptography:

Symmetric-Key Cryptography

  • Uses a single key for both encryption and decryption.
  • Examples: AES, DES, 3DES

Public-Key Cryptography

  • Uses a pair of keys: a public key for encryption and a private key for decryption.
  • Examples: RSA, DSA, ECC

Key Considerations:

  • Key Strength: A strong key is essential for secure encryption.
  • Algorithm Choice: Select a robust and secure cryptographic algorithm.
  • Implementation Security: Avoid common vulnerabilities like weak key generation, insecure random number generation, and implementation errors.
  • Secure Key Storage and Distribution: Protect keys from unauthorized access.

Fernet is a symmetric encryption scheme built on top of AES with a 128-bit key. It is a good choice for general-purpose encryption tasks.

## FERNET ENCRYPTION/DECRYPTION EXAMPLE ##
from cryptography.fernet import Fernet

def encrypt(msg, key):
    cipher = Fernet(key)
    ciphertext = cipher.encrypt(msg.encode())
    return ciphertext   

def decrypt(enc_msg, key):
    cipher = Fernet(key)
    plaintext = cipher.decrypt(enc_msg)
    return plaintext.decode()

message = "Hello, World!"

# Generate a key
key = Fernet.generate_key()
print("Key:\n", key)

# Encrypt the message
ciphertext = encrypt(message, key)
print("Encrypted Message:\n", ciphertext)

# Decrypt the message
decrypted_message = decrypt(ciphertext, key)
print("Decrypted Message:\n", decrypted_message)