Skip to content

Latest commit

 

History

History
1025 lines (762 loc) · 32.5 KB

File metadata and controls

1025 lines (762 loc) · 32.5 KB

Input and Output

Fancier Output Formatting

str() and repr() Functions

Both str() and repr() are used to convert objects into string representations, but they serve different audiences and purposes:

  • str(): Designed for humans. It provides a readable and clean string suitable for display or end-user interaction.
  • repr(): Designed for developers. It provides a detailed, unambiguous representation, often including information helpful for debugging or recreating the object.

Key Differences:

  1. Audience:

    • str(): End-users, focusing on readability.
    • repr(): Developers, focusing on accuracy and detail.
  2. Purpose:

    • str(): Makes objects look "nice" and easy to understand.
    • repr(): Shows a representation that could recreate the object when passed to eval() (in most cases).

Examples:

# A simple string
s = 'Hello, world.'
print(str(s))   # Output: Hello, world.
print(repr(s))  # Output: 'Hello, world.'

# Notice how `repr()` adds quotes around the string
# and escapes special characters for clarity:
special = 'hello, world\n'
print(str(special))  # Output: hello, world
print(repr(special)) # Output: 'hello, world\\n'

# For more complex objects, the difference becomes clearer:
import datetime
now = datetime.datetime.now()
print(str(now))   # Output: 2025-01-09 12:34:56.789123 (friendly format)
print(repr(now))  # Output: datetime.datetime(2025, 1, 9, 12, 34, 56, 789123)

Usage Guidelines:

  • Use str() when you need a readable string for end-users, such as in UI or logs.
  • Use repr() when you need a precise representation for debugging, logging, or when working with developers.

Rule of Thumb:

If you implement a custom object class, define both __str__() and __repr__() methods to ensure clear and meaningful string representations for different contexts.

Formatted String Literals (f-strings)

Formatted string literals, or f-strings, are a powerful Python feature that enables embedding expressions within strings by prefixing the string with f or F. These expressions, enclosed in curly braces {}, are evaluated at runtime and inserted directly into the string. This method is concise, efficient, and offers advanced formatting capabilities.

Embedding Expressions in Strings

Any Python expression can be placed inside {} within an f-string. The expression is evaluated, and its result is included in the string.

name = "Alice"
age = 30
f"My name is {name} and I am {age} years old."
# Output: 'My name is Alice and I am 30 years old.'

Using Format Specifiers

Format specifiers control the presentation of values, such as the number of decimal places, width, or type of alignment. To use a format specifier, include a colon (:) after the expression, followed by the desired specification.

Example: Rounding values

import math
f"The value of pi is approximately {math.pi:.3f}."
# Output: 'The value of pi is approximately 3.142.'

Example: Aligning columns with minimum width

table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
for name, phone in table.items():
    print(f'{name:10} ==> {phone:10d}')
# Output:
# Sjoerd     ==>       4127
# Jack       ==>       4098
# Dcab       ==>       7678
  • {name:10} ensures the name field is at least 10 characters wide.
  • {phone:10d} ensures the phone field is a 10-character wide integer.

Value Conversion Modifiers

F-strings allow modifying values before formatting:

  • !a: Applies ascii() (produces ASCII-safe representations).
  • !s: Applies str() (converts to a string).
  • !r: Applies repr() (produces interpreter-readable representation).
animals = 'eels'
f"My hovercraft is full of {animals}."
# Output: 'My hovercraft is full of eels.'

f"My hovercraft is full of {animals!r}."
# Output: "My hovercraft is full of 'eels'."

Self-Documenting Expressions with =

The = specifier automatically expands expressions to include the expression itself, followed by its evaluated result. This is useful for debugging or self-documenting code.

bugs = 'roaches'
count = 13
area = 'living room'
f"Debugging {bugs=} {count=} {area=}"
# Output: "Debugging bugs='roaches' count=13 area='living room'"

Advanced Format Specifiers

F-strings use the same format specifications as the str.format() method. For a full reference, consult the formatspec documentation.

Fixed-point formatting:

value = 123.45678
f"{value:.2f}"  # Output: '123.46'

Zero-padding:

number = 42
f"{number:05d}"  # Output: '00042'

Hexadecimal or binary:

num = 255
f"{num:#x}"  # Output: '0xff'
f"{num:#b}"  # Output: '0b11111111'

Formatting option for Floating Point Numbers

Modifier Example Description
number :20d Put the value in a field width of 20
< :<20d Put the value in a field 20 characters wide, left-aligned
> :>20d Put the value in a field 20 characters wide, right-aligned
^ :^20d Put the value in a field 20 characters wide, center-aligned
0 :020d Put the value in a field 20 characters wide, fill in with leading zeros
. :20.2f Put the value in a field 20 characters wide with 2 characters to the right of the decimal point

Advantages of F-strings

  1. Concise and Readable:
    • Embed variables and expressions directly into strings without additional function calls.
  2. Powerful Formatting:
    • Provides granular control over presentation using format specifiers.
  3. Debugging Aid:
    • The = specifier helps self-document code and debug variable values easily.
  4. Performance:
    • Faster than alternatives like str.format() or string concatenation due to direct evaluation.

Comparison with Other Methods

  • F-strings:
    name = "Alice"
    f"My name is {name}."
    # Output: 'My name is Alice.'
  • str.format():
    name = "Alice"
    "My name is {}.".format(name)
    # Output: 'My name is Alice.'
  • Concatenation:
    name = "Alice"
    "My name is " + name + "."
    # Output: 'My name is Alice.'

F-strings are generally preferred for their clarity and simplicity.


The str.format() Method

The str.format() method in Python is a flexible way to format strings by substituting values into placeholders denoted by curly braces {}. It supports both positional and keyword arguments, allowing for advanced customization of string formatting.

Basic Usage

The placeholders {} in a string are replaced by the arguments passed to the format() method.

Example:

print('We are the {} who say "{}!"'.format('knights', 'Ni'))
# Output: We are the knights who say "Ni!"

Positional Arguments

You can specify the order of substitution using numbers inside the curly braces.

Default order:

print('{0} and {1}'.format('spam', 'eggs'))
# Output: spam and eggs

Reversed order:

print('{1} and {0}'.format('spam', 'eggs'))
# Output: eggs and spam

Keyword Arguments

Keyword arguments are referenced by their names within the placeholders.

print('This {food} is {adjective}.'.format(food='spam', adjective='absolutely horrible'))
# Output: This spam is absolutely horrible.

Mixing Positional and Keyword Arguments

You can combine positional and keyword arguments arbitrarily.

print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', other='Georg'))
# Output: The story of Bill, Manfred, and Georg.

print('The story of {0}, {2}, and {other}.'.format('Bill', 'Manfred', other='Georg'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: Replacement index 2 out of range for positional args tuple

Formatting with Width and Alignment

The str.format() method allows precise control over spacing and alignment.

Example: Formatting integers with aligned columns:

for x in range(1, 11):
    print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
# Output:
#  1    1     1
#  2    4     8
#  3    9    27
#  ...
# 10  100  1000
  • {0:2d}: Ensures a width of 2 characters for the first value.
  • {1:3d}: Ensures a width of 3 characters for the second value.
  • {2:4d}: Ensures a width of 4 characters for the third value.

Accessing Values in Dictionaries

You can use dictionary keys as references within the placeholders:

Using direct dictionary access:

table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; Dcab: {0[Dcab]:d}'.format(table))
# Output: Jack: 4098; Sjoerd: 4127; Dcab: 8637678

Using ** to unpack the dictionary:

print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
# Output: Jack: 4098; Sjoerd: 4127; Dcab: 8637678

Using vars() for Local Variables

The vars() function returns a dictionary of all local variables. This can be used with ** to dynamically format strings.

Example:

x = 42
y = 'example'
table = vars()
print('x: {x}, y: {y}'.format(**table))
# Output: x: 42, y: example

Advantages of str.format()

  1. Flexibility: Combine positional and keyword arguments.
  2. Readability: Reference variables by name or position for clarity.
  3. Customizability: Control width, alignment, and type-specific formatting (e.g., integers, floats).
  4. Dictionary Access: Format strings using values from dictionaries, including dynamically generated ones.

Comparison with f-strings

Feature str.format() f-strings
Basic Syntax '{0}, {1}'.format(val1, val2) f"{val1}, {val2}"
Positional Arguments Supported Not directly supported
Keyword Arguments Supported Supported
Accessing Dictionaries '{key}'.format(**dict) f"{dict['key']}"
Readability Less concise More concise
Performance Slower due to method calls Faster due to direct evaluation

Manual String Formatting

Manual string formatting involves directly manipulating strings using methods like rjust(), ljust(), center(), and zfill() for precise control over layout and alignment. This approach does not involve placeholders like str.format() or f-strings but instead relies on string methods to create properly formatted output.

Right-Justifying with str.rjust()

The rjust() method aligns text to the right within a specified width by padding it with spaces on the left.

for x in range(1, 11):
    print(repr(x).rjust(2), repr(x * x).rjust(3), end=' ')
    print(repr(x * x * x).rjust(4))
# Output:
#  1   1    1
#  2   4    8
#  3   9   27
#  4  16   64
#  5  25  125
#  6  36  216
#  7  49  343
#  8  64  512
#  9  81  729
# 10 100 1000
  • repr(): Ensures that numbers are converted to strings.
  • Alignment: Columns are aligned using rjust().

Left-Justifying with str.ljust()

The ljust() method aligns text to the left within a specified width by padding it with spaces on the right.

print('Hello'.ljust(10) + 'World')
# Output: Hello     World

Centering with str.center()

The center() method centers text within a specified width, padding both sides with spaces.

print('Python'.center(10))
# Output:   Python  

Zero-Padding with str.zfill()

The zfill() method pads a string on the left with zeros, and it recognizes signs (+ or -) for numeric strings.

Examples:

print('12'.zfill(5))          # Output: 00012
print('-3.14'.zfill(7))       # Output: -003.14
print('3.14159265359'.zfill(5))  # Output: 3.14159265359 (unchanged because it's longer than 5)

Truncating Strings

If you need a fixed width and truncation, you can use slicing in combination with ljust() or rjust().

print('Hello, world!'.ljust(10)[:10])
# Output: Hello, wo

Spaces Between Arguments in print()

The print() function automatically adds a space between arguments. To avoid extra spaces, use string concatenation or formatting.

print('Hello', 'World')  # Output: Hello World
print('Hello' + 'World')  # Output: HelloWorld
print('Hello', 'World', sep="") # Output: HelloWorld

Old String Formatting with the % Operator

Old-style string formatting in Python uses the % operator, also known as the modulo operator, to substitute values into a format string. This approach was prevalent before str.format() and f-strings were introduced. The % operator offers several format specifiers to control how values are inserted into a string.

format % values
  • format: A string containing placeholders.
  • values: A single value or a tuple of values to replace the placeholders.

Placeholders and Specifiers

Placeholders are introduced with % (format operator) followed by format specifiers that define how the value will be formatted.

Common Format Specifiers
Specifier Meaning Example Input Example Output
%s String 'Python' Python
%d or %i Integer 42 42
%f Floating-point number 3.14159 3.141590
%x Hexadecimal (lowercase) 255 ff
%X Hexadecimal (uppercase) 255 FF
%e Scientific notation (lowercase) 12345.67 1.234567e+04
%E Scientific notation (uppercase) 12345.67 1.234567E+04
%c Single character 65 A (ASCII 65)
Additional Format Options
Modifier Example Description
number %20d Put the value in a field width of 20
- %-20d Put the value in a field 20 characters wide, left-justified
+ %+20d Put the value in a field 20 characters wide, right-justified
0 %020d Put the value in a field 20 characters wide, fill in with leading zeros
. %20.2f Put the value in a field 20 characters wide with 2 characters to the right of the decimal point
(name) %(name)d Get the value from the supplied dictionary using name as the key

Examples of Basic Usage

Substituting a Single Value

print('Hello, %s!' % 'world')
# Output: Hello, world!

Multiple Values Use a tuple to pass multiple values:

name = 'Alice'
age = 25
print('My name is %s, and I am %d years old.' % (name, age))
# Output: My name is Alice, and I am 25 years old.

Floating-Point Numbers You can control the precision of floating-point numbers:

import math
print('The value of pi is approximately %.2f.' % math.pi)
# Output: The value of pi is approximately 3.14.

Formatting with Width and Precision

The % operator allows specifying the width (minimum number of characters) and precision for numeric values.

Syntax:

%[flags][width][.precision]specifier

Example: Width and Right-Justification:

print('Value: %5d' % 42)
# Output: Value:    42

Example: Width with Zero-Padding:

print('Value: %05d' % 42)
# Output: Value: 00042

Example: Floating-Point Precision:

print('Value: %.3f' % 3.14159)
# Output: Value: 3.142

Example: Combining Width and Precision:

print('Value: %8.2f' % 3.14159)
# Output: Value:     3.14

Using %s for Strings

The %s specifier converts any value into a string.

print('Name: %s' % 'Bob')
# Output: Name: Bob

print('List: %s' % [1, 2, 3])
# Output: List: [1, 2, 3]

Escape Characters

To include a literal % in the string, use %%:

print('Discount: %d%%' % 20)
# Output: Discount: 20%

Limitations of % Formatting

  • Readability: Formatting with % can become hard to read, especially with multiple placeholders.
    print('Coordinates: (%d, %d)' % (10, 20))
    # Output: Coordinates: (10, 20)
  • Flexibility: It lacks features like alignment and nested formatting available in str.format() and f-strings.

Using the Template Class from the string Module:

Templates offer basic string substitution using placeholders like $x. Substitutions are made from a dictionary.

from string import Template
t = Template('Total: $total, Discount: $discount')
t.substitute(total=100, discount=10)
# Output: 'Total: 100, Discount: 10'

Reading and Writing Files

Python provides a straightforward way to handle file operations through the open() function. File handling is a critical feature for reading data from files and writing data to them. This section delves into the various modes, practices, and precautions when working with files in Python.

open() Function

The open() function is the entry point for file operations. It returns a file object used to read, write, or manipulate the file.

open(filename, mode, encoding=None)
  • filename: Name of the file to open (string).
  • mode: Describes the purpose of file access.
  • encoding: Specifies the text encoding; recommended to use "utf-8" unless otherwise required.

File Modes

The mode parameter determines how the file is opened. Below are the common modes:

Mode Description
r Read-only mode (default). The file must exist.
w Write mode. Creates a new file or truncates an existing file.
a Append mode. Writes data to the end of the file, creating it if it doesn't exist.
r+ Read and write mode. The file must exist.
b Binary mode. Appends to r, w, or a to handle files as bytes objects (e.g., rb, wb).

Text Mode vs Binary Mode

Text Mode (Default)

  • Reads and writes strings.
  • Converts platform-specific line endings (in windows it is \r\n) to \n (reading) and back to platform-specific endings (writing).
  • Suitable for text files like .txt or .csv.

Binary Mode

  • Reads and writes data as bytes objects.
  • Does not modify line endings.
  • Essential for handling binary data like images (.jpeg), executables (.exe), or audio files.

Using the with Keyword

The with statement is the preferred way to handle file objects. It ensures the file is automatically closed after its block is executed, even if an exception occurs.

with open('example.txt', 'r', encoding='utf-8') as f:
    data = f.read()

print(f.closed)  # Output: True

Advantages:

  1. Automatically closes the file after use.
  2. Eliminates the need for explicitly calling f.close().
  3. Prevents resource leaks.

Without with:

f = open('example.txt', 'r', encoding='utf-8')
data = f.read()
f.close()

Writing to Files

Use the write() method to write strings or bytes to a file.

Example: Writing Strings

with open('example.txt', 'w', encoding='utf-8') as f:
    f.write('Hello, World!\n')

Precaution: Always use with or ensure f.close() is called to flush data to disk properly.

Example: Appending Data

with open('example.txt', 'a', encoding='utf-8') as f:
    f.write('Appending new line.\n')

Reading from Files

Use methods like read(), readline(), or readlines() to read data from a file.

  • read(size): Reads the specified number of characters (or bytes in binary mode).
  • readline(): Reads a single line. Includes the new line by default.
  • readlines(): Reads all lines and returns a list.

Example: Reading Entire File

with open('example.txt', 'r', encoding='utf-8') as f:
    content = f.read()
print(content)

Example: Reading Line by Line

with open('example.txt', 'r', encoding='utf-8') as f:
    for line in f:
        print(line.strip())

Handling Closed Files

Once a file is closed (either manually or automatically with with), any operations on it will raise an error.

f = open('example.txt', 'r')
f.close()
print(f.read())  # Raises ValueError: I/O operation on closed file

Error Handling

Be cautious when handling files:

  1. Use with to ensure proper cleanup.
  2. Avoid hardcoding paths; use libraries like os or pathlib for path manipulations.
  3. Handle exceptions using try-except blocks when necessary.
try:
    with open('nonexistent.txt', 'r') as f:
        content = f.read()
except FileNotFoundError:
    print('File does not exist!')

Summary of Good Practices

  1. Always use the with statement for file operations.
  2. Specify encoding="utf-8" for text files unless another encoding is required.
  3. Use binary mode for non-text files.
  4. Avoid performing operations on closed files.
  5. Handle exceptions gracefully to avoid crashes.

Methods of File Objects

File objects in Python provide numerous methods to interact with files effectively, whether reading, writing, or manipulating file positions.

1. Reading File Contents

f.read(size)

  • Reads a specified quantity of data and returns it as a string (text mode) or a bytes object (binary mode).
  • size (optional): Number of characters or bytes to read.
    • Omitted or negative: Reads the entire file.
    • Reaches the end of the file: Returns an empty string ('').

Examples:

# Reading the entire file
with open('example.txt', 'r') as f:
    content = f.read()
print(content)

# Reading with a size limit
with open('example.txt', 'r') as f:
    print(f.read(10))  # Reads first 10 characters
    print(f.read(5))   # Reads next 5 characters

# Reading beyond the end of the file
with open('example.txt', 'r') as f:
    print(f.read())  # Reads the content
    print(f.read())  # Returns '' since end of the file is reached

f.readline()

  • Reads one line from the file.
  • Keeps the newline character (\n) at the end of the string unless the file doesn't end with one.
  • Returns an empty string ('') if the end of the file is reached.
with open('example.txt', 'r') as f:
    print(f.readline())  # First line
    print(f.readline())  # Second line
    print(f.readline())  # Returns '' if no more lines exist

Handling Blank Lines:

  • A blank line is represented as '\n'.

f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn't end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by \n, a string containing only a single newline.


Reading Lines Efficiently

Iterating over a file object is an efficient and straightforward way to read lines.

with open('example.txt', 'r') as f:
    for line in f:
        print(line.strip())  # Removes trailing newline

f.readlines() or list(f)

Reads all lines from a file and returns them as a list of strings.

with open('example.txt', 'r') as f:
    lines = f.readlines()
print(lines)

with open('example.txt', 'r') as f:
    lines = list(f)
print(lines)

Writing to Files

f.write(string)

  • Writes the given string to the file.
  • Returns the number of characters written.
with open('example.txt', 'w') as f:
    num_chars = f.write('This is a test\n')
print(num_chars)  # Output: 15

Writing Non-String Objects

  • Non-string objects must be converted to strings or bytes before writing.
value = ('the answer', 42)
with open('example.txt', 'w') as f:
    f.write(str(value))  # Converts tuple to string

File Positioning

f.tell()

  • Returns the current position in the file.
  • In binary mode: Number of bytes from the start.
  • In text mode: An ???opaque value that represents the position.
with open('example.txt', 'r') as f:
    print(f.read(10))  # Reads first 10 characters
    print(f.tell())    # Shows the position after 10 characters

f.seek(offset, whence)

  • Changes the current position in the file.
  • offset: Number of bytes/characters to move.
  • whence: Reference point for movement.
    • 0: Beginning of the file (default).
    • 1: Current position.
    • 2: End of the file.

Examples:

# Binary mode example
with open('example.bin', 'wb+') as f:
    f.write(b'0123456789abcdef')  # Writes bytes

    f.seek(5)            # Move to the 6th byte
    print(f.read(1))     # Output: b'5'

    f.seek(-3, 2)        # Move 3 bytes before the end
    print(f.read(1))     # Output: b'd'

# Text mode example
with open('example.txt', 'w+') as f:
    f.write('Hello, World!')
    f.seek(0)            # Go to the start
    print(f.read(5))     # Reads 'Hello'

Text Mode Restrictions:

  • Seeks relative to the beginning (0) are only allowed.
  • Seeking beyond valid offsets produces undefined behavior.

Miscellaneous Methods

isatty()

  • Checks if the file object is connected to a terminal.
  • Rarely used in general file handling.

truncate(size)

  • Truncates the file to the specified size.
  • If size is omitted, it truncates from the current position.

Example:

with open('example.txt', 'w+') as f:
    f.write('This is a test.')
    f.truncate(10)  # Truncates to the first 10 characters
    f.seek(0)
    print(f.read())  # Output: 'This is a'

Good Practices and Tips

  1. Iterating Over Files: Use for line in f for efficient line-by-line reading.
  2. Avoid Reading Large Files Fully: Use f.read(size) to limit memory usage.
  3. Always Close Files: Use the with keyword to handle files automatically.
  4. Binary Data: Use binary mode ('rb' or 'wb') for non-text files.
  5. Position Handling: Use tell() and seek() for precise control over file navigation.

Saving Structured Data with JSON

Python's json module provides an efficient way to serialize and deserialize data for saving and sharing structured data like lists, dictionaries, and nested objects. JSON (JavaScript Object Notation) is a widely used data interchange format, making it a great choice for interoperability.

Understanding JSON Serialization and Deserialization

Serialization (Converting Python Objects to JSON)

  • Serializing is the process of converting Python objects (like lists, dictionaries, or other data hierarchies) into a JSON string representation.
  • This is helpful for saving data to files or transferring it over a network.

Deserialization (Converting JSON to Python Objects)

  • Deserializing is the process of reconstructing Python objects from a JSON string.
  • Useful for reading structured data stored in JSON format.

Why Use JSON?

  • Simplifies saving and retrieving complex data types like nested lists and dictionaries.
  • Eliminates the need to manually parse and serialize data.
  • JSON is widely supported, making it a great format for data exchange across applications and languages.

Key Functions in the json Module

json.dumps(obj)

Serializes a Python object into a JSON string.

import json

data = [1, 'simple', 'list']
json_string = json.dumps(data)
print(json_string)  # Output: '[1, "simple", "list"]'

json.dump(obj, file)

Serializes a Python object and writes it directly to a file.

data = {'name': 'Alice', 'age': 30, 'is_member': True}

with open('data.json', 'w', encoding='utf-8') as f:
    json.dump(data, f)

json.loads(json_string)

Deserializes a JSON string into a Python object.

json_string = '{"name": "Alice", "age": 30, "is_member": true}'
data = json.loads(json_string)
print(data)  # Output: {'name': 'Alice', 'age': 30, 'is_member': True}

json.load(file)

Deserializes JSON content from a file and reconstructs it into a Python object.

with open('data.json', 'r', encoding='utf-8') as f:
    data = json.load(f)
print(data)  # Output: {'name': 'Alice', 'age': 30, 'is_member': True}

Examples of Using JSON

Saving a Nested Dictionary

nested_data = {
    "user": {
        "id": 1,
        "name": "John",
        "preferences": {
            "language": "English",
            "timezone": "UTC"
        }
    }
}

with open('user_data.json', 'w', encoding='utf-8') as f:
    json.dump(nested_data, f, indent=4)  # `indent` for pretty formatting

Reading and Modifying JSON Data

with open('user_data.json', 'r', encoding='utf-8') as f:
    data = json.load(f)

# Modify the data
data['user']['preferences']['language'] = 'French'

# Save changes back to the file
with open('user_data.json', 'w', encoding='utf-8') as f:
    json.dump(data, f, indent=4)

Important Notes

UTF-8 Encoding:

  • JSON files must use UTF-8 encoding. Always specify encoding="utf-8" when opening files for reading or writing.

Handling Arbitrary Python Objects:

  • JSON supports common data types like strings, numbers, lists, and dictionaries.
  • Arbitrary Python objects (e.g., instances of custom classes) require additional effort to serialize.

Example with Custom Serialization:

class User:
   def __init__(self, name, age):
       self.name = name
       self.age = age

def user_to_dict(user):
   return {'name': user.name, 'age': user.age}

user = User('Alice', 25)
json_string = json.dumps(user, default=user_to_dict)
print(json_string)  # Output: '{"name": "Alice", "age": 25}'

Comparison with pickle:

pickle is a protocol which allows serializing and deserializing arbitrarily complex Python objects. It converts Python objects into a byte stream (pickling) and back into objects (unpickling). It's useful for saving objects to files or transmitting them over networks.

  • Advantages of JSON:
  • Language-independent: JSON can be used across different programming languages.
  • Safer: JSON doesn't execute arbitrary code during deserialization.
  • Advantages of pickle:
  • Can serialize arbitrary Python objects, but it's Python-specific.
  • Less secure: Deserializing untrusted pickle data can execute malicious code.

When to Use:

  • Use JSON for interoperability and when sharing data between different systems.
  • Use pickle for Python-specific projects requiring more complex object serialization.

Best Practices

  1. Always Close Files:

    • Use with to ensure proper file handling.

    Example:

    with open('data.json', 'r', encoding='utf-8') as f:
        data = json.load(f)
  2. Use indent for Readability:

    • When saving JSON data to a file, use the indent parameter for pretty-printed output.
  3. Avoid Circular References:

    • JSON serialization doesn't support circular references in data structures.

Advanced Example: Logging JSON Data

import json
import datetime

log_entry = {
    "timestamp": datetime.datetime.now().isoformat(),
    "event": "user_login",
    "details": {
        "user_id": 12345,
        "ip_address": "192.168.1.1"
    }
}

with open('log.json', 'a', encoding='utf-8') as f:
    json.dump(log_entry, f)
    f.write('\n')  # Add a newline for each log entry