Python topics for AI/ML/Data Science (no library names, just Python).
Must know core
Syntax, indentation, expressions, operator precedence
Numeric types (int, float, complex), boolean logic, comparisons
Strings: slicing, formatting, f strings, Unicode and encodings (UTF 8),
bytes/bytearray
Built in collections: list, tuple, dict, set; mutability vs immutability
Slicing, unpacking, iterable utilities (zip, enumerate), sorting with key
Comprehensions: list/dict/set; generator expressions
Functions: def, default args, *args/**kwargs, scope, closures
Control flow: if/elif/else, for/while, break/continue, ternary expressions
Exceptions: try/except/else/finally, custom exceptions, raise patterns
File I/O: text vs binary, newline handling, buffering, large file streaming
Modules and packages: imports, __name__ == "__main__", module structure
Docstrings and inline help
Intermediate data/ML leaning Python
Iterables vs iterators; generator functions; lazy evaluation
Context managers and the with statement; writing your own with
__enter__/__exit__
Object oriented basics: classes, instances, self, inheritance, composition
Dunder methods: __repr__, __str__, equality/ordering, hashing, length,
iteration
Properties and descriptors for computed and validated attributes
Data modeling helpers: lightweight immutable/mutable record types
Pattern matching (match/case) for parsing structured data
Text processing: tokenizing, simple parsing, and regular expressions
Dates, times, time zones, timedeltas; naive vs aware pitfalls
Randomness and reproducibility: seeding, deterministic workflows
Numeric gotchas: floating point errors, Decimal/Fraction use cases
Basic statistics utilities in the language s built ins (mean, median,
variance)
Serialization: JSON, CSV, and safe vs unsafe persistence; newline and encoding
issues
Command line interfaces: argument parsing and polite CLI ergonomics
Logging and warnings for experiments and runs
Testing basics: unit tests, fixtures, assertions; testable design
Advanced and performance
Memory profiling: object sizes, reference cycles, garbage collection
Caching and memoization; purity vs side effects
Concurrency models: threads vs processes vs async/await; I/O bound vs
CPU bound; queues
The GIL: what it is, when it matters
Streaming pipelines: lazy transforms, back pressure friendly iteration
Binary data handling: buffers, memoryview, struct packing/unpacking
Stable sorting, custom comparators, key functions for complex data
Defensive programming: input validation, sentinel values, None handling
Import system internals: search paths, packages, relative vs absolute imports
Type hints: generics, unions, optionals, literals, protocols; runtime vs
static checking tradeoffs
Packaging basics: module layout, versioning, reproducible runs
Pythonic data work patterns
Clean transforms with comprehensions and generator pipelines
Grouping, aggregating, and windowed calculations with plain iterators
Deduping, set based operations, and joins with built ins
Robust error handling around flaky data: retry, fallback, and partial results
Reusable config via env vars, .ini/.json, and dataclass style configs
Reproducible scripts: fixed seeds, fixed sort orders, explicit encodings,
pinned time zones