Software Design By Example A Toolbased Introduction With Python Greg Wilson download https://ebookbell.com/product/software-design-by-example-a- toolbased-introduction-with-python-greg-wilson-55974684 Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be interested in. You can click the link to download. Software Design By Example A Toolbased Introduction With Python Greg Wilson https://ebookbell.com/product/software-design-by-example-a-toolbased- introduction-with-python-greg-wilson-55959768 Upnp Design By Example A Software Developers Guide To Universal Plug And Play Michael Jeronimo https://ebookbell.com/product/upnp-design-by-example-a-software- developers-guide-to-universal-plug-and-play-michael-jeronimo-2184566 Software Design By Example Greg Wilson https://ebookbell.com/product/software-design-by-example-greg- wilson-47254836 Practical Design Patterns For Java Developers Hone Your Software Design Skills By Implementing Popular Design Patterns In Java 1st Edition Miroslav Wengner https://ebookbell.com/product/practical-design-patterns-for-java- developers-hone-your-software-design-skills-by-implementing-popular- design-patterns-in-java-1st-edition-miroslav-wengner-49111936
Handson Design Patterns With C And Net Core Write Clean And Maintainable Code By Using Reusable Solutions To Common Software Design Problems 1st Edition Gaurav Aroraa https://ebookbell.com/product/handson-design-patterns-with-c-and-net- core-write-clean-and-maintainable-code-by-using-reusable-solutions-to- common-software-design-problems-1st-edition-gaurav-aroraa-53022704 Addison Wesley Domaindriven Design Tackling Complexity In The Heart Of Software By Eric By Eric Evans https://ebookbell.com/product/addison-wesley-domaindriven-design- tackling-complexity-in-the-heart-of-software-by-eric-by-eric- evans-36065612 Handson Domaindriven Design With Net Core Tackling Complexity In The Heart Of Software By Putting Ddd Principles Into Practice Alexey Zimarev https://ebookbell.com/product/handson-domaindriven-design-with-net- core-tackling-complexity-in-the-heart-of-software-by-putting-ddd- principles-into-practice-alexey-zimarev-11074038 Software Design https://ebookbell.com/product/software-design-58948182 Software Design Patterns The Ultimate Guide 1st Edition Sufyan Bin Uzay https://ebookbell.com/product/software-design-patterns-the-ultimate- guide-1st-edition-sufyan-bin-uzay-47260458
Software Design by Example The best way to learn design in any field is to study examples, and some of the best examples of software design come from the tools programmers use in their own work. Software De- sign by Example: A Tool-Based Introduction with Python therefore builds small versions of the things programmers use in order to demystify them and give some insights into how ex- perienced programmers think. From a file backup system and a testing framework to a regular expression matcher, a browser layout engine, and a very small compiler, we explore common design patterns, show how making code easier to test also makes it easier to re-use, and help readers understand how debuggers, profilers, package managers, and version control systems work so that they can use them more effectively. This material can be used for self-paced study, in an undergraduate course on software design, or as the core of an intensive weeklong workshop for working programmers. Each chapter has a set of exercises ranging in size and difficulty from half a dozen lines to a full day’s work. Read- ers should be familiar with the basics of modern Python, but the more advanced features of the language are explained and illustrated as they are introduced. All the written material in this project can be freely re-used under the terms of the Creative Commons - Attribution license, while all of the software is made available under the terms of the Hippocratic License. All proceeds from the sale of this book will go to support the Red Door Family Shelter in Toronto. Features: •   Teaches software design by showing programmers how to build the tools they use every day •   Each chapter includes exercises to help readers check and deepen their understanding •   All the example code can be downloaded, re-used, and modified under an open license Dr. Greg Wilson is a programmer, author, and educator based in Toronto. He co-founded and was the first Executive Director of Software Carpentry, which has taught basic software skills to tens of thousands of researchers worldwide, and he has authored or edited over a dozen books (including two for children). Greg is a member of the Python Software Foundation and a recipi- ent of ACM SIGSOFT’s Influential Educator of the Year Award.
Taylor Francis Taylor Francis Group http://taylorandfrancis.com
Software Design by Example A Tool-Based Introduction with Python Greg Wilson
First edition published 2024 by CRC Press 2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor Francis Group, LLC © 2024 Greg Wilson Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot as- sume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho- tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for iden- tification and explanation without intent to infringe. ISBN: 978-1-032-72525-3 (hbk) ISBN: 978-1-032-72521-5 (pbk) ISBN: 978-1-032-72523-9 (ebk) DOI: 10.1201/9781032725239 Typeset in Arial by KnowledgeWorks Global Ltd. Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
Dedication This one’s for Mike and Jon: I’m glad you always found time to chat.
Taylor Francis Taylor Francis Group http://taylorandfrancis.com
Contents 1 Introduction 1 1.1 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The Big Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 What People Are Saying . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Objects and Classes 7 2.1 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Finding Duplicate Files 17 3.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Hashing Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Better Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Matching Patterns 25 4.1 Simple Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Rethinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5 Parsing Text 35 5.1 Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6 Running Tests 43 6.1 Storing and Running Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.2 Finding Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 vii
viii Contents 7 An Interpreter 51 7.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 7.3 Introspection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 8 Functions and Closures 59 8.1 Definition and Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 8.2 Calling Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 8.3 Closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 8.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 9 Protocols 69 9.1 Mock Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 9.2 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 9.3 Decorators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 9.4 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 10 A File Archiver 81 10.1 Saving Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 10.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 10.3 Tracking Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 10.4 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 11 An HTML Validator 91 11.1 HTML and the DOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 11.2 The Visitor Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 11.3 Checking Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 11.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 12 A Template Expander 99 12.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 12.2 Managing Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 12.3 Visiting Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 12.4 Implementing Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 12.5 Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 12.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 12.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 13 A Code Linter 111 13.1 Machinery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 13.2 Finding Duplicate Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 13.3 Finding Unused Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 13.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Contents ix 14 Page Layout 119 14.1 Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 14.2 Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 14.3 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 14.4 Wrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 14.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 14.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 15 Performance Profiling 131 15.1 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 15.2 Row-Wise Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 15.3 Column-Wise Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 15.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 15.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 16 Object Persistence 145 16.1 Built-in Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 16.2 Converting to Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 16.3 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 16.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 16.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 17 Binary Data 155 17.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 17.2 Bitwise Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 17.3 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 17.4 And Now, Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 17.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 17.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 18 A Database 165 18.1 Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 18.2 Saving Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 18.3 A File-Backed Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 18.4 Playing with Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 18.5 Persisting Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 18.6 Cleaning Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 18.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 18.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 19 A Build Manager 177 19.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 19.2 Initial Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 19.3 Topological Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 19.4 A Better Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 19.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 19.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
x Contents 20 A Package Manager 187 20.1 Semantic Versioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 20.2 Exhaustive Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 20.3 Generating Possibilities Manually . . . . . . . . . . . . . . . . . . . . . . . 191 20.4 Incremental Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 20.5 Using a Theorem Prover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 20.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 20.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 21 Transferring Files 201 21.1 Using TCP/IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 21.2 Chunking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 21.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 21.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 21.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 22 Serving Web Pages 209 22.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 22.2 Hello, Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 22.3 Serving Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 22.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 22.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 22.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 23 A File Viewer 219 23.1 Curses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 23.2 Windowing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 23.3 Moving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 23.4 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 23.5 Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 23.6 Viewport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 23.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 23.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 24 Undo and Redo 233 24.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 24.2 Insertion and Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 24.3 Going Backward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 24.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 24.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 25 A Virtual Machine 243 25.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 25.2 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 25.3 Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 25.4 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 25.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 25.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Contents xi 26 A Debugger 255 26.1 One Step at a Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 26.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 26.3 Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 26.4 Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 26.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 26.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 27 Conclusion 267 A Bibliography 269 B Bonus Material 271 B.1 Using Function Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B.2 Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 B.3 Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 B.4 Tracing Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 B.5 Inspecting Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 B.6 User-Defined Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 B.7 Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 B.8 Big and Little Endian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 B.9 Generating Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 C Syllabus 287 D License 293 D.1 Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 D.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 E Code of Conduct 297 E.1 Our Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 E.2 Our Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 E.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 E.4 Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 E.5 Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 F Contributing 299 F.1 Editing Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 F.2 Making Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 F.3 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 G Glossary 303 Index 327
Taylor Francis Taylor Francis Group http://taylorandfrancis.com
1 Introduction • The complexity of a system increases more rapidly than its size. • The best way to learn design is to study examples, and the best programs to use as examples are the ones programmers use every day. • These lessons assume readers can write small programs and want to write larger ones, or are looking for material to use in software design classes that they teach. • All of the content is free to read and re-use under open licenses, and all royalties from sales of this book will go to charity. Terms defined: cognitive load The best way to learn design in any field is to study examples [Schon1984; Petre2016], and the most approachable examples are ones that readers are already familiar with. These lessons therefore build small versions of tools that programmers use every day1 to show how experienced software designers think. Along the way, they introduce some fundamen- tal ideas in computer science that many self-taught programmers haven’t encountered. We hope these lessons will help you design better software yourself, and that if you know how programming tools work, you’ll be more likely to use them and better able to use them well. 1.1 Audience This learner persona [Wilson2019] describes who this book is for: Maya has a master’s degree in genomics. She knows enough Python to analyze data from her experiments, but struggles to write code other people can use. These lessons will show her how to design, build, and test large programs in less time and with less pain. Like Maya, you should be able to: • Write Python programs using lists, loops, conditionals, dictionaries, and functions. • Puzzle your way through Python programs that use classes and exceptions. • Run basic Unix shell commands like ls and mkdir. • Read and write a little bit of HTML. • Use Git2 to save and share files. (It’s OK not to know the more obscure commands3 .) 1https://en.wikipedia.org/wiki/Programming_tool 2https://git-scm.com/ 3https://git-man-page-generator.lokaltog.net/ 1
2 1 Introduction Objects and Classes Running Tests An Interpreter Functions and Closures Object Persistence A Template Expander Protocols Binary Data A Build Manager Undo and Redo A File Viewer A Debugger A Virtual Machine A Database Performance Profiling Page Layout An HTML Validator A Code Linter Parsing Text Matching Patterns A File Archiver Finding Duplicate Files Transferring Files Serving Web Pages A Package Manager source sink Legend 2 3 4 5 6 7 8 9 10 11 12 12 14 15 16 17 18 19 20 21 22 23 24 25 26 Figure 1.1: Lesson topics and dependencies. These chapters (Figure 1.1) are also designed to help another persona: Yim teaches two college courses on software development. They are frustrated that so many books talk about details but not about design and use examples that their students can’t relate to. This book will give them material they can use in class and starting points for course projects. 1.2 The Big Ideas Our approach to design is based on three big ideas. First, as the number of components in a system grows, the complexity of the system increases rapidly (Figure 1.2). However, the number of things we can hold in working memory at any time is fixed and fairly small [Hermans2021]. If we want to build large programs that we can understand, we therefore need to construct them out of pieces that interact in a small number of ways. Figuring out what those pieces and interactions should be is the core of what we call “design”. Second, “making sense” depends on who we are. When we use a low-level language, we incur the cognitive load of assembling micro-steps into something more meaningful. When we use a high-level language, on the other hand, we incur a similar load translating functions of functions into actual operations on actual data. More experienced programmers are more capable at both ends of the curve, but that’s not the only thing that changes. If a novice’s comprehension curve looks like the lower one in Figure 1.3, then an expert’s looks like the upper one. Experts don’t just understand more at all levels of abstraction; their preferred level has also shifted so they find √ x2 + y2 easier to read than the Medieval equivalent “the side of the square whose area is the sum of the areas of the two squares whose sides are given by the first part and the second part”. This
1.3 Formatting 3 A B C A B C D E F A B C D E F 3 components have 3 interactions 6 components have 30 interactions 2 components with 3 sub-components have 7 interactions Figure 1.2: How complexity grows with size. Abstraction Comprehension expert novice Figure 1.3: Novice and expert comprehension curves. curve means that for any given task, the code that is quickest for a novice to comprehend will almost certainly be different from the code that an expert can understand most quickly. Our third big idea is that programs are just another kind of data. Source code is just text, which we can process like other text files. Likewise, a program in memory is just a data structure that we can inspect and modify like any other. Treating code like data enables us to solve hard problems in elegant ways, but at the cost of increasing the level of abstraction in our programs. Once again, finding the balance is what we mean by “design”. 1.3 Formatting We display Python source code like this: for ch in example: print(ch) and Unix shell commands like this: for filename in *.dat do cut -d , -f 10 $filename done
4 1 Introduction Data files and program output are shown like this: - name: read params: - sample_data.csv alpha beta gamma delta We use ... to show where lines have been omitted, and occasionally break lines in unnatural ways to make them fit on the page. Where we do this, we end all but the last line with a single backslash . Finally, we show glossary entries in bold text and write functions as function_name rather than function_name(). The latter is more common, but the empty parentheses makes it hard to tell whether we’re talking about the function itself or a call to the function with no parameters. 1.4 Usage The source for this book is available in our Git repository4 and all of it can be read on our website5 . All of the written material in this book is licensed under the Creative Commons - Attribution - NonCommercial 4.0 International license6 (CC-BY-NC-4.0), while the software is covered by the Hippocratic License7 . The first license allows you to use and remix this material for noncommercial purposes, as-is or in adapted form, provided you cite its orig- inal source; if you want to sell copies or make money from this material in any other way, you must contact us8 and obtain permission first. The second license allows you to use and remix the software on this site provided you do not violate international agreements governing human rights; please see Appendix D for details. If you would like to improve what we have, add new material, or ask questions, please file an issue in our GitHub repository9 or send an email10 . All contributors are required to abide by our Code of Conduct (Appendix E). 1.5 What People Are Saying Here’s what people said about the JavaScript version of this book [Wilson2022a]: • Jessica Kerr11 : “Software Design by Example is the book I’ll recommend to every new dev... It is nice to you. It wants you to succeed... It’s a bridge from ‘learn to program’ to working programmer.” 4https://github.com/gvwilson/sdxpy/ 5https://third-bit.com/sdxpy/ 6https://creativecommons.org/licenses/by-nc/4.0/ 7https://firstdonoharm.dev/ 8mailto:gvwilson@third-bit.com 9https://github.com/gvwilson/sdxpy/ 10mailto:gvwilson@third-bit.com 11https://jessitron.com/2023/02/20/book-review-software-design-by-example/
1.6 Acknowledgments 5 • Jenn Schiffer12 : “I am v much enjoying gvwilson’s book Software Design by Example. It makes me miss teaching, it would be such a fun text to use!” • Emily Gorcenski13 : “There’s a lot of books on programming but fewer books that couple software development with effective and practical use of tools, presenting a language not as a main course but as a part of an engineering ecosystem. Greg Wilson’s book hits all the right notes in bringing together theory, pragmatism, and best practices.” • Danielle Navarro14 : “The book is really bloody lovely.” 1.6 Acknowledgments Like [Wilson2022a], this book was inspired by [Kamin1990; Kernighan1979; Kernighan1981; Kernighan1983; Kernighan1988; Oram2007; Wirth1976] and by: • The Architecture of Open Source Applications15 series [Brown2011; Brown2012; Arm- strong2013; Brown2016]; • Mary Rose Cook’s16 Gitlet17 ; • Matt Brubeck’s18 browser engine tutorial19 ; • Web Browser Engineering20 by Pavel Panchekha21 and Chris Harrelson22 ; • Connor Stack’s23 database tutorial24 ; • Maël Nison’s25 package manager tutorial26 ; • Paige Ruten’s27 kilo text editor28 and Wasim Lorgat’s29 editor tutorial30 ; • Bob Nystrom’s31 Crafting Interpreters32 [Nystrom2021]; and • the posts and zines33 created by Julia Evans34 . 12https://mastodon.social/@jenn@pixel.kitchen/109985276835264400 13https://emilygorcenski.com/post/book-report-software-design-by-example-by-greg-wilson/ 14https://blog.djnavarro.net/posts/2023-05-31_software-design-by-example/ 15https://aosabook.org/ 16https://maryrosecook.com/ 17http://gitlet.maryrosecook.com/ 18https://limpet.net/mbrubeck/ 19https://limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html 20https://browser.engineering/ 21https://pavpanchekha.com/ 22https://twitter.com/chrishtr 23https://connorstack.com/ 24https://cstack.github.io/db_tutorial/ 25https://arcanis.github.io/ 26https://classic.yarnpkg.com/blog/2017/07/11/lets-dev-a-package-manager/ 27https://viewsourcecode.org/ 28https://viewsourcecode.org/snaptoken/kilo/index.html 29https://wasimlorgat.com/ 30https://github.com/seem/editor 31http://journal.stuffwithstuff.com/ 32https://craftinginterpreters.com/ 33https://wizardzines.com/ 34https://jvns.ca/
6 1 Introduction I am grateful to Miras Adilov, Alvee Akand, Rohan Alexander, Alexey Alexapolsky, Lina Andrén, Alberto Bacchelli, Yanina Bellini Saibene, Matthew Bluteau, Adrienne Canino, Marc Chéhab, Stephen Childs, Hector Correa, Socorro Dominguez, Christian Drumm, Christian Epple, Julia Evans, Davide Fucci, Thomas Fritz, Francisco Gabriel, Florian Gaudin-Delrieu, Craig Gross, Jonathan Guyer, McKenzie Hagen, Han Qi, Fraser Hay, Alexandru Hurjui, Bahman Karimi, Carolyn Kim, Kitsios Konstantinos, Jenna Landy, Peter Lin, Zihan Liu, Becca Love, Dan McCloy, Ramiro Mejia, Michael Miller, Firas Moosvi, Joe Nash, Sheena Ng, Reiko Okamoto, Juanan Pereira, Mahmoodur Rahman, Arpan Sarkar, Silvan Schlegel, Rosan Shanmuganathan, Dave W. Smith, Stephen M. Sturdevant, Diyar Taskiran, Ece Tur- nator, and Yao Yundong for feedback on early drafts of this material. I am also grateful to Shashi Kumar for help with LaTeX, to Odin Beuchat35 for help with JavaScript, and to the creators of Black36 , flake837 , Glosario38 , GNU Make39 , isort40 , ark41 , LaTeX42 , pip43 , Python44 , Remark45 , WAVE46 , and many other open source tools: if we all give a little, we all get a lot. All royalties from this book will go to the Red Door Family Shelter47 in Toronto. 1.7 Exercises Setting Up 1. Use pip48 to install Black49 , flake850 , and isort51 on your computer. 2. Run them on a few programs you have already written. (The file setup.cfg in the root directory of this book’s GitHub repository52 has the settings we use for these tools.) What problems do they report? Which of these reports do you disagree with? Avoiding Potholes Go to the GitHub repository53 for this book and look at the open issues. Which of them can you understand? What makes the others hard to understand? What could you add, leave out, or write differently when you report a problem that you have found? 35https://www.drafolin.ch/ 36https://black.readthedocs.io/ 37https://flake8.pycqa.org/ 38https://glosario.carpentries.org/ 39https://www.gnu.org/software/make/ 40https://pycqa.github.io/isort/ 41https://www.dmulholl.com/docs/ark/main/ 42https://www.latex-project.org/ 43https://pip.pypa.io/ 44https://www.python.org/ 45https://remarkjs.com/ 46https://wave.webaim.org/ 47https://www.reddoorshelter.ca/ 48https://pip.pypa.io/ 49https://black.readthedocs.io/ 50https://flake8.pycqa.org/ 51https://pycqa.github.io/isort/ 52https://github.com/gvwilson/sdxpy/ 53https://github.com/gvwilson/sdxpy/
2 Objects and Classes • Objects are useful without classes, but classes make them easier to understand. • A well-designed class defines a contract that code using its instances can rely on. • Objects that respect the same contract are polymorphic, i.e., they can be used interchangeably even if they do different specific things. • Objects and classes can be thought of as dictionaries with stereotyped behavior. • Most languages allow functions and methods to take a variable number of arguments. • Inheritance can be implemented in several ways that differ in the order in which objects and classes are searched for methods. Terms defined: alias, argument, cache, class method, constructor, derived class, design by contract, monkey patching, multiple inheritance, object-oriented programming, parameter, polymorphism, recursion, spread, static method, upcall, varargs We are going to create a lot of objects and classes in these lessons, and they will be a lot easier to use if we understand how they are implemented. Historically, object-oriented programming (OOP) was invented to solve two problems: 1. What is a natural way to represent real-world “things” in code? 2. How can we organize code to make it easier to understand, test, and extend? 2.1 Objects As a motivating problem, let’s define some of the things a generic shape in a drawing package must be able to do: class Shape: def __init__(self, name): self.name = name def perimeter(self): raise NotImplementedError(perimeter) def area(self): raise NotImplementedError(area) A specification like this is sometimes called a contract because an object must satisfy it in order to be considered a shape, i.e., must provide methods with these names that do 7
8 2 Objects and Classes what those names suggest. For example, we can derive classes from Shape to represent squares and circles. class Square(Shape): def __init__(self, name, side): super().__init__(name) self.side = side def perimeter(self): return 4 * self.side def area(self): return self.side ** 2 class Circle(Shape): def __init__(self, name, radius): super().__init__(name) self.radius = radius def perimeter(self): return 2 * math.pi * self.radius def area(self): return math.pi * self.radius ** 2 Since squares and circles have the same methods, we can use them interchangeably. This is called polymorphism, and it reduces cognitive load by allowing the people using related things to ignore their differences: examples = [Square(sq, 3), Circle(ci, 2)] for thing in examples: n = thing.name p = thing.perimeter() a = thing.area() print(f{n} has perimeter {p:.2f} and area {a:.2f}) sq has perimeter 12.00 and area 9.00 ci has perimeter 12.57 and area 12.57 But how does polymorphism work? The first thing we need to understand is that a func- tion is an object. While the bytes in a string represent characters and the bytes in an image represent pixels, the bytes in a function are instructions (Figure 2.1). When Python exe- cutes the code below, it creates an object in memory that contains the instructions to print a string and assigns that object to the variable example: def example(): print(in example) We can create an alias for the function by assigning it to another variable and then call the function by referencing that second variable. Doing this doesn’t alter or erase the connection between the function and the original name: alias = example alias() in example We can also store function objects in data structures like lists and dictionaries. Let’s write some functions that do the same things as the methods in our original Python and store them in a dictionary to represent a square (Figure 2.2):
2.1 Objects 9 as text it's all just bytes as image old_x = x x = 2 * x + 1 as instructions bytes 74 69 73 27 61 20 6C 6C 6A 20 73 75 20 74 79 62 65 74 0A 73 Figure 2.1: Bytes can be interpreted as text, images, instructions, and more. name side perim area square sq 2 square_perim() square_area() name radius perim area circle ci 3 circle_perim() circle_area() Figure 2.2: Using dictionaries to emulate objects. def square_perimeter(thing): return 4 * thing[side] def square_area(thing): return thing[side] ** 2 def square_new(name, side): return { name: name, side: side, perimeter: square_perimeter , area: square_area } If we want to use one of the “methods” in this dictionary, we call it like this: def call(thing, method_name): return thing[method_name](thing) examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: n = ex[name] p = call(ex, perimeter) a = call(ex, area) print(f{n} {p:.2f} {a:.2f}) The function call looks up the function stored in the dictionary, then calls that function with the dictionary as its first object; in other words, instead of using obj.meth(arg) we use obj[meth](obj, arg). Behind the scenes, this is (almost) how objects actually work.
10 2 Objects and Classes We can think of an object as a special kind of dictionary. A method is just a function that takes an object of the right kind as its first parameter (typically called self in Python). 2.2 Classes One problem with implementing objects as dictionaries is that it allows every single object to behave slightly differently. In practice, we want objects to store different values (e.g., different squares to have different sizes) but the same behaviors (e.g., all squares should have the same methods). We can implement this by storing the methods in a dictionary called Square that corresponds to a class and having each individual square contain a ref- erence to that higher-level dictionary (Figure 2.3). In the code below, that special reference uses the key _class: def square_perimeter(thing): return 4 * thing[side] def square_area(thing): return thing[side] ** 2 Square = { perimeter: square_perimeter , area: square_area , _classname: Square } def square_new(name, side): return { name: name, side: side, _class: Square } Calling a method now involves one more lookup because we have to go from the object to the class to the method, but once again we call the “method” with the object as the first argument: name side perim area square sq 2 square_perim() square_area() name radius perim area circle ci 3 circle_perim() circle_area() _class _class _classname square _classname circle Square Circle Figure 2.3: Using dictionaries to emulate classes.
2.3 Arguments 11 def call(thing, method_name): return thing[_class][method_name](thing) examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: n = ex[name] p = call(ex, perimeter) a = call(ex, area) c = ex[_class][_classname] print(f{n} is a {c}: {p:.2f} {a:.2f}) As a bonus, we can now reliably identify objects’ classes and ask whether two objects are of the same class or not by checking what their _class keys refer to. Arguments vs. Parameters Many programmers use the words argument and parameter interchangeably, but to make our meaning clear, we call the values passed into a function its arguments and the names the function uses to refer to them as its parameters. Put another way, parameters are part of the definition, and arguments are given when the function is called. 2.3 Arguments The methods we have defined so far operate on the values stored in the object’s dictionary, but none of them take any extra arguments as input. Implementing this is a little bit tricky because different methods might need different numbers of arguments. We could define functions call_0, call_1, call_2 and so on to handle each case, but like most modern languages, Python gives us a better way. If we define a parameter in a function with a leading *, it captures any “extra” values passed to the function that don’t line up with named parameters. Similarly, if we define a parameter with two leading stars **, it captures any extra named parameters: def show_args(title, *args, **kwargs): print(f{title} args '{args}' and kwargs '{kwargs}') show_args(nothing) show_args(one unnamed argument, 1) show_args(one named argument, second=2) show_args(one of each, 3, fourth=4) nothing args '()' and kwargs '{}' one unnamed argument args '(1,)' and kwargs '{}' one named argument args '()' and kwargs '{'second': '2'}' one of each args '(3,)' and kwargs '{'fourth': '4'}' This mechanism is sometimes referred to as varargs (short for “variable arguments”). A complementary mechanism called spreading allows us to take a list or dictionary full of arguments and spread them out in a call to match a function’s parameters:
12 2 Objects and Classes def show_spread(left, middle, right): print(fleft {left} middle {middle} right {right}) all_in_list = [1, 2, 3] show_spread(*all_in_list) all_in_dict = {right: 30, left: 10, middle: 20} show_spread(**all_in_dict) left 1 middle 2 right 3 left 10 middle 20 right 30 With these tools in hand, let’s add a method to our Square class to tell us whether a square is larger than a user-specified size: def square_larger(thing, size): return call(thing, area) size Square = { perimeter: square_perimeter , area: square_area , larger: square_larger , _classname: Square } The function that implements this check for circles looks exactly the same: def circle_larger(thing, size): return call(thing, area) size We then modify call to capture extra arguments in *args and spread them into the function being called: def call(thing, method_name , *args): return thing[_class][method_name](thing, *args) Our tests show that this works: examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: result = call(ex, larger, 10) print(fis {ex['name']} larger? {result}) is sq larger? False is ci larger? True However, we now have two functions that do exactly the same thing—the only difference between them is their names. Anything in a program that is duplicated in several places will eventually be wrong in at least one, so we need to find some way to share this code. 2.4 Inheritance The tool we want is inheritance. To see how this works in Python, let’s add a method called density to our original Shape class that uses other methods defined by the class
2.4 Inheritance 13 class Shape: def __init__(self, name): self.name = name def perimeter(self): raise NotImplementedError(perimeter) def area(self): raise NotImplementedError(area) def density(self, weight): return weight / self.area() examples = [Square(sq, 3), Circle(ci, 2)] for ex in examples: n = ex.name d = ex.density(5) print(f{n}: {d:.2f}) sq: 0.56 ci: 0.40 To enable our dictionary-based “classes” to do the same thing, we create a dictionary to represent a generic shape and give it a “method” to calculate density: def shape_density(thing, weight): return weight / call(thing, area) Shape = { density: shape_density , _classname: Shape, _parent: None } We then add another specially-named field to the dictionaries for “classes” like Square to keep track of their parents: Square = { perimeter: square_perimeter , area: square_area , _classname: Square, _parent: Shape } and modify the call function to search for the requested method (Figure 2.4): def call(thing, method_name , *args): method = find(thing[_class], method_name) return method(thing, *args) def find(cls, method_name): while cls is not None: if method_name in cls: return cls[method_name] cls = cls[_parent] raise NotImplementedError(method_name)
14 2 Objects and Classes name side perim area square sq 2 square_perim() square_area() name radius perim area circle ci 3 circle_perim() circle_area() _class _class _classname square _classname circle Square Circle _parent _parent density shape_density() _classname shape Shape _parent Figure 2.4: Using dictionary search to implement inheritance. A simple test shows that this is working as intended: examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: n = ex[name] d = call(ex, density, 5) print(f{n}: {d:.2f}) sq: 0.56 ci: 0.40 We do have one task left, though: we need to make sure that when a square or circle is made, it is made correctly. In short, we need to implement constructors. We do this by giving the dictionaries that implement classes a special key _new whose value is the function that builds something of that type: def shape_new(name): return { name: name, _class: Shape } Shape = { density: shape_density , _classname: Shape, _parent: None, _new: shape_new } In order to make an object, we call the function associated with its _new key: def make(cls, *args): return cls[_new](*args) That function is responsible for upcalling the constructor of its parent. For example, the constructor for a square calls the constructor for a generic shape and adds square-specific values using | to combine two dictionaries:
2.5 Summary 15 def square_new(name, side): return make(Shape, name) | { side: side, _class: Square } Square = { perimeter: square_perimeter , area: square_area , _classname: Square, _parent: Shape, _new: square_new } Of course, we’re not done until we test it: examples = [make(Square, sq, 3), make(Circle, ci, 2)] for ex in examples: n = ex[name] d = call(ex, density, 5) print(f{n}: {d:.2f}) sq: 0.56 ci: 0.40 2.5 Summary We have only scratched the surface of Python’s object system. Multiple inheritance, class methods, static methods, and monkey patching are powerful tools, but they can all be understood in terms of dictionaries that contain references to properties, functions, and other dictionaries (Figure 2.5). class object methods attributes defines has instance of contract respects dictionary dictionary stored in refers to constructor includes stored in Figure 2.5: Concept map for implementing objects and classes.
16 2 Objects and Classes 2.6 Exercises Handling Named Arguments The final version of call declares a parameter called *args to capture all the positional arguments of the method being called and then spreads them in the actual call. Modify it to capture and spread named arguments as well. Multiple Inheritance Implement multiple inheritance using dictionaries. Does your implementation look methods up in the same order as Python would? Class Methods and Static Methods 1. Explain the differences between class methods and static methods. 2. Implement both using dictionaries. Reporting Type Python type method reports the most specific type of an object, while isinstance deter- mines whether an object inherits from a type either directly or indirectly. Add your own versions of both to dictionary-based objects and classes. Using Recursion A recursive function is one that calls itself, either directly or indirectly. Modify the find function that finds a method to call so that it uses recursion instead of a loop. Which version is easier to understand? Which version is more efficient? Method Caching Our implementation searches for the implementation of a method every time that method is called. An alternative is to add a cache to each object to save the methods that have been looked up before. For example, each object could have a special key called _cache whose value is a dictionary. The keys in that dictionary are the names of methods that have been called in the past, and the values are the functions that were found to implement those methods. Add this feature to our dictionary-based objects. How much more complex does it make the code? How much extra storage space does it need compared to repeated lookup?
3 Finding Duplicate Files • A hash function creates a fixed-size value from an arbitrary sequence of bytes. • Use big-oh notation to estimate the running time of algorithms. • The output of a hash function is deterministic but not easy to predict. • A good hash function’s output is evenly distributed. • A large cryptographic hash can be used to uniquely identify a file’s contents. Terms defined: big-oh notation, binary mode, bucket, collision (in hashing), cryptographic hash function, hash code, hash function, hexadecimal, SHA-256 (hash function), streaming API, time complexity Suppose we want to find duplicated files, such as extra copies of photos or data sets. People often rename files, so we must compare their contents, but this will be slow if we have a lot of files. We can estimate how slow “slow” will be with a simple calculation. N objects can be paired in N(N − 1) ways. If we remove duplicate pairings (i.e., if we count A-B and B-A as one pair) then there are N(N − 1)/2 = (N2 − N)/2 distinct pairs. As N gets large, this value is approximately proportional to N2 . A computer scientist would say that the time complexity of our algorithm is O(N2 ), which is pronounced “big-oh of N squared”. In simpler terms, when the number of files doubles, the running time roughly quadruples, which means the time per file increases as the number of files increases. Slowdown like this is often unavoidable, but in our case there’s a better way. If we generate a shorter identifier for each file that depends only on the bytes it contains, we can group together the files that have the same identifier and only compare the files within a group. This approach is faster because we only do the expensive byte-by-byte comparison on files that might be equal. And as we’ll see, if we are very clever about how we generate identifiers then we can avoid byte-by-byte comparisons entirely. 3.1 Getting Started We’ll start by implementing the inefficient N2 approach so that we can compare our later designs to it. The short program below takes a list of filenames from the command line, finds duplicates, and prints the matches: def find_duplicates(filenames): matches = [] for left in filenames: for right in filenames: if same_bytes(left, right): matches.append((left, right)) return matches 17
18 3 Finding Duplicate Files if __name__ == __main__: duplicates = find_duplicates(sys.argv[1:]) for (left, right) in duplicates: print(left, right) This program uses a function called same_bytes that reads two files and compares them byte by byte: def same_bytes(left_name, right_name): left_bytes = open(left_name, rb).read() right_bytes = open(right_name, rb).read() return left_bytes == right_bytes Notice that the files are opened in binary mode using rb instead of the usual r. As we’ll see in Chapter 17, this tells Python to read the bytes exactly as they are rather than trying to convert them to characters. To test this program and the others we’re about to write, we create a tests directory with six files: Filename a1.txt a2.txt a3.txt b1.txt b2.txt c1.txt Content aaa aaa aaa bb bb c We expect the three a files and the two b files to be reported as duplicates. There’s no particular reason for these tests—we just have to start somewhere. Our first test looks like this: python brute_force_1.py tests/*.txt tests/a1.txt tests/a1.txt tests/a1.txt tests/a2.txt tests/a1.txt tests/a3.txt tests/a2.txt tests/a1.txt tests/a2.txt tests/a2.txt tests/a2.txt tests/a3.txt tests/a3.txt tests/a1.txt tests/a3.txt tests/a2.txt tests/a3.txt tests/a3.txt tests/b1.txt tests/b1.txt tests/b1.txt tests/b2.txt tests/b2.txt tests/b1.txt tests/b2.txt tests/b2.txt tests/c1.txt tests/c1.txt Our program’s output is correct but not useful: every file is reported as being identical to itself, and every match of different files is reported twice. Let’s fix the nested loop in find_duplicates so that we only check potentially differing pairs once (Figure 3.1): def find_duplicates(filenames): matches = [] for i_left in range(len(filenames)): left = filenames[i_left] for i_right in range(i_left): right = filenames[i_right] if same_bytes(left, right): matches.append((left, right)) return matches
3.2 Hashing Files 19 1 2 3 0 0 1 2 3 i_left i_right Figure 3.1: Scoping the inner loop to produce unique combinations. 3.2 Hashing Files Instead of comparing every file against every other, let’s process each file once to produce a short identifier that depends only on the file’s contents and then only compare files that have the same identifier, i.e., that might be equal. If files are evenly divided into g groups then each group will contain roughly N/g files, so the total work will be roughly O(g(N/g)2 ) (i.e., g groups times (N/g)2 comparisons within each group). Simplifying, this is N2 /g, so as the number of groups grows, and the overall running time should decrease (Figure 3.2). a1.txt a2.txt a3.txt b1.txt b2.txt c1.txt aaa aaa aaa bb bb c 6 6 6 10 10 5 filename contents hash code Figure 3.2: Grouping by hash code reduces comparisons from 15 to 4. We can construct IDs for files using a hash function to produce a hash code. Since bytes are just numbers, we can create a very simple hash function by adding up the bytes in a file and taking the remainder modulo some number: def naive_hash(data): return sum(data) % 13 Here’s a quick test that calculates the hash code for successively longer substrings of the word hashing: example = bytes(hashing, utf-8) for i in range(1, len(example) + 1): substring = example[:i] hash = naive_hash(substring) print(f{hash:2} {substring}) 0 b'h' 6 b'ha' 4 b'has' 4 b'hash' 5 b'hashi'
20 3 Finding Duplicate Files 0 2 4 6 8 10 12 0 1000 2000 3000 hash count Figure 3.3: Distribution of hash codes per line in Dracula. 0 2 4 6 8 10 12 0 200 400 600 800 1000 hash count Figure 3.4: Distribution of hash codes per unique line in Dracula. 11 b'hashin' 10 b'hashing' The output seems random, but is it? As a more stringent test, let’s try hashing every line of text in the Project Gutenberg1 version of the novel Dracula and plot the distribution (Figure 3.3). Most of the buckets are approximately the same height, but why is there a peak at zero? Our big-oh estimate of how efficient our algorithm would be depended on files being distributed evenly between groups; if that’s not the case, our code won’t be as fast as we hoped. After a bit of digging, it turns out that the text file we’re processing uses a blank line to separate paragraphs. These hash to zero, so the peak reflects an unequal distribution in our data. If we plot the distribution of hash codes of unique lines, the result is more even (Figure 3.4). 1https://www.gutenberg.org/
3.3 Better Hashing 21 Hashing is a tremendously powerful tool: for example, Python’s dictionaries hash their keys to make lookup fast. Now that we can hash files, we can build a dictionary with hash codes as keys and sets of filenames as values. The code that does this is shown below; each time it calculate a hash code, it checks to see if that value has been seen before. If not, it creates a new entry in the groups dictionary with the hash code as its key and an empty set as a value. It can then be sure that there’s a set to add the filename to: def find_groups(filenames): groups = {} for fn in filenames: data = open(fn, rb).read() hash_code = naive_hash(data) if hash_code not in groups: groups[hash_code] = set() groups[hash_code].add(fn) return groups We can now re-use most of the code we wrote earlier to find duplicates within each group: groups = find_groups(sys.argv[1:]) for filenames in groups.values(): duplicates = find_duplicates(list(filenames)) for (left, right) in duplicates: print(left, right) tests/a2.txt tests/a1.txt tests/a3.txt tests/a1.txt tests/a3.txt tests/a2.txt tests/b1.txt tests/b2.txt 3.3 Better Hashing Let’s go back to the formula O(N2 /g) that tells us how much work we have to do if we have divided N files between g groups. If we have exactly as many groups as files—i.e., if g is equal to N—then the work to process N files would be O(N2 /N) = O(N), which means that the work will be proportional to the number of files. We have to read each file at least once anyway, so we can’t possibly do better than this, but how can we ensure that each unique file winds up in its own group? The answer is to use a cryptographic hash function. The output of such a function is completely deterministic: given the same bytes in the same order, it will always produce the same output. However, the output is distributed like a uniform random variable: each possible output is equally likely, which ensures that files will be evenly distributed between groups. Cryptographic hash functions are hard to write, and it’s very hard to prove that a partic- ular algorithm has the properties we require. We will therefore use a function from Python’s hashing module2 that implements the SHA-256 hashing algorithm. Given some bytes as input, this function produces a 256-bit hash, which is normally written as a 64-character hexadecimal string. This uses the letters A-F (or a-f) to represent the digits from 10 to 15, so that (for example) 3D5 is (3×162 ) + (13×161 ) + (5×160 ), or 981 in decimal: 2https://docs.python.org/3/library/hashlib.html
22 3 Finding Duplicate Files example = bytes(hash, utf-8) for i in range(1, len(example) + 1): substring = example[:i] hash = sha256(substring).hexdigest() print(f{substring}n{hash}) b'h' aaa9402664f1a41f40ebbc52c9993eb66aeb366602958fdfaa283b71e64db123 b'ha' 8693873cd8f8a2d9c7c596477180f851e525f4eaf55a4f637b445cb442a5e340 b'has' 9150c74c5f92d51a92857f4b9678105ba5a676d308339a353b20bd38cd669ce7 b'hash' d04b98f48e8f8bcc15c6ae5ac050801cd6dcfd428fb5f9e65c4e16e7807340fa The Birthday Problem The odds that two people share a birthday are 1/365 (ignoring February 29). The odds that they don’t are therefore 364/365. When we add a third person, the odds that they don’t share a birthday with either of the preceding two people are 363/365, so the overall odds that nobody shares a birthday are (364/365)×(363/365). If we keep going, there’s a 50% chance of two people sharing a birthday in a group of just 23 people, and a 99.9% chance with 70 people. The same math can tell us how many files we need to hash before there’s a 50% chance of a collision with a 256-bit hash. According to Wikipedia3 , the answer is approximately 4×1038 files. We’re willing to take that risk. Using this library function makes our duplicate file finder much shorter: import sys from hashlib import sha256 def find_groups(filenames): groups = {} for fn in filenames: data = open(fn, rb).read() hash_code = sha256(data).hexdigest() if hash_code not in groups: groups[hash_code] = set() groups[hash_code].add(fn) return groups if __name__ == __main__: groups = find_groups(sys.argv[1:]) for filenames in groups.values(): print(, .join(sorted(filenames))) python dup.py tests/*.txt tests/a1.txt, tests/a2.txt, tests/a3.txt tests/b1.txt, tests/b2.txt tests/c1.txt 3https://en.wikipedia.org/wiki/Birthday_problem
3.4 Summary 23 More importantly, our new approach scales to very large sets of files: as explained above, we only have to look at each file once, so the running time is as good as it possibly can be. 3.4 Summary Figure 3.5 summarizes the key ideas in this chapter, the most important of which is that some algorithms are intrinsically better than others. hash function bytes file hash code from input is output is deterministic uniformly distributed is unique for each sequence of duplicates may be efficiency improves of detecting big-oh measured by such as linear O(N) quadratic O(N2) Figure 3.5: Concept map for duplicate file detection with hashing. 3.5 Exercises Odds of Collision If hashes were only 2 bits long, then the chances of collision with each successive file assuming no previous collision are: Number of Files Odds of Collision 1 0% 2 25% 3 50% 4 75% 5 100% A colleague of yours says this means that if we hash four files, there’s only a 75% chance of any collision occurring. What are the actual odds?
24 3 Finding Duplicate Files Streaming I/O A streaming API delivers data one piece at a time rather than all at once. Read the doc- umentation for the update method of hashing objects in Python’s hashing module4 and rewrite the duplicate finder from this chapter to use it. Big Oh Chapter 1 said that as the number of components in a system grows, the complexity of the system increases rapidly. How fast is “rapidly” in big-oh terms? The hash Function 1. Read the documentation for Python’s built-in hash function. 2. Why do hash(123) and hash(123) work when hash([123]) raises an exception? How Good Is SHA-256? 1. Write a function that calculates the SHA-256 hash code of each unique line of a text file. 2. Convert the hex digests of those hash codes to integers. 3. Plot a histogram of those integer values with 20 bins. 4. How evenly distributed are the hash codes? How does the distribution change as you process larger files? 4https://docs.python.org/3/library/hashlib.html
4 Matching Patterns • Use globs and regular expressions to match patterns in text. • Use inheritance to make matchers composable and extensible. • Simplify code by having objects delegate work to other objects. • Use the Null Object pattern to eliminate special cases in code. • Use standard refactorings to move code from one working state to another. • Build and check the parts of your code you are least sure of first to find out if your design will work. Terms defined: Chain of Responsibility pattern, child class, Extract Parent Class refactoring, globbing, greedy matching, helper method, inheritance, lazy matching, literal (in parsing), Null Object pattern, refactor, regular expression, signature, technical debt, test-driven development We used *.txt to tell the duplicate file finder of Chapter 3 which files to compare. Older programmers (like this author) refer to this kind of pattern-matching as globbing because early versions of Unix had a tool called glob1 to do it. Globbing was so useful that it was quickly added to the shell, and the Python standard library includes a module called glob2 to match filenames in the same way. For example, 2023-*.{pdf,txt} matches 2023-01.txt and 2023-final.pdf but not draft-2023.docx (Figure 4.1). Globbing patterns are simpler than the regular expressions used to scrape data from text files, but the principles are the same. This chapter therefore implements a simple ver- sion of globbing to show how pattern-matching works in general. This matcher will only handle the cases in Table 4.1, but as the exercises will show, our design makes it easy to add new kinds of patterns. 2023- * . {pdf,txt} 2023- 01 . txt 2023- * . {pdf,txt} 2023- final . pdf 2023- * . {pdf,txt} draft 2023 . docx Figure 4.1: Examples of glob matching. 1https://en.wikipedia.org/wiki/Glob_(programming) 2https://docs.python.org/3/library/glob.html 25
26 4 Matching Patterns Pattern Text Match? Pattern Text Match? abc “abc” true a*c “abc” true ab “abc” false {a,b} “a” true abc “ab” false {a,b} “c” false * ”“ true {a,b} “ab” false * “abc” true *{x,y} “abcx” true Table 4.1: Pattern-matching cases. 4.1 Simple Patterns Matching is conceptually simple. If the first element of the pattern matches the target string at the current location, we check if the rest of the pattern matches what’s left of the string. If the element doesn’t match the front of the string, or if the rest of the pattern can’t match the rest of the string, matching fails. (This behavior makes globbing different from regular expressions, which can match parts of strings.) This design makes use of the Chain of Responsibility design pattern. Each matcher matches if it can then asks the next matcher in the chain to try to match the remaining text (Figure 4.2). Crucially, objects don’t know how long the chain after them is: they just know whom to ask next. In some cases we only need to know what kind of matching we’re doing: for example, the * pattern matches any characters. In other cases, though, we need some extra information, such as the literal text abc or the two alternatives pdf and txt. We therefore decide to create matching objects that can hold this extra information rather than just writing functions. Our first matcher checks whether a piece of text like abc matches a string. We call this class Lit because a fixed string of characters is sometimes called a literal, and it has a constructor and a match method: class Lit: def __init__(self, chars, rest=None): self.chars = chars self.rest = rest def match(self, text, start=0): end = start + len(self.chars) if text[start:end] != self.chars: return False if self.rest: return self.rest.match(text, end) return end == len(text) chars is the characters to be matched, while rest is responsible for matching the rest of the text. If rest is None, this matcher is the last one in the chain, so it must match to the end of the target string. 2023- * . {pdf,txt} 2023- 01 . txt Literal Any Literal Either Figure 4.2: Matching with Chain of Responsibility.
4.1 Simple Patterns 27 The match method takes the text to be matched as an input along with an optional start parameter that indicates where matching is to start. This parameter has a default value of 0 (meaning “start at the beginning”), but if this Lit follows other matchers, they need to tell it where to start looking. To see if this works, let’s write and run a few tests: def test_literal_match_entire_string(): # /abc/ matches abc assert Lit(abc).match(abc) def test_literal_substring_alone_no_match(): # /ab/ doesn't match abc assert not Lit(ab).match(abc) def test_literal_superstring_no_match(): # /abc/ doesn't match ab assert not Lit(abc).match(ab) Notice that we give tests long, meaningful names to make failure reports from the test runner easier to read. We could go ahead and build some more matchers right away, but as [Petre2016] ex- plains, good programmers build and check the parts of their code that they are least sure of as early as possible to find out if their entire design is going to work or not. We there- fore write a test to make sure that chaining works when one literal matcher is followed by another: def test_literal_followed_by_literal_match(): # /a/+/b/ matches ab assert Lit(a, Lit(b)).match(ab) def test_literal_followed_by_literal_no_match(): # /a/+/b/ doesn't match ac assert not Lit(a, Lit(b)).match(ac) Chaining two literal matchers together is unnecessary: we could (and probably should) write Lit(ab) instead of Lit(a, Lit(b)). However, the fact that these two tests pass reassures us that our design is working. Test-Driven Development Some programmers write the tests for a piece of code before writing the code itself. This practice is called test-driven development, and its advocates claim that it yields better code in less time because (a) writing tests helps people think about what the code should do before they’re committed to a particular implementation and (b) if people write tests first, they’ll actually write tests. However, research shows that the order in which the tests are written doesn’t actually make a difference [Fucci2016]; what actually matters is alternating short bursts of testing and coding. These tests for Lit pass, so we’re ready to move on to wildcards. A * character in our pattern matches zero or more characters, so if there are no more matchers in the chain, then this * matches to the end of the target string and match returns True right away. If there are other matchers, on the other hand, we try matching no characters, one character, two characters, and so on and see if those other matchers can get us to the end of the string if we do so. If none of these possibilities succeeds, the overall match fails (Figure 4.3).
28 4 Matching Patterns a * .txt 0 no .txt a b c a * .txt 1 no .txt a b c a * .txt yes .txt a b c 2 Figure 4.3: How wildcard matching works. class Any: def __init__(self, rest=None): self.rest = rest def match(self, text, start=0): if self.rest is None: return True for i in range(start, len(text)): if self.rest.match(text, i): return True return False Once again we write a few tests before moving on: def test_any_matches_empty(): # /*/ matches assert Any().match() def test_any_matches_entire_string(): # /*/ matches abc assert Any().match(abc) def test_any_matches_as_prefix(): # /*def/ matches abcdef assert Any(Lit(def)).match(abcdef) def test_any_matches_as_suffix(): # /abc*/ matches abcdef assert Lit(abc, Any()).match(abcdef) def test_any_matches_interior(): # /a*c/ matches abc assert Lit(a, Any(Lit(c))).match(abc) Either/or matching works much the same way. If the first alternative matches, we try the rest of the chain. If not, we try the second alternative, and if that doesn’t work either, we fail: class Either: def __init__(self, left, right, rest=None): self.left = left self.right = right self.rest = rest def match(self, text, start=0): return self.left.match(text, start) or self.right.match(text, start)
4.2 Rethinking 29 Our first few tests pass: def test_either_two_literals_first(): # /{a,b}/ matches a assert Either(Lit(a), Lit(b)).match(a) def test_either_two_literals_not_both(): # /{a,b}/ doesn't match ab assert not Either(Lit(a), Lit(b)).match(ab) but further testing uncovers a bug: def test_either_followed_by_literal_match(): # /{a,b}c/ matches ac assert Either(Lit(a), Lit(b), Lit(c)).match(ac) def test_either_followed_by_literal_no_match(): # /{a,b}c/ doesn't match ax assert not Either(Lit(a), Lit(b), Lit(c)).match(ax) ======================= test session starts ======================== test_glob_problem.py F. [100%] ===================== short test summary info ====================== FAILED test_glob_problem.py::test_either_followed_by_literal_match =================== 1 failed, 1 passed in 0.00s ==================== The problem is that Either.match isn’t using rest properly—in fact, it’s not using rest at all because it doesn’t know what to pass it as a starting point. Instead of having match methods return True or False, we need them to return an indication of where the next match should start so that Either can pass that information along to rest. Before making this change, we will clear up a bit of technical debt in our code. 4.2 Rethinking We now have three matchers with the same interfaces. Before we do any further work, we will refactor using a pattern called Extract Parent Class to make the relationship between the matchers clear (Figure 4.4). At the same time, each matcher is checking to see if its rest is None. We can simplify this by creating a class to represent “nothing here”, which is known as the Null Object pattern. We Didn’t Invent This We didn’t invent any of the patterns or refactorings used in this chapter. Instead, we learned them from books like [Gamma1994; Fowler2018; Kerievsky2004]. And as [Tichy2010] showed, learning these patterns makes people better programmers.
30 4 Matching Patterns Lit __init__() match() Any __init__() match() Either __init__() match() Lit __init__() _match() Any __init__() _match() Either __init__() _match() Null __init__() _match() Match __init__() match() calls Figure 4.4: Using the Extract Parent Class refactoring. Our new parent class Match looks like this: class Match: def __init__(self, rest): self.rest = rest if rest is not None else Null() def match(self, text): result = self._match(text, 0) return result == len(text) Match.rest requires every child class to have a helper method called _match that returns the location from which searching is to continue. Match.match checks whether the entire match reaches the end of the target string and returns True or False as appropriate. Our new Null Object class looks like this: class Null(Match): def __init__(self): self.rest = None def _match(self, text, start): return start Null objects must be at the end of the matching chain, i.e., their rest must be None, so we remove the rest parameter from the class’s constructor and pass None up to the parent constructor every time. Since Null objects don’t match anything, Null._match immediately returns whatever starting point it was given. Every other matcher can now pass responsi- bility down the chain without having to test whether it’s the last matcher in line or not. With these changes in place, our literal matcher becomes: class Lit(Match): def __init__(self, chars, rest=None): super().__init__(rest) self.chars = chars def _match(self, text, start): end = start + len(self.chars) if text[start:end] != self.chars: return None return self.rest._match(text, end) Lit’s constructor calls the constructor of its parent class to initialize the things that all classes share, then adds the data specific to this class. It returns None for “no match” or whatever self.rest returns If this object’s rest is an instance of Null, this result will be the index after the overall match.
4.3 Summary 31 As before, the matcher for * checks what happens if it matches an ever-larger part of the target string: class Any(Match): def __init__(self, rest=None): super().__init__(rest) def _match(self, text, start): for i in range(start, len(text) + 1): end = self.rest._match(text, i) if end == len(text): return end return None (The exercises will ask why loop has to run to len(text) + 1.) Finally, the either/or matcher that prompted this refactoring becomes: class Either(Match): def __init__(self, left, right, rest=None): super().__init__(rest) self.left = left self.right = right def _match(self, text, start): for pat in [self.left, self.right]: end = pat._match(text, start) if end is not None: end = self.rest._match(text, end) if end == len(text): return end return None Looping over the left and right alternative saves us from repeating code or introducing a helper method. It also simplifies the handling of more than two options, which we explore in the exercises. Crucially, none of the existing tests change because none of the matching classes’ constructors changed and the signature of the match method (which they now inherit from the generic Match class) stayed the same as well. We should add some tests for Null, but we have now met our original goal, and as the exercises will show we can easily add matchers for other kinds of patterns. 4.3 Summary Figure 4.5 summarizes the key ideas in this chapter; we will see the Null Object and Chain of Responsibility design patterns again.
32 4 Matching Patterns glob pattern text matchers class base class Chain of Responsiblity is a matches written as made up of each implemented as of cooperate using Null Object ends with Figure 4.5: Regular expression matching concept map. 4.4 Exercises Looping Rewrite the matchers so that a top-level object manages a list of matchers, none of which know about any of the others. Is this design simpler or more complicated than the Chain of Responsibility design? Length Plus One Why does the upper bound of the loop in the final version of Any run to len(text) + 1? Find One or More Extend the regular expression matcher to support +, meaning “match one or more charac- ters”. Match Sets of Characters 1. Add a new matching class that matches any character from a set, so that Charset('aeiou') matches any lower-case vowel. 2. Create a matcher that matches a range of characters. For example, Range(a, z) matches any single lower-case Latin alphabetic character. (This is just a convenience matcher: ranges can always be spelled out in full.) 3. Write some tests for your matchers. Exclusion 1. Create a matcher that doesn’t match a specified pattern. For example, Not(Lit(abc)) only succeeds if the text isn’t “abc”. 2. Write some tests for it.
4.4 Exercises 33 Make Repetition More Efficient Rewrite Any so that it does not repeatedly re-match text. Multiple Alternatives 1. Modify Either so that it can match any number of sub-patterns, not just two. 2. Write some tests for it. 3. What does your implementation do when no sub-patterns are specified? Returning Matches Modify the matcher so that it returns the substrings that matched each part of the expres- sion. For example, when *.txt matches name.txt, the library should return some indication that * matched the string name. Alternative Matching The tool we have built implements lazy matching, i.e., the * character matches the shortest string it can that results in the overall pattern matching. Modify the code to do greedy matching instead, and combine it with the solution to the previous exercise for testing.
Taylor Francis Taylor Francis Group http://taylorandfrancis.com
5 Parsing Text • Parsing transforms text that’s easy for people to read into objects that are easy for computers to work with. • A grammar defines the textual patterns that a parser recognizes. • Most parsers tokenize input text and then analyze the tokens. • Most parsers need to implement some form of precedence to prioritize different patterns. • Operations like addition and function call work just like user-defined functions. • Programs can overload built-in operators by defining specially-named methods that are recognized by the compiler or interpreter. Terms defined: abstract syntax tree, concrete class, CSV, grammar, JSON, operator overloading, parser, token, tokenizer, YAML We constructed objects to match patterns in Chapter 4, but an expression like 2023-*{pdf,txt} is a lot easier to read and write than code like Lit(2023-, Any(Either(pdf, txt))). If we want to use the former, we need a parser to con- vert those human-readable strings into machine-comprehensible objects. Table 5.1 shows the grammar our parser will handle. When we are done, our parser should be able to recognize that 2023-*.{pdf,txt} means the literal 2023-, any charac- ters, a literal ., and then either a literal pdf or a literal txt. Please Don’t Write Parsers Languages that are comfortable for people to read and write are usually difficult for computers to understand and vice versa, so we need parsers to translate the former into the latter. However, the world doesn’t need more file formats: please use CSV, JSON, YAML, or something else that already has an acronym rather than inventing something of your own. 5.1 Tokenizing Most parsers are written in two parts (Figure 5.1). The first stage groups characters into atoms of text called “tokens“, which are meaningful pieces of text like the digits making up a number or the letters making up a variable name. Our grammar’s tokens are the special characters ,, {, }, and *. Any sequence of one or more other characters is a single multi- letter token. This classification determines the design of our tokenizer: 35
36 5 Parsing Text Meaning Character Any literal character c c Zero or more characters * Alternatives {x,y} Table 5.1: Glob grammar. 1. If a character is not special, then append it to the current literal (if there is one) or start a new literal (if there isn’t). 2. If a character is special, then close the existing literal (if there is one) and create a token for the special character. Note that the , character closes a literal but doesn’t produce a token. The result of tokenization is a flat list of tokens. The second stage of parsing assembles tokens to create an abstract syntax tree (AST) that represents the structure of what was parsed. We will re-use the classes defined in Chapter 4 for this purpose. Before we start writing our tokenizer, we have to decide whether to implement it as a set of functions or as one or more classes. Based on previous experience, we choose the latter: this tokenizer is simple enough that we’ll only need a handful of functions, but one capable of handling a language like Python would be much larger, and classes are a handy way to group related functions together. The main method of our tokenizer looks like this: def tok(self, text): self._setup() for ch in text: if ch == *: self._add(Any) elif ch == {: self._add(EitherStart) elif ch == ,: self._add(None) elif ch == }: self._add(EitherEnd) elif ch in CHARS: self.current += ch else: raise NotImplementedError(fwhat is '{ch}'?) self._add(None) return self.result This method calls self._setup() at the start so that the tokenizer can be re-used. It doesn’t call self._add() for regular characters; instead, it creates a Lit entry when it 2023-*{pdf,txt} [Lit, 2023-] [Any] [Either, pdf, txt] Lit 2023- Any Either pdf txt Null tokenizer parser Figure 5.1: Stages in parsing pipeline.
5.1 Tokenizing 37 A * { B , C } A*{B,C} current result A [ ] ch [[Lit, A], [Any]] [[Lit, A], [Any], [EitherStart]] B [[Lit, A], [Any], [EitherStart]] [[Lit, A], [Any], [EitherStart], [Lit, B]] C [[Lit, A], [Any], [EitherStart], [Lit, B]] [[Lit, A], [Any], [EitherStart], [Lit, B], [Lit, C], [EitherEnd]] Figure 5.2: Steps in tokenizing a string. encounters a special character (i.e., when the current literal ends) and after all the input has been parsed (to capture the last literal). The method self._add adds the current thing to the list of tokens. As a special case, self._add(None) means “add the literal but nothing else” (Figure 5.2): def _add(self, thing): if len(self.current) 0: self.result.append([Lit, self.current]) self.current = if thing is not None: self.result.append([thing]) Finally, we work backward to initialize the tokenizer when we construct it and to define the set of characters that make up literals: CHARS = set(string.ascii_letters + string.digits) class Tokenizer: def __init__(self): self._setup() def _setup(self): self.result = [] self.current = A Simple Constant The code fragment above defines CHARS to be a set containing ASCII letters and digits. We use a set for speed: if we used a list, Python would have to search through it each time we wanted to check a character, but finding something in a set is much faster. Chapter 17 will explain why the word “ASCII” appears in string library’s definition of characters, but using it and string.digits greatly reduces the chances of us typing abcdeghi...yz rather than abcdefghi...yz. (The fact that it took you a moment to spot the missing letter ‘f’ proves this point.) We can now write a few tests to check that the tokenizer is producing a list of lists in which each sub-list represents a single token:
Random documents with unrelated content Scribd suggests to you:
venomous snake there, but the only viper, except Russell’s viper, a much larger snake. Only twice could I observe the toxic effects of the Echis carinata at present (1882) in the collection; both cases being in hot weather. It has so far conformed to circumstances in England, as to consent to dine on small white mice, failing scorpions. In the first case it struck the mouse savagely as soon as it was dropped into the cage, and the mouse died in less than two minutes. Echis approached it stealthily and timidly, but having at last got courage to seize it, ate it very quickly; and as the snake moved and dragged it, the mouse appeared to be quite stiff in that short time. On the second occasion, it bit a mouse on the leg, and it was five minutes dying. At first only the leg was paralyzed; then a spasm followed, and the mouse fell over and lay extended flat and still as if dead; but presently a spasmodic convulsion followed. It again appeared to be dead, and the little viper approached; but on a very slight spasm receded swiftly, not once taking its eyes off the mouse, which was dying slowly. The viper was at least five minutes swallowing this, and as if it did not much care about it. One must argue, therefore, that the charge of venom had been scantily expended, as the difference between this and the previous victim was remarkable. Echis poison has been seen to take instantaneous effect. The small Vipera atropos from the South African mountains is also astoundingly virulent. One in the collection in 1881 struck a mouse as soon as it arrived, and death occurred in fifty seconds by the watch. A large store of poison must have accumulated during its journey and since its previous meal. One more African snake must be mentioned before I conclude the painful duty of describing the inevitable—though happily short— sufferings inflicted by venomous serpents. Three young Najas, the well-known Ring Halsschlange of South Africa, were brought in the spring of, I think, 1877. They were very black and very shy, and for a long while one could see nothing more of them than three little heads in a row peeping out from under their blanket, and watching with their large round black eyes, but vanishing like a shot at your approach. ‘They cut away the moment you go near them,’ said the keeper. When they did give us an opportunity of looking at them, we found that one was quite black, and another was speckled with white; they erected their heads and distended their necks
defiantly. Their eyes had a white rim round them, and were bright and undeniably beautiful, even though belonging to a venomous snake. Whether because they were young and inexperienced, or naturally stupid, I could not decide but of all the snakes none ever went so awkwardly to work in feeding, or put their victims to such unnecessary torture, as did these ridiculous little Najas. The feeding observations were made in August, when they had grown considerably, and had become accustomed to their home. They seemed to bite the prey anywhere without much effect, sometimes retaining it in their mouth, and at other times beginning at once to eat it. One frog was ten minutes from the time it was struck until it was swallowed, and for no reason beyond the feeder’s awkwardness. The little snake began at a hind leg, and not being able to get the frog into its mouth, put it down and began again at the side, but with no better result, the legs being in the way. Then he gave it up and let the frog go, and presently his comrade struck the half-dead thing and took five minutes to eat it. One might decide from this that frogs were not their natural food; but with very young sparrows the same mismanagement was observable. The bird was awkwardly bitten on the tip of the wing, and the snake held it helplessly for a quarter of an hour while the bird was struggling violently. Not getting good hold, the snake put it down and began again, so that the poor little sparrow was twenty minutes in being swallowed, gasping to the last, and evidently only very feebly poisoned. One of the Najas bit his companion, and held on for about ten minutes, and for no reason whatever that one could discern. In no other venomous snakes have I seen such prolonged suffering caused by such stupidity or bungling as in those young African ‘Ring Hals.’ Their fangs are, however, exceedingly short, as I found on examining a dead one, and this may account for the slow effect of them. Three other heads were often seen in a row peeping out, but belonging to harmless ‘glass snakes,’ and there was intelligence in their looks; for they recognised the keeper, and advanced to the glass whenever he passed, asking for their dinner as plainly as little snakes could ask. A Heterodon exhibited equal intelligence when it was dinner- time, and sprang at the glass when he saw the keeper coming. Some of the pythons display intelligence too, on feeding days, but of quite an epicure form. One day in May 1876, on remarking that the pythons
were disinclined to eat, Holland said ‘they were waiting for young ducks,’ only elderly birds being in their cage at the time. Even in summer they don’t eat the old ducks so eagerly, because the large, hard quills annoy them. A bunch of these quills passes undigested. Hair or feathers in a desiccated mass pass through the snakes, and occasionally, when they are not in health, digestible but undigested substances too, also the beaks of the ducks. Vegetable substances have been found in snakes, from which it has been argued that they sometimes eat vegetables. But it rather argues that they don’t digest vegetables, which have probably been swallowed in the stomach of a rabbit or some other herbivorous animal that they have caught. An indifference to food was noticeable in the snakes in ungenial weather. One cold, raw, foggy day in October 1873, a python caught a duck and partially coiled it, but so feebly that the bird, after passively submitting for a time, at last disengaged her feet and walked away to shake herself, and then turn and stare as if to discover what possibly had kept her there. A similar disinclination to exert themselves was seen that same chilly day in the largest cage, where were three large pythons. One of them having killed a duck, could not get a satisfactory hold of its head, and let go repeatedly. Another held a duck, but not to crush it or hurt it; for it, like the one above named, only gazed deliberately around, and as if asking the meaning of its detention. A third duck was put into the den for the third python, who, however, only lazily stared at it and made no attempt to seize it; while the bird gazed in astonishment at the one in the embrace of the other snake, as if to inquire, ‘What are you doing there?’ Presently this duck also got away, and was again caught and only partially coiled. The python seemed too large and fat to constrict so small a thing as a duck. It was like tying up a pill-box with a rope. Some of the spectators expressed satisfaction that the duck was not more tightly coiled, and hoped it would succeed in getting away (the duck was not worth two shillings, the python could not be bought for twenty pounds), and were far more horrified when a vigorous constrictor caught and killed its prey in one flash, as when an extended watch-spring flies back to its original position. But a half- constricted creature does suffer, and happily this does not often occur,
the chilly weather that day diminishing ophidian energy considerably. A gentleman, disappointed because they did not eat, and wishing to assign some reason for such unaccountable abstinence, remarked to his friend, ‘I have an idea they sting themselves.’ Watching these gigantic ophidians on one of those half-wintry days, it happened that two of them were lazily gliding, partially hidden by their blanket, and with neither heads nor tails visible, so that the two bodies seemed as only one snake. Two youths stood watching and vainly endeavouring to calculate the numbers of feet or of yards which were entwined and entwisted in those moving coils. Portions and loops of two other pythons in the same cage were visible beyond the rug, but only one head of all the four snakes was to be seen; and to distinguish to which of the gliding, shining curves that head belonged, was impossible. ‘It seems to me that snake’s such a length that he doesn’t know the other part belongs to him,’ remarked one boy to his friend. ‘I don’t think he knows where it is,’ returned the other boy sympathetically. Not a little are the keepers sometimes tried in replying to the inquiries of visitors desirous of improving their minds. Let me repeat one or two conversations overheard on those Fridays. ‘Is that duck put in there for the snake to eat?’ asked a respectably dressed man of the keeper on one of those autumnal days, when a duck sat pluming itself as if settling itself for the evening. ‘Yes, sir,’ replied the keeper. ‘Will he swallow it whole?’ ‘Yes, sir.’ ‘Choke him! I should think?’ ‘No, sir; no—it won’t choke him.’ The man studied the duck, and studied the size of the python’s head and throat for some time. The duck apparently going to rest, but not quite reconciled at so many persons intruding upon her, the man looked disappointed, and again began: ‘Now is that duck charmed, sitting there?’ ‘I should think, sir, she was not at all charmed with the prospect,’ sedately replied Holland.
‘Does that duck know it’s going to be eaten?’ then inquired the man after fresh scrutiny. ‘No, sir,’ returned the keeper with the utmost gravity. ‘That snake don’t seem to be hungry,’ then said the disappointed observer. ‘No, sir. He’ll eat well enough next Friday. He’s going to change his skin.’ ‘Oh!’ said the man to a boy by his side, satisfied, though still rather puzzled, ‘that snake’s going to change his skin next Friday.’ Though there are always on an average fifty snakes in the Reptile House, and on an average each casts its coat three times a year, the visitors are for the most part much mystified about this phenomenon. A snake that had just completed a new toilet had a portion of the old slough still adhering to its tail, when a boy drew attention to it, saying, ‘Papa, that snake is all ragged and torn on its tail.’ ‘Yes, my dear, it is casting its tail.’ Papa must have been reading Aristotle, who wrote: ‘Tails, also, of serpents and lizards when cut off are reproduced.’ With regard to the reproduction of their eyes, Aristotle spake more cautiously. ‘It is reported that the eyes of serpents, if dug out, will be reproduced.’ But, on the contrary, the eyes of snakes are easily injured, and not easily healed; snakes are therefore frequently seen partially blind. As need scarcely be said, only lizards ‘reproduce’ a tail that has been accidentally abridged; and the repair is after all only a boneless one. The truncated member gradually heals, and by and by a short point is again formed, but can always be recognised as a repaired, and not the original, tail; and as far as I have been able to observe, viz. for three or four months, no bone was reproduced. Probably also a snake’s tail might heal in the same way, and to a casual observer appear quite perfect; but the anatomical structure in either case would not, I imagine, be restored. That boy was not far wrong when he said he thought the python did not know which was its own tail. At all events, it is not endowed with much external sensation, as one might judge by the way in which the rats and guinea-pigs take liberties with it. This must be owing to the thickness of the cuticle, because, as we have seen in the constricting snakes, there is keen muscular sensibility in the tail. I may cite an
instance of each case. One day a young rabbit caught hold of a small python with its teeth and held firmly on. The reptile was moving across the cage, and did not appear to feel any hindrance. Indeed, being much the stronger of the two, the persistent bunny was compelled to hop along at the same pace, still holding on by its teeth. But presently, from the position of the snake, the rabbit was obliged to let go, when it next caught hold of the tip of the python’s tail, and again holding tight, hopped after the retreating reptile as if enjoying the joke. In this case I do not think the snake was conscious of the insult, as perhaps the rabbit had hold of the skin only. On the other occasion a guinea-pig was biting a coiled and passive constrictor, Python sebœ. The snake wished to be quiet, but piggy got among its coils and worried it, hopping over it and biting its tail. The python on this, moving only the end of its tail, pushed away the guinea-pig, which soon returned to the attack. The snake again gave the little animal a caudal hint that his fidgeting was annoying; but as the guinea-pig did not take the hint, and still nibbled and teased the snake, the latter with two coils of the tail put an end to the annoyance, not once turning its head, but just tucking up its persecutor in the end of its tail as you might tuck up a parcel under your arm. The python was not hungry, and took no more notice of the offender, though thus effectually punishing the offence with the last two feet of its practical tail. Could we suppose such a quality as muscular intelligence, we might think the tails of those constricting snakes were surely endowed with it. As in other instances already described in chaps. xi. and xii., the eyes took no part in directing the movements of the snake; the whole nine or ten feet of the animal remaining passively coiled, while only the extremity of the tail exerted itself. When reptiles are in a partially torpid condition, their sensations are slow; when hibernating, they are reduced to a minimum. At such times, the creatures being half dead, they may be maimed or injured without any apparent effect. Rats have been known to attack and nibble snakes under these circumstances, and even to eat bits out of them, the snakes being at the time unconscious of injury, though possibly dying from the after effects. A good deal of very interesting matter might be added on the economics of the reptilian ménage, the mode of ventilating and warming it, the cost of its larder, and the best means of preserving the
health of the inmates. There are, besides, some incidental experiences not devoid of sensationalism in connection with snake guardianship, but my own herpetological experience does not extend beyond the keeping of pet lizards, including blindworms. I may add a word, however, in reply to some often-heard lamentations of disappointed spectators who object to the coverlets, after sometimes waiting in vain to see the snakes emerge from beneath them. ‘Those horrid blankets! Why not give the snakes moss or hay in their cages? or turf and sand and dead leaves? Much more natural for them than those woollen rugs.’ I, too, may have echoed such plaints until a better comprehension of ophidian nature showed the wisdom of what is certainly a somewhat disappointing arrangement. And those who have honoured these pages with a patient perusal, and discovered the nervous timidity and sensitiveness of these reptiles, their proneness to reject or to disgorge their food, to injure themselves or each other when molested, not to mention the danger of meddling with the venomous kinds and the easy escape of the swifter snakes, will admit the importance of providing them with such retreat and shelter as can be most speedily arranged, and which will secure the least annoyance to the terrified serpents while the keepers are doing their best to preserve order and cleanliness. The allusion to lizards tempts me to add a word or two on the exceptional species which has lately become an inmate of our Zoological Gardens. There are certain features in it so much in common with viperine snakes, that I may be pardoned for dragging a lizard into these pages. I allude to the Heloderm (Heloderma horridum) from Mexico, presented to the Zoological Society in July 1882 by Sir John Lubbock. Its advent was an event in reptilian annals; and being surrounded by a halo of curiosity, it claims a passing notice. We have been at some pains to exonerate saurians from the evil character which our ancestors were apt to give them; but suddenly—and to the surprise of even some herpetological authorities—there comes a lizard that with one grip of his jaws caused a frog to fall dead in a moment, and a guinea-pig in three minutes, the symptoms appearing to be the same as those produced by deadly snakes. The Heloderm is ‘said’ to be furnished with poison glands in both jaws! But until a dead specimen
has been further examined and described, the signification of ‘poison gland’ must be restricted. Its teeth—many and strong—are grooved with a deep furrow; its salivary glands are largely developed; and under excitement a thick, acrid secretion flows abundantly from its jaws. Yet so far as present observations enable us to form an opinion, the reptile does not use these formidable teeth to secure its prey, or even in feeding. It did not devour the victims of its bite, nor has it since killed any creature for the express purpose of eating it. Up to the date at which I write (Oct. 1882), eggs have formed its chief diet, varied by an occasional dead mouse. Now it certainly does not require deeply- grooved teeth and venomous saliva to bite raw eggs and dead mice. Nor does the noxious secretion flow continuously from its gums in repose, but abundantly so when irritated. Though a stranger in England, this lizard was known more than two hundred years ago. Hernandez, in his Nova Animalium Mexicanum, published at Rome in 1651, described its bite as ‘hurtful, but not deadly;’ and that it was ‘more dreadful in appearance than reality.’ Its Mexican name, Acaltetepon, is (or was then) applied to all large and suspicious-looking lizards. Scorpione is its modern name. As Heloderma horridum was awarded plenty of space in the journals at the time of its arrival, full accounts of it will be found elsewhere; it is introduced here merely as one of the venomous reptiles that form the chief subjects of this chapter, and to trace its analogy with them. In its slow, stealthy movements there is the same striking contrast between the Heloderm and most other lizards, that there is between the deadly vipers and the active colubrine snakes; and the inquiry suggests itself, Can the venom elaborated in their system so act upon themselves as to produce this habitual lethargy? Drowsiness and coma are almost invariable effects of snake venom in the blood, and why is it that the deadly serpents are so constitutionally different from others? The Heloderm has a round, heavy tail, of no service to it in swimming, and short, weak fingers, ill suited to climbing; and it passes its lethargic existence on the sandy plains of Mexico, manifesting in its actions, or rather in its inactivity and stealthiness, a conscious timidity and cowardice. Motionless for hours, with an impulse to retreat if molested, but attempting to bite if angered, its noxious saliva would seem to be rather protective than aggressive. It may have formidable enemies at home; and by all we see
of it here, it does not use its teeth as a means of obtaining food. In this respect, therefore, it is an exception to deadly serpents, and cannot take its venom into its stomach as they do. And, again, the remarkable development of its tongue suggests a peculiarity of food. In lapping the egg, the action of it is apparently perfected by practice; the tongue is twisted, extended, twined under, then over, now used as a shovel, a scoop, or a broom, as occasion requires. It is the very reverse of what I noticed in some other lizards feebly lapping up an egg (see p. 71), for in a most expeditious manner does Heloderm cause its raw eggs to disappear. A word à propos of its name horridum, supposed by many to refer to its objectionable qualities. Unfortunately the word ‘horrid’ has almost entirely lost its original signification and become mere slang in English. But when Wiegmann assigned it the name of Heloderma horridum in 1829, ‘horrid’ was understood according to its original meaning, from horridus, rough, rugged, etc.; and as this reptile has a remarkable skin, dotted over with little prominences, like knobs or warts (hence its generic name, Heloderma, warty skin), there can be but little doubt as to the intention of horridum. In a communication to Knowledge (Sept. 29), I ventured to call this the ‘Warty-skinned Lizard,’ in consequence of the confused accounts of it which have appeared in print. There are several other warty-skinned or ‘tuberculous lizards.’ The specific horridus, as applied to the South American Crotalus, also signified its terrible or dreadful character, and not the ‘horrid’ which spectators apply indiscriminately to snakes and their blankets. With the rapid advance of ophiology comes the splendid new home for snakes which will shortly grace our Zoological Gardens; and in taking leave of my readers, I cannot offer them a kinder wish than that their visits there to observe the snakes will be productive to them of as much pleasure as has been mine to describe them.
INDEX. LIST OF ABBREVIATIONS. Am., American; Ass., Association; Br., British; co., cobra; Con., Convention; cons., constrictor; cro., crotalus; cy., cyclopædia; C.M.Z.S., Corresponding Member of the Zoological Society; F.Z., Fellow of the Zoological Society; hist., history; nat., natural; N.Y., New York; py., python; Soc., Society; s., snake; ss., snakes; ser., serpent; v., viper; vs., vipers; U.S., United States; z., zoological; Z.G., Zoological Gardens; Trans., Transactions; Proc., Proceedings. A ‘Aberfoyle’ (the barque), sea-ser. seen from, 249. Abnormal development, of teeth, 67; of sea-sers., 265, et seq.; of hair, 302; two heads, 189. Abnormal, health in captivity, 84, 440, 450, 457, 565; gestation, 437, 439, et seq., 459, 466, 505. Academy of Sciences, Paris, 444. Acaltetepon, the, 590. ‘Account, of the Beasts in Virginia’, 280; of the rattlesnake, 275. Acrobats, 181, 196, 198, 214, et seq., 219, 239, 472, et seq. Adaptation, of organization, 70, 135;
of habits to temperature, 159, et seq. (see hibernation); of coils, 200, 204; of ribs in progression, 207; of form, 215. Adaptive development, of head bones, 31, 34; of spine bones, 70; of windpipe, 132; of salivary glands, 342, 350, 537, 557; of teeth, 348, 350, 364, 408. Adipose tissue, 394. Admiralty, the, report of a sea-ser. to, 255, 259. Advance of Ophiology, 3, 21, 75, 81, 191, 273, 353, 363, 372, 443, 485, 515, et seq. Ælian, an error traced to, 191. Æsculapius, his remedies, 522. Africa, range of sea-s., 236. Air-bladder, the, 262. Air-breathers, 44, 146, 222, 265. Air-tube, 132, 135, 137, et seq. (see glottis, respiration). ‘Albert Nyanza’, the, 358. Alceste, Voyage of the, 111, et seq. Alcohol a popular remedy, 548, et seq. Aldrovandus, 102; his work, 272. Alexander the Great, 407 (see Bucephalus). Alligators, 43, 261. Amazon, the Jararaca, 421; Wallace’s ‘Travels in’, ib.; anaconda from, 455. ‘American Agriculturist’, the, 485. American Ass. for the Advancement of Science, 485, et seq., 572; secretary to, 490. American colonies, the, 284. ‘American Naturalist’, the, 93, 151, 308, 310, 450, 485, et seq., 490;
editor of, 485 (see Putnam). American sea-ser., 252. Ammonia an approved remedy, 533, 547. Amphibia, the, 46. Amsterdam, py. at, 437. Anatomy, of head, 30; of jaws, 31, 34; of the spine, 209; of the rattlesnake, 275. ‘Anatomy of the Vertebrates’, 67, 119, 131, 143, 147, 180, 184, 196, et seq., 212, 336 (see Owen). ‘Anecdotes of Serpents’, 216, 523. ‘Animal Biography’, 134. Animal kingdom, 51. ‘Animal Physiology’, 147, 210 (see Carpenter). ‘Animal Physiology’ (Roget’s), 120, 195 (see Roget). ‘Animal World’, the, 169. ‘Animalium Mexicanum’, 190, 590. ‘Annales des Sciences Naturelle’, 78, 91, 442, 444. Antennæ of insects, 118, 120, 126, et seq. Antidotes, 275; Fayrer’s definition of, 533; various reputed ones, ib., 534, 539, 547, et seq. Apodal, ss. are, 206. Appendages, mythical, 101; caudal, 172, et seq.; epidermal, 315; ‘horns’, 320; crest, 325; tentacular, 326; as ‘claws’, ib. (see rattle, epidermis). Aquatic ss., 53, 145, 150, 174, 221, 225, et seq., 233, et seq., 401, 423, 453, 495, 496. Arabs chew herbs, 525.
Archeopteryx, 44. Aristotle, his name for reptiles, 45; bite of s., 96; marine ss., 244; vs., 432, et seq. Arius, the, 489. Armadillo, 413. Association, Am., 485, et seq. Association, Br., 42. Atlantic, the, 249; sea-ss. in, 238; sea-ser., 252. Atrophied limbs, 326. Atter, ætter, 479. ‘Aunt Judy’s Magazine’, 21, 72, 303, 312, 333, 470. Aural: powers of anguis fragilis, 476; apparatus of ground ss., 527. Australia, s.-hunting in, 167; sea-s., 236, 246. ‘Australia, Snakes of’ (see Krefft). Axolotl, 44. B Bacon, Lord, quoted, 49, 57; a poor naturalist, 99. Bagrus, the, 489. Baird of U.S. on the bull-s., 502. Baker, Owen, a sea-ser. seen by, 257, et seq. Baker, Sir Samuel: v. fangs, 357. Balance of Nature, 17, 57. Balfour’s ‘British India’, 74, 76, 86. Balfour’s ‘Cyclopædia of India’, 86. Bancroft, H. H., ser.-worship, 514.
Banks, Sir Joseph, action of ribs, 207; letter to Waterton, 418. Bard of Avon, the, 97. Bartlett, Mr. A. D., on the sea-ser., 255, 261, et seq.; presenting a v., 322; buying an anaconda, 455. Barton, Benjamin Smith, on the Cro., 299. Bartram (Mr., of U.S.): a ‘roaring’ s., 155; ‘cluster of teeth’, 371. Bates (H. W.): a cannibal s., 39; Jararaca, 421. Batrachia, the, 51. Battues of ss., 286, et seq., 289. Beal (J. W., of U.S.): sound of the rattle, 309. Beaufort, Duke of (A.D. 1709), 173. Beaumont and Fletcher, quotation from, 487. Beauvoir, Palisot de, his testimony, 488. Bell, Prof. Thomas, 4; food of ring-s., 74; his tame s., 76; his ‘British Reptiles’, ib.; a trustworthy herpetologist, 78; ss. drink, ib., 91; editor ‘Zoological Journal’, 140; on anguis fragilis, 453; gestation of anguis fragilis, 466; sloughing of anguis fragilis, 481; the maternal refuge, 493. Bellowing s., 155, et seq. Ben Jonson, 99. Bengal, Bay of, range of sea-ss., 236. Berkeley; rattlesnake remedies, 538. Beverley, Colonel: a stinging tail, 174; his ‘History of Virginia’, 281; a severed head biting, 390.
Bezoar, 523. Bibron, 80 (see Duméril). Bingley, a boa feeding, 134. ‘Blackie’, 458, et seq. Bladder, the swim-, 145. Blake (Colonel, of Virginia), a rattlesnake, 284. Blanket, swallowed, 35; disappointing, 588. Blowing: the, vi. 152; as ‘puffing’, 148; like a bull, 155. Bluets, 65. Bond, Rev. H., his testimony, 491. Bones: of the head, 30; intermaxillary, 31; of spine, 178, 198, 213; in Deirodon, 67; of tail, 183; of ‘claw’, 220 (see jaw, vertebræ, tails, etc.). Bonnat, 228, 383. Borneo, Crotalidæ in, 386. Boston, U.S., sea-ser. off, 251. Buffon, 197; his era, 383. Bourbon, Isle of, py. incubating in, 444. Bournemouth, lizards at, 164; a capture, 458. Bourrelets, of the rattle, 304. Bowerbank, a two-headed s., 190. Braden, J. G., Esq.: specimens of rattle, 306. Brazil, tree s. in, 219; ‘Discourse’ of, 271; names of Crotalidæ, 277; specimens from, 339, 360;
Jararaca in, 396; ‘Travels’ in, 397; experiments in, 406; vernaculars of, 419, 423, et seq. Breathing: irregular, 142, et seq.; sometimes partial, 144; suspension of, 145, 161, 168, 253; as ‘puffing’, 149, 155 (see hibernation, respiration). Bridgewater Treatise, 120. British Ass., 42. British Guiana, Hist. of, 420. British India (see Günther, Fayrer, Balfour, Nicholson, etc.). British Museum, 19, 131, 291, 377, 405, 418; Dr. Günther’s catalogue of ss. at, 354. ‘British Reptiles’ (Bell’s), 76, 140, 453 (see Bell). Brittain, Mr., evidence of, 504. Broderip, Dr. J., a naturalist, 113; his works, ib.; on the larynx, 135; quoted by Gosse, 112; observations by, 134, et seq. Browne, Sir Thomas: ‘Vulgar Errours’, 171; a two-headed s., 191; the maternal refuge, 487. Bruton, Dr. Lauder, 552. Buckland, Frank: his visit to Chelsea, 13; a ‘flannel sausage’, 36; the Coronella, 83, 435; a waggish s., 104; a mistake of, 115; sensitiveness of tail, 183; a sea-s., 238; a ribbon fish, 250; sea-ser., 255, et seq., 264; two-headed s., 189; reserve fangs, 346; s. eggs, 437; anaconda, 455;
invited, 464; obituary notice of ‘Cleo’, 515. Bull -frog, 156. Bullen, George, Esq., of Br. Museum, 19. Bulletin, U.S., Zoological ‘Reports’, 291, 309, 388. Burman coast, sea-ss. range, 238. Burrowing ss., 46, 53, 188, 459, 468. C Caledonia, New, sea-ss. at, 238. Cannibalism, 37, 182, 199, 401, 562; sometimes unintentional, 38, 572; common among the Elapidæ, 567, 571. Cañons, ss. in, 162. Cantor, Dr. Theodore, sea-ss. drinking, 82; quotes Schlegel, 90; tongue of sea-ss., 125; sea-ss. asleep, 169; on pelagic sers., 235; their poison, 243; their food, 245; number of species, 246. Cape, the, sea ser. at, 252, 259. Capybara, the, 229. Carbolic acid, kills ss., 544; a remedy for the venom, ib. Carinate scales, 319 (see ‘keel’). ‘Carolina, History of’ (see Lawson). ‘Carolina, Natural History of’ (see Catesby). Carpenter, Dr., F.R.S., etc., links and transitions in nature, 44; his opinion of Bacon, 99; lungs of ss., 143; hibernation, 165; length of spine, 210; insalivation of food, 352;
vegetable protectives, 539. Carver, Jonathan, a ‘blowing’ s., 153; large swarms of ss., 225; the maternal refuge, 487. Catesby, Mark: an egg robber, 63; the ‘blowing v.’, 152; ‘Horn ss.’, 174, 189; ‘water vs.’, 226; rattle-ss., 284; ‘hog-nosed’ ss., 410; Indian remedies, 538. Catlin, George, rattlesnake battues, 287; an alarm, 391; conjugal ss., 502; Indian superstitions, 509. Cats, their whiskers as feelers, 124; tenacious of life, 559; resist venom, ib. Caudal, eloquence, 155, 179, 311, 587; appendages, 170, et seq. 174 (see tail, rattle, etc.). Caverns, the retreat of ss., 287, 443. Caves, sacred, 509; abode of ss., 124, 162, 166, 288. Centipedes, legs of, 212. ‘Ceylon, History of’ (see Tennant). Chalk epoch, 262. ‘Challenger’, voyage of the, 262. Chambers: ‘Anecdotes of Serpents’, 216, 523; editor of the ‘Journal’, 23; observations at the Z.G., 217; the ‘Miscellany’, 523. Chancery, ss. in, 13, 515. Chandernagor, travels in, 444. Charas, Moyse: his work, 273; he ‘grovels’ for fangs, 359, 372; experiments on vs., 371;
a ‘nursery’ of fangs, ib.; knew of the mobility of fangs, 372. Charming, Sir H. Sloane on, 281; its origin, 515, et seq., 578, 585 (see fascination). Charms, 281; in s. relics, etc., 509, et seq. Chase, a, with a s., 214. Chased by a s., 185. Chateaubriand’s descriptions of ss., 153, 175, 197, 307; the maternal asylum, 487. Chelonia, 51. Chelsea, tame ss. at, 13, 27, 515, 525. Chicago, observations at, 496. Chimaphila, 65. Chinese mythologies, 509. Chittagong, rare beast from, 261. Chordæ vocales, 147. Circulation of blood, 56; checked by cold, 161; renewed by warmth, ib.; moisture essential to it, 162 (see hibernation, respiration, etc.). ‘City of Baltimore’, the, sea sers. seen from, 249. Clarke’s translation of Der Hoeven’s ‘Handbook of Zoology’, 118. Classification, 50; at present defective, 51; five principal groups of ss., 53; by dentition unsatisfactory, 354, et seq.; difficulties occurring in, 413, 421; Krefft on, 423 (see nomenclature). Clayton, Mr. J., the ‘tayle’ of rattlesnakes, 280. Climbing, 180, et seq., 196, 214, et seq., 230; of sea-ss., 238; of Anguis fragilis, 475, 482. Cockburn versus Mann, 13.
Coiling, the, 48; in constriction, 29, 203; of tail, 182, 587; in convolutions, 185; for a spring, 198; to substitute hands, 199, et seq., 206; swiftness of, 200; flexibility of, 218; of the sea-ser., 257; of Heterodon, 409, 570; in repose, 447, 587; of ‘Lizzie’, 472, 478; of Natrix, 569; of the Lacertines, 571 (see constriction); before striking, 573. Cold; ss. affected by, 143, 159, 165, 584, et seq. Cold-blooded, 56; why so, 142, 146, 159 (see hibernation, etc.). Colours of ss., 10; under excitement, 153, 572; of tree ss., 219, 386; of rattlesnakes, 270, 285; of African vs., 321, 338, et seq.; of ‘Bushmaster’, 417, et seq., 426; variable in young ss., 424; in other ss., ib.; after moulting, 508 (see sloughing). Combats between ss., 37, 199, 563. Congress (U.S.), Government Commissions, 199, 376. Constriction, 29, 199, 203, 214, 245; of young boas, 216, 439, 446; sometimes feeble, 583. Constrictors, 14, 35, 38, 135, 141, 182, 198, et seq., 213, et seq., 258, 336, 438, 446, 454, 583. Convention, a, of ss., 104; U.S. on ss., 485, 505, 552, 572. Cooke, M. C., ‘Our Reptiles’, 491; editor ‘Science Gossip’, ib.; a herpetologist, ib.
Cooper, W. R., ‘Serpent Myths’, 514. Cope, Professor, of U.S., 386. Cotton, Dr., of Tennessee, his rattlesnake, 298. Coues, Dr. Elliott, of U.S., a combat, 199; sound of the rattle, 309; development of rattle, 312; action of fangs, 362; cro. fang, 376, et seq.; species of cro., 386; virulence of bite, 390; pigs not exempt, 394. Council of Z. Soc., 78, 322. Cows sucked by ss., 84, 478. Coypu, 229. Cragin, Mr. F. W., on Heterodon, 450. Cranberry swamps, the Massasauga, 393. Crepitaculum caude, the, 294. Crispe, Dr. Edwards, F.Z.S., on the œsophagus, 490; the maternal refuge, ib. Crocodilia, the, 51, 261. Crotalina, 292. Crotalum, 292. Crotchets mobiles, 353, 362. Cruden, s. poison, 102. Cruelty, 17, 169, 206, 321, 469. ‘Curiosities of Natural History’ (see Buckland). Cuvier; his classification, 46; his ‘boa’, 47; he distinguished fangs, ib.; his era, 90, 383; quoted by Darwin, 176; his ‘vs.’, 353; incubation, 434, 456. ‘Cyclopædia of Anatomy’, 118
‘Cyclopædia of India’, 86. ‘Cyclopædia, the Penny’, 113, 120. D Dahomey, ser. deities at, 514; natives fearless of vs., 523. Dalton, the ‘Bushmaster’, 420. Danish vernaculars, 479. Darwin; complex organisms, 44; cro. mutus, 176; a living fossil, 244; the Platypus, 263; survival of the fittest, 267; on the rattle, 311; atrophied limbs, 326, 452. Daudin, 353, 383, 488. Davidson, Mr. (R. N.), a sea-ser., 251. Davis’ Lectures at the Z. G., 24, 51, 214, 413 (see Flower, Huxley, Mivart). Dean, Dr., a sea-ser. seen by, 249. ‘Deccan Days’, 517 (see Frere, Miss). Deer kill rattlesnakes, 394. Deglutition; manner of, 30, 111, et seq., 132; facilitated by saliva, 35, 109, 113; in drinking, 92, et seq.; Schlegel on, 362; watched at the Z. G., 581, et seq. De Kay; ‘Zoology of N.Y.’, 85. Demerara, ‘Bushmaster’ there, 423. Dentition, 34; of Deirodon, 67, et seq.; of sea-ss., 241; of sea-ser., 266; various forms of, 342, et seq.; distinguishing names in, 347; illus. of, 349, 355, 356, 360;
Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com

Software Design By Example A Toolbased Introduction With Python Greg Wilson

  • 1.
    Software Design ByExample A Toolbased Introduction With Python Greg Wilson download https://ebookbell.com/product/software-design-by-example-a- toolbased-introduction-with-python-greg-wilson-55974684 Explore and download more ebooks at ebookbell.com
  • 2.
    Here are somerecommended products that we believe you will be interested in. You can click the link to download. Software Design By Example A Toolbased Introduction With Python Greg Wilson https://ebookbell.com/product/software-design-by-example-a-toolbased- introduction-with-python-greg-wilson-55959768 Upnp Design By Example A Software Developers Guide To Universal Plug And Play Michael Jeronimo https://ebookbell.com/product/upnp-design-by-example-a-software- developers-guide-to-universal-plug-and-play-michael-jeronimo-2184566 Software Design By Example Greg Wilson https://ebookbell.com/product/software-design-by-example-greg- wilson-47254836 Practical Design Patterns For Java Developers Hone Your Software Design Skills By Implementing Popular Design Patterns In Java 1st Edition Miroslav Wengner https://ebookbell.com/product/practical-design-patterns-for-java- developers-hone-your-software-design-skills-by-implementing-popular- design-patterns-in-java-1st-edition-miroslav-wengner-49111936
  • 3.
    Handson Design PatternsWith C And Net Core Write Clean And Maintainable Code By Using Reusable Solutions To Common Software Design Problems 1st Edition Gaurav Aroraa https://ebookbell.com/product/handson-design-patterns-with-c-and-net- core-write-clean-and-maintainable-code-by-using-reusable-solutions-to- common-software-design-problems-1st-edition-gaurav-aroraa-53022704 Addison Wesley Domaindriven Design Tackling Complexity In The Heart Of Software By Eric By Eric Evans https://ebookbell.com/product/addison-wesley-domaindriven-design- tackling-complexity-in-the-heart-of-software-by-eric-by-eric- evans-36065612 Handson Domaindriven Design With Net Core Tackling Complexity In The Heart Of Software By Putting Ddd Principles Into Practice Alexey Zimarev https://ebookbell.com/product/handson-domaindriven-design-with-net- core-tackling-complexity-in-the-heart-of-software-by-putting-ddd- principles-into-practice-alexey-zimarev-11074038 Software Design https://ebookbell.com/product/software-design-58948182 Software Design Patterns The Ultimate Guide 1st Edition Sufyan Bin Uzay https://ebookbell.com/product/software-design-patterns-the-ultimate- guide-1st-edition-sufyan-bin-uzay-47260458
  • 6.
    Software Design by Example Thebest way to learn design in any field is to study examples, and some of the best examples of software design come from the tools programmers use in their own work. Software De- sign by Example: A Tool-Based Introduction with Python therefore builds small versions of the things programmers use in order to demystify them and give some insights into how ex- perienced programmers think. From a file backup system and a testing framework to a regular expression matcher, a browser layout engine, and a very small compiler, we explore common design patterns, show how making code easier to test also makes it easier to re-use, and help readers understand how debuggers, profilers, package managers, and version control systems work so that they can use them more effectively. This material can be used for self-paced study, in an undergraduate course on software design, or as the core of an intensive weeklong workshop for working programmers. Each chapter has a set of exercises ranging in size and difficulty from half a dozen lines to a full day’s work. Read- ers should be familiar with the basics of modern Python, but the more advanced features of the language are explained and illustrated as they are introduced. All the written material in this project can be freely re-used under the terms of the Creative Commons - Attribution license, while all of the software is made available under the terms of the Hippocratic License. All proceeds from the sale of this book will go to support the Red Door Family Shelter in Toronto. Features: •   Teaches software design by showing programmers how to build the tools they use every day •   Each chapter includes exercises to help readers check and deepen their understanding •   All the example code can be downloaded, re-used, and modified under an open license Dr. Greg Wilson is a programmer, author, and educator based in Toronto. He co-founded and was the first Executive Director of Software Carpentry, which has taught basic software skills to tens of thousands of researchers worldwide, and he has authored or edited over a dozen books (including two for children). Greg is a member of the Python Software Foundation and a recipi- ent of ACM SIGSOFT’s Influential Educator of the Year Award.
  • 7.
    Taylor Francis Taylor Francis Group http://taylorandfrancis.com
  • 8.
    Software Design by Example ATool-Based Introduction with Python Greg Wilson
  • 9.
    First edition published2024 by CRC Press 2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor Francis Group, LLC © 2024 Greg Wilson Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot as- sume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho- tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for iden- tification and explanation without intent to infringe. ISBN: 978-1-032-72525-3 (hbk) ISBN: 978-1-032-72521-5 (pbk) ISBN: 978-1-032-72523-9 (ebk) DOI: 10.1201/9781032725239 Typeset in Arial by KnowledgeWorks Global Ltd. Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
  • 10.
    Dedication This one’s forMike and Jon: I’m glad you always found time to chat.
  • 11.
    Taylor Francis Taylor Francis Group http://taylorandfrancis.com
  • 12.
    Contents 1 Introduction 1 1.1Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The Big Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 What People Are Saying . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Objects and Classes 7 2.1 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Finding Duplicate Files 17 3.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Hashing Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Better Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Matching Patterns 25 4.1 Simple Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Rethinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5 Parsing Text 35 5.1 Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6 Running Tests 43 6.1 Storing and Running Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.2 Finding Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 vii
  • 13.
    viii Contents 7 AnInterpreter 51 7.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 7.3 Introspection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 8 Functions and Closures 59 8.1 Definition and Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 8.2 Calling Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 8.3 Closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 8.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 9 Protocols 69 9.1 Mock Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 9.2 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 9.3 Decorators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 9.4 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 10 A File Archiver 81 10.1 Saving Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 10.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 10.3 Tracking Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 10.4 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 11 An HTML Validator 91 11.1 HTML and the DOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 11.2 The Visitor Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 11.3 Checking Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 11.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 12 A Template Expander 99 12.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 12.2 Managing Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 12.3 Visiting Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 12.4 Implementing Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 12.5 Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 12.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 12.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 13 A Code Linter 111 13.1 Machinery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 13.2 Finding Duplicate Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 13.3 Finding Unused Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 13.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
  • 14.
    Contents ix 14 PageLayout 119 14.1 Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 14.2 Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 14.3 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 14.4 Wrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 14.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 14.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 15 Performance Profiling 131 15.1 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 15.2 Row-Wise Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 15.3 Column-Wise Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 15.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 15.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 16 Object Persistence 145 16.1 Built-in Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 16.2 Converting to Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 16.3 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 16.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 16.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 17 Binary Data 155 17.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 17.2 Bitwise Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 17.3 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 17.4 And Now, Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 17.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 17.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 18 A Database 165 18.1 Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 18.2 Saving Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 18.3 A File-Backed Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 18.4 Playing with Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 18.5 Persisting Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 18.6 Cleaning Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 18.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 18.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 19 A Build Manager 177 19.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 19.2 Initial Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 19.3 Topological Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 19.4 A Better Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 19.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 19.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
  • 15.
    x Contents 20 APackage Manager 187 20.1 Semantic Versioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 20.2 Exhaustive Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 20.3 Generating Possibilities Manually . . . . . . . . . . . . . . . . . . . . . . . 191 20.4 Incremental Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 20.5 Using a Theorem Prover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 20.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 20.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 21 Transferring Files 201 21.1 Using TCP/IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 21.2 Chunking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 21.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 21.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 21.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 22 Serving Web Pages 209 22.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 22.2 Hello, Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 22.3 Serving Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 22.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 22.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 22.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 23 A File Viewer 219 23.1 Curses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 23.2 Windowing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 23.3 Moving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 23.4 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 23.5 Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 23.6 Viewport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 23.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 23.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 24 Undo and Redo 233 24.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 24.2 Insertion and Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 24.3 Going Backward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 24.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 24.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 25 A Virtual Machine 243 25.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 25.2 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 25.3 Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 25.4 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 25.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 25.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
  • 16.
    Contents xi 26 ADebugger 255 26.1 One Step at a Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 26.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 26.3 Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 26.4 Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 26.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 26.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 27 Conclusion 267 A Bibliography 269 B Bonus Material 271 B.1 Using Function Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B.2 Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 B.3 Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 B.4 Tracing Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 B.5 Inspecting Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 B.6 User-Defined Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 B.7 Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 B.8 Big and Little Endian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 B.9 Generating Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 C Syllabus 287 D License 293 D.1 Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 D.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 E Code of Conduct 297 E.1 Our Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 E.2 Our Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 E.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 E.4 Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 E.5 Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 F Contributing 299 F.1 Editing Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 F.2 Making Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 F.3 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 G Glossary 303 Index 327
  • 17.
    Taylor Francis Taylor Francis Group http://taylorandfrancis.com
  • 18.
    1 Introduction • The complexityof a system increases more rapidly than its size. • The best way to learn design is to study examples, and the best programs to use as examples are the ones programmers use every day. • These lessons assume readers can write small programs and want to write larger ones, or are looking for material to use in software design classes that they teach. • All of the content is free to read and re-use under open licenses, and all royalties from sales of this book will go to charity. Terms defined: cognitive load The best way to learn design in any field is to study examples [Schon1984; Petre2016], and the most approachable examples are ones that readers are already familiar with. These lessons therefore build small versions of tools that programmers use every day1 to show how experienced software designers think. Along the way, they introduce some fundamen- tal ideas in computer science that many self-taught programmers haven’t encountered. We hope these lessons will help you design better software yourself, and that if you know how programming tools work, you’ll be more likely to use them and better able to use them well. 1.1 Audience This learner persona [Wilson2019] describes who this book is for: Maya has a master’s degree in genomics. She knows enough Python to analyze data from her experiments, but struggles to write code other people can use. These lessons will show her how to design, build, and test large programs in less time and with less pain. Like Maya, you should be able to: • Write Python programs using lists, loops, conditionals, dictionaries, and functions. • Puzzle your way through Python programs that use classes and exceptions. • Run basic Unix shell commands like ls and mkdir. • Read and write a little bit of HTML. • Use Git2 to save and share files. (It’s OK not to know the more obscure commands3 .) 1https://en.wikipedia.org/wiki/Programming_tool 2https://git-scm.com/ 3https://git-man-page-generator.lokaltog.net/ 1
  • 19.
    2 1 Introduction Objectsand Classes Running Tests An Interpreter Functions and Closures Object Persistence A Template Expander Protocols Binary Data A Build Manager Undo and Redo A File Viewer A Debugger A Virtual Machine A Database Performance Profiling Page Layout An HTML Validator A Code Linter Parsing Text Matching Patterns A File Archiver Finding Duplicate Files Transferring Files Serving Web Pages A Package Manager source sink Legend 2 3 4 5 6 7 8 9 10 11 12 12 14 15 16 17 18 19 20 21 22 23 24 25 26 Figure 1.1: Lesson topics and dependencies. These chapters (Figure 1.1) are also designed to help another persona: Yim teaches two college courses on software development. They are frustrated that so many books talk about details but not about design and use examples that their students can’t relate to. This book will give them material they can use in class and starting points for course projects. 1.2 The Big Ideas Our approach to design is based on three big ideas. First, as the number of components in a system grows, the complexity of the system increases rapidly (Figure 1.2). However, the number of things we can hold in working memory at any time is fixed and fairly small [Hermans2021]. If we want to build large programs that we can understand, we therefore need to construct them out of pieces that interact in a small number of ways. Figuring out what those pieces and interactions should be is the core of what we call “design”. Second, “making sense” depends on who we are. When we use a low-level language, we incur the cognitive load of assembling micro-steps into something more meaningful. When we use a high-level language, on the other hand, we incur a similar load translating functions of functions into actual operations on actual data. More experienced programmers are more capable at both ends of the curve, but that’s not the only thing that changes. If a novice’s comprehension curve looks like the lower one in Figure 1.3, then an expert’s looks like the upper one. Experts don’t just understand more at all levels of abstraction; their preferred level has also shifted so they find √ x2 + y2 easier to read than the Medieval equivalent “the side of the square whose area is the sum of the areas of the two squares whose sides are given by the first part and the second part”. This
  • 20.
    1.3 Formatting 3 AB C A B C D E F A B C D E F 3 components have 3 interactions 6 components have 30 interactions 2 components with 3 sub-components have 7 interactions Figure 1.2: How complexity grows with size. Abstraction Comprehension expert novice Figure 1.3: Novice and expert comprehension curves. curve means that for any given task, the code that is quickest for a novice to comprehend will almost certainly be different from the code that an expert can understand most quickly. Our third big idea is that programs are just another kind of data. Source code is just text, which we can process like other text files. Likewise, a program in memory is just a data structure that we can inspect and modify like any other. Treating code like data enables us to solve hard problems in elegant ways, but at the cost of increasing the level of abstraction in our programs. Once again, finding the balance is what we mean by “design”. 1.3 Formatting We display Python source code like this: for ch in example: print(ch) and Unix shell commands like this: for filename in *.dat do cut -d , -f 10 $filename done
  • 21.
    4 1 Introduction Datafiles and program output are shown like this: - name: read params: - sample_data.csv alpha beta gamma delta We use ... to show where lines have been omitted, and occasionally break lines in unnatural ways to make them fit on the page. Where we do this, we end all but the last line with a single backslash . Finally, we show glossary entries in bold text and write functions as function_name rather than function_name(). The latter is more common, but the empty parentheses makes it hard to tell whether we’re talking about the function itself or a call to the function with no parameters. 1.4 Usage The source for this book is available in our Git repository4 and all of it can be read on our website5 . All of the written material in this book is licensed under the Creative Commons - Attribution - NonCommercial 4.0 International license6 (CC-BY-NC-4.0), while the software is covered by the Hippocratic License7 . The first license allows you to use and remix this material for noncommercial purposes, as-is or in adapted form, provided you cite its orig- inal source; if you want to sell copies or make money from this material in any other way, you must contact us8 and obtain permission first. The second license allows you to use and remix the software on this site provided you do not violate international agreements governing human rights; please see Appendix D for details. If you would like to improve what we have, add new material, or ask questions, please file an issue in our GitHub repository9 or send an email10 . All contributors are required to abide by our Code of Conduct (Appendix E). 1.5 What People Are Saying Here’s what people said about the JavaScript version of this book [Wilson2022a]: • Jessica Kerr11 : “Software Design by Example is the book I’ll recommend to every new dev... It is nice to you. It wants you to succeed... It’s a bridge from ‘learn to program’ to working programmer.” 4https://github.com/gvwilson/sdxpy/ 5https://third-bit.com/sdxpy/ 6https://creativecommons.org/licenses/by-nc/4.0/ 7https://firstdonoharm.dev/ 8mailto:gvwilson@third-bit.com 9https://github.com/gvwilson/sdxpy/ 10mailto:gvwilson@third-bit.com 11https://jessitron.com/2023/02/20/book-review-software-design-by-example/
  • 22.
    1.6 Acknowledgments 5 •Jenn Schiffer12 : “I am v much enjoying gvwilson’s book Software Design by Example. It makes me miss teaching, it would be such a fun text to use!” • Emily Gorcenski13 : “There’s a lot of books on programming but fewer books that couple software development with effective and practical use of tools, presenting a language not as a main course but as a part of an engineering ecosystem. Greg Wilson’s book hits all the right notes in bringing together theory, pragmatism, and best practices.” • Danielle Navarro14 : “The book is really bloody lovely.” 1.6 Acknowledgments Like [Wilson2022a], this book was inspired by [Kamin1990; Kernighan1979; Kernighan1981; Kernighan1983; Kernighan1988; Oram2007; Wirth1976] and by: • The Architecture of Open Source Applications15 series [Brown2011; Brown2012; Arm- strong2013; Brown2016]; • Mary Rose Cook’s16 Gitlet17 ; • Matt Brubeck’s18 browser engine tutorial19 ; • Web Browser Engineering20 by Pavel Panchekha21 and Chris Harrelson22 ; • Connor Stack’s23 database tutorial24 ; • Maël Nison’s25 package manager tutorial26 ; • Paige Ruten’s27 kilo text editor28 and Wasim Lorgat’s29 editor tutorial30 ; • Bob Nystrom’s31 Crafting Interpreters32 [Nystrom2021]; and • the posts and zines33 created by Julia Evans34 . 12https://mastodon.social/@jenn@pixel.kitchen/109985276835264400 13https://emilygorcenski.com/post/book-report-software-design-by-example-by-greg-wilson/ 14https://blog.djnavarro.net/posts/2023-05-31_software-design-by-example/ 15https://aosabook.org/ 16https://maryrosecook.com/ 17http://gitlet.maryrosecook.com/ 18https://limpet.net/mbrubeck/ 19https://limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html 20https://browser.engineering/ 21https://pavpanchekha.com/ 22https://twitter.com/chrishtr 23https://connorstack.com/ 24https://cstack.github.io/db_tutorial/ 25https://arcanis.github.io/ 26https://classic.yarnpkg.com/blog/2017/07/11/lets-dev-a-package-manager/ 27https://viewsourcecode.org/ 28https://viewsourcecode.org/snaptoken/kilo/index.html 29https://wasimlorgat.com/ 30https://github.com/seem/editor 31http://journal.stuffwithstuff.com/ 32https://craftinginterpreters.com/ 33https://wizardzines.com/ 34https://jvns.ca/
  • 23.
    6 1 Introduction Iam grateful to Miras Adilov, Alvee Akand, Rohan Alexander, Alexey Alexapolsky, Lina Andrén, Alberto Bacchelli, Yanina Bellini Saibene, Matthew Bluteau, Adrienne Canino, Marc Chéhab, Stephen Childs, Hector Correa, Socorro Dominguez, Christian Drumm, Christian Epple, Julia Evans, Davide Fucci, Thomas Fritz, Francisco Gabriel, Florian Gaudin-Delrieu, Craig Gross, Jonathan Guyer, McKenzie Hagen, Han Qi, Fraser Hay, Alexandru Hurjui, Bahman Karimi, Carolyn Kim, Kitsios Konstantinos, Jenna Landy, Peter Lin, Zihan Liu, Becca Love, Dan McCloy, Ramiro Mejia, Michael Miller, Firas Moosvi, Joe Nash, Sheena Ng, Reiko Okamoto, Juanan Pereira, Mahmoodur Rahman, Arpan Sarkar, Silvan Schlegel, Rosan Shanmuganathan, Dave W. Smith, Stephen M. Sturdevant, Diyar Taskiran, Ece Tur- nator, and Yao Yundong for feedback on early drafts of this material. I am also grateful to Shashi Kumar for help with LaTeX, to Odin Beuchat35 for help with JavaScript, and to the creators of Black36 , flake837 , Glosario38 , GNU Make39 , isort40 , ark41 , LaTeX42 , pip43 , Python44 , Remark45 , WAVE46 , and many other open source tools: if we all give a little, we all get a lot. All royalties from this book will go to the Red Door Family Shelter47 in Toronto. 1.7 Exercises Setting Up 1. Use pip48 to install Black49 , flake850 , and isort51 on your computer. 2. Run them on a few programs you have already written. (The file setup.cfg in the root directory of this book’s GitHub repository52 has the settings we use for these tools.) What problems do they report? Which of these reports do you disagree with? Avoiding Potholes Go to the GitHub repository53 for this book and look at the open issues. Which of them can you understand? What makes the others hard to understand? What could you add, leave out, or write differently when you report a problem that you have found? 35https://www.drafolin.ch/ 36https://black.readthedocs.io/ 37https://flake8.pycqa.org/ 38https://glosario.carpentries.org/ 39https://www.gnu.org/software/make/ 40https://pycqa.github.io/isort/ 41https://www.dmulholl.com/docs/ark/main/ 42https://www.latex-project.org/ 43https://pip.pypa.io/ 44https://www.python.org/ 45https://remarkjs.com/ 46https://wave.webaim.org/ 47https://www.reddoorshelter.ca/ 48https://pip.pypa.io/ 49https://black.readthedocs.io/ 50https://flake8.pycqa.org/ 51https://pycqa.github.io/isort/ 52https://github.com/gvwilson/sdxpy/ 53https://github.com/gvwilson/sdxpy/
  • 24.
    2 Objects and Classes •Objects are useful without classes, but classes make them easier to understand. • A well-designed class defines a contract that code using its instances can rely on. • Objects that respect the same contract are polymorphic, i.e., they can be used interchangeably even if they do different specific things. • Objects and classes can be thought of as dictionaries with stereotyped behavior. • Most languages allow functions and methods to take a variable number of arguments. • Inheritance can be implemented in several ways that differ in the order in which objects and classes are searched for methods. Terms defined: alias, argument, cache, class method, constructor, derived class, design by contract, monkey patching, multiple inheritance, object-oriented programming, parameter, polymorphism, recursion, spread, static method, upcall, varargs We are going to create a lot of objects and classes in these lessons, and they will be a lot easier to use if we understand how they are implemented. Historically, object-oriented programming (OOP) was invented to solve two problems: 1. What is a natural way to represent real-world “things” in code? 2. How can we organize code to make it easier to understand, test, and extend? 2.1 Objects As a motivating problem, let’s define some of the things a generic shape in a drawing package must be able to do: class Shape: def __init__(self, name): self.name = name def perimeter(self): raise NotImplementedError(perimeter) def area(self): raise NotImplementedError(area) A specification like this is sometimes called a contract because an object must satisfy it in order to be considered a shape, i.e., must provide methods with these names that do 7
  • 25.
    8 2 Objectsand Classes what those names suggest. For example, we can derive classes from Shape to represent squares and circles. class Square(Shape): def __init__(self, name, side): super().__init__(name) self.side = side def perimeter(self): return 4 * self.side def area(self): return self.side ** 2 class Circle(Shape): def __init__(self, name, radius): super().__init__(name) self.radius = radius def perimeter(self): return 2 * math.pi * self.radius def area(self): return math.pi * self.radius ** 2 Since squares and circles have the same methods, we can use them interchangeably. This is called polymorphism, and it reduces cognitive load by allowing the people using related things to ignore their differences: examples = [Square(sq, 3), Circle(ci, 2)] for thing in examples: n = thing.name p = thing.perimeter() a = thing.area() print(f{n} has perimeter {p:.2f} and area {a:.2f}) sq has perimeter 12.00 and area 9.00 ci has perimeter 12.57 and area 12.57 But how does polymorphism work? The first thing we need to understand is that a func- tion is an object. While the bytes in a string represent characters and the bytes in an image represent pixels, the bytes in a function are instructions (Figure 2.1). When Python exe- cutes the code below, it creates an object in memory that contains the instructions to print a string and assigns that object to the variable example: def example(): print(in example) We can create an alias for the function by assigning it to another variable and then call the function by referencing that second variable. Doing this doesn’t alter or erase the connection between the function and the original name: alias = example alias() in example We can also store function objects in data structures like lists and dictionaries. Let’s write some functions that do the same things as the methods in our original Python and store them in a dictionary to represent a square (Figure 2.2):
  • 26.
    2.1 Objects 9 astext it's all just bytes as image old_x = x x = 2 * x + 1 as instructions bytes 74 69 73 27 61 20 6C 6C 6A 20 73 75 20 74 79 62 65 74 0A 73 Figure 2.1: Bytes can be interpreted as text, images, instructions, and more. name side perim area square sq 2 square_perim() square_area() name radius perim area circle ci 3 circle_perim() circle_area() Figure 2.2: Using dictionaries to emulate objects. def square_perimeter(thing): return 4 * thing[side] def square_area(thing): return thing[side] ** 2 def square_new(name, side): return { name: name, side: side, perimeter: square_perimeter , area: square_area } If we want to use one of the “methods” in this dictionary, we call it like this: def call(thing, method_name): return thing[method_name](thing) examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: n = ex[name] p = call(ex, perimeter) a = call(ex, area) print(f{n} {p:.2f} {a:.2f}) The function call looks up the function stored in the dictionary, then calls that function with the dictionary as its first object; in other words, instead of using obj.meth(arg) we use obj[meth](obj, arg). Behind the scenes, this is (almost) how objects actually work.
  • 27.
    10 2 Objectsand Classes We can think of an object as a special kind of dictionary. A method is just a function that takes an object of the right kind as its first parameter (typically called self in Python). 2.2 Classes One problem with implementing objects as dictionaries is that it allows every single object to behave slightly differently. In practice, we want objects to store different values (e.g., different squares to have different sizes) but the same behaviors (e.g., all squares should have the same methods). We can implement this by storing the methods in a dictionary called Square that corresponds to a class and having each individual square contain a ref- erence to that higher-level dictionary (Figure 2.3). In the code below, that special reference uses the key _class: def square_perimeter(thing): return 4 * thing[side] def square_area(thing): return thing[side] ** 2 Square = { perimeter: square_perimeter , area: square_area , _classname: Square } def square_new(name, side): return { name: name, side: side, _class: Square } Calling a method now involves one more lookup because we have to go from the object to the class to the method, but once again we call the “method” with the object as the first argument: name side perim area square sq 2 square_perim() square_area() name radius perim area circle ci 3 circle_perim() circle_area() _class _class _classname square _classname circle Square Circle Figure 2.3: Using dictionaries to emulate classes.
  • 28.
    2.3 Arguments 11 defcall(thing, method_name): return thing[_class][method_name](thing) examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: n = ex[name] p = call(ex, perimeter) a = call(ex, area) c = ex[_class][_classname] print(f{n} is a {c}: {p:.2f} {a:.2f}) As a bonus, we can now reliably identify objects’ classes and ask whether two objects are of the same class or not by checking what their _class keys refer to. Arguments vs. Parameters Many programmers use the words argument and parameter interchangeably, but to make our meaning clear, we call the values passed into a function its arguments and the names the function uses to refer to them as its parameters. Put another way, parameters are part of the definition, and arguments are given when the function is called. 2.3 Arguments The methods we have defined so far operate on the values stored in the object’s dictionary, but none of them take any extra arguments as input. Implementing this is a little bit tricky because different methods might need different numbers of arguments. We could define functions call_0, call_1, call_2 and so on to handle each case, but like most modern languages, Python gives us a better way. If we define a parameter in a function with a leading *, it captures any “extra” values passed to the function that don’t line up with named parameters. Similarly, if we define a parameter with two leading stars **, it captures any extra named parameters: def show_args(title, *args, **kwargs): print(f{title} args '{args}' and kwargs '{kwargs}') show_args(nothing) show_args(one unnamed argument, 1) show_args(one named argument, second=2) show_args(one of each, 3, fourth=4) nothing args '()' and kwargs '{}' one unnamed argument args '(1,)' and kwargs '{}' one named argument args '()' and kwargs '{'second': '2'}' one of each args '(3,)' and kwargs '{'fourth': '4'}' This mechanism is sometimes referred to as varargs (short for “variable arguments”). A complementary mechanism called spreading allows us to take a list or dictionary full of arguments and spread them out in a call to match a function’s parameters:
  • 29.
    12 2 Objectsand Classes def show_spread(left, middle, right): print(fleft {left} middle {middle} right {right}) all_in_list = [1, 2, 3] show_spread(*all_in_list) all_in_dict = {right: 30, left: 10, middle: 20} show_spread(**all_in_dict) left 1 middle 2 right 3 left 10 middle 20 right 30 With these tools in hand, let’s add a method to our Square class to tell us whether a square is larger than a user-specified size: def square_larger(thing, size): return call(thing, area) size Square = { perimeter: square_perimeter , area: square_area , larger: square_larger , _classname: Square } The function that implements this check for circles looks exactly the same: def circle_larger(thing, size): return call(thing, area) size We then modify call to capture extra arguments in *args and spread them into the function being called: def call(thing, method_name , *args): return thing[_class][method_name](thing, *args) Our tests show that this works: examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: result = call(ex, larger, 10) print(fis {ex['name']} larger? {result}) is sq larger? False is ci larger? True However, we now have two functions that do exactly the same thing—the only difference between them is their names. Anything in a program that is duplicated in several places will eventually be wrong in at least one, so we need to find some way to share this code. 2.4 Inheritance The tool we want is inheritance. To see how this works in Python, let’s add a method called density to our original Shape class that uses other methods defined by the class
  • 30.
    2.4 Inheritance 13 classShape: def __init__(self, name): self.name = name def perimeter(self): raise NotImplementedError(perimeter) def area(self): raise NotImplementedError(area) def density(self, weight): return weight / self.area() examples = [Square(sq, 3), Circle(ci, 2)] for ex in examples: n = ex.name d = ex.density(5) print(f{n}: {d:.2f}) sq: 0.56 ci: 0.40 To enable our dictionary-based “classes” to do the same thing, we create a dictionary to represent a generic shape and give it a “method” to calculate density: def shape_density(thing, weight): return weight / call(thing, area) Shape = { density: shape_density , _classname: Shape, _parent: None } We then add another specially-named field to the dictionaries for “classes” like Square to keep track of their parents: Square = { perimeter: square_perimeter , area: square_area , _classname: Square, _parent: Shape } and modify the call function to search for the requested method (Figure 2.4): def call(thing, method_name , *args): method = find(thing[_class], method_name) return method(thing, *args) def find(cls, method_name): while cls is not None: if method_name in cls: return cls[method_name] cls = cls[_parent] raise NotImplementedError(method_name)
  • 31.
    14 2 Objectsand Classes name side perim area square sq 2 square_perim() square_area() name radius perim area circle ci 3 circle_perim() circle_area() _class _class _classname square _classname circle Square Circle _parent _parent density shape_density() _classname shape Shape _parent Figure 2.4: Using dictionary search to implement inheritance. A simple test shows that this is working as intended: examples = [square_new(sq, 3), circle_new(ci, 2)] for ex in examples: n = ex[name] d = call(ex, density, 5) print(f{n}: {d:.2f}) sq: 0.56 ci: 0.40 We do have one task left, though: we need to make sure that when a square or circle is made, it is made correctly. In short, we need to implement constructors. We do this by giving the dictionaries that implement classes a special key _new whose value is the function that builds something of that type: def shape_new(name): return { name: name, _class: Shape } Shape = { density: shape_density , _classname: Shape, _parent: None, _new: shape_new } In order to make an object, we call the function associated with its _new key: def make(cls, *args): return cls[_new](*args) That function is responsible for upcalling the constructor of its parent. For example, the constructor for a square calls the constructor for a generic shape and adds square-specific values using | to combine two dictionaries:
  • 32.
    2.5 Summary 15 defsquare_new(name, side): return make(Shape, name) | { side: side, _class: Square } Square = { perimeter: square_perimeter , area: square_area , _classname: Square, _parent: Shape, _new: square_new } Of course, we’re not done until we test it: examples = [make(Square, sq, 3), make(Circle, ci, 2)] for ex in examples: n = ex[name] d = call(ex, density, 5) print(f{n}: {d:.2f}) sq: 0.56 ci: 0.40 2.5 Summary We have only scratched the surface of Python’s object system. Multiple inheritance, class methods, static methods, and monkey patching are powerful tools, but they can all be understood in terms of dictionaries that contain references to properties, functions, and other dictionaries (Figure 2.5). class object methods attributes defines has instance of contract respects dictionary dictionary stored in refers to constructor includes stored in Figure 2.5: Concept map for implementing objects and classes.
  • 33.
    16 2 Objectsand Classes 2.6 Exercises Handling Named Arguments The final version of call declares a parameter called *args to capture all the positional arguments of the method being called and then spreads them in the actual call. Modify it to capture and spread named arguments as well. Multiple Inheritance Implement multiple inheritance using dictionaries. Does your implementation look methods up in the same order as Python would? Class Methods and Static Methods 1. Explain the differences between class methods and static methods. 2. Implement both using dictionaries. Reporting Type Python type method reports the most specific type of an object, while isinstance deter- mines whether an object inherits from a type either directly or indirectly. Add your own versions of both to dictionary-based objects and classes. Using Recursion A recursive function is one that calls itself, either directly or indirectly. Modify the find function that finds a method to call so that it uses recursion instead of a loop. Which version is easier to understand? Which version is more efficient? Method Caching Our implementation searches for the implementation of a method every time that method is called. An alternative is to add a cache to each object to save the methods that have been looked up before. For example, each object could have a special key called _cache whose value is a dictionary. The keys in that dictionary are the names of methods that have been called in the past, and the values are the functions that were found to implement those methods. Add this feature to our dictionary-based objects. How much more complex does it make the code? How much extra storage space does it need compared to repeated lookup?
  • 34.
    3 Finding Duplicate Files •A hash function creates a fixed-size value from an arbitrary sequence of bytes. • Use big-oh notation to estimate the running time of algorithms. • The output of a hash function is deterministic but not easy to predict. • A good hash function’s output is evenly distributed. • A large cryptographic hash can be used to uniquely identify a file’s contents. Terms defined: big-oh notation, binary mode, bucket, collision (in hashing), cryptographic hash function, hash code, hash function, hexadecimal, SHA-256 (hash function), streaming API, time complexity Suppose we want to find duplicated files, such as extra copies of photos or data sets. People often rename files, so we must compare their contents, but this will be slow if we have a lot of files. We can estimate how slow “slow” will be with a simple calculation. N objects can be paired in N(N − 1) ways. If we remove duplicate pairings (i.e., if we count A-B and B-A as one pair) then there are N(N − 1)/2 = (N2 − N)/2 distinct pairs. As N gets large, this value is approximately proportional to N2 . A computer scientist would say that the time complexity of our algorithm is O(N2 ), which is pronounced “big-oh of N squared”. In simpler terms, when the number of files doubles, the running time roughly quadruples, which means the time per file increases as the number of files increases. Slowdown like this is often unavoidable, but in our case there’s a better way. If we generate a shorter identifier for each file that depends only on the bytes it contains, we can group together the files that have the same identifier and only compare the files within a group. This approach is faster because we only do the expensive byte-by-byte comparison on files that might be equal. And as we’ll see, if we are very clever about how we generate identifiers then we can avoid byte-by-byte comparisons entirely. 3.1 Getting Started We’ll start by implementing the inefficient N2 approach so that we can compare our later designs to it. The short program below takes a list of filenames from the command line, finds duplicates, and prints the matches: def find_duplicates(filenames): matches = [] for left in filenames: for right in filenames: if same_bytes(left, right): matches.append((left, right)) return matches 17
  • 35.
    18 3 FindingDuplicate Files if __name__ == __main__: duplicates = find_duplicates(sys.argv[1:]) for (left, right) in duplicates: print(left, right) This program uses a function called same_bytes that reads two files and compares them byte by byte: def same_bytes(left_name, right_name): left_bytes = open(left_name, rb).read() right_bytes = open(right_name, rb).read() return left_bytes == right_bytes Notice that the files are opened in binary mode using rb instead of the usual r. As we’ll see in Chapter 17, this tells Python to read the bytes exactly as they are rather than trying to convert them to characters. To test this program and the others we’re about to write, we create a tests directory with six files: Filename a1.txt a2.txt a3.txt b1.txt b2.txt c1.txt Content aaa aaa aaa bb bb c We expect the three a files and the two b files to be reported as duplicates. There’s no particular reason for these tests—we just have to start somewhere. Our first test looks like this: python brute_force_1.py tests/*.txt tests/a1.txt tests/a1.txt tests/a1.txt tests/a2.txt tests/a1.txt tests/a3.txt tests/a2.txt tests/a1.txt tests/a2.txt tests/a2.txt tests/a2.txt tests/a3.txt tests/a3.txt tests/a1.txt tests/a3.txt tests/a2.txt tests/a3.txt tests/a3.txt tests/b1.txt tests/b1.txt tests/b1.txt tests/b2.txt tests/b2.txt tests/b1.txt tests/b2.txt tests/b2.txt tests/c1.txt tests/c1.txt Our program’s output is correct but not useful: every file is reported as being identical to itself, and every match of different files is reported twice. Let’s fix the nested loop in find_duplicates so that we only check potentially differing pairs once (Figure 3.1): def find_duplicates(filenames): matches = [] for i_left in range(len(filenames)): left = filenames[i_left] for i_right in range(i_left): right = filenames[i_right] if same_bytes(left, right): matches.append((left, right)) return matches
  • 36.
    3.2 Hashing Files19 1 2 3 0 0 1 2 3 i_left i_right Figure 3.1: Scoping the inner loop to produce unique combinations. 3.2 Hashing Files Instead of comparing every file against every other, let’s process each file once to produce a short identifier that depends only on the file’s contents and then only compare files that have the same identifier, i.e., that might be equal. If files are evenly divided into g groups then each group will contain roughly N/g files, so the total work will be roughly O(g(N/g)2 ) (i.e., g groups times (N/g)2 comparisons within each group). Simplifying, this is N2 /g, so as the number of groups grows, and the overall running time should decrease (Figure 3.2). a1.txt a2.txt a3.txt b1.txt b2.txt c1.txt aaa aaa aaa bb bb c 6 6 6 10 10 5 filename contents hash code Figure 3.2: Grouping by hash code reduces comparisons from 15 to 4. We can construct IDs for files using a hash function to produce a hash code. Since bytes are just numbers, we can create a very simple hash function by adding up the bytes in a file and taking the remainder modulo some number: def naive_hash(data): return sum(data) % 13 Here’s a quick test that calculates the hash code for successively longer substrings of the word hashing: example = bytes(hashing, utf-8) for i in range(1, len(example) + 1): substring = example[:i] hash = naive_hash(substring) print(f{hash:2} {substring}) 0 b'h' 6 b'ha' 4 b'has' 4 b'hash' 5 b'hashi'
  • 37.
    20 3 FindingDuplicate Files 0 2 4 6 8 10 12 0 1000 2000 3000 hash count Figure 3.3: Distribution of hash codes per line in Dracula. 0 2 4 6 8 10 12 0 200 400 600 800 1000 hash count Figure 3.4: Distribution of hash codes per unique line in Dracula. 11 b'hashin' 10 b'hashing' The output seems random, but is it? As a more stringent test, let’s try hashing every line of text in the Project Gutenberg1 version of the novel Dracula and plot the distribution (Figure 3.3). Most of the buckets are approximately the same height, but why is there a peak at zero? Our big-oh estimate of how efficient our algorithm would be depended on files being distributed evenly between groups; if that’s not the case, our code won’t be as fast as we hoped. After a bit of digging, it turns out that the text file we’re processing uses a blank line to separate paragraphs. These hash to zero, so the peak reflects an unequal distribution in our data. If we plot the distribution of hash codes of unique lines, the result is more even (Figure 3.4). 1https://www.gutenberg.org/
  • 38.
    3.3 Better Hashing21 Hashing is a tremendously powerful tool: for example, Python’s dictionaries hash their keys to make lookup fast. Now that we can hash files, we can build a dictionary with hash codes as keys and sets of filenames as values. The code that does this is shown below; each time it calculate a hash code, it checks to see if that value has been seen before. If not, it creates a new entry in the groups dictionary with the hash code as its key and an empty set as a value. It can then be sure that there’s a set to add the filename to: def find_groups(filenames): groups = {} for fn in filenames: data = open(fn, rb).read() hash_code = naive_hash(data) if hash_code not in groups: groups[hash_code] = set() groups[hash_code].add(fn) return groups We can now re-use most of the code we wrote earlier to find duplicates within each group: groups = find_groups(sys.argv[1:]) for filenames in groups.values(): duplicates = find_duplicates(list(filenames)) for (left, right) in duplicates: print(left, right) tests/a2.txt tests/a1.txt tests/a3.txt tests/a1.txt tests/a3.txt tests/a2.txt tests/b1.txt tests/b2.txt 3.3 Better Hashing Let’s go back to the formula O(N2 /g) that tells us how much work we have to do if we have divided N files between g groups. If we have exactly as many groups as files—i.e., if g is equal to N—then the work to process N files would be O(N2 /N) = O(N), which means that the work will be proportional to the number of files. We have to read each file at least once anyway, so we can’t possibly do better than this, but how can we ensure that each unique file winds up in its own group? The answer is to use a cryptographic hash function. The output of such a function is completely deterministic: given the same bytes in the same order, it will always produce the same output. However, the output is distributed like a uniform random variable: each possible output is equally likely, which ensures that files will be evenly distributed between groups. Cryptographic hash functions are hard to write, and it’s very hard to prove that a partic- ular algorithm has the properties we require. We will therefore use a function from Python’s hashing module2 that implements the SHA-256 hashing algorithm. Given some bytes as input, this function produces a 256-bit hash, which is normally written as a 64-character hexadecimal string. This uses the letters A-F (or a-f) to represent the digits from 10 to 15, so that (for example) 3D5 is (3×162 ) + (13×161 ) + (5×160 ), or 981 in decimal: 2https://docs.python.org/3/library/hashlib.html
  • 39.
    22 3 FindingDuplicate Files example = bytes(hash, utf-8) for i in range(1, len(example) + 1): substring = example[:i] hash = sha256(substring).hexdigest() print(f{substring}n{hash}) b'h' aaa9402664f1a41f40ebbc52c9993eb66aeb366602958fdfaa283b71e64db123 b'ha' 8693873cd8f8a2d9c7c596477180f851e525f4eaf55a4f637b445cb442a5e340 b'has' 9150c74c5f92d51a92857f4b9678105ba5a676d308339a353b20bd38cd669ce7 b'hash' d04b98f48e8f8bcc15c6ae5ac050801cd6dcfd428fb5f9e65c4e16e7807340fa The Birthday Problem The odds that two people share a birthday are 1/365 (ignoring February 29). The odds that they don’t are therefore 364/365. When we add a third person, the odds that they don’t share a birthday with either of the preceding two people are 363/365, so the overall odds that nobody shares a birthday are (364/365)×(363/365). If we keep going, there’s a 50% chance of two people sharing a birthday in a group of just 23 people, and a 99.9% chance with 70 people. The same math can tell us how many files we need to hash before there’s a 50% chance of a collision with a 256-bit hash. According to Wikipedia3 , the answer is approximately 4×1038 files. We’re willing to take that risk. Using this library function makes our duplicate file finder much shorter: import sys from hashlib import sha256 def find_groups(filenames): groups = {} for fn in filenames: data = open(fn, rb).read() hash_code = sha256(data).hexdigest() if hash_code not in groups: groups[hash_code] = set() groups[hash_code].add(fn) return groups if __name__ == __main__: groups = find_groups(sys.argv[1:]) for filenames in groups.values(): print(, .join(sorted(filenames))) python dup.py tests/*.txt tests/a1.txt, tests/a2.txt, tests/a3.txt tests/b1.txt, tests/b2.txt tests/c1.txt 3https://en.wikipedia.org/wiki/Birthday_problem
  • 40.
    3.4 Summary 23 Moreimportantly, our new approach scales to very large sets of files: as explained above, we only have to look at each file once, so the running time is as good as it possibly can be. 3.4 Summary Figure 3.5 summarizes the key ideas in this chapter, the most important of which is that some algorithms are intrinsically better than others. hash function bytes file hash code from input is output is deterministic uniformly distributed is unique for each sequence of duplicates may be efficiency improves of detecting big-oh measured by such as linear O(N) quadratic O(N2) Figure 3.5: Concept map for duplicate file detection with hashing. 3.5 Exercises Odds of Collision If hashes were only 2 bits long, then the chances of collision with each successive file assuming no previous collision are: Number of Files Odds of Collision 1 0% 2 25% 3 50% 4 75% 5 100% A colleague of yours says this means that if we hash four files, there’s only a 75% chance of any collision occurring. What are the actual odds?
  • 41.
    24 3 FindingDuplicate Files Streaming I/O A streaming API delivers data one piece at a time rather than all at once. Read the doc- umentation for the update method of hashing objects in Python’s hashing module4 and rewrite the duplicate finder from this chapter to use it. Big Oh Chapter 1 said that as the number of components in a system grows, the complexity of the system increases rapidly. How fast is “rapidly” in big-oh terms? The hash Function 1. Read the documentation for Python’s built-in hash function. 2. Why do hash(123) and hash(123) work when hash([123]) raises an exception? How Good Is SHA-256? 1. Write a function that calculates the SHA-256 hash code of each unique line of a text file. 2. Convert the hex digests of those hash codes to integers. 3. Plot a histogram of those integer values with 20 bins. 4. How evenly distributed are the hash codes? How does the distribution change as you process larger files? 4https://docs.python.org/3/library/hashlib.html
  • 42.
    4 Matching Patterns • Useglobs and regular expressions to match patterns in text. • Use inheritance to make matchers composable and extensible. • Simplify code by having objects delegate work to other objects. • Use the Null Object pattern to eliminate special cases in code. • Use standard refactorings to move code from one working state to another. • Build and check the parts of your code you are least sure of first to find out if your design will work. Terms defined: Chain of Responsibility pattern, child class, Extract Parent Class refactoring, globbing, greedy matching, helper method, inheritance, lazy matching, literal (in parsing), Null Object pattern, refactor, regular expression, signature, technical debt, test-driven development We used *.txt to tell the duplicate file finder of Chapter 3 which files to compare. Older programmers (like this author) refer to this kind of pattern-matching as globbing because early versions of Unix had a tool called glob1 to do it. Globbing was so useful that it was quickly added to the shell, and the Python standard library includes a module called glob2 to match filenames in the same way. For example, 2023-*.{pdf,txt} matches 2023-01.txt and 2023-final.pdf but not draft-2023.docx (Figure 4.1). Globbing patterns are simpler than the regular expressions used to scrape data from text files, but the principles are the same. This chapter therefore implements a simple ver- sion of globbing to show how pattern-matching works in general. This matcher will only handle the cases in Table 4.1, but as the exercises will show, our design makes it easy to add new kinds of patterns. 2023- * . {pdf,txt} 2023- 01 . txt 2023- * . {pdf,txt} 2023- final . pdf 2023- * . {pdf,txt} draft 2023 . docx Figure 4.1: Examples of glob matching. 1https://en.wikipedia.org/wiki/Glob_(programming) 2https://docs.python.org/3/library/glob.html 25
  • 43.
    26 4 MatchingPatterns Pattern Text Match? Pattern Text Match? abc “abc” true a*c “abc” true ab “abc” false {a,b} “a” true abc “ab” false {a,b} “c” false * ”“ true {a,b} “ab” false * “abc” true *{x,y} “abcx” true Table 4.1: Pattern-matching cases. 4.1 Simple Patterns Matching is conceptually simple. If the first element of the pattern matches the target string at the current location, we check if the rest of the pattern matches what’s left of the string. If the element doesn’t match the front of the string, or if the rest of the pattern can’t match the rest of the string, matching fails. (This behavior makes globbing different from regular expressions, which can match parts of strings.) This design makes use of the Chain of Responsibility design pattern. Each matcher matches if it can then asks the next matcher in the chain to try to match the remaining text (Figure 4.2). Crucially, objects don’t know how long the chain after them is: they just know whom to ask next. In some cases we only need to know what kind of matching we’re doing: for example, the * pattern matches any characters. In other cases, though, we need some extra information, such as the literal text abc or the two alternatives pdf and txt. We therefore decide to create matching objects that can hold this extra information rather than just writing functions. Our first matcher checks whether a piece of text like abc matches a string. We call this class Lit because a fixed string of characters is sometimes called a literal, and it has a constructor and a match method: class Lit: def __init__(self, chars, rest=None): self.chars = chars self.rest = rest def match(self, text, start=0): end = start + len(self.chars) if text[start:end] != self.chars: return False if self.rest: return self.rest.match(text, end) return end == len(text) chars is the characters to be matched, while rest is responsible for matching the rest of the text. If rest is None, this matcher is the last one in the chain, so it must match to the end of the target string. 2023- * . {pdf,txt} 2023- 01 . txt Literal Any Literal Either Figure 4.2: Matching with Chain of Responsibility.
  • 44.
    4.1 Simple Patterns27 The match method takes the text to be matched as an input along with an optional start parameter that indicates where matching is to start. This parameter has a default value of 0 (meaning “start at the beginning”), but if this Lit follows other matchers, they need to tell it where to start looking. To see if this works, let’s write and run a few tests: def test_literal_match_entire_string(): # /abc/ matches abc assert Lit(abc).match(abc) def test_literal_substring_alone_no_match(): # /ab/ doesn't match abc assert not Lit(ab).match(abc) def test_literal_superstring_no_match(): # /abc/ doesn't match ab assert not Lit(abc).match(ab) Notice that we give tests long, meaningful names to make failure reports from the test runner easier to read. We could go ahead and build some more matchers right away, but as [Petre2016] ex- plains, good programmers build and check the parts of their code that they are least sure of as early as possible to find out if their entire design is going to work or not. We there- fore write a test to make sure that chaining works when one literal matcher is followed by another: def test_literal_followed_by_literal_match(): # /a/+/b/ matches ab assert Lit(a, Lit(b)).match(ab) def test_literal_followed_by_literal_no_match(): # /a/+/b/ doesn't match ac assert not Lit(a, Lit(b)).match(ac) Chaining two literal matchers together is unnecessary: we could (and probably should) write Lit(ab) instead of Lit(a, Lit(b)). However, the fact that these two tests pass reassures us that our design is working. Test-Driven Development Some programmers write the tests for a piece of code before writing the code itself. This practice is called test-driven development, and its advocates claim that it yields better code in less time because (a) writing tests helps people think about what the code should do before they’re committed to a particular implementation and (b) if people write tests first, they’ll actually write tests. However, research shows that the order in which the tests are written doesn’t actually make a difference [Fucci2016]; what actually matters is alternating short bursts of testing and coding. These tests for Lit pass, so we’re ready to move on to wildcards. A * character in our pattern matches zero or more characters, so if there are no more matchers in the chain, then this * matches to the end of the target string and match returns True right away. If there are other matchers, on the other hand, we try matching no characters, one character, two characters, and so on and see if those other matchers can get us to the end of the string if we do so. If none of these possibilities succeeds, the overall match fails (Figure 4.3).
  • 45.
    28 4 MatchingPatterns a * .txt 0 no .txt a b c a * .txt 1 no .txt a b c a * .txt yes .txt a b c 2 Figure 4.3: How wildcard matching works. class Any: def __init__(self, rest=None): self.rest = rest def match(self, text, start=0): if self.rest is None: return True for i in range(start, len(text)): if self.rest.match(text, i): return True return False Once again we write a few tests before moving on: def test_any_matches_empty(): # /*/ matches assert Any().match() def test_any_matches_entire_string(): # /*/ matches abc assert Any().match(abc) def test_any_matches_as_prefix(): # /*def/ matches abcdef assert Any(Lit(def)).match(abcdef) def test_any_matches_as_suffix(): # /abc*/ matches abcdef assert Lit(abc, Any()).match(abcdef) def test_any_matches_interior(): # /a*c/ matches abc assert Lit(a, Any(Lit(c))).match(abc) Either/or matching works much the same way. If the first alternative matches, we try the rest of the chain. If not, we try the second alternative, and if that doesn’t work either, we fail: class Either: def __init__(self, left, right, rest=None): self.left = left self.right = right self.rest = rest def match(self, text, start=0): return self.left.match(text, start) or self.right.match(text, start)
  • 46.
    4.2 Rethinking 29 Ourfirst few tests pass: def test_either_two_literals_first(): # /{a,b}/ matches a assert Either(Lit(a), Lit(b)).match(a) def test_either_two_literals_not_both(): # /{a,b}/ doesn't match ab assert not Either(Lit(a), Lit(b)).match(ab) but further testing uncovers a bug: def test_either_followed_by_literal_match(): # /{a,b}c/ matches ac assert Either(Lit(a), Lit(b), Lit(c)).match(ac) def test_either_followed_by_literal_no_match(): # /{a,b}c/ doesn't match ax assert not Either(Lit(a), Lit(b), Lit(c)).match(ax) ======================= test session starts ======================== test_glob_problem.py F. [100%] ===================== short test summary info ====================== FAILED test_glob_problem.py::test_either_followed_by_literal_match =================== 1 failed, 1 passed in 0.00s ==================== The problem is that Either.match isn’t using rest properly—in fact, it’s not using rest at all because it doesn’t know what to pass it as a starting point. Instead of having match methods return True or False, we need them to return an indication of where the next match should start so that Either can pass that information along to rest. Before making this change, we will clear up a bit of technical debt in our code. 4.2 Rethinking We now have three matchers with the same interfaces. Before we do any further work, we will refactor using a pattern called Extract Parent Class to make the relationship between the matchers clear (Figure 4.4). At the same time, each matcher is checking to see if its rest is None. We can simplify this by creating a class to represent “nothing here”, which is known as the Null Object pattern. We Didn’t Invent This We didn’t invent any of the patterns or refactorings used in this chapter. Instead, we learned them from books like [Gamma1994; Fowler2018; Kerievsky2004]. And as [Tichy2010] showed, learning these patterns makes people better programmers.
  • 47.
    30 4 MatchingPatterns Lit __init__() match() Any __init__() match() Either __init__() match() Lit __init__() _match() Any __init__() _match() Either __init__() _match() Null __init__() _match() Match __init__() match() calls Figure 4.4: Using the Extract Parent Class refactoring. Our new parent class Match looks like this: class Match: def __init__(self, rest): self.rest = rest if rest is not None else Null() def match(self, text): result = self._match(text, 0) return result == len(text) Match.rest requires every child class to have a helper method called _match that returns the location from which searching is to continue. Match.match checks whether the entire match reaches the end of the target string and returns True or False as appropriate. Our new Null Object class looks like this: class Null(Match): def __init__(self): self.rest = None def _match(self, text, start): return start Null objects must be at the end of the matching chain, i.e., their rest must be None, so we remove the rest parameter from the class’s constructor and pass None up to the parent constructor every time. Since Null objects don’t match anything, Null._match immediately returns whatever starting point it was given. Every other matcher can now pass responsi- bility down the chain without having to test whether it’s the last matcher in line or not. With these changes in place, our literal matcher becomes: class Lit(Match): def __init__(self, chars, rest=None): super().__init__(rest) self.chars = chars def _match(self, text, start): end = start + len(self.chars) if text[start:end] != self.chars: return None return self.rest._match(text, end) Lit’s constructor calls the constructor of its parent class to initialize the things that all classes share, then adds the data specific to this class. It returns None for “no match” or whatever self.rest returns If this object’s rest is an instance of Null, this result will be the index after the overall match.
  • 48.
    4.3 Summary 31 Asbefore, the matcher for * checks what happens if it matches an ever-larger part of the target string: class Any(Match): def __init__(self, rest=None): super().__init__(rest) def _match(self, text, start): for i in range(start, len(text) + 1): end = self.rest._match(text, i) if end == len(text): return end return None (The exercises will ask why loop has to run to len(text) + 1.) Finally, the either/or matcher that prompted this refactoring becomes: class Either(Match): def __init__(self, left, right, rest=None): super().__init__(rest) self.left = left self.right = right def _match(self, text, start): for pat in [self.left, self.right]: end = pat._match(text, start) if end is not None: end = self.rest._match(text, end) if end == len(text): return end return None Looping over the left and right alternative saves us from repeating code or introducing a helper method. It also simplifies the handling of more than two options, which we explore in the exercises. Crucially, none of the existing tests change because none of the matching classes’ constructors changed and the signature of the match method (which they now inherit from the generic Match class) stayed the same as well. We should add some tests for Null, but we have now met our original goal, and as the exercises will show we can easily add matchers for other kinds of patterns. 4.3 Summary Figure 4.5 summarizes the key ideas in this chapter; we will see the Null Object and Chain of Responsibility design patterns again.
  • 49.
    32 4 MatchingPatterns glob pattern text matchers class base class Chain of Responsiblity is a matches written as made up of each implemented as of cooperate using Null Object ends with Figure 4.5: Regular expression matching concept map. 4.4 Exercises Looping Rewrite the matchers so that a top-level object manages a list of matchers, none of which know about any of the others. Is this design simpler or more complicated than the Chain of Responsibility design? Length Plus One Why does the upper bound of the loop in the final version of Any run to len(text) + 1? Find One or More Extend the regular expression matcher to support +, meaning “match one or more charac- ters”. Match Sets of Characters 1. Add a new matching class that matches any character from a set, so that Charset('aeiou') matches any lower-case vowel. 2. Create a matcher that matches a range of characters. For example, Range(a, z) matches any single lower-case Latin alphabetic character. (This is just a convenience matcher: ranges can always be spelled out in full.) 3. Write some tests for your matchers. Exclusion 1. Create a matcher that doesn’t match a specified pattern. For example, Not(Lit(abc)) only succeeds if the text isn’t “abc”. 2. Write some tests for it.
  • 50.
    4.4 Exercises 33 MakeRepetition More Efficient Rewrite Any so that it does not repeatedly re-match text. Multiple Alternatives 1. Modify Either so that it can match any number of sub-patterns, not just two. 2. Write some tests for it. 3. What does your implementation do when no sub-patterns are specified? Returning Matches Modify the matcher so that it returns the substrings that matched each part of the expres- sion. For example, when *.txt matches name.txt, the library should return some indication that * matched the string name. Alternative Matching The tool we have built implements lazy matching, i.e., the * character matches the shortest string it can that results in the overall pattern matching. Modify the code to do greedy matching instead, and combine it with the solution to the previous exercise for testing.
  • 51.
    Taylor Francis Taylor Francis Group http://taylorandfrancis.com
  • 52.
    5 Parsing Text • Parsingtransforms text that’s easy for people to read into objects that are easy for computers to work with. • A grammar defines the textual patterns that a parser recognizes. • Most parsers tokenize input text and then analyze the tokens. • Most parsers need to implement some form of precedence to prioritize different patterns. • Operations like addition and function call work just like user-defined functions. • Programs can overload built-in operators by defining specially-named methods that are recognized by the compiler or interpreter. Terms defined: abstract syntax tree, concrete class, CSV, grammar, JSON, operator overloading, parser, token, tokenizer, YAML We constructed objects to match patterns in Chapter 4, but an expression like 2023-*{pdf,txt} is a lot easier to read and write than code like Lit(2023-, Any(Either(pdf, txt))). If we want to use the former, we need a parser to con- vert those human-readable strings into machine-comprehensible objects. Table 5.1 shows the grammar our parser will handle. When we are done, our parser should be able to recognize that 2023-*.{pdf,txt} means the literal 2023-, any charac- ters, a literal ., and then either a literal pdf or a literal txt. Please Don’t Write Parsers Languages that are comfortable for people to read and write are usually difficult for computers to understand and vice versa, so we need parsers to translate the former into the latter. However, the world doesn’t need more file formats: please use CSV, JSON, YAML, or something else that already has an acronym rather than inventing something of your own. 5.1 Tokenizing Most parsers are written in two parts (Figure 5.1). The first stage groups characters into atoms of text called “tokens“, which are meaningful pieces of text like the digits making up a number or the letters making up a variable name. Our grammar’s tokens are the special characters ,, {, }, and *. Any sequence of one or more other characters is a single multi- letter token. This classification determines the design of our tokenizer: 35
  • 53.
    36 5 ParsingText Meaning Character Any literal character c c Zero or more characters * Alternatives {x,y} Table 5.1: Glob grammar. 1. If a character is not special, then append it to the current literal (if there is one) or start a new literal (if there isn’t). 2. If a character is special, then close the existing literal (if there is one) and create a token for the special character. Note that the , character closes a literal but doesn’t produce a token. The result of tokenization is a flat list of tokens. The second stage of parsing assembles tokens to create an abstract syntax tree (AST) that represents the structure of what was parsed. We will re-use the classes defined in Chapter 4 for this purpose. Before we start writing our tokenizer, we have to decide whether to implement it as a set of functions or as one or more classes. Based on previous experience, we choose the latter: this tokenizer is simple enough that we’ll only need a handful of functions, but one capable of handling a language like Python would be much larger, and classes are a handy way to group related functions together. The main method of our tokenizer looks like this: def tok(self, text): self._setup() for ch in text: if ch == *: self._add(Any) elif ch == {: self._add(EitherStart) elif ch == ,: self._add(None) elif ch == }: self._add(EitherEnd) elif ch in CHARS: self.current += ch else: raise NotImplementedError(fwhat is '{ch}'?) self._add(None) return self.result This method calls self._setup() at the start so that the tokenizer can be re-used. It doesn’t call self._add() for regular characters; instead, it creates a Lit entry when it 2023-*{pdf,txt} [Lit, 2023-] [Any] [Either, pdf, txt] Lit 2023- Any Either pdf txt Null tokenizer parser Figure 5.1: Stages in parsing pipeline.
  • 54.
    5.1 Tokenizing 37 A * { B , C } A*{B,C} currentresult A [ ] ch [[Lit, A], [Any]] [[Lit, A], [Any], [EitherStart]] B [[Lit, A], [Any], [EitherStart]] [[Lit, A], [Any], [EitherStart], [Lit, B]] C [[Lit, A], [Any], [EitherStart], [Lit, B]] [[Lit, A], [Any], [EitherStart], [Lit, B], [Lit, C], [EitherEnd]] Figure 5.2: Steps in tokenizing a string. encounters a special character (i.e., when the current literal ends) and after all the input has been parsed (to capture the last literal). The method self._add adds the current thing to the list of tokens. As a special case, self._add(None) means “add the literal but nothing else” (Figure 5.2): def _add(self, thing): if len(self.current) 0: self.result.append([Lit, self.current]) self.current = if thing is not None: self.result.append([thing]) Finally, we work backward to initialize the tokenizer when we construct it and to define the set of characters that make up literals: CHARS = set(string.ascii_letters + string.digits) class Tokenizer: def __init__(self): self._setup() def _setup(self): self.result = [] self.current = A Simple Constant The code fragment above defines CHARS to be a set containing ASCII letters and digits. We use a set for speed: if we used a list, Python would have to search through it each time we wanted to check a character, but finding something in a set is much faster. Chapter 17 will explain why the word “ASCII” appears in string library’s definition of characters, but using it and string.digits greatly reduces the chances of us typing abcdeghi...yz rather than abcdefghi...yz. (The fact that it took you a moment to spot the missing letter ‘f’ proves this point.) We can now write a few tests to check that the tokenizer is producing a list of lists in which each sub-list represents a single token:
  • 55.
    Random documents withunrelated content Scribd suggests to you:
  • 56.
    venomous snake there,but the only viper, except Russell’s viper, a much larger snake. Only twice could I observe the toxic effects of the Echis carinata at present (1882) in the collection; both cases being in hot weather. It has so far conformed to circumstances in England, as to consent to dine on small white mice, failing scorpions. In the first case it struck the mouse savagely as soon as it was dropped into the cage, and the mouse died in less than two minutes. Echis approached it stealthily and timidly, but having at last got courage to seize it, ate it very quickly; and as the snake moved and dragged it, the mouse appeared to be quite stiff in that short time. On the second occasion, it bit a mouse on the leg, and it was five minutes dying. At first only the leg was paralyzed; then a spasm followed, and the mouse fell over and lay extended flat and still as if dead; but presently a spasmodic convulsion followed. It again appeared to be dead, and the little viper approached; but on a very slight spasm receded swiftly, not once taking its eyes off the mouse, which was dying slowly. The viper was at least five minutes swallowing this, and as if it did not much care about it. One must argue, therefore, that the charge of venom had been scantily expended, as the difference between this and the previous victim was remarkable. Echis poison has been seen to take instantaneous effect. The small Vipera atropos from the South African mountains is also astoundingly virulent. One in the collection in 1881 struck a mouse as soon as it arrived, and death occurred in fifty seconds by the watch. A large store of poison must have accumulated during its journey and since its previous meal. One more African snake must be mentioned before I conclude the painful duty of describing the inevitable—though happily short— sufferings inflicted by venomous serpents. Three young Najas, the well-known Ring Halsschlange of South Africa, were brought in the spring of, I think, 1877. They were very black and very shy, and for a long while one could see nothing more of them than three little heads in a row peeping out from under their blanket, and watching with their large round black eyes, but vanishing like a shot at your approach. ‘They cut away the moment you go near them,’ said the keeper. When they did give us an opportunity of looking at them, we found that one was quite black, and another was speckled with white; they erected their heads and distended their necks
  • 57.
    defiantly. Their eyeshad a white rim round them, and were bright and undeniably beautiful, even though belonging to a venomous snake. Whether because they were young and inexperienced, or naturally stupid, I could not decide but of all the snakes none ever went so awkwardly to work in feeding, or put their victims to such unnecessary torture, as did these ridiculous little Najas. The feeding observations were made in August, when they had grown considerably, and had become accustomed to their home. They seemed to bite the prey anywhere without much effect, sometimes retaining it in their mouth, and at other times beginning at once to eat it. One frog was ten minutes from the time it was struck until it was swallowed, and for no reason beyond the feeder’s awkwardness. The little snake began at a hind leg, and not being able to get the frog into its mouth, put it down and began again at the side, but with no better result, the legs being in the way. Then he gave it up and let the frog go, and presently his comrade struck the half-dead thing and took five minutes to eat it. One might decide from this that frogs were not their natural food; but with very young sparrows the same mismanagement was observable. The bird was awkwardly bitten on the tip of the wing, and the snake held it helplessly for a quarter of an hour while the bird was struggling violently. Not getting good hold, the snake put it down and began again, so that the poor little sparrow was twenty minutes in being swallowed, gasping to the last, and evidently only very feebly poisoned. One of the Najas bit his companion, and held on for about ten minutes, and for no reason whatever that one could discern. In no other venomous snakes have I seen such prolonged suffering caused by such stupidity or bungling as in those young African ‘Ring Hals.’ Their fangs are, however, exceedingly short, as I found on examining a dead one, and this may account for the slow effect of them. Three other heads were often seen in a row peeping out, but belonging to harmless ‘glass snakes,’ and there was intelligence in their looks; for they recognised the keeper, and advanced to the glass whenever he passed, asking for their dinner as plainly as little snakes could ask. A Heterodon exhibited equal intelligence when it was dinner- time, and sprang at the glass when he saw the keeper coming. Some of the pythons display intelligence too, on feeding days, but of quite an epicure form. One day in May 1876, on remarking that the pythons
  • 58.
    were disinclined toeat, Holland said ‘they were waiting for young ducks,’ only elderly birds being in their cage at the time. Even in summer they don’t eat the old ducks so eagerly, because the large, hard quills annoy them. A bunch of these quills passes undigested. Hair or feathers in a desiccated mass pass through the snakes, and occasionally, when they are not in health, digestible but undigested substances too, also the beaks of the ducks. Vegetable substances have been found in snakes, from which it has been argued that they sometimes eat vegetables. But it rather argues that they don’t digest vegetables, which have probably been swallowed in the stomach of a rabbit or some other herbivorous animal that they have caught. An indifference to food was noticeable in the snakes in ungenial weather. One cold, raw, foggy day in October 1873, a python caught a duck and partially coiled it, but so feebly that the bird, after passively submitting for a time, at last disengaged her feet and walked away to shake herself, and then turn and stare as if to discover what possibly had kept her there. A similar disinclination to exert themselves was seen that same chilly day in the largest cage, where were three large pythons. One of them having killed a duck, could not get a satisfactory hold of its head, and let go repeatedly. Another held a duck, but not to crush it or hurt it; for it, like the one above named, only gazed deliberately around, and as if asking the meaning of its detention. A third duck was put into the den for the third python, who, however, only lazily stared at it and made no attempt to seize it; while the bird gazed in astonishment at the one in the embrace of the other snake, as if to inquire, ‘What are you doing there?’ Presently this duck also got away, and was again caught and only partially coiled. The python seemed too large and fat to constrict so small a thing as a duck. It was like tying up a pill-box with a rope. Some of the spectators expressed satisfaction that the duck was not more tightly coiled, and hoped it would succeed in getting away (the duck was not worth two shillings, the python could not be bought for twenty pounds), and were far more horrified when a vigorous constrictor caught and killed its prey in one flash, as when an extended watch-spring flies back to its original position. But a half- constricted creature does suffer, and happily this does not often occur,
  • 59.
    the chilly weatherthat day diminishing ophidian energy considerably. A gentleman, disappointed because they did not eat, and wishing to assign some reason for such unaccountable abstinence, remarked to his friend, ‘I have an idea they sting themselves.’ Watching these gigantic ophidians on one of those half-wintry days, it happened that two of them were lazily gliding, partially hidden by their blanket, and with neither heads nor tails visible, so that the two bodies seemed as only one snake. Two youths stood watching and vainly endeavouring to calculate the numbers of feet or of yards which were entwined and entwisted in those moving coils. Portions and loops of two other pythons in the same cage were visible beyond the rug, but only one head of all the four snakes was to be seen; and to distinguish to which of the gliding, shining curves that head belonged, was impossible. ‘It seems to me that snake’s such a length that he doesn’t know the other part belongs to him,’ remarked one boy to his friend. ‘I don’t think he knows where it is,’ returned the other boy sympathetically. Not a little are the keepers sometimes tried in replying to the inquiries of visitors desirous of improving their minds. Let me repeat one or two conversations overheard on those Fridays. ‘Is that duck put in there for the snake to eat?’ asked a respectably dressed man of the keeper on one of those autumnal days, when a duck sat pluming itself as if settling itself for the evening. ‘Yes, sir,’ replied the keeper. ‘Will he swallow it whole?’ ‘Yes, sir.’ ‘Choke him! I should think?’ ‘No, sir; no—it won’t choke him.’ The man studied the duck, and studied the size of the python’s head and throat for some time. The duck apparently going to rest, but not quite reconciled at so many persons intruding upon her, the man looked disappointed, and again began: ‘Now is that duck charmed, sitting there?’ ‘I should think, sir, she was not at all charmed with the prospect,’ sedately replied Holland.
  • 60.
    ‘Does that duckknow it’s going to be eaten?’ then inquired the man after fresh scrutiny. ‘No, sir,’ returned the keeper with the utmost gravity. ‘That snake don’t seem to be hungry,’ then said the disappointed observer. ‘No, sir. He’ll eat well enough next Friday. He’s going to change his skin.’ ‘Oh!’ said the man to a boy by his side, satisfied, though still rather puzzled, ‘that snake’s going to change his skin next Friday.’ Though there are always on an average fifty snakes in the Reptile House, and on an average each casts its coat three times a year, the visitors are for the most part much mystified about this phenomenon. A snake that had just completed a new toilet had a portion of the old slough still adhering to its tail, when a boy drew attention to it, saying, ‘Papa, that snake is all ragged and torn on its tail.’ ‘Yes, my dear, it is casting its tail.’ Papa must have been reading Aristotle, who wrote: ‘Tails, also, of serpents and lizards when cut off are reproduced.’ With regard to the reproduction of their eyes, Aristotle spake more cautiously. ‘It is reported that the eyes of serpents, if dug out, will be reproduced.’ But, on the contrary, the eyes of snakes are easily injured, and not easily healed; snakes are therefore frequently seen partially blind. As need scarcely be said, only lizards ‘reproduce’ a tail that has been accidentally abridged; and the repair is after all only a boneless one. The truncated member gradually heals, and by and by a short point is again formed, but can always be recognised as a repaired, and not the original, tail; and as far as I have been able to observe, viz. for three or four months, no bone was reproduced. Probably also a snake’s tail might heal in the same way, and to a casual observer appear quite perfect; but the anatomical structure in either case would not, I imagine, be restored. That boy was not far wrong when he said he thought the python did not know which was its own tail. At all events, it is not endowed with much external sensation, as one might judge by the way in which the rats and guinea-pigs take liberties with it. This must be owing to the thickness of the cuticle, because, as we have seen in the constricting snakes, there is keen muscular sensibility in the tail. I may cite an
  • 61.
    instance of eachcase. One day a young rabbit caught hold of a small python with its teeth and held firmly on. The reptile was moving across the cage, and did not appear to feel any hindrance. Indeed, being much the stronger of the two, the persistent bunny was compelled to hop along at the same pace, still holding on by its teeth. But presently, from the position of the snake, the rabbit was obliged to let go, when it next caught hold of the tip of the python’s tail, and again holding tight, hopped after the retreating reptile as if enjoying the joke. In this case I do not think the snake was conscious of the insult, as perhaps the rabbit had hold of the skin only. On the other occasion a guinea-pig was biting a coiled and passive constrictor, Python sebœ. The snake wished to be quiet, but piggy got among its coils and worried it, hopping over it and biting its tail. The python on this, moving only the end of its tail, pushed away the guinea-pig, which soon returned to the attack. The snake again gave the little animal a caudal hint that his fidgeting was annoying; but as the guinea-pig did not take the hint, and still nibbled and teased the snake, the latter with two coils of the tail put an end to the annoyance, not once turning its head, but just tucking up its persecutor in the end of its tail as you might tuck up a parcel under your arm. The python was not hungry, and took no more notice of the offender, though thus effectually punishing the offence with the last two feet of its practical tail. Could we suppose such a quality as muscular intelligence, we might think the tails of those constricting snakes were surely endowed with it. As in other instances already described in chaps. xi. and xii., the eyes took no part in directing the movements of the snake; the whole nine or ten feet of the animal remaining passively coiled, while only the extremity of the tail exerted itself. When reptiles are in a partially torpid condition, their sensations are slow; when hibernating, they are reduced to a minimum. At such times, the creatures being half dead, they may be maimed or injured without any apparent effect. Rats have been known to attack and nibble snakes under these circumstances, and even to eat bits out of them, the snakes being at the time unconscious of injury, though possibly dying from the after effects. A good deal of very interesting matter might be added on the economics of the reptilian ménage, the mode of ventilating and warming it, the cost of its larder, and the best means of preserving the
  • 62.
    health of theinmates. There are, besides, some incidental experiences not devoid of sensationalism in connection with snake guardianship, but my own herpetological experience does not extend beyond the keeping of pet lizards, including blindworms. I may add a word, however, in reply to some often-heard lamentations of disappointed spectators who object to the coverlets, after sometimes waiting in vain to see the snakes emerge from beneath them. ‘Those horrid blankets! Why not give the snakes moss or hay in their cages? or turf and sand and dead leaves? Much more natural for them than those woollen rugs.’ I, too, may have echoed such plaints until a better comprehension of ophidian nature showed the wisdom of what is certainly a somewhat disappointing arrangement. And those who have honoured these pages with a patient perusal, and discovered the nervous timidity and sensitiveness of these reptiles, their proneness to reject or to disgorge their food, to injure themselves or each other when molested, not to mention the danger of meddling with the venomous kinds and the easy escape of the swifter snakes, will admit the importance of providing them with such retreat and shelter as can be most speedily arranged, and which will secure the least annoyance to the terrified serpents while the keepers are doing their best to preserve order and cleanliness. The allusion to lizards tempts me to add a word or two on the exceptional species which has lately become an inmate of our Zoological Gardens. There are certain features in it so much in common with viperine snakes, that I may be pardoned for dragging a lizard into these pages. I allude to the Heloderm (Heloderma horridum) from Mexico, presented to the Zoological Society in July 1882 by Sir John Lubbock. Its advent was an event in reptilian annals; and being surrounded by a halo of curiosity, it claims a passing notice. We have been at some pains to exonerate saurians from the evil character which our ancestors were apt to give them; but suddenly—and to the surprise of even some herpetological authorities—there comes a lizard that with one grip of his jaws caused a frog to fall dead in a moment, and a guinea-pig in three minutes, the symptoms appearing to be the same as those produced by deadly snakes. The Heloderm is ‘said’ to be furnished with poison glands in both jaws! But until a dead specimen
  • 63.
    has been furtherexamined and described, the signification of ‘poison gland’ must be restricted. Its teeth—many and strong—are grooved with a deep furrow; its salivary glands are largely developed; and under excitement a thick, acrid secretion flows abundantly from its jaws. Yet so far as present observations enable us to form an opinion, the reptile does not use these formidable teeth to secure its prey, or even in feeding. It did not devour the victims of its bite, nor has it since killed any creature for the express purpose of eating it. Up to the date at which I write (Oct. 1882), eggs have formed its chief diet, varied by an occasional dead mouse. Now it certainly does not require deeply- grooved teeth and venomous saliva to bite raw eggs and dead mice. Nor does the noxious secretion flow continuously from its gums in repose, but abundantly so when irritated. Though a stranger in England, this lizard was known more than two hundred years ago. Hernandez, in his Nova Animalium Mexicanum, published at Rome in 1651, described its bite as ‘hurtful, but not deadly;’ and that it was ‘more dreadful in appearance than reality.’ Its Mexican name, Acaltetepon, is (or was then) applied to all large and suspicious-looking lizards. Scorpione is its modern name. As Heloderma horridum was awarded plenty of space in the journals at the time of its arrival, full accounts of it will be found elsewhere; it is introduced here merely as one of the venomous reptiles that form the chief subjects of this chapter, and to trace its analogy with them. In its slow, stealthy movements there is the same striking contrast between the Heloderm and most other lizards, that there is between the deadly vipers and the active colubrine snakes; and the inquiry suggests itself, Can the venom elaborated in their system so act upon themselves as to produce this habitual lethargy? Drowsiness and coma are almost invariable effects of snake venom in the blood, and why is it that the deadly serpents are so constitutionally different from others? The Heloderm has a round, heavy tail, of no service to it in swimming, and short, weak fingers, ill suited to climbing; and it passes its lethargic existence on the sandy plains of Mexico, manifesting in its actions, or rather in its inactivity and stealthiness, a conscious timidity and cowardice. Motionless for hours, with an impulse to retreat if molested, but attempting to bite if angered, its noxious saliva would seem to be rather protective than aggressive. It may have formidable enemies at home; and by all we see
  • 64.
    of it here,it does not use its teeth as a means of obtaining food. In this respect, therefore, it is an exception to deadly serpents, and cannot take its venom into its stomach as they do. And, again, the remarkable development of its tongue suggests a peculiarity of food. In lapping the egg, the action of it is apparently perfected by practice; the tongue is twisted, extended, twined under, then over, now used as a shovel, a scoop, or a broom, as occasion requires. It is the very reverse of what I noticed in some other lizards feebly lapping up an egg (see p. 71), for in a most expeditious manner does Heloderm cause its raw eggs to disappear. A word à propos of its name horridum, supposed by many to refer to its objectionable qualities. Unfortunately the word ‘horrid’ has almost entirely lost its original signification and become mere slang in English. But when Wiegmann assigned it the name of Heloderma horridum in 1829, ‘horrid’ was understood according to its original meaning, from horridus, rough, rugged, etc.; and as this reptile has a remarkable skin, dotted over with little prominences, like knobs or warts (hence its generic name, Heloderma, warty skin), there can be but little doubt as to the intention of horridum. In a communication to Knowledge (Sept. 29), I ventured to call this the ‘Warty-skinned Lizard,’ in consequence of the confused accounts of it which have appeared in print. There are several other warty-skinned or ‘tuberculous lizards.’ The specific horridus, as applied to the South American Crotalus, also signified its terrible or dreadful character, and not the ‘horrid’ which spectators apply indiscriminately to snakes and their blankets. With the rapid advance of ophiology comes the splendid new home for snakes which will shortly grace our Zoological Gardens; and in taking leave of my readers, I cannot offer them a kinder wish than that their visits there to observe the snakes will be productive to them of as much pleasure as has been mine to describe them.
  • 65.
    INDEX. LIST OF ABBREVIATIONS. Am.,American; Ass., Association; Br., British; co., cobra; Con., Convention; cons., constrictor; cro., crotalus; cy., cyclopædia; C.M.Z.S., Corresponding Member of the Zoological Society; F.Z., Fellow of the Zoological Society; hist., history; nat., natural; N.Y., New York; py., python; Soc., Society; s., snake; ss., snakes; ser., serpent; v., viper; vs., vipers; U.S., United States; z., zoological; Z.G., Zoological Gardens; Trans., Transactions; Proc., Proceedings. A ‘Aberfoyle’ (the barque), sea-ser. seen from, 249. Abnormal development, of teeth, 67; of sea-sers., 265, et seq.; of hair, 302; two heads, 189. Abnormal, health in captivity, 84, 440, 450, 457, 565; gestation, 437, 439, et seq., 459, 466, 505. Academy of Sciences, Paris, 444. Acaltetepon, the, 590. ‘Account, of the Beasts in Virginia’, 280; of the rattlesnake, 275. Acrobats, 181, 196, 198, 214, et seq., 219, 239, 472, et seq. Adaptation, of organization, 70, 135;
  • 66.
    of habits totemperature, 159, et seq. (see hibernation); of coils, 200, 204; of ribs in progression, 207; of form, 215. Adaptive development, of head bones, 31, 34; of spine bones, 70; of windpipe, 132; of salivary glands, 342, 350, 537, 557; of teeth, 348, 350, 364, 408. Adipose tissue, 394. Admiralty, the, report of a sea-ser. to, 255, 259. Advance of Ophiology, 3, 21, 75, 81, 191, 273, 353, 363, 372, 443, 485, 515, et seq. Ælian, an error traced to, 191. Æsculapius, his remedies, 522. Africa, range of sea-s., 236. Air-bladder, the, 262. Air-breathers, 44, 146, 222, 265. Air-tube, 132, 135, 137, et seq. (see glottis, respiration). ‘Albert Nyanza’, the, 358. Alceste, Voyage of the, 111, et seq. Alcohol a popular remedy, 548, et seq. Aldrovandus, 102; his work, 272. Alexander the Great, 407 (see Bucephalus). Alligators, 43, 261. Amazon, the Jararaca, 421; Wallace’s ‘Travels in’, ib.; anaconda from, 455. ‘American Agriculturist’, the, 485. American Ass. for the Advancement of Science, 485, et seq., 572; secretary to, 490. American colonies, the, 284. ‘American Naturalist’, the, 93, 151, 308, 310, 450, 485, et seq., 490;
  • 67.
    editor of, 485(see Putnam). American sea-ser., 252. Ammonia an approved remedy, 533, 547. Amphibia, the, 46. Amsterdam, py. at, 437. Anatomy, of head, 30; of jaws, 31, 34; of the spine, 209; of the rattlesnake, 275. ‘Anatomy of the Vertebrates’, 67, 119, 131, 143, 147, 180, 184, 196, et seq., 212, 336 (see Owen). ‘Anecdotes of Serpents’, 216, 523. ‘Animal Biography’, 134. Animal kingdom, 51. ‘Animal Physiology’, 147, 210 (see Carpenter). ‘Animal Physiology’ (Roget’s), 120, 195 (see Roget). ‘Animal World’, the, 169. ‘Animalium Mexicanum’, 190, 590. ‘Annales des Sciences Naturelle’, 78, 91, 442, 444. Antennæ of insects, 118, 120, 126, et seq. Antidotes, 275; Fayrer’s definition of, 533; various reputed ones, ib., 534, 539, 547, et seq. Apodal, ss. are, 206. Appendages, mythical, 101; caudal, 172, et seq.; epidermal, 315; ‘horns’, 320; crest, 325; tentacular, 326; as ‘claws’, ib. (see rattle, epidermis). Aquatic ss., 53, 145, 150, 174, 221, 225, et seq., 233, et seq., 401, 423, 453, 495, 496. Arabs chew herbs, 525.
  • 68.
    Archeopteryx, 44. Aristotle, hisname for reptiles, 45; bite of s., 96; marine ss., 244; vs., 432, et seq. Arius, the, 489. Armadillo, 413. Association, Am., 485, et seq. Association, Br., 42. Atlantic, the, 249; sea-ss. in, 238; sea-ser., 252. Atrophied limbs, 326. Atter, ætter, 479. ‘Aunt Judy’s Magazine’, 21, 72, 303, 312, 333, 470. Aural: powers of anguis fragilis, 476; apparatus of ground ss., 527. Australia, s.-hunting in, 167; sea-s., 236, 246. ‘Australia, Snakes of’ (see Krefft). Axolotl, 44. B Bacon, Lord, quoted, 49, 57; a poor naturalist, 99. Bagrus, the, 489. Baird of U.S. on the bull-s., 502. Baker, Owen, a sea-ser. seen by, 257, et seq. Baker, Sir Samuel: v. fangs, 357. Balance of Nature, 17, 57. Balfour’s ‘British India’, 74, 76, 86. Balfour’s ‘Cyclopædia of India’, 86. Bancroft, H. H., ser.-worship, 514.
  • 69.
    Banks, Sir Joseph,action of ribs, 207; letter to Waterton, 418. Bard of Avon, the, 97. Bartlett, Mr. A. D., on the sea-ser., 255, 261, et seq.; presenting a v., 322; buying an anaconda, 455. Barton, Benjamin Smith, on the Cro., 299. Bartram (Mr., of U.S.): a ‘roaring’ s., 155; ‘cluster of teeth’, 371. Bates (H. W.): a cannibal s., 39; Jararaca, 421. Batrachia, the, 51. Battues of ss., 286, et seq., 289. Beal (J. W., of U.S.): sound of the rattle, 309. Beaufort, Duke of (A.D. 1709), 173. Beaumont and Fletcher, quotation from, 487. Beauvoir, Palisot de, his testimony, 488. Bell, Prof. Thomas, 4; food of ring-s., 74; his tame s., 76; his ‘British Reptiles’, ib.; a trustworthy herpetologist, 78; ss. drink, ib., 91; editor ‘Zoological Journal’, 140; on anguis fragilis, 453; gestation of anguis fragilis, 466; sloughing of anguis fragilis, 481; the maternal refuge, 493. Bellowing s., 155, et seq. Ben Jonson, 99. Bengal, Bay of, range of sea-ss., 236. Berkeley; rattlesnake remedies, 538. Beverley, Colonel: a stinging tail, 174; his ‘History of Virginia’, 281; a severed head biting, 390.
  • 70.
    Bezoar, 523. Bibron, 80(see Duméril). Bingley, a boa feeding, 134. ‘Blackie’, 458, et seq. Bladder, the swim-, 145. Blake (Colonel, of Virginia), a rattlesnake, 284. Blanket, swallowed, 35; disappointing, 588. Blowing: the, vi. 152; as ‘puffing’, 148; like a bull, 155. Bluets, 65. Bond, Rev. H., his testimony, 491. Bones: of the head, 30; intermaxillary, 31; of spine, 178, 198, 213; in Deirodon, 67; of tail, 183; of ‘claw’, 220 (see jaw, vertebræ, tails, etc.). Bonnat, 228, 383. Borneo, Crotalidæ in, 386. Boston, U.S., sea-ser. off, 251. Buffon, 197; his era, 383. Bourbon, Isle of, py. incubating in, 444. Bournemouth, lizards at, 164; a capture, 458. Bourrelets, of the rattle, 304. Bowerbank, a two-headed s., 190. Braden, J. G., Esq.: specimens of rattle, 306. Brazil, tree s. in, 219; ‘Discourse’ of, 271; names of Crotalidæ, 277; specimens from, 339, 360;
  • 71.
    Jararaca in, 396; ‘Travels’in, 397; experiments in, 406; vernaculars of, 419, 423, et seq. Breathing: irregular, 142, et seq.; sometimes partial, 144; suspension of, 145, 161, 168, 253; as ‘puffing’, 149, 155 (see hibernation, respiration). Bridgewater Treatise, 120. British Ass., 42. British Guiana, Hist. of, 420. British India (see Günther, Fayrer, Balfour, Nicholson, etc.). British Museum, 19, 131, 291, 377, 405, 418; Dr. Günther’s catalogue of ss. at, 354. ‘British Reptiles’ (Bell’s), 76, 140, 453 (see Bell). Brittain, Mr., evidence of, 504. Broderip, Dr. J., a naturalist, 113; his works, ib.; on the larynx, 135; quoted by Gosse, 112; observations by, 134, et seq. Browne, Sir Thomas: ‘Vulgar Errours’, 171; a two-headed s., 191; the maternal refuge, 487. Bruton, Dr. Lauder, 552. Buckland, Frank: his visit to Chelsea, 13; a ‘flannel sausage’, 36; the Coronella, 83, 435; a waggish s., 104; a mistake of, 115; sensitiveness of tail, 183; a sea-s., 238; a ribbon fish, 250; sea-ser., 255, et seq., 264; two-headed s., 189; reserve fangs, 346; s. eggs, 437; anaconda, 455;
  • 72.
    invited, 464; obituary noticeof ‘Cleo’, 515. Bull -frog, 156. Bullen, George, Esq., of Br. Museum, 19. Bulletin, U.S., Zoological ‘Reports’, 291, 309, 388. Burman coast, sea-ss. range, 238. Burrowing ss., 46, 53, 188, 459, 468. C Caledonia, New, sea-ss. at, 238. Cannibalism, 37, 182, 199, 401, 562; sometimes unintentional, 38, 572; common among the Elapidæ, 567, 571. Cañons, ss. in, 162. Cantor, Dr. Theodore, sea-ss. drinking, 82; quotes Schlegel, 90; tongue of sea-ss., 125; sea-ss. asleep, 169; on pelagic sers., 235; their poison, 243; their food, 245; number of species, 246. Cape, the, sea ser. at, 252, 259. Capybara, the, 229. Carbolic acid, kills ss., 544; a remedy for the venom, ib. Carinate scales, 319 (see ‘keel’). ‘Carolina, History of’ (see Lawson). ‘Carolina, Natural History of’ (see Catesby). Carpenter, Dr., F.R.S., etc., links and transitions in nature, 44; his opinion of Bacon, 99; lungs of ss., 143; hibernation, 165; length of spine, 210; insalivation of food, 352;
  • 73.
    vegetable protectives, 539. Carver,Jonathan, a ‘blowing’ s., 153; large swarms of ss., 225; the maternal refuge, 487. Catesby, Mark: an egg robber, 63; the ‘blowing v.’, 152; ‘Horn ss.’, 174, 189; ‘water vs.’, 226; rattle-ss., 284; ‘hog-nosed’ ss., 410; Indian remedies, 538. Catlin, George, rattlesnake battues, 287; an alarm, 391; conjugal ss., 502; Indian superstitions, 509. Cats, their whiskers as feelers, 124; tenacious of life, 559; resist venom, ib. Caudal, eloquence, 155, 179, 311, 587; appendages, 170, et seq. 174 (see tail, rattle, etc.). Caverns, the retreat of ss., 287, 443. Caves, sacred, 509; abode of ss., 124, 162, 166, 288. Centipedes, legs of, 212. ‘Ceylon, History of’ (see Tennant). Chalk epoch, 262. ‘Challenger’, voyage of the, 262. Chambers: ‘Anecdotes of Serpents’, 216, 523; editor of the ‘Journal’, 23; observations at the Z.G., 217; the ‘Miscellany’, 523. Chancery, ss. in, 13, 515. Chandernagor, travels in, 444. Charas, Moyse: his work, 273; he ‘grovels’ for fangs, 359, 372; experiments on vs., 371;
  • 74.
    a ‘nursery’ offangs, ib.; knew of the mobility of fangs, 372. Charming, Sir H. Sloane on, 281; its origin, 515, et seq., 578, 585 (see fascination). Charms, 281; in s. relics, etc., 509, et seq. Chase, a, with a s., 214. Chased by a s., 185. Chateaubriand’s descriptions of ss., 153, 175, 197, 307; the maternal asylum, 487. Chelonia, 51. Chelsea, tame ss. at, 13, 27, 515, 525. Chicago, observations at, 496. Chimaphila, 65. Chinese mythologies, 509. Chittagong, rare beast from, 261. Chordæ vocales, 147. Circulation of blood, 56; checked by cold, 161; renewed by warmth, ib.; moisture essential to it, 162 (see hibernation, respiration, etc.). ‘City of Baltimore’, the, sea sers. seen from, 249. Clarke’s translation of Der Hoeven’s ‘Handbook of Zoology’, 118. Classification, 50; at present defective, 51; five principal groups of ss., 53; by dentition unsatisfactory, 354, et seq.; difficulties occurring in, 413, 421; Krefft on, 423 (see nomenclature). Clayton, Mr. J., the ‘tayle’ of rattlesnakes, 280. Climbing, 180, et seq., 196, 214, et seq., 230; of sea-ss., 238; of Anguis fragilis, 475, 482. Cockburn versus Mann, 13.
  • 75.
    Coiling, the, 48; inconstriction, 29, 203; of tail, 182, 587; in convolutions, 185; for a spring, 198; to substitute hands, 199, et seq., 206; swiftness of, 200; flexibility of, 218; of the sea-ser., 257; of Heterodon, 409, 570; in repose, 447, 587; of ‘Lizzie’, 472, 478; of Natrix, 569; of the Lacertines, 571 (see constriction); before striking, 573. Cold; ss. affected by, 143, 159, 165, 584, et seq. Cold-blooded, 56; why so, 142, 146, 159 (see hibernation, etc.). Colours of ss., 10; under excitement, 153, 572; of tree ss., 219, 386; of rattlesnakes, 270, 285; of African vs., 321, 338, et seq.; of ‘Bushmaster’, 417, et seq., 426; variable in young ss., 424; in other ss., ib.; after moulting, 508 (see sloughing). Combats between ss., 37, 199, 563. Congress (U.S.), Government Commissions, 199, 376. Constriction, 29, 199, 203, 214, 245; of young boas, 216, 439, 446; sometimes feeble, 583. Constrictors, 14, 35, 38, 135, 141, 182, 198, et seq., 213, et seq., 258, 336, 438, 446, 454, 583. Convention, a, of ss., 104; U.S. on ss., 485, 505, 552, 572. Cooke, M. C., ‘Our Reptiles’, 491; editor ‘Science Gossip’, ib.; a herpetologist, ib.
  • 76.
    Cooper, W. R.,‘Serpent Myths’, 514. Cope, Professor, of U.S., 386. Cotton, Dr., of Tennessee, his rattlesnake, 298. Coues, Dr. Elliott, of U.S., a combat, 199; sound of the rattle, 309; development of rattle, 312; action of fangs, 362; cro. fang, 376, et seq.; species of cro., 386; virulence of bite, 390; pigs not exempt, 394. Council of Z. Soc., 78, 322. Cows sucked by ss., 84, 478. Coypu, 229. Cragin, Mr. F. W., on Heterodon, 450. Cranberry swamps, the Massasauga, 393. Crepitaculum caude, the, 294. Crispe, Dr. Edwards, F.Z.S., on the œsophagus, 490; the maternal refuge, ib. Crocodilia, the, 51, 261. Crotalina, 292. Crotalum, 292. Crotchets mobiles, 353, 362. Cruden, s. poison, 102. Cruelty, 17, 169, 206, 321, 469. ‘Curiosities of Natural History’ (see Buckland). Cuvier; his classification, 46; his ‘boa’, 47; he distinguished fangs, ib.; his era, 90, 383; quoted by Darwin, 176; his ‘vs.’, 353; incubation, 434, 456. ‘Cyclopædia of Anatomy’, 118
  • 77.
    ‘Cyclopædia of India’,86. ‘Cyclopædia, the Penny’, 113, 120. D Dahomey, ser. deities at, 514; natives fearless of vs., 523. Dalton, the ‘Bushmaster’, 420. Danish vernaculars, 479. Darwin; complex organisms, 44; cro. mutus, 176; a living fossil, 244; the Platypus, 263; survival of the fittest, 267; on the rattle, 311; atrophied limbs, 326, 452. Daudin, 353, 383, 488. Davidson, Mr. (R. N.), a sea-ser., 251. Davis’ Lectures at the Z. G., 24, 51, 214, 413 (see Flower, Huxley, Mivart). Dean, Dr., a sea-ser. seen by, 249. ‘Deccan Days’, 517 (see Frere, Miss). Deer kill rattlesnakes, 394. Deglutition; manner of, 30, 111, et seq., 132; facilitated by saliva, 35, 109, 113; in drinking, 92, et seq.; Schlegel on, 362; watched at the Z. G., 581, et seq. De Kay; ‘Zoology of N.Y.’, 85. Demerara, ‘Bushmaster’ there, 423. Dentition, 34; of Deirodon, 67, et seq.; of sea-ss., 241; of sea-ser., 266; various forms of, 342, et seq.; distinguishing names in, 347; illus. of, 349, 355, 356, 360;
  • 78.
    Welcome to ourwebsite – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com