Posts: 20 Threads: 11 Joined: Jan 2019 I have the need to change a list of ips into a regular expression, then copy / paste the results else where The starting list Quote:10.10.10.10 host1 10.10.10.11 host2 10.10.10.12 host3 10.10.10.13 host4 10.10.10.14 host5 The desired output Quote:^10.10.10.10$|^10.10.10.11$|^10.10.10.12$|^10.10.10.13$|^10.10.10.14$ The current output .. notice the last "|", I want that removed. Quote:^10.10.10.10$|^10.10.10.11$|^10.10.10.12$|^10.10.10.13$|^10.10.10.14$| My cheep g code !/usr/bin/python # -*- coding: utf-8 -*- from __future__ import print_function import sys, os, re def cls(): os.system('clear') def main(): cls() try: #olist = [] for line in open (sys.argv[1], 'r' ): word_list = line.split() word_list[0] = re.sub("^", "^", word_list[0], flags=re.M) word_list[0] = re.sub("$", "$|", word_list[0], flags=re.M) print(word_list[0],end='') print('\n\n') except IOError as e : print("File Open Error") print("Error :", str(e)) except IndexError as i : print("Usage: argv[0] <file having ip as the first field, hostname as the second>\nExample : 10.10.10.10 host1\n 10.10.10.10 host2\n 10.10.10.12 host3") main()Working on a Linux vm [localhost etc]$ cat system-release CentOS Linux release 7.6.1810 (Core) [localhost etc]$ python -V Python 2.7.5 I know .. the Python version is old and crusty considering 3.8 is in beta ... but they are still using 2.7 at work. My Question The only way I can think of to get rid of the trailing pipe is to count the lines in the file, iterate a separate counter as I run through the file, compare the constant to the line counter, if equal do some thing like print word_list[0][:-1] Is there a better way to do this .. as a side question .. is there a way to combine the 2 re's into a single line ? Thanks for any help provided !!! Regards Sum Posts: 2,171 Threads: 12 Joined: May 2017 Sep-06-2019, 03:17 PM (This post was last modified: Sep-06-2019, 03:17 PM by DeaD_EyE.) Use str.join My output: Output: deadeye@nexus ~ $ python2.7 parse_ips.py Without piping to program, you have to use --input-file deadeye@nexus ~ $ python2.7 parse_ips.py --input-file usage: parse_ips.py [-h] [--input-file INPUT_FILE] parse_ips.py: error: argument --input-file: expected one argument deadeye@nexus ~ $ python2.7 parse_ips.py --input-file hosts.txt ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$ deadeye@nexus ~ $ cat hosts.txt | python2.7 parse_ips.py ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$
Code: #!/usr/bin/env python2.7 from __future__ import print_function import sys import argparse def ip2regex(text): ips = [] for row in text.splitlines(): try: ip, hostname = row.split() except ValueError: # skip errors continue ip = '^' + ip.replace('.', r'\.') + '$' ips.append(ip) return '|'.join(ips) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('--input-file', required=False, help='Input file to generate regex output.') args = parser.parse_args() if args.input_file is None and not sys.stdin.isatty(): print(ip2regex(sys.stdin.read())) elif args.input_file and sys.stdin.isatty(): with open(args.input_file) as fd: print(ip2regex(fd.read())) else: print('Without piping to program, you have to use --input-file', file=sys.stderr)Line 15-17 preparing the IP address. By the way, a dot is a metachar in regex. The dot stands for any kind of char. If you use the dot without escaping it, the regex ^10.10.10.10$ will be also match: 10510710310 PS: split is the opposite of join. Posts: 8,198 Threads: 162 Joined: Sep 2016 why complicate things that much? simple string methods and formating would do? infile = sys.argv[1] with (infile, 'r') as f: ips = [ip_addr for line in f for ip_addr, *_ in line.split()] print('|'.join('^{}$'.format(ip_addr) for ip_addr in ips))and these 4 lines can be shorten to 2 Posts: 20 Threads: 11 Joined: Jan 2019 Sep-06-2019, 04:59 PM (This post was last modified: Sep-06-2019, 05:04 PM by sumncguy.) (Sep-06-2019, 03:17 PM)DeaD_EyE Wrote: Use str.join My output: Output: deadeye@nexus ~ $ python2.7 parse_ips.py Without piping to program, you have to use --input-file deadeye@nexus ~ $ python2.7 parse_ips.py --input-file usage: parse_ips.py [-h] [--input-file INPUT_FILE] parse_ips.py: error: argument --input-file: expected one argument deadeye@nexus ~ $ python2.7 parse_ips.py --input-file hosts.txt ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$ deadeye@nexus ~ $ cat hosts.txt | python2.7 parse_ips.py ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$ Yep thanks .. I understand that he "." means any char. The App Im pasting into recognizes an ip just so long its wrapped in ^$. Thanks. Code: #!/usr/bin/env python2.7 from __future__ import print_function import sys import argparse def ip2regex(text): ips = [] for row in text.splitlines(): try: ip, hostname = row.split() except ValueError: # skip errors continue ip = '^' + ip.replace('.', r'\.') + '$' ips.append(ip) return '|'.join(ips) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('--input-file', required=False, help='Input file to generate regex output.') args = parser.parse_args() if args.input_file is None and not sys.stdin.isatty(): print(ip2regex(sys.stdin.read())) elif args.input_file and sys.stdin.isatty(): with open(args.input_file) as fd: print(ip2regex(fd.read())) else: print('Without piping to program, you have to use --input-file', file=sys.stderr)Line 15-17 preparing the IP address. By the way, a dot is a metachar in regex. The dot stands for any kind of char. If you use the dot without escaping it, the regex ^10.10.10.10$ will be also match: 10510710310 PS: split is the opposite of join.
Ive seen this construct in some example code .. but not in any instruction .... probably because Im just starting out. What is it called and where can I learn about it .. ips = [ip_addr for line in f for ip_addr, *_ in line.split()] Posts: 8,198 Threads: 162 Joined: Sep 2016 (Sep-06-2019, 04:59 PM)sumncguy Wrote: What is it called and where can I learn about it .. this is list comprehension. but yu can also have generator expression, e.g. (ip_addr for line in f for ip_addr, *_ in line.split()) in which case it will not create full list in memory or dict comprehension note that it can be expanded as normal for loop infile = sys.argv[1] ips = [] with (infile, 'r') as f: for line in f: for ip_addr, *_ in line.split(): ips.append(ip_addr) print('|'.join('^{}$'.format(ip_addr) for ip_addr in ips)) Posts: 20 Threads: 11 Joined: Jan 2019 list comprehension .. thank you I dont like to just copy and paste solutions given that I don't understand. Main reason, if I did, next month when I look at the code again ... I'd be thinking 'What the heck does that do again ?" So if I get even a high level understanding an annotate my script .. it will be easier to jar this crusty 54 year old memory ! :) Thanks Sum Posts: 2,171 Threads: 12 Joined: May 2017 Sep-06-2019, 11:59 PM (This post was last modified: Sep-06-2019, 11:59 PM by DeaD_EyE.) The _ is a valid name in Python. In interactive mode it holds the last result, if it was not assigned to a name. >>> 5+5 10 >>> print(_) 10 And the effect of the wildcard in front of one of the names in a assignment: start, *middle, end = 'start 1 2 3 4 5 end'.split() print(start) print(middle) print(end) Output: start ['1', '2', '3', '4', '5'] end
ips = [ip_addr for line in f for ip_addr, *_ in line.split()] Is the same like: ips = [] for line in f: for ip_addr, *_ in line.split(): ips.append(ip_addr) The name f should point to an open file. Iterating over a file-object, gets line by line. But I think this example is overcomplicated. You can write this as: ips = [] for line in f: ip_addr, *rest = line.split() ips.append(ip_addr) Then you get rid of the nested loop. Turning this into a list comprehension: ips = [line.split()[0] for line in f] Posts: 20 Threads: 11 Joined: Jan 2019 Sep-16-2019, 03:40 PM (This post was last modified: Sep-16-2019, 03:40 PM by sumncguy.) I found that a few VMs are using 2.6.6.. Seems that format wasnt introduced until 2.7 .. so the print solution doesnt work in some cases. Can anyone point me to a place where I can find out how to truncate the last "|" in 2.6.6. I wish 1. they would standardize the Linux and python version they are using. 2. upgrade at least to python 3.x .. especially being that 3.8 is in beta 2 !! I work for a big company .. cant say which .. but I find it incredible that they arent really doing any admin on their VMs. Thanks for the help Sum Posts: 8,198 Threads: 162 Joined: Sep 2016 Sep-16-2019, 03:50 PM (This post was last modified: Sep-16-2019, 03:50 PM by buran.) it works, in 2.6 just need to number the placehodler(s) (i.e. explicitly specify the order in which to place values in palceholders). print('|'.join('^{0}$'.format(ip_addr) for ip_addr in ips))this will work also in 2.7 and 3.x versions Or to say it the other way around - in 2.7 and 3.x you can skip the number |