|
1 | | -# fortran-regex |
2 | | -Fortran port of the tiny-regex-c library |
| 1 | +### fortran-regex |
| 2 | + |
| 3 | +Fortran-regex is a Modern Fortran port of the [tiny-regex-c](https://github.com/kokke/tiny-regex-c) library for regular expressions. It is based on the original C implementation, but the API is modelled in Fortran style, which is similar to the intrinsic `index` function. |
| 4 | + |
| 5 | +### API |
| 6 | + |
| 7 | +The main API is modelled around Fortran's `index` intrinsic function (which performs a simple search for a substring within a string): |
| 8 | + |
| 9 | +```fortran |
| 10 | + ! Simple regex |
| 11 | + result = REGEX(string, pattern) |
| 12 | + |
| 13 | + ! Regex with output matched pattern length |
| 14 | + result = REGEX(string, pattern, length) |
| 15 | +``` |
| 16 | + |
| 17 | +# Object-oriented interface |
| 18 | + |
| 19 | +One can also parse a regex pattern into a `type(regex_op)` structure, and use that instead of a string pattern. I have no idea why this should be useful, but at least it's given with a consistent interface |
| 20 | + |
| 21 | +### Overview |
| 22 | + |
| 23 | +The original tiny-regex-c code has been significantly refactored, to: |
| 24 | + |
| 25 | +* Remove all references to `NULL` character string termination, and replace them with Fortran's string intrinsics (`len`, `len_trim`, etc.) |
| 26 | +* Remove all C escaped characters (`\n`, `\t`, etc), replace with Fortran syntax. |
| 27 | +* Even in presence of strings, use `pure` `elemental` functions wherever possible |
| 28 | +* It is a standalone module that has no external dependencies besides compiler modules. |
| 29 | + |
| 30 | +### Example programs |
| 31 | + |
| 32 | +```fortran |
| 33 | +! Demonstrate use of regex |
| 34 | +program test_regex |
| 35 | + use regex_module |
| 36 | + implicit none |
| 37 | + |
| 38 | + integer :: idx,ln |
| 39 | + character(*), parameter :: text = 'table football' |
| 40 | + |
| 41 | + |
| 42 | + |
| 43 | + idx = REGEX(string=text,pattern='foo*',length=ln); |
| 44 | +
|
| 45 | + ! Prints "football" |
| 46 | + print *, text(idx:idx+ln-1) |
| 47 | + |
| 48 | +end program |
| 49 | +``` |
| 50 | + |
| 51 | + |
| 52 | +```fortran |
| 53 | +
|
| 54 | +! Demonstrate use of object-oriented interface |
| 55 | +program test_regex |
| 56 | + use regex_module |
| 57 | + implicit none |
| 58 | + |
| 59 | + integer :: idx,ln |
| 60 | + character(*), parameter :: text = 'table football' |
| 61 | + type(regex_op) :: re |
| 62 | + |
| 63 | + ! Parse pattern into a regex structure |
| 64 | + re = parse_pattern('foo*') |
| 65 | + |
| 66 | + idx = REGEX(string=text,pattern=re,length=ln); |
| 67 | +
|
| 68 | + ! Prints "football" |
| 69 | + print *, text(idx:idx+ln-1) |
| 70 | + |
| 71 | +end program |
| 72 | + |
| 73 | +``` |
| 74 | + |
| 75 | +### To do list |
| 76 | + |
| 77 | + - [ ] Add a `BACK` optional keyword to return the last instance instead of the first. |
| 78 | + - [ ] Option to return ALL instances as an array, instead of the first/last one only. |
| 79 | + - [ ] Replace fixed-size static storage with allocatable character strings (slower?) |
| 80 | + |
| 81 | +### Reporting problems |
| 82 | + |
| 83 | +Please report any problems! It is appreciated. The original C library had hacks to account for the fact that several special characters are read in with escaped sequences, which partially collides with the escaped sequence options in regex. So, expect the current API to be still a bit rough around the edges. |
| 84 | + |
| 85 | +### License |
| 86 | + |
| 87 | +fortran-regex is released under the MIT license. The code it's based upon is in the public domain. |
0 commit comments