I created a powershell script to do certain tasks for our sales/finance department.
Our system creates a large PDF file with several pages where each page is one invoice.
The PDF file needs to be split into single pages and renamed with their invoice number.
One of our plants creates 5 digit invoice numbers which is a bit problematic because my regex-search finds the ZIP code of our plant first before it finds the invoice number in the document.
Right now I have created the script in a way that it excludes the 3 ZIP codes that our plants have. But if we ever get to the point where Invoice # = ZIP code, then my script will fail.
If I could get the 2nd match of the regex-pattern, it would work every time. The overall structure of our invoices are all the same. So the invoice will always be the 2nd match.
This is the current snippet of my script:
$regexPattern = "((?!11111|22222|33333)\d{5,6})" if ($documentcontent -match $regexpattern) { $invoiceNumber = $matches[1] } $documencontent is basically a Word com-object filled with the content of a single page PDF. The splitting I do beforehand before I get to this section.
The "d{5,6}" is because only 1 plant has 5 digit invoice number, the other 2 have 6 digit invoice numbers. The script in its entirety works fine as of now, but I run the risk of 3 invoices to fail once the invoice number matches one of our ZIP codes.
Is there a way to change the "-match" or "matches[1]" so it always skips the first match it finds?
Thanks in advance and please tell me if I can provide you any more information!
Edit: A bit more information
$documentcontent = $document.content.text $document = $word.Documents.Open($invoice.Fullname) $word = New-Object -ComObject Word.Application $invoice is the current single page invoice (this is happening in a foreach loop).
-skip 1 -first 1to do this I believe. I'm not sure how complex your requirements truly are, but explore this for a starting point. If there are exceptions and any match is only one ever, then those matches of 2nd one only would bring back anullvalue. You may need additional conditional logic with$matches.Count -ge 1and$matches.Count -ge 2with someiflogic to process accordingly and differently based on the matches count."^((?!11111|22222|33333)\d{5,6})$"and then take the first result.