Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Text Searching  (Read 560 times)

macktoast

  • Newbie
  • *
  • Karma: +0/-0
  • Posts: 2
Text Searching
« on: September 19, 2023, 01:40:34 PM »

Question for the group regarding Text Searching a PDF. Have a document, 1429 pages, not protected. Searching for key word "Officer Station" and want to find that in the complete document. If I uncheck "Case Sensitive", and uncheck " Whole Words Only", and only select current page to limit the results, after the Search, no match have been found. Clearly, I can see 2 instances of "Officer Station" on this current page. If I search for "Station" it will find both instances but also finds Station as part of "Nurse Station", which I only want Officer Station. If I search the letter "r", it finds all the "r" on the page including the "r" in Officer. If I search for the letter "e", again same results including the letter "e" in Officer. Same results for letter "c", selecting all "c" included on page as well as the "c" in Officer. Now it gets interesting. If I search for the letter "i", it returns all the i's on the page except the "i" in Officer, but does find the letter "i" in Station. If I search for the combined letters of "cer", it will find the "cer of Officer. If I search for "icer" it will NOT find the "icer" of Officer. It will not find the letter "f" in Officer, but will find "o" of  Officer.  If I search for "cer Station" it will find both instances. If I open the page in a web browser and use the search function looking for "Officer Station", the browser will find both instances. I have OCR'd the page and received same results as listed above. Just wondering if anyone else has found search errors close to this.
Logged

CHARLIEEASTWOOD

  • Jr. Member
  • **
  • Karma: +0/-0
  • Posts: 58
Re: Text Searching
« Reply #1 on: September 20, 2023, 05:10:41 AM »

I've noticed bluebeam has a hissy fit with some fonts so it might have something to do with that. I've also accepted that bluebeam is REDACTED for finding text so I use other editors for that haha.

If you edit the pdf text using Edit > PDF Content > Edit Text, how does it cope with it? If you copy the text directly from the Edit Text into the text searching box, does that do anything differently?

Could be a secret double space between officer and station?
Logged

macktoast

  • Newbie
  • *
  • Karma: +0/-0
  • Posts: 2
Re: Text Searching
« Reply #2 on: September 20, 2023, 09:10:59 AM »

Thank you Charlie for your response. I believe you hit the nail on the head.  If I Edit> Edit Content>Select All and paste into a note pad, "Officer Station" in the PDF is "Ocer Station" on the notepad and "Medicaid Office"  in the PDF is "Medicaid Oce" in the notepad and "PREA Office" in the PDA is "PREA Oce" in the notepad. Appears the scanning or save as PDF, process the owner used had some scanning issues. Thanks again
Logged

CHARLIEEASTWOOD

  • Jr. Member
  • **
  • Karma: +0/-0
  • Posts: 58
Re: Text Searching
« Reply #3 on: September 20, 2023, 09:19:33 AM »

Excellent, glad I could help!
Logged