To be more specific, I am searching for something that will scan a file or document, and then catagorize each word/set of characters in the frequency that it appears. For example, if "Harry potter" were to be scanned, it would list the amount of times the words "wizard, witch, spell, magic, the, etc, school, room" appear along with the mount of times they appear. Are there any such tools capable of doing this??
Can you tell me the format of the document to be scanned? Coz that is the only messy part. Other than that it is a simple counter application. Depending on what kind of stuff you are dealing with (say a book or two or a series), it is as simple as using Counter in python or an hashmap / dict / equivalent in other languages. There is also a cmd line solution I found, Code: cat potato.txt | tr '[:space:]' '[\n*]' | tr -d '[:punct:]' | grep -v "^\s*$" | sort | uniq -c | sort
The format is a bit flexible. Ideally, it would be from a web page, but it can easily be copied into a word doc or even onto the application itself.
If it's an image file, you use OCR. It would scan the image and output the texts, though it's not always accurate. If you're trying to get certain words calculated from a text file, paste it into https://wordcounter.net/ Then look at Keyword Density section on the right side. It doesn't allow you to search for any particular keywords, but it gives the 30 most used keywords.