Are there any tools for scanning documents and tracking words??

Gandire Alea · Nov 3, 2021

To be more specific, I am searching for something that will scan a file or document, and then catagorize each word/set of characters in the frequency that it appears. For example, if "Harry potter" were to be scanned, it would list the amount of times the words "wizard, witch, spell, magic, the, etc, school, room" appear along with the mount of times they appear.

Are there any such tools capable of doing this??

Lacey_Avocato · Nov 3, 2021

The most similar thing I know is:
https://www.easycalculation.com/word-count.php

Ddraig · Nov 3, 2021

Gandire Alea said: ↑

To be more specific, I am searching for something that will scan a file or document, and then catagorize each word/set of characters in the frequency that it appears. For example, if "Harry potter" were to be scanned, it would list the amount of times the words "wizard, witch, spell, magic, the, etc, school, room" appear along with the mount of times they appear.

Are there any such tools capable of doing this??
Click to expand...

Can you tell me the format of the document to be scanned? Coz that is the only messy part. Other than that it is a simple counter application.
Depending on what kind of stuff you are dealing with (say a book or two or a series), it is as simple as using Counter in python or an hashmap / dict / equivalent in other languages.

There is also a cmd line solution I found,
Code:
cat potato.txt | tr '[:space:]' '[\n*]' | tr -d '[:punct:]' | grep -v "^\s*$" | sort | uniq -c | sort

Gandire Alea · Nov 3, 2021

Ddraig said: ↑
Can you tell me the format of the document to be scanned? Coz that is the only messy part. Other than that it is a simple counter application.
Depending on what kind of stuff you are dealing with (say a book or two or a series), it is as simple as using Counter in python or an hashmap / dict / equivalent in other languages.

There is also a cmd line solution I found,
Code:
cat potato.txt | tr '[:space:]' '[\n*]' | tr -d '[:punct:]' | grep -v "^\s*$" | sort | uniq -c | sort
Click to expand...
The format is a bit flexible.
Ideally, it would be from a web page, but it can easily be copied into a word doc or even onto the application itself.

Ai chan · Nov 4, 2021

Gandire Alea said: ↑

To be more specific, I am searching for something that will scan a file or document, and then catagorize each word/set of characters in the frequency that it appears. For example, if "Harry potter" were to be scanned, it would list the amount of times the words "wizard, witch, spell, magic, the, etc, school, room" appear along with the mount of times they appear.

Are there any such tools capable of doing this??
Click to expand...

If it's an image file, you use OCR. It would scan the image and output the texts, though it's not always accurate.

If you're trying to get certain words calculated from a text file, paste it into https://wordcounter.net/

Then look at Keyword Density section on the right side. It doesn't allow you to search for any particular keywords, but it gives the 30 most used keywords.

Log in

Are there any tools for scanning documents and tracking words??

Gandire Alea [Wicked Awesome Translator]

Lacey_Avocato New Member

Ddraig Frostfire Dragon|Retired lurker|FFF|Loved by RNG

Gandire Alea [Wicked Awesome Translator]

Ai chan Queen of Yuri, Devourer of Traps, Thrusted Witch

Log in

Are there any tools for scanning documents and tracking words??

Gandire Alea [Wicked Awesome Translator]

Lacey_Avocato New Member

Ddraig Frostfire Dragon|Retired lurker|FFF|Loved by RNG

Gandire Alea [Wicked Awesome Translator]

Ai chan Queen of Yuri, Devourer of Traps, Thrusted Witch

Useful Searches