Reads a text file with one url on each line to scrape the contents of a web page and extract key terms using natural language processing. Built with python.
Run the script from the command line. There are a few required options
-i
,--input
the name of the txt file containing the URLS-c
,--content
the selector for the content region to parse-o
,--output
the name of the file to be output. Acceptable formats are csv or json.
-l
,--length
the minimum length of each keyword returned by the script