GitHub - p1ng-request/document-automation: Scripts to Automate Documentation Workflow

Documentation best practices 📚

Something I have been working on 🎯

broken-links-checker.py: A Python script to scan broken links from a given web domain.

translate.py: Translate documentations (.md files) to destined language.

This pipeline uses a high-performance Neural Machine Translation (NMT) system. The current code is running on Helsinki-NLP/opus-mt-en-zh model, which is trained on a diverse range of parallel texts from the internet. Switch to your favourite pre-trainded moddel to translate between any pair of languages from the OPUS corpus.
(Optional) Fully-automated Documentation publication workflow, by push & commit to GitHub, and subsequently Docs auto deployment.

✨✨ml-docs-scanner.py✨✨: The udpated version of NLP Docs Scanner.

Using machine learning techniques to train models on the cleaned data, make predictions on the data, score the documentation based on the criteria you specified.
Use supervised learning techniques to train a model to predict the quality of a document based on a set of labeled examples. For example, you can use grammatical error correction models, spell checker models, readability metrics such as the Flesch-Kincaid readability test, sentiment analysis models to measure the objectivity and tone of a document.
Customized ignore_list.txt. Sample:

["ignored_word1", "ignored_phrase1", "ignored_word2", "ignored_phrase2", ... ]

Utilize transformer-based language models to check for consistency and coherence in style, tone, and terminology throughout the text, and give improved readability scores.
Screenshot:
Dependenceis:

## prereq: python3, jre
## Install dependenceis:
pip3 install nltk textstat markdown textblob language-tool-python pyfiglet textblob

nlp-docs-scanner.py: Automated Documentation Scanner. Features:

Scan all .md files in a given directory and all the sub-directories and use natural language processing(NLP) techniques to determine complicated words by breaking down the text into individual sentences.
Grammar and Spelling checker.
Evaluate readability: the Flesch-Kincaid Reading Ease score.
Evalute the objectivity: by computing the Automated Readability Index (ARI) and Flesch-Kincaid Grade Level.
Evalute clearity: Apply named entity recognition (NER) to identify specific words within the text and make suggestions for improvements.
Evalue the tone: Apply Sentiment analysis using Machine learning (ML) techniques.
Evalute the consistency: Analyze the text based on NLP and ML, which, detects terms and check consistency.

Note 1: You are obligated to create a terminology_dict.json file in the following format:

{
    "word1": count1,
    "word2": count2,
    ...
}

Note 2: Grammar check, spelling check & clearity check on a word-based level proven to be unreliable for generating too many false positives. Best pracitce: use grammarly instead.

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
best-practices		best-practices
README.md		README.md
broken-links-checker.py		broken-links-checker.py
create-term-dic.py		create-term-dic.py
ml-docs-scanner.py		ml-docs-scanner.py
nlp-docs-scanner.py		nlp-docs-scanner.py
screenshot.png		screenshot.png
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentation best practices 📚

Something I have been working on 🎯

About

Releases

Packages

Languages

p1ng-request/document-automation

Folders and files

Latest commit

History

Repository files navigation

Documentation best practices 📚

Something I have been working on 🎯

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages