n-Gram Generation System: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
(Created page with "An n-Gram Generation System is a tuple generation system that can solve an n-Gram generation task (to produce an n-gram sets for a sequence record). * <B>S...")
 
m (Text replacement - " it." to " it.")
Line 7: Line 7:
===2008===
===2008===
* http://code.prashanthellina.com/code/generate_ngrams.py
* http://code.prashanthellina.com/code/generate_ngrams.py
** QUOTE: The “generate_ngrams.py” script creates [[uni-gram|uni]], [[bi-gram|bi]] and [[tri-gram]]s of whatever [[text]] is piped into it. The following command pipes all the txt files through both the scripts to create the [[ngram set|ngram]]s [[file]]. <code>for i in `find gutenberg_txt/ -name "*.txt"`; do cat $i | python remove_gutenberg_text.py | grep -i -v "project gutenberg" | python generate_ngrams.py >> gutenberg_ngrams; done</code>
** QUOTE: The “generate_ngrams.py” script creates [[uni-gram|uni]], [[bi-gram|bi]] and [[tri-gram]]s of whatever [[text]] is piped into [[it]]. The following command pipes all the txt files through both the scripts to create the [[ngram set|ngram]]s [[file]]. <code>for i in `find gutenberg_txt/ -name "*.txt"`; do cat $i | python remove_gutenberg_text.py | grep -i -v "project gutenberg" | python generate_ngrams.py >> gutenberg_ngrams; done</code>


----
----
__NOTOC__
__NOTOC__
[[Category:Concept]]
[[Category:Concept]]

Revision as of 22:15, 7 November 2015

An n-Gram Generation System is a tuple generation system that can solve an n-Gram generation task (to produce an n-gram sets for a sequence record).



References

2008

  • http://code.prashanthellina.com/code/generate_ngrams.py
    • QUOTE: The “generate_ngrams.py” script creates uni, bi and tri-grams of whatever text is piped into it. The following command pipes all the txt files through both the scripts to create the ngrams file. for i in `find gutenberg_txt/ -name "*.txt"`; do cat $i | python remove_gutenberg_text.py | grep -i -v "project gutenberg" | python generate_ngrams.py >> gutenberg_ngrams; done