CogStack Pipeline

AKA: cogstack-pipeline.
Context:
- GitHub Repository: https://github.com/CogStack/CogStack-Pipeline
- It is based on Java Spring Batch framework.
- …
Example(s):
- …
Counter-Example(s):
- CogStack Job Repository
- CogStack-ElasticSearch.
- Bio-YODIE,
- Bio-LarK.
- Elasticsearch,
- PostgreSQL,
- NGINX.
- Kibana.
See:, SemEHR System, Semantic Web Search System, Semantic Web, Ontology Search System, Natural Language Processing, Annotation Task, Electronic Health Record, Clinical Trial.

References

(Roguski, 2019) ⇒ Lukasz Roguski (Apr, 2019). "CogStack platform"
- QUOTE: The data processing workflow of CogStack is based on Java Spring Batch framework. Not to dwell too much into technical details and just to give a general idea – the data is being read in batches from a predefined data source, later it follows a number of processing operations with the final result stored in a predefined data sink. CogStack implements variety of data processors, data readers and writers with scalability mechanisms that can be specified in CogStack job configuration.
  Each CogStack data processing pipeline is configured using a number of parameters defined in the corresponding Java properties file. Moreover, multiple CogStack data processing pipelines can be launched in parallel or chained together (see Examples), each using its own properties configuration file.
  (...) CogStack data processing pipeline design follows the principles behind Spring Batch. That is, there are different Jobs and Steps and custom processing components called ItemReaders , ItemWriters and ItemProcessors . A Job has one to many Step, which has exactly one ItemReader, ItemProcessor, and ItemWriter. A Job needs to be launched by JobLauncher, and meta data about the currently running process needs to be stored in JobRepository. The picture below presents a simplified version of a reference batch processing architecture.