Data Stream Source
A Data Stream Source is a data source that provides sequential data from asynchronous events (to support continuous processing and real-time analysis).
- AKA: Streaming Data.
- Context:
- ...
- It can range from being a Finite Stream to being an Infinite Stream, depending on its duration type.
- It can range from being a Simple Stream to being a Complex Stream, depending on its data structure.
- It can range from being a Raw Stream to being a Processed Stream, depending on its processing level.
- It can range from being a Static Stream to being a Dynamic Stream, depending on its data nature (as of 2017).
- ...
- It can transmit Data Elements through streaming mechanisms.
- It can maintain Data Sequence via ordering protocols.
- It can support Continuous Processing with processing engines.
- It can enable Real-Time Analysis through analysis components.
- It can handle Data Volume via buffering mechanisms.
- It can process Data Content incrementally without complete data access.
- It can have Data Stream Freshness Measure for data quality.
- It can often include Data Pattern with temporal characteristics.
- It can often support Data Filtering through filter rules.
- It can often enable Data Transformation via transformation logic.
- It can often maintain Data Quality through validation checks.
- It can often adapt to Concept Drift in stream property.
- It can exhibit Time-Based Patterns through temporal analysis.
- It can support Time Window Operations via window functions.
- It can handle Time-Series Analysis through temporal processing.
- It can integrate with Stream Processor for stream processing.
- It can connect to Data Source for data ingestion.
- It can support Data Consumer for data consumption.
- It can differ from Data Lake in processing speed and analysis continuity (as of 2020).
- It can be processed by Data Stream Processing System for stream analysis.
- It can have Implementation Detail, such as:
- Data Stream Protocols, including:
- Data Stream Formats, including:
- Stream Processing Models, including:
- ...
- Examples:
- Log Data Streams, such as:
- System Log Stream for system monitoring.
- Clickstream Log for user activity tracking (as of 2018).
- Temporal Data Streams, such as:
- Sensor Streams, such as:
- Temperature Data Stream for environmental monitoring.
- Continuous Sensor Data Stream for real-time measurement (as of 2018).
- Media Streams, such as:
- Streaming Video for visual content delivery (as of 2018).
- Audio Stream for sound transmission.
- Big Data Streams, such as:
- ...
- Log Data Streams, such as:
- Counter-Examples:
- Batch Data Source, which lacks continuous flow.
- Finite Dataset, which lacks streaming nature.
- Data Collection, which lacks sequential ordering.
- Data Lake, which requires data storage before analysis.
- See: Stream Processing, Data Flow, Real-Time System, Continuous Processing, Big Data, Concept Drift, Ordered Data Object, Data Stream Mining, Dynamic Data.
References
2020
- (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/Streaming_data Retrieved:2020-10-19.
- Streaming data is data that is continuously generated by different sources. Such data should be processed incrementally using Stream Processing techniques without having access to all of the data. In addition, it should be considered that concept drift may happen in the data which means that the properties of the stream may change over time.
It is usually used in the context of big data in which it is generated by many different sources at high speed. Data streaming can also be explained as a technology used to deliver content to devices over the internet, and it allows users to access the content immediately, rather than having to wait for it to be downloaded. Big data is forcing many organizations to focus on storage costs, which brings interest to data lakes and data streams. A data lake refers to the storage of a large amount of unstructured and semi data, and is useful due to the increase of big data as it can be stored in such a way that firms can dive into the data lake and pull out what they need at the moment they need it. Whereas a data stream can perform real-time analysis on streaming data, and it differs from data lakes in speed and continuous nature of analysis, without having to store the data first.
- Streaming data is data that is continuously generated by different sources. Such data should be processed incrementally using Stream Processing techniques without having access to all of the data. In addition, it should be considered that concept drift may happen in the data which means that the properties of the stream may change over time.
2018
- (Oussous et al., 2018) ⇒ Ahmed Oussous, Fatima-Zahra Benjelloun, Ayoub Ait Lahcen, and Samir Belfkih. (2018). “Big Data Technologies: A Survey.” Journal of King Saud University-Computer and Information Sciences, 30(4).
- QUOTE: ... for querying streaming data such as streaming video or continuous sensor data. ... It can handle many data source and types, including clickstream logs, …
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/real-time_data Retrieved:2017-8-3.
- Real-time data (RTD) is information that is delivered immediately after collection. There is no delay in the timeliness of the information provided. Real-time data is often used for navigation or tracking. [1]
Some uses of the term "real-time data" confuse it with the term dynamic data. The presence of real-time data is actually irrelevant to whether it is dynamic or static.
- Real-time data (RTD) is information that is delivered immediately after collection. There is no delay in the timeliness of the information provided. Real-time data is often used for navigation or tracking. [1]
- ↑ Wade, T. and Sommer, S. eds. A to Z GIS