Click-Through Log Dataset
(Redirected from Clickthrough Log Dataset)
Jump to navigation
Jump to search
A Click-Through Log Dataset is a click-through data source that is a log dataset (of click-through records).
- Context:
- It can be associated with a Click-Through Data Stream.
- …
- Example(s):
- Counter-Example(s):
- See: Clickthrough Rate Estimation, Interaction Dataset.
References
2017
- (Joachims, Swaminathan et al., 2017) ⇒ Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. (2017). “Unbiased Learning-to-Rank with Biased Feedback.” In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ISBN:978-1-4503-4675-7 doi:10.1145/3018661.3018699
- QUOTE: Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data.
2014
- http://kaggle.com/c/avazu-ctr-prediction
- QUOTE:
- train - Training set. 10 days of click-through data, ordered chronologically. Non-clicks and clicks are subsampled according to different strategies.
- test - Test set. 1 day of ads to for testing your model predictions.
- sampleSubmission.csv - Sample submission file in the correct format, corresponds to the All-0.5 Benchmark.
- QUOTE:
Data fields id: ad identifier click: 0/1 for non-click/click hour: format is YYMMDDHH, so 14091123 means 23:00 on Sept. 11, 2014 UTC. C1 -- anonymized categorical variable banner_pos site_id site_domain site_category app_id app_domain app_category device_id device_ip device_model device_type device_conn_type C14-C21 -- anonymized categorical variables