ML Page Categories
Overview
This dataset reflects periodic daily ML inferences where Sincera categorizes text assets from the publisher_text_assets dataset with the current available IAB model on that environment.
Dataset
| Field | Type | Description |
|---|---|---|
| id | integer | Unique identifier that applies to the related inference run. |
| page_id | integer | Unique identifier for the specific publisher page asset being classified. |
| classification | jsonb | Metadata providing category weights based on IAB category. |
| model_id | integer | Corresponds to latest ML model version. |
| created_at | datetime | Date when the inference run was first seen by the Sincera platform. |
| updated_at | datetime | Date when the inference run was last updated by the Sincera platform. |
| page_url | string | URL to publisher page being inferenced upon. |
| article_title | string | Article being inferenced on the page. |
| reviewed | boolean | Indicates whether the inference run has been reviewed. |
| user_id | integer | The user ID running the inference. |
| en_text | boolean | Detects whether the text is English language. |
| truncated_text | string | The js script found on the page that instantiates the video player. |
| pretty_classification | jsonb | Pretty version of the classification metadata outputs. |
| publisher_id | integer | Maps to the publisher_id primary key in the publishers dataset. |
| traffic_rank | integer | Indicates the publisher’s traffic rank. |
| result_status | integer | Result ID. |