Anti-Targeting Solution
Solution Background and Business Value
Anti-targeting during digital advertisement campaigns helps businesses identify which customers are least likely to purchase a campaign’s promoted product, helping them reallocate resources away from users that are unlikely to convert. This strategy is particularly important for campaigns with limited budgets or short timelines, where wasting impressions on uninterested users can lead to large losses in efficiency.
Such business problems can be approached from many different angles, often requiring distinct predictive models depending on the interpretation. Kumo possesses the necessary flexibility to accommodate these different perspectives, enabling you to tailor models that meet the specific needs of your use case. We will now demonstrate two ways of training predictive models based on different interpretations of anti-targeting, one requiring a static prediction and another requiring a temporal prediction. Keep in mind that in general, the difference between static and temporal predictions consist in how the training tables are constructed, so the decision of which one to use directly stems from your use case.
To develop an effective anti-targeting model, we need a structured set of tables which captures all the relevant campaign data, user data, and transaction history. While there exists a minimum amount of tables for generating anti-targeting predictions, the addition of relevant information and complexity to the graph will only serve to increase model accuracy.
Example: Anti-targeting in flight throughout the campaign
Performing anti-targeting in flight throughout a campaign requires a model that is trained for temporal predictions. When constructing a training table for temporal predictions the model looks back in time and samples several moments for the same user. For example, if we want to train a model to predict if a user is going to perform a certain action in the next 30 days, the model will sample 30 day intervals in the past in order to obtain training data. This kind of prediction allows the model to capture behavior over time, which can be very helpful when performing anti-targeting during digital campaigns.
This example of a temporal prediction uses three tables: campaign data, user data, and session data. Each user is associated with a specific campaign, so all purchases made by that user are attributed to that campaign. Likewise, each user is associated with several sessions, which store information about purchases the user made.
Core Tables:
- Campaign Data:
- Stores all data about each campaign
- Key attributes:
campaign_id
: unique campaign identifierstart_date
: start time of campaign- Optional: daily budget, campaign type, campaign name
- User Data:
- Stores all data about each user
- Key Attributes:
user_id
: unique user identifieracquired_through_campaign
: boolean that represents how this user was acquiredcampaign_id
: which campaign was this user associated with- Optional: age, gender, income, signup date, income range, etc.
- Session Data:
- Stores all data about every session a user has
- Key Attributes:
session_id
: unique session identifieruser_id
: the user this session belonged toclaim_reconciled
: the user made a purchase- Optional: duration, engagement score, sale type, claim site, etc.
Additional Tables (Optional):
For improved prediction, consider including:
- Payload Table: Data about each transaction within each session (e. g., interaction time, margin)
- Ad Group Data Table: Data about the targeted audience of each campaign
Entity Relationship Diagram (ERD)
Sub Graph:
When preparing the data to be ingested into a model, Kumo creates subgraphs of each graph centered at the entity that we are predicting for. In this case, since we want to predict whether or not each user is going to purchase a given product, the subgraph will be centered on each user. Here is an example of a subgraph for the schema above:
**By default, model plans only consider nodes that are two hops away from the root node. **However, since the graph is centered at each user, if the graph only performs two hops, the subgraph would only include the user we are predicting for, their corresponding campaign, and other users who are a part of that campaign, completely omitting the sessions of other users from the training data and thus foregoing useful information. For this reason, before training, we need to manually add a third hop in the model plan.
Predictive Query:
Anti-targeting identifies the least likely users to purchase a given product that is advertised by a specific campaign. One challenge when performing these predictions is deciding whether or not to filter out users which have previously purchased the advertised product. For example, a user is not likely to purchase several high price or non-perishable items over the course of one advertisement campaign, so it may valuable to only consider users who had never bought the item before.
At prediction time, this model predicts the likelihood that a user that corresponds with campaign K would not purchase the promoted item in the next N days, given that the user had not purchased the item before.
Example: Anti-targeting at the beginning of an advertisement campaign
Performing anti-targeting at the beginning of a campaign requires a model that is trained for static predictions. Static predictions, unlike temporal ones, only consider features at a single point in time when predicting an outcome. Because of their static nature, each entity has to correspond with a specific label, meaning the same entity cannot have different entries in a training table. There are ways to get around this feature, however, such as adding duplicate entities with different timestamp properties.
Static predictions often require the creation of label tables or columns, which represent the prediction that we want to make. Since we need to generate the labels ahead of time, there exists a lot more fine grained control over the time period that we are training the data over. For example, while temporal predictions will gather data for training iteratively, looking back into the past over several intervals, static predictions give you the flexibility to specify a specific interval like a month or a season to generate training data for.
This example of static prediction uses four tables: user data, orders, products, and labels. The schema represents data for a single campaign. Each user places several orders, each of which has one product attributed to it.
- User Data :
- Stores static data for each user
- Key Attributes:
user_id
: unique user identifier- Optional: age, gender, income, signup date, income range, etc.
- Orders Data:
- Stores all transactional data about each purchase made
- Key Attributes:
order_id
: unique order identifieruser_id
: the user that made the orderproduct_id
: the product that was purchasedtime_created
: when the order was placedpromoted_flag
: whether or not the product was promoted by the campaign
- Key Attributes:
- Stores all transactional data about each purchase made
- Product Data:
- Stores metadata about each product
- Key Attributes:
product_id
: unique product identifier- Optional: product name, brand, size, category, price
- Label Table:
- Stores binary labels for supervised learning tasks
- Key Attributes:
label_id
: unique label identifieruser_id
: the user this label is associated withtime
: when the label appliestarget
(1/0): binary outcome for prediction (e.g., purchase or not)
Entity Relationship Diagram (ERD)
Predictive Query:
The challenge of constructing a predictive query when performing a static query comes not from the query itself, but from the creation of the labels. The labels need to be pre-generated, and are representative of what you want to predict. For this example, an appropriate label for each user would have a 1 or a 0 depending on whether the user bought the promoted product within the desired timeframe or not. With the aforementioned generated set of labels, this is what the predictive query looks like:
Since labels and users have a one-to-one relationship, this short predictive query is all that is needed.