Given just account id , transaction amount per period n count of transaction per period how to build a model to predict churn ??
Start creating variables that you think would help indicate it
Milk sugar and fruit
The first task is to define what counts as churn. This will eventually define the labels in whatever supervised model you use. You should talk about this with the business side. It ultimately isn’t that important. There are hundreds of different ways to define churn for a non subscription product. They’re all correlated with each other, so it doesn’t matter. However, you want all parties to agree on what counts. One sort of caveat is that defining churn as a cessation of activity is not great, because it’s noisy wrt to random logins, etc. An alternative framework is survival modeling, but let’s set that aside. Once you have a definition of churn, you want to translate your data into a standard supervised problem. The standard way is to extract features from the time series data, of which the most important is usually time since last activity. This feature alone has usually gotten me 80% of the performance of an arbitrarily complex model. A disciplined way to extract features is to do things like Laplace/Fourier transforms. Another line of thought works with the whole time series directly. For example, knn with a good choice of metric, like dynamic time warping. You have to be careful and smooth your data, because it is too noisy in its natural state. Training RNN’s is also a popular approach, but I haven’t tried. I suspect the same caveats about noisy data apply. Another great approach is to tie the churn modeling with the proposed intervention. Instead of predicting churn, see what kind of lift you can get by predicting the likelihood that an intervention (promotion, ad, etc) will induce more activity. This could be like a churn model with features corresponding to your intervention. This way naturally leads to variants of bandits. Overall, start with a baseline model using just time since login. See if you can improve performance enough to justify additional complexity.
Do you have any usage information?
I just have account id, amount per day and transaction counts per day info.. i am looking to just build very basic model
I see. I would consider these factors: - log(time a customer) - log(total transactions) - dollars over the last 7 days / dollars over the 7 days prior - same as above but with 30 days Use logistic regression and check which variables are significant. You may wish to make these variables into discrete factors, then consider their interaction.
lol OP, try this paper napkin approach. 1) (# acct id at period end - # acct id at period start) divided by # acct if at period start (make sure to exclude any new/added acct id during the period, this will get u a ballpark user churn) 2) to calculate contraction mrr, run the same math but replace acct id with (avg transaction amount X avg transaction volume) you can then run some basic analysis to correlate back to user churn
Tech Industry
Yesterday
880
The new Tesla Model 3 P goes from 0-60 in 2.9 seconds
Tech Industry
Yesterday
3155
ByteDance is officially fucked
Ask Blinders
Yesterday
978
Tipping culture is really getting out of control! Waiter gave me ‘a look’ because I tipped her 10% for ‘BAD service!’
Tech Industry
Yesterday
2251
TESLA UP 14% AFTER HOURS 🎉🎉🎉🎉
Tech Industry
47m
354
I wish I were East Asian instead of Vietnamese (Southeast Asian)
You need to pass in the desired level of creaminess. Churning for longer makes it creamier
I am having hard time creating labeled data..