What would it take for company X to collect all internal emails and msgs and knowing the unique id of each sender make a profile to train a model to recognize and match writing patterns based on style, misspellings, grammar length etc so that you could “blind” test and accurately machine learn odds of a msg coming from that person? So if that company had training data from it’s own users , they could then buy Blind data extracts from users belonging to their company X and with decent accuracy narrow down who the user is, and give a confidence score based on speech patterns alone, not on any other digital fingerprint ? EDIT : thanks for the replies and thoughts .. in the end I guess it’s more of a thought experiment of creating a fingerprint from typed text sourced internally from companies where the sender is known and also considering ways Blind could use the data here for its business model... both of which could be considered separately but have some mutually shared implications... Thanks for any continued thoughts on confidence scores for identifying people based on typed text
Not sure if you can make a fingerprint using only the writing of an individual. Coursera does it using typing behavior (speed between various letters is unique between individuals)
Paranoia is a terrible thing, my friend
I considered disclaimer of paranoia lol but actually more thinking of the business model of blind... same for linked in.. they know enough about jobs and job seeker behavior they could easily sell monthly reports to companies to let them know of flight risks
Yes but if it ever got out that Blind was doin that shit, there would be a mass exodus and they'd be worth dogshit.
Nah they’re all stakeholders and will dox everyone when it’s “time”
Wouldn't the writing styles be different over internal email vs Blind posts given the shorter format of Blind? I know I write internal work emails a bit differently compared to this post I just wrote.
As mentioned, company x would have emails and internal msgs (skype, slack etc) but yes to your point the writers style would vary on medium, so my assumption would be that the model that gives confidence scores on potential matches would be able to handle length and potential variances of formality for each platform... perhaps the scoring would be more accurate on long form posts
Aah yes, then yes I agree.
Technically they could terminate someone if they found the blind verification email and knew what it was. Or even if they didn’t, non work related email. (Even though it is...) lol
How about the work email you used to sign up? You already gave them what they need.
I don't think you could because of the amount of noise, and you can't extract that much data from text and you're talking about making a prediction over thousands of people. But hey, I could be wrong