I have several tables. Im trying to build an anomaly detector that can scan through a column (numerical or text data) and determine if a certain entry is anomalous. For ex a SSN showing up in an Employee Id column (same number of digits). I was thinking in the direction of regex at the low end of the complexity spectrum and autoencoders at the other end. Any thoughts on how to approach this problem? Training data of valid entries for certain classes is available #machinelearning
Tech Industry
Yesterday
516
Which company is best to have on your resume?
Tech Industry
Yesterday
944
I haven’t done shit today!
Tech Industry
Yesterday
2115
I’m Sooo Happy about Biden signing TikTok ban bill today!!
Tech Industry
Yesterday
2339
Buy FB now before 20% increase after hours?
Tech Industry
2d
39457
Worried that our top performer is an attrition risk. How do managers handle this?
We built something similar for detecting pii in log data streams. There are a lot of python libraries and tools available which you can reuse in a pipeline for this.
Pii?
Personally Identifiable Information