Skip to main content

Finding the alias’ in your data

(Image credit: Image Credit: The Digital Artist / Pixabay)

Security firm Agari recently exposed an email scam that took advantage of Gmail’s “dot” feature to streamline operations. According to Axios’ reporting, “If you own the right to someusername@gmail.com, you will receive emails sent to some.user.name@email.com and s.o.m.e.u.s.e.r.n.a.m.e@email.com.”  Axios goes on to say while “that may seem like a minor feature, the vast majority of email providers treat each of those as different accounts. That allows you to sign up for multiple accounts on most websites in each of those email addresses.” Using the “dots” approach, the criminal group discovered by Agari were able to take detrimental actions, from applying for nearly fifty credit cards at US-based financial institutions to submitting FEMA disaster assistance applications to faking more than a dozen fake tax returns.

Alias and the Ego

Most people who use multiple email accounts do not do so for nefarious reasons. I have at least three: one for work, one for personal use, and one that I use if I must share an email but I don’t want to share my ‘real’ email account with this vendor or person (aka- junk email account). Knowing this, it’s clear a data management approach could help avoid this type of theft. A company using data management could identify my personal email or as I like to call it, ‘my ego email,’ and my fake (alias) email address and tie them to together.  

The process could be as simple as stripping all the periods out of the email before the “@” sign and doing a string comparison to identify the ego and the alias.  It gets more complex if a person starts to transpose characters in their address. For example, if they use firstname.lastname@email address for their ego email, and maybe lastname.firstname@email address then stripping the periods out of the email address will not find this condition. This is where data management enters the picture. Data management can process the email addresses and based on built-in intelligence, understands that the first name and last names were flipped and that the two emails probably belong to the same person. By adding additional components to the decisioning process (i.e. address, phone, IP address, location information, social security number, and so on) and if some or all this additional data aligns the two-email addresses, then you’re left with a high level of confidence they belong to the same person.  

Now, let’s say the email uses a full name in one email and a nick name in another email address. For example, Margaret.smith@email, and Peggy.smith@email. An organisation would not know that these have the potential of being the same person because using simple string or exact matching, Peggy and Margaret would never come together. Thus, as companies try to identify the alias to the ego in customer data, it’s more than just simply removing periods between the letters.

Many organisations operate on a global scope, so the language and cultural aspects of the name must be considered for correct identification and linkage. Don in the US is a nickname for Donald; however, in Spain, Mexico, and Italy, it’s an honorific. A nickname dictionary doesn’t solve the cultural aspects of the data. The origins of the data must be considered as part of the identification process to ensure the correct alignment and linkage. As an organisation considers data management strategies, international functionality must be part of their capabilities.

Business rules

Business rules for the duplicity of emails must be established to inspect incoming data so as alias emails entering the organisation are tied to a fiscal decision and are quickly identified and quarantined. Once quarantined, it can be reviewed by a specialist or via machine learning to determine fraud. Business rules should include a level of fuzziness to its decision-making process to identify other email combinations that could be considered additional alias’ to the ego. A data management solution would then put this new email addresses into a remediation query for further investigation and flagging. If a new email alias has been identified, it can be appended to the existing cluster of known fraudulent email records.

Now that you have the ego linked to the known and new alias, the question becomes: what should an organisation do with this new insight to ensure fraudulent activities are not taking place? The easiest solution is to append the ego email to all the alias emails and link them all together through a cluster of like records, and assign a unique hashtag associated to this cluster to be populated across the various source systems. In the transactional systems, data management solutions should move higher up in the transactional stacks. These data management offerings can make calls into various database systems via SOAP, REST e.g. call (with no latency) to determine the viability of the transactional details. If the system identifies a deceitful email, it can reject the transaction or populate into a que for review by a person for additional investigation.

Employing data management should also incorporate the fuzzy capabilities in the transactional stack to quickly determine if a new email entering an organisation is similar to an existing email address (or multiple addresses). If the new email is similar enough, then business rules kick into gear to determine how the organisation handles this transaction—and who should be handling it. The new email can be automatically inserted into an existing fraudulent cluster of records and the cluster is updated. Or it can be routed for manual inspection and then inserted into the cluster.

The identification of duplicitous emails is a continuous cycle. Keep in mind, business rules and calls to action will evolve as organisations’ experience expands, and as the refinement of addressing this problem progresses. The key to any successful program of this kind is to have a well thought out and documented process that is shared with the appropriate decision-makers inside your organisation. If you can plainly explain why you are doing this, the value you expect the company to receive, and with clearly defined rules, workflows and decisioning parameters, the more successful the efforts. So, what are you waiting for?

Kim Kaluba, Senior Product Marketing Manager in Data Management, SAS
Image Credit: The Digital Artist / Pixabay