Forensic Linguistics and Information Security

The way humans use language is so transparent to us, we can barely see it. Language is kind of like an iceberg that way. There's way more information in it than the information we notice, and therefore way more than we can consciously manipulate.

Forensic linguistics is the study of the language iceberg for legal purposes. Those can be investigative purposes, prosecuting purposes, purposes of litigation, or purposes of scrutinizing our legal system's functions. I became aware of forensic linguistics around the same time that I became aware of the escalating need for IT leaders to know more about information security. For me it was like finding a problem and a tool for solving it at around the same time.

Though I don't want to minimize what a struggle it is to secure networks, systems, and data, there are processes that address those things, and many resources pointing the way. What I couldn't find were resources for solving what I think of as the "Eligible Receiver" problem. Secured networks and systems can still often be bypassed with a phone call, just as the Joint Staff's server had been in the 1997 Eligible Receiver defense hacking exercise. (See Fred Kaplan's Dark Territory: The Secret History of Cyber War.) I saw social engineering/human hacking in action at DefCon, and I worried about processes for resetting passwords through phone interactions. I wanted to know if there was a way to detect these kinds of things the way we detect a network intrusion.

Robert A. Leonard, the head of the forensic linguistics program and the Institute for Forensic Linguistics, Threat Assessment, and Strategic Analysis at Hofstra University, was extremely interested as well. Though at first he didn't see the applicability of forensic linguistics to cybersecurity, when I explained to him how social engineering worked, he immediately saw how social engineering was related to other types of language crime — like fraud and extortion — that had already been studied. Any type of crime committed with language, he said, was within the purview of his program. Professor Leonard encouraged me to tackle the research, and through him I've met other forensic linguists who are now getting interested in this area as well. Though I had studied several languages in my life, linguistics was completely new to me. I already had a PhD in a different field. Many friends reacted with surprise when I told them I was going back to graduate school to learn linguistics. I referred to it as my midlife crisis — more useful than a fast car, surely.

The great folks at Social-Engineer.com, who run the social engineering booth at DefCon, were also very interested. They helped educate me about the process of human hacking. Social engineers develop the trust of their targets, insinuating themselves into their target organizations with background research. I quickly found, though, that mass scams like fraudulent IRS calls or Microsoft repair centers don't evidence any building up of trust. Understanding these types of phone scams seemed like a first step toward understanding human hacking in general — not to do it, but to detect it.

I believe some automated answers may be possible in future, but I have also found some answers that can be used to help secure humans right now. For example, there are linguistic features of IRS phone call scams that might alert people that they are the target of a scam. Asking people to be alert for lies doesn't work. Research has shown that humans are terrible lie detectors. But we can be good at noticing polar tag questions, if we're sensitized to them. Polar tag questions are when a speaker ends their turn in a conversation with a yes/no question but expects a yes answer. (Now you're going to be listening for them too, right?) Once we're sensitized to them, we can follow up by asking a question to see if the caller will be responsive and allow us to change the topic. IRS phone scammers won't let you change the topic and won't answer questions. We can use this type of knowledge to improve our public service announcements to our constituents.1 Eventually, I hope we use this research to improve tools like Lenny, a conversation bot that can help tie up scam phone callers, and develop tools that can perhaps identify a fraudulent phone call in progress as we can currently identify a phone service interaction that doesn't meet our standards.2

Studying forensic linguistics and the interplay with information security is tremendously intellectually satisfying. The field of forensic linguistics itself is broad. I had the opportunity to study threat letters with an ex-FBI analyst and language crimes with the founder of the field. I had to stretch myself as a student in ways that I never had and might not have been brave enough to try before. But it is a field with a payoff. As nerdy as I get about the pragmatics of the infinitive or modal verbs, I'm doing it for a purpose. The outcome of forensic linguistics research is incredibly practical.

If the definition of insanity is doing the same thing over and over and expecting different results, we can't keep doing awareness programs for cybersecurity without more research about what is actually happening in social engineering. We all know that carefully thought-through information security can be bypassed by a phishing e-mail or a phone call from a fake customer. Human language interactions are not beyond the reach of research; forensic linguistics offers the right toolbox. To me it's not a diversion from my IT career at all; it's all part and parcel of what we need to better secure our information resources.

Notes

  1. For more on this topic, you can read my paper, "Linguistic Features of Phone Scams: A Qualitative Survey," in the proceedings of the 12th Annual Symposium on Information Assurance (ASIA '16), June 8–9, 2016, Empire State Plaza, Albany, NY, 52–58.
  2. It's not only about phone calls. My next project to be completed is on the language of phishing e-mails. I would also love to study CEO impersonations or spear-phishing. If you have recordings or transcripts of these types of scams, e-mail me!

Judith L. Tabron is an IT leader who researches, teaches, and writes about human factors in cybersecurity.

© 2017 Judith L. Tabron. This EDUCAUSE Review blog is licensed under Creative Commons BY-NC-SA 4.0.