Dorkbot: A Managed Application Security Assessment Service for Higher Education

min read

The University of Texas at Austin Information Security Office's Dorkbot service identifies and verifies high-risk security vulnerabilities in web applications and provides real-time reports for colleges and universities.

iconic drawing in University of Texas colors of round eyeglasses with a zig-zag crack in one lens
Credit: The University of Texas at Austin © 2019

"Google dorking" has been around since 2002, and most of us have done some "Google hacking" to look for sensitive information about our campus or ourselves.

This is true for the Information Security Office at the University of Texas at Austin (UTISO). For many years the UTISO devoted lunch hours to hunting and Google dorking for vulnerable UT Austin websites. Through this process, we confirmed that adequate nourishment was a key factor for effective meetings involving the UTISO and promptly added the following (arguably the most important) document to our information security program: Policy on Food Provisioning at Meetings.

Only slightly more important was the fact that we also began to automate the detection and verification of high-risk web application security vulnerabilities such as Cross-site Scripting, SQL injection, Local/Remote File Inclusion, and OS injection. Thus, the birth of Dorkbot. By 2015, we had a solution that worked pretty darn well, and we extended it to some 25 million pages for the UT Austin campus. By 2016, more kinks were ironed out and we extended it to all fifteen UT System campuses.

After a year of broader deployment and refinement, we had created a highly automated service, and by March 2017 we were ready to make it generally available to higher education institutions. The thinking was that we could extend an automated service—one that we were already operating for the UT System—and provide a proven risk-reducing, no-cost capability to any interested campus. We knew that many campuses simply don't have the ability to devote time and/or resources to this effort, so we wanted to share as much as possible. We also made our source code available for institutions that wanted to contribute to the project or operate the capability locally.

At this point, in less than two years, more than 800 institutions across six continents have subscribed to the Dorkbot service (we could use some help with Antarctica!). We are grateful to be able to serve the community in this way.

Dorkbot subscribers include almost every institution in the state of Texas, 82% of all Carnegie R1 institutions, some of the most prestigious campuses across the globe, and a wide variety of other classifications of higher education institutions. We are forecasting that an additional 300–400 campuses will subscribe to the Dorkbot service in 2019 and are trying to raise awareness with institutions that might not otherwise be connected with a community like EDUCAUSE or REN-ISAC. To that end, we encourage our higher ed colleagues across the globe to tell local community colleges and other partner campuses about Dorkbot.

Dorkbot has a proven track record at this point, having reported 26,439 verified high-risk web application vulnerabilities to subscribers in 2018. This represented an increase of 349% over 2017 (of course, the number of subscribing campuses also grew rapidly in 2018).

At peak operation, the UT Dorkbot service's scan-engine farm runs 256 concurrent docker containers performing over 8,000 throttled page scans and reporting an average of 140 verified vulnerabilities each day.

Figures 1 and 2 present some summary statistics from UT's Dorkbot service for 2018:

Figure 1. Number and type of vulnerabilities, and number of subscribers
Figure 1. Number and type of vulnerabilities, and number of subscribers
Figure 2. Verified vulnerabilities, by Carnegie classification
Figure 2. Verified vulnerabilities, by Carnegie classification

One neat thing about the Dorkbot service is that data show the vulnerability exposure (or attack service) for long-term subscribers has decreased by roughly 55% compared to the previous year. There are many factors at play, but this is a notable observation, especially given that many data thieves tend to use tactics similar to Dorkbot to locate and exploit vulnerable applications. For example, many of the credentials recently exposed in the "Collection #I" breach were acquired from successful SQL injection attacks.

Dorkbot currently sweeps through a list of target URLs for each subscribed campus each month. This allows campuses to remediate and rely on an automated process to validate their work. Campuses can request a rescan, or they can wait until the next monthly sweep. Dorkbot activity also emanates from a single IP address so that is completely transparent to the subscribing campus in the event they want to specifically restrict or permit access.

While we encourage campuses to subscribe their entire top-level domain and any other TLDs they might own, campuses can start small (e.g., one host or a few subdomains) if they need to build out local awareness. Campuses can also exclude extremely fragile systems on request.

We are also willing to work with campuses that would like to have deeper spidering of their internal content, but we are required to recover our costs for that activity (though it will still be quite reasonable).

What's new for Dorkbot? We'll be rolling this capability into a new cyber hunting platform we call CHOMP (Cyber Hunting Orchestrated Maneuvers Platform). Similar to the Dorkbot service, we'd like to make sure CHOMP is available to colleges and universities at no cost. More on this offering will be shared with the EDUCAUSE community in the coming months.

If you aren't yet subscribed to Dorkbot, please check out the service. And feel free to share any questions or comments with us at [email protected]

Thanks for everyone's support!


Cam Beasley is Chief Information Security Officer for the University of Texas at Austin.

© 2019 Cam Beasley.