Microsoft’s encrypted Remote Desktop Protocol (RDP) contains vulnerabilities that could enable attackers to detect activities, keystrokes, and mouse movements in 30-second traces. That’s according to a preprint study from researchers at Queen’s University in Kingston, Canada, who claim the protocol’s design exposes “fine-grained” actions on which machine learning models could be trained to identify usage patterns.

Encryption is a common response to network weaknesses. It’s estimated that in 2018, encryption was used in more than 70% of all network communications. But it isn’t a cure-all. That’s because traffic analysis need not rely on the content of data packets to reveal internet work activity; analysis can instead draw on things like the services being used, signatures in data payloads, data analytics, and behavioral classification.

In something of a case study, the coauthors investigated RDP, which is designed to let a client PC user interact with a host as if sitting at that host. They sourced a Windows 10 workstation to serve as the client; another PC running two virtual machines — a Windows 10 installation and the Linux distribution CentOS — as the host; and a remote machine at Queen’s University behind a physical firewall as a second host.

The researchers had the client PC connect to either the local or physically distant hosts via RDP and recorded activities for 30-second windows. Using the client PC, they downloaded files, used Firefox and Chrome, typed in Notepad, played YouTube videos, and copied content from the hosts to the local client using the Windows clipboard.

The coauthors used two tools — CIC Flow Meter and Tshark — to extract attributes like packet lengths for each network traffic exchange. And to classify each activity, they built an ensemble machine learning model consisting of the top most effective classifiers for traffic classes, chosen to maximize the precision (the fraction of relevant instances among retrieved instances) so the ensemble classifier could determine whenever a kind of traffic was present. After training the classifiers, the researchers applied the ensemble to a corpus comprising 2,160 30-second samples, after which they evaluated the prediction performance on a per-class basis.

The researchers report that for two types of traffic — TCP and UDP — the ensemble was successful in identifying one or even simultaneous activities taking place via RDP. The classifiers accurately detected in-progress file downloads, internet browsing, Notepad writing, YouTube viewing, and text copying-and-pasting with greater than 97% precision and at least 94% recall (the fraction of the total amount of relevant instances actually retrieved). Perhaps more problematically, the ensemble detected keystrokes sent from the client to the remote systems by their TCP frames. The coauthors note the total number of frames in a window correlates with visual changes on the screen and can reveal how many keystrokes have been sent, opening the door to password attacks.

The researchers concede they only analyzed traffic between Windows 10 systems and that different systems, PCs, and RDP updates could conceivably affect accuracy. But they say ensemble retraining would likely be sufficient to adapt to new network environments.

“We have shown that, for an encrypted protocol such as RDP, it is still possible to infer five common categories of activities with high reliability from traffic properties that cannot be concealed by encryption,” the researchers wrote. “It is conceivable that some of these predictions could be defeated by obfuscation in the protocol but protocol designers are caught between the need to conceal activity and the need to provide responsiveness.”