Network Security

What is Tor?


Posted by Ya Xiao

Have you heard of "Tor"? This blog aims to give you a brief understanding of Tor.

“Tor” comes from an acronym of a project named “The Onion Router”, which aims at researching, designing, building, and analyzing anonymous communications systems. The Tor network is a group of volunteer-operated servers that allows people to improve their privacy and security on the Internet. Tor's users employ this network by connecting through a series of virtual tunnels rather than making a direct connection, thus allowing both organizations and individuals to share information over public networks without compromising their privacy. Along the same line, Tor is an effective censorship circumvention tool, allowing its users to reach otherwise blocked destinations or content. Tor can also be used as a building block for software developers to create new communication tools with built-in privacy features.

An initial design for Onion Routing was published at the first Information Hiding Workshop and deployed in mid 1996. However, the current word “Tor” usually refers to the second generation onion router which began in mid 2002 while the previous works tended to be called the first generation design. The latest onion routing system is freely available and runs on most common operating systems. There is a Tor network of several hundred nodes, processing traffic from hundreds of thousands of unknown users.

Below, I will introduce the Tor network from two aspects, technology aspect and controvercy aspect.

1. Technologies of Tor


How does Tor make anonymity possible? Figure 1 below may give you a brief overview of the main idea of Tor.

Overview of Tor network

Figure 1. Overview of Tor network

 

At a high level the main idea behind Tor is to transmit traffic of users through bunch of different nodes such that each node has no information about where the packet is coming from, and where it is going to. To achieve this, Tor uses onion routing. In an onion network, messages (packets) are enclosed in layers of encryption, called Onions (due to their layered architecture). Each node in this network is called an onion router (OR), and each OR is responsible to peel away a single layer (decrypts), uncovering the data for the next node in the network.

To go down to a more detailed explaination of the mechanism of Tor, there are some conceptions needed to be clarified:

  • Cells
  • In Tor network, traffic passes along connections in fixed-size cells. Each cell is 512 bytes and contains a header and a payload. The header includes some important auxiliary information like ID and Command. The payload is the data of communications.

  • Circuit
  • The communication in Tor network is circuit-based. Each circuit is a connection path along which cells are forwarded from the sender to the destination. Before each communication, a circuit consisting of router nodes is built by the sender. In Figure 1, the transmission route A-D-B-C-F is exactly a circuit.

  • Onion Router (OR)
  • Onion Proxy (OP): In Tor network, each user runs a local software called onion proxy (OP). An OP works to build circuit for communication, fetch directories and handle connections from user applications. In Figure 1, the sender, the left lady in blue suit, can be regarded as an OP.

    1.1 Cell Structure


    Cells are the minimal transmitted unit of Tor communication. There are two kind of different cells. One is control cells and the other is relay cells.


    1.2 Circuit Construction


    Before data transmission, OP needs to establish the circuit and negotiate a symmetric key with each OR on the circuit. It incrementally builds a circuit of encrypted connections through relays on the network. The circuit is extended one hop at a time, and each relay along the way knows only which relay gave it data and which relay it is giving data to. No individual relay ever knows the complete path that a data packet has taken. OP negotiates a separate set of encryption keys for each hop along the circuit to ensure that each hop can't trace these connections as they pass through.

    This process can be shown as:

    Circuit Construction Step 1

    Figure 4. Circuit Construction Step 1

     

    The OP randomly selects the next OR and negotiates the common symmetric key with it. This key is used as session key to encrypt data.

    Circuit Construction Step 2

    Figure 5. Circuit Construction Step 2

     

    The OP extends the circuit by adding a hop. It passes information to OR1 and OR1 helps OP and OR2 to negotiate a shared session key.

    Circuit Construction Step 3

    Figure 6. Circuit Construction Step 3

     

    Up to now, a complete circuit has been built.

    The detailed information of key negotiation in this process can be shown in Figure 7.

    Detailed Infromation of Circuit Establishment

    Figure 7. Detailed Infromation of Circuit Establishment

     

    OP negotiates session key with each OR by Diffie-Hellman handshake.

    $OP\rightarrow OR1$ : c1, $E_{pk_{OR1}}(g^{x_1})$

    $OR1 \rightarrow OP$ : c1, $g^{y_1}, H(K_1||'handshake')$

    From the above handshake, OP and OR establish their shared session key: $K_1 = g^{x_1y_1}$.

    In the next step, OP extends circuit by one hop through the help of OR1:

    $OP \rightarrow OR1$ : c1, $E_{K_1}(OR2,E_{pk_{OR2}}(g^{x_2}))$

    $OR1 \rightarrow OR2$ : c2, $E_{pk_{OR2}}(g^{x_2})$

    $OR2 \rightarrow OR1$ : c2, $g^{y_2},H(K_2||'handshake')$

    $OR1 \rightarrow OP$ : c1, $E_{K_1}(g^{y_2},H(K_2||'handshake'))$

    OP sends the encrypted handshake information to OR1 and OR1 decrypts it and changes the circuit ID then relays to OR2. After receiving OR2’s response, it encrypts these information and sends it back to OP.

    The OR3 is also extended in the same way. OP sends information which is multilayer encrypted iteratively by $K_2$, $K_1$. OR1 decrypts one layer by K_1 and relays it to OR2. OR decrypts another layer then relays it to OR3. The response of OR3 is also multilayer encrypted by the session key of each node on the circuit when fowarded towards OP.

    The asymmetric cryptography in this process helps to avoid man-in-the-middle attack of Diffie-Hellman protocol. OR1 relaying cells between OP and OR2 ensures that OR2 doesn't know the hop nodes previous of OR1. Therefore, no single OR has the knowledge of the whole path.

    1.3 Traffic Relay


    After establishing the circuit, the end-to-end stream data can be transmitted anonymously by this circuit. At present, the OP knows all shared keys with each OR while OR only knows its last hop OR and next hop OR and doesn’t know other information like who is the OP.

    This process is similar to the process of circuit establishment. OP multilayer encrypts a cell with the session keys shared with ORs according the order from farthest to closest one. Then, OP sends it to the first OR node of the circuit.

    Upon receiving a relay cell, an OR looks up the corresponding circuit, and decrypts the relay header and payload with the session key for that circuit. Then, the OR checks its digest and see if the digest is a valid one. If this OR is not the last hop, the digest shows a unmeaningful value since it is encrypted. Therefore, the OR looks up the circus and OR for the next step in the circuit, replaces the circus as appropriate, and sends the decrypted relay cell to the next OR.

    When an OR then replies to OP with a relay cell, it encrypts cell’s relay header and payload with the single key shared with OP, and sends it back toward OP along the circuit. Subsequent ORs add further layers o encryption as they relay the cell back to OP.

    OP treats incoming relay cells similarly: it iteratively unwraps the relay header and payload with the session keys shared with each OR on the circuit, from closest to farthest. If at any stage the digest is valid, the cell must have originated at the OR whose encryption has just been removed.

    1.4 Location Hidden Services


    Using the above design, Tor can help users to achieve anonymity. There is also anonymity demand of the responder. The location hidden services which allow a server to provide a TCP service to a user without revealing the IP address of the server can also be achieved in Tor network.

    Goals of location hidden services include:


    The process to achieve it can be summarized as follows (where Bob is the server and Alice represents the user’s OP):

    2. Controversies on Tor


    The real purpose of the Tor network is to keep your browsing habits anonymous and protect your privacy. By passing web traffic through a series volunteer relays, and encrypting the information at each stop, Tor keeps the origin of internet traffic much more difficult to trace.

    An anonymous and private online experience is of value to many people including:

  • Parents seeking to protect their children from predators
  • Anyone accessing sensitive, personal information
  • Freedom fighters, whistle-blowers and journalists protecting themselves and their sources
  • Citizens of countries where censorship restricts internet access
  • Government agents handling classified information
  • Criminals seeking to deal in illegal goods or services

  • However, in the public mind, the Tor network is typified by black market sites that law enforcement agencies attempt to locate and shut down. Not everyone using Tor has innocent motives. Some of Tor’s dark reputation is true, since the anonymity it grants can provide a haven for criminals.

    Perhaps the most notable examples are dark markets, such as the Silk Road, Alphabay or Hansa – all now closed down by authorities. But the demand for it never disappears. While the Silk Road was live, it processed an estimated $15 million in transactions each year. These used Tor hidden services to connect customers with merchants, overwhelmingly for illegal goods such as drugs, fake IDs and similar, and cryptocurrencies such as Bitcoin to facilitate payments.

    But Tor has been used for other malicious ends as well. Over the years, law enforcement agencies have discovered and taken down sites that trade in child abuse imagery and others that offer illegal hacking services on the dark web. Terrorists have also been experimenting with the service, although the surprisingly low-tech approach of many would-be murderers reportedly puts them off such secure solutions.

    One 2015 survey of the dark web put the proportion of illicit content at around 40%.

    Such figures have led nations such as France to wonder if banning the service might be a possibility, but proponents say the 60% of sites that are perfectly legitimate are important to defend.

    It seems to be a double-edged sword which people may love it while hate it at the same time. What do you think of the Tor network?

    CS/ECE 5584: Network Security, Fall 2017, Ning Zhang