To understand the subtle and not so subtle differences between the three, one needs to know the OSI model and where in that model these operate.
Here is a brief overview of the OSI model.
The left column defines the data model for each layer. For example the smallest data block at the Network layer is a packet and the smallest data block at the transport layer is a segment.
Protocols like TCP and UDP function at the Transport layer, whereas IP and routing works at the Network layer.
Now coming to what operates in which layer:
Router
See the diagram below:
There are two machines, each on a different network connected through a Router. If machine A wants to communicate to machine B it needs to create a TCP connection. When the connection is created, both the machines A and B are unaware of the presence of a router in between. What this means is, in both the machines if we look up the IP:PORT combination it would be 10.0.0.1:1234 - 11.0.0.1:4567 i.e router is transparent.
In this case router does not participate at the Transport (TCP) layer and just acts as a relay for datagram packets. All it knows is where to send the packet. It does not modify the packets and does not require the response to come back through it. Infact, if there is another router somewhere in between these networks which often is the case if we assume two machines connecting over the internet, then the response may come back through some other route. Router is Network layer (TCP) protocol unaware.
NAT (Network Address Translator)
Now NAT is nothing but an intelligent router. Since we all know that IPv4 ips are in short supply and these days even something as dumb as a fridge has an IP. It is also a fact that most of the devices are consumers and not producers i.e they do not need a publicly resolvable IP. For eg : all the work stations inside a building need not have a public IP assigned to them.
This is where NAT comes into play. What NAT does is hide the machines behind it from the internet. All the machines go through NAT to access the outside world.
See the image below:
The machine A still thinks that it is talking to machine B directly (10.0.0.1:1234 - 11.0.0.1:4567). What the NAT does is replace the Source:Port Header of the packet it received from A with its own IP (w.x.y.z) and its own random port (7897). When the packet reaches machine B it thinks it came from the NAT IP.
Since a response is always to the source, it tries to send the response packet back to the NAT at the same port as was overridden in the header(7897). When the NAT had received the packet from A it had assigned it a random port (7897) and kept that in a table called NAT table. So when the response comes back from Machine B it just does a reverse look up in the same table and forwards it to the desired recipient (Machine A). This way more than one machines can access the internet through NAT.
One important point to note here is that at each layer of OSI model there is a checksum to determine if the packet/segment is valid. The same applies here. If the NAT changes the source IP:PORT combination, it need to recalculate the checksum again both at the Transport as well as Network layer. This leads to some additional work for the NAT.
Proxy:
A proxy works at the Transport layer and is aware of the protocol. Its not transparent in nature. It actually creates two connections one each with source and destination. Machine A does not even know about machine B. For machine A Proxy is the only thing its talking to and does not care how and where the proxy gets its data.
See the image below:
In the above picture Server A does not even know the IP of Server B. All it knows is the IP and port of the proxy server. The proxy server creates two connections, one with Server A and one with Server B. This happens at the Transport layer. Similarly server B does not even know the IP of server A, for it Proxy is the Source.
Examples of proxy servers are load balancers like HAProxy, Nginx, Apache, AWS ELB, F5 BIG-IP. They hide the backend from the outside world and do lot of nifty stuff like load balancing. optimization, etc.
Above is a very high level distinction of the three and the lines infact have blurred. Proxy for one is loosely used even for a NAT and vice versa. This just given you a starting point to learn more.
Here is a brief overview of the OSI model.
The left column defines the data model for each layer. For example the smallest data block at the Network layer is a packet and the smallest data block at the transport layer is a segment.
Protocols like TCP and UDP function at the Transport layer, whereas IP and routing works at the Network layer.
Now coming to what operates in which layer:
Router
See the diagram below:
There are two machines, each on a different network connected through a Router. If machine A wants to communicate to machine B it needs to create a TCP connection. When the connection is created, both the machines A and B are unaware of the presence of a router in between. What this means is, in both the machines if we look up the IP:PORT combination it would be 10.0.0.1:1234 - 11.0.0.1:4567 i.e router is transparent.
In this case router does not participate at the Transport (TCP) layer and just acts as a relay for datagram packets. All it knows is where to send the packet. It does not modify the packets and does not require the response to come back through it. Infact, if there is another router somewhere in between these networks which often is the case if we assume two machines connecting over the internet, then the response may come back through some other route. Router is Network layer (TCP) protocol unaware.
NAT (Network Address Translator)
Now NAT is nothing but an intelligent router. Since we all know that IPv4 ips are in short supply and these days even something as dumb as a fridge has an IP. It is also a fact that most of the devices are consumers and not producers i.e they do not need a publicly resolvable IP. For eg : all the work stations inside a building need not have a public IP assigned to them.
This is where NAT comes into play. What NAT does is hide the machines behind it from the internet. All the machines go through NAT to access the outside world.
See the image below:
The machine A still thinks that it is talking to machine B directly (10.0.0.1:1234 - 11.0.0.1:4567). What the NAT does is replace the Source:Port Header of the packet it received from A with its own IP (w.x.y.z) and its own random port (7897). When the packet reaches machine B it thinks it came from the NAT IP.
Since a response is always to the source, it tries to send the response packet back to the NAT at the same port as was overridden in the header(7897). When the NAT had received the packet from A it had assigned it a random port (7897) and kept that in a table called NAT table. So when the response comes back from Machine B it just does a reverse look up in the same table and forwards it to the desired recipient (Machine A). This way more than one machines can access the internet through NAT.
One important point to note here is that at each layer of OSI model there is a checksum to determine if the packet/segment is valid. The same applies here. If the NAT changes the source IP:PORT combination, it need to recalculate the checksum again both at the Transport as well as Network layer. This leads to some additional work for the NAT.
Proxy:
A proxy works at the Transport layer and is aware of the protocol. Its not transparent in nature. It actually creates two connections one each with source and destination. Machine A does not even know about machine B. For machine A Proxy is the only thing its talking to and does not care how and where the proxy gets its data.
See the image below:
In the above picture Server A does not even know the IP of Server B. All it knows is the IP and port of the proxy server. The proxy server creates two connections, one with Server A and one with Server B. This happens at the Transport layer. Similarly server B does not even know the IP of server A, for it Proxy is the Source.
Examples of proxy servers are load balancers like HAProxy, Nginx, Apache, AWS ELB, F5 BIG-IP. They hide the backend from the outside world and do lot of nifty stuff like load balancing. optimization, etc.
Above is a very high level distinction of the three and the lines infact have blurred. Proxy for one is loosely used even for a NAT and vice versa. This just given you a starting point to learn more.
Very good brief! Appreciate your effort!
ReplyDeleteThank you, that cleared up things a lot for me. So for example, a computer on network A wants to make a request to one of Google's servers. The public IP address it gets from the DNS server is actually a proxy. So the computer would send the packets to the through the router, which would translate the return address into a public IP using NAT, to the proxy server. The proxy server would then set up a separate connection with one of the Google servers, which would only see the IP address of the proxy server. The Google server would send the results to the proxy server, which (I assume the proxy server also does NAT or something similar) would then send them back to our router with the return address from NAT. The router would translate that IP address into a private address and send it to the computer making the request. Is that accurate?
ReplyDeleteBrief and clear explanation on nat and proxy. This is what i am looking for since long time which provides me complete detail of all the things. It is needed for my project which is based on dedicated socks5 proxy. Hope it is works fine.
ReplyDelete