By Abhinav Bhatia．Dec 6, 2021
Thanks for being with me so far. In the last blog we went through all the options to store our data. Up until now we have learnt about the cloud way of doing computing, storing your data but an important piece of puzzle that is still left is how to connect the application with the end user. Remember with the utility companies, you still need to have a power connection. Power lines are laid across so that electricity can be transmitted from the power generation source to the destination homes. Similarly in case of cloud, your source is the cloud data centre where you have all the computing power, but you need a way that the computing power can be tapped by a user. What is that mode of connectivity ? Its Internet and if you are a company looking for dedicated lines guaranteeing higher performance at your offices you can have leased line also from your office network to Cloud Network.
A few Disclaimers
All opinions discussed in this blog series are my own and in no way should be attributed to be coming from the companies that I am or have been a part of. These are my learnings that I have tried to put in in as simple manner as possible. I understand that oversimplification can sometimes leads to an alternate version which might not be true. I would try my best not to oversimplify but my only request is to take all of this with a pinch of salt. Validate from as many sources as possible.
Sitting at home or office, when you want to use any application on your mobile phone or on your laptop, what all devices do you interact with before you reach your destination ? Have you heard of names like DNS, CDN, Load Balancers, Firewalls, Routers ? No. Let’s explore that in this blog today.
A couple of decades back, telephone directory and yellow pages used to be available at each and every homes and hotels so that you are able to call a neighbour or a business. DNS performs the role of that telephone directory where your browser needs to know the IP address (as compared to a telephone number) of a domain name to connect because every domain on the internet is identifiable with an IP address.
Every node on the internet has an IP address which can either be IPv4 of 32 bits (126.96.36.199) or IPv6 of 128 bits (2606:2800:220:1:248:1893:25c8:194). It can either be Private (not exposed to the Internet) or Public (reachable on Internet).
Essentially when you type a domain name on your browser like example.com. your browser uses the DNS client of your laptop to make a recursive request to a DNS Resolver stationed at an ISP normally. The job of a DNS resolver is to get the IP address. It reaches out to the root server first (Note the . after example.com.) and fetch the ip address of Top Level Domain (TLD) name servers (com). The DNS resolver reaches then to the TLD server and fetch the IP address of Authoritative name server of example.com. It finally reaches the Authoritative Name server of example.com which has an answer that example.com is located at 188.8.131.52 IPv4. Normally all the components has a cache location available which helps in serving the answer from the cache (and hence saving on several round trips).
Generally, if you are hosting your web server, you also have to think about hosting its Authoritative DNS server and create records like
- A record which would give IPv4 address
- AAAA record which would give IPv6 address
- CNAME record which would redirect to another domain name
- MX record which would give out an address to an email server (cannot point to CNAME record)
- NS record stores the name server for a DNS entry
- TXT record to store an arbitrary text notes in the record. Useful for email security or to verify your domain name sometimes.
- SOA (Start of Authority record) records used to store admin information about the domain (like email of the domain administrator)
- SRV record (Service record) used to store host and port for specific services (Protocols like Session Initiation Protocol used in VOIP calls often requires use of SRV records)
- PTR record provides domain name for an IP address (in reverse lookups)
- CAA (Certification Authority Authorization): containing acceptable Certificate Authorities for a host/domain
- DNSKEY: The Key record for DNSSEC (DNS Security Extensions: use of public key cryptography to make DNS more secure by verifying the records returned)
If you need to have such a service, you can either have it installed on your own VM and manage it on your own or go for a managed DNS offering in which you only have to configure it. Google Cloud has Cloud DNS which uses anycast (send to the nearest node among multiple possible nodes) name servers to serve your DNS zones from redundant locations around the world, providing high availability and lower latency for your users. With Cloud DNS you can either setup Public Zones (would be used by public domains hosted by your business, like websites, user apps) or Private Zones ( would be used by private domains, like internal sites, service to service calls with your application)
There is also Cloud Domains which enables you to register, transfer and configure a domain in Google Cloud.
Once we have the A address of the domain that we have hosted, what do you think that A address belongs to ? A web server hosting your middleware like apache/nginx in case of linux or Internet Information Server in case of windows. But do you think, a single VM would be enough to take the load of all your users ? No. That’s where load balancers come in where as the name signifies they distribute the load to multiple servers hosting the web application/application. There are two kinds of load balancers primarily
- Application Load Balancers: These load balancers work at Layer 7 of the OSI layer where the application layer is present. They can understand application layer protocol like HTTP/HTTPs (port 80,8080,443), web sockets so they can take decisions like redirecting request to a specific server basis host and path rules ( host rules like for a.example.com go to servers 1,2, 3 and for b. example.com go to servers 4,5 and 6 and path rules are like for example.com/a go to servers 1,2, 3 and for example.com/b go to servers 4,5,6). They can also take care of terminating an HTTPs connection and opening a new HTTP or HTTPs connection with the backend.
- Network Load Balancers: These load balancers work at Layer 4 of the OSI layer and is good for applications using Network protocols like TCP and UDP. Not all application modules has a web server. Some would have the logic layers built in working on protocols other than the web one like 3000 or any.
GCP has different options with respect to load balancers.
Content Delivery Network
Web Application Firewall
Specifically for web traffic you need a device which can keep a check on the traffic that is flowing in and check if someone is trying to attack you. This attack can be an attack on your database (the heart of your application) or attacks to your web layer. Some of these attacks are
- Cross Site Request Forgery: Targets authenticated users or admins into clicking malicious links (through social engineering attacks like emails or chats) which tricks the user to do state changing activities like transferring funds, changing passwords etc.
- SQL injection: Malicious SQL query written with an intent to get sensitive data/or delete sensitive data from the database often injected into objects like forms.
On cloud solutions like Cloud Armor goes one step ahead and also protects against Layer 7 DDOS attacks.
Virtual Private Cloud
We discussed earlier that every node on the network has an IP address. While sending a letter, you always write the To address and From address. Similarly all communication between nodes or to the node require an IP address at either end because that’s how the system is designed. Everything in cloud is virtualised, and so is the network. A Virtual Private Cloud is an encompassing network container which provides a logical segregation and provides
- IP addresses: through constructs like subnets (184.108.40.206/24)
- Routes: Similar to roads connecting source and destination, enabled by a network device called as router, routes connect two nodes and provides different paths to reach a particular point
- Gateway to the internet: A device similar to a router having a route to the Internet (0.0.0.0/0)
- NATing : A device whose main aim is to hide the sensitive private IP addresses for an outward flow and use a public IP so that the private IP is not exposed to the world and similarly change back to private IP address from the public one when the response traffic is received.
- Firewall: Similar to a security guard allows or denies traffic to come inside or exit from a VPC basis polices defined on 5 tuples generally (Source IP, destination IP, source port, destination port, protocol).
- Virtual Private Network(VPN): Extends Private network (with private IP addresses) across public network with encryption so that you can extend a cloud data centre, for example, to your office network and communicate in a safe and secure manner
And that’s it. In this five part series of understanding cloud, hope all of you have been able to understand the basis tenets on which the cloud is built and can hopefully start joining the dots and designing your first application on cloud. There are a still lot of pertinent functions that I have not covered like Monitoring, Logging, Message queues, AI and ML systems which would hopefully make Part 6.
Check out the other parts here:
- Learning Cloud through GCP — Part 1: What is Cloud ? where I have tried to demystify cloud in as simple manner as possible so that our mind can put a picture to it (Is Cloud Model similar to a Vending Machine / a utility company ?)
- Learning Cloud through GCP — Part 2: How can I consume Cloud ? where I would first discuss the layers of a software application and then understand the different service models in which those layers are bundled and sold by a Cloud vendor with a Shared Responsibility Model. (IaaS, PaaS, SaaS, FaaS, XaaS)
- Learning Cloud through GCP -Part 3: Where do I compute on Cloud ? where I would touch upon the several compute options available on cloud and how to choose between them. (VM vs Kubernetes vs Serverless vs Event Driven Serverless Framework)
- Learning Cloud through GCP -Part 4: Where do I store my data on Cloud ? where I would try to answer a very pertinent question on how to select the right storage unit to hold your data. (OLTP vs OLAP, ETL vs ELT, SQL vs NoSQL, File vs Block Storage, What is NewSQL ?)
- Learning Cloud through GCP -Part 5: How do I connect to my Cloud ? (this blog)
The original article published on Medium.