Skip to content

Introduction to Web and HTTP Protocol

The HTTP or Hypertext transfer protocol is one of the parts of WWW ( world wide web). You can also say that it’s the backbone of WWW. We will cover it later in this article if you don’t know about WWW. Before diving deep into the web world, I assume you already know computer networking. If you don’t know about networking I highly recommend you learn about networking concepts like OSI, TCP/IP, Protocols and network devices etc.

World Wide Web

The www is just a collection of web pages which can be accessed via the internet. In order words, www is part of the internet which allows us to access web pages. The www or the web also has multiple versions available. Also to make the web work, we need a few things together to make it work.

  • Protocol – As you know to access or work over the network we need a defined protocol to access the specific resource or service. For everything you do over the network, a protocol is defined for it like SSH or DNS etc. As I said www is a way to access web pages so to access those web pages over the network or internet we have an HTTP protocol.
  • Web Page – I mentioned the www will show the web page, so we need to have a web page. Now the way a web page is defined is by using HTML. The HTML is the core part of the web page. We can also use a text file as well to the pages but as I said www is a collection of web pages and to connect multiple pages we need to use the hyperlink which can be done with the help of HTML.
  • URL – To identify the multiple web pages over the internet we need a specific and unique identifier. The URL or Uniform resource locator provides us with a unique name and location for each resource over the internet.

Web Versions

Over the years the web is optimized and has improved and made tons of changes to it. The web. There are mainly 3 versions of the web.

  • Web 1.0 – web 1.0 is where everything started. web 1.0 is nothing but just a collection of static web pages. The meaning of a static page is the web page is the same for everyone. You can think of it as a newspaper where the content is the same for every reader and you can only read or view it.
  • Web 2.0 – The next version of the web changed the way of web pages work. Web 2.0 is not just about static content. The web pages can be dynamic as well as web 2.0 allow users to read and modify the data. You can think of your social media account where each user can see different content even on the same page. Also, users can edit delete or modify the data like profile images etc. Everything which we are currently doing on the web is possible because of web 2.0.
  • Web 3.0 – Web 3.0 is the upcoming version of the web that worktop of blockchain technology. The idea of working on web 3.0 is to decentralize the user data.

Now 3.0 is currently out of our scope for this series so we are going to focus on web 2.0.

The Working of Web

Till now we know the web and its components which are required to work. Let’s see the actual working of the web and see how a web page is accessed.

The user or client who needs to view the web page must have an URL. Once the user has the URL he needs to use the HTTP protocol to send the request to the server. The server receives the user requests and gives the response to the user with the web page requested.

We know that there are tons of different kinds of servers available in the network like SMTP servers, FTP servers etc. So to understand the user, and web pages request we need a web server. There are soo many servers available like Microsoft IIS, Nginx, Apache etc.

HTTP Request

The underlying protocol for the web is HTTP and the default port number for HTTP is 80 and 443 for HTTPS (HTTP over a secure connection). In the above image, the user sends an HTTP request to the web server. The / means is the home page of the URL which https://allabouthack.com is referred to as / in terms of the HTTP request. We will see all this later in this article.

We can see that once the web server receives the request from the user it will give a response to the user with an HTTP response and a web page. There will be soo many processes that will be done before giving the response to the user.

What is HTTP

HTTP is the backbone of the web. The Web uses HTTP to communicate with the web server. HTTP is an application layer protocol which is used for client and server communication on the web. As the protocol, there are set of rules on how HTTP work and what are the elements of the HTTP protocol. Now there are different versions of the HTTP protocol and till HTTP version 2.0 it uses the TCP for communication. The recent version of HTTP 3.0 works on the UDP which is still not used by many applications.

HTTP requires users to initiate the session and the server will give the

HTTP Request

The main parts of the HTTP request are as below.

  • Request Header – Each HTTP request has headers which define and give information about the request being sent to the server. There are soo many request headers which are used.
  • Request Method – While sending the Request we need to use several methods in the request. Each method has different working like GET, Delete, POST etc.
  • Request Body – The request can have a body based on the HTTP methods. The body is used to send the data to the server like the username password you enter are sent in the body.
GET /homepage.html HTTP/1.1
Host: allabouthack.com
Cookie: session=abcd

The above HTTP request is simple. Let’s understand the above request one by one.

  • GET – This is the HTTP method, GET is used to fetch the details from the server. As we are fetching the details the request doesn’t have a body.
  • HomePage – The homepage.html is the path or the location which we want to fetch. In this case, we are telling the server that we want to fetch the homepage
  • HTTP/1.1 – This is where we have defined the version of HTTP which we are using in this request.
  • Host and Cookie – The host and Cookie are the headers of this request which provide information to the server about the request. We will cover this in detail later in this article.
POST /login.html HTTP/1.1
Host: allabouthack.com
Cookie: session=abcd
Content-Type: text/html

username=admin&password=admin123

In the above request, you can see that method is POST which is used to post or send data to the server. We have an additional header content type which tells the server what kind of body or data is it. Lastly, we have the body.

HTTP Methods

The HTTP methods are part of the request. Which defines what to expect and what the request is all about. The HTTP methods are also known as HTTP verbs used to tell what to do based on the method. Remember all HTTP methods should be in capital letters.

GET

The GET is used to fetch or retrieve the information from the server. for example, if you type google.com, in this case, we want to retrieve the google home page and it will send the GET method. Remember if you enter any URL in the browser address bar it will be the default GET request. Also as the GET method doesn’t have a request body and all the data is in the URL. It is recommended that sensitive data like username passwords should not be sent using the GET method.

POST

The POST method is used to send data or client-side information or user data to the server. The method is also used for file uploading. The data of the POST request is in the body so you won’t be able to see that in the browser address bar or history. When you submit a form like a contact or a login form most of the time it will be in the POST because of that you can’t see data in the URL in the browser.

DELETE

The Delete method is as the name suggests it is to delete any resource in the server. The request might look similar to the GET request but will indicate a resource that should be deleted.

PUT

The PUT method is used to change replace or modify the data at the server-side. It is quite similar to how POST requests look with a body. The PUT is also used to upload the files.

TRACE

The TRACE method is used for debugging purposes and allows you to view the request or you can it just return or echo the request which is being sent. This method mostly will not be allowed over the internet as it may disclose some internal information.

HEAD

The head method is similar to the GET method. The HEAD method request will be similar to the GET only difference is in the case of the HEAD method there will be no BODY in the response so only response headers will be shown.

OPTIONS

The OPTIONS method is used to get the server-side information about all the methods enabled by the server. So by using the OPTIONS method we can verify all the HTTP methods used or supported by the server.

Connect

The CONNECT method is used to create a tunnel of a secure connection with HTTPS to the server. The method is mostly used if there is a proxy between the user and the web server. In that case, Connect method is used to create a connection between the proxy and users.

PATCH

The PATCH method is similar to the PUT it is also used to modify the resource but not all. So the PATCH method is used to partially modify the resource.

These HTTP methods are designed to perform tasks based on the method used. but you can still use the method you want in most cases. Like you can use the login request in GET and POST both.

POST /login.html HTTP/1.1
Host: allabouthack.com
Cookie: session=abcd
Content-Type: text/html

username=admin&password=admin123
GET /login.html?username=admin&password=admin123 HTTP/1.1
Host: allabouthack.com
Cookie: session=abcd

Both the requests are valid and can be used but it’s always good to use the appropriate method also from the security point of view sensitive information should not be sent over the GET method. In the above GET request, the username and password are called as Parameters

Common HTTP Request Headers

Now as we know about the HTTP Methods, we can see in the above raw HTTP requests that each request has headers. This header provides information about the request to the server. There are soo many request headers which could be used in HTTP requests apart from the default developer can create a custom request header for their purposes. Now the request is initiated by the client from browsers so all the headers will be added by the browser based on the requirements and other circumstances. Some headers will always be there some will be added based on the request. To send a custom header developers need to create requests manually by using the JavaScript code from the user browser. We will cover all these parts in another article of this series.

Host

The host header is one of the most important request headers of the HTTP request. Since the HTTP version, a 1.1 host header is required and should always be there in the request. The host header is used to tell the server about which host or server this request should be sent. The host header also has a port number attached to it so that each different server can find out which port or server needs to get this request.

There is another concept like shared server or shared hosting and reverse proxy that use host headers to identify the request and websites. We will look at those later in this series.

Referer

The Referer is an optional header which is used to inform the server about the web page from which the current request was sent. You can think of it like you are at google.com and from google.com you click on any button/link that will send a request to gmail.com then in that request that referer header will be like

Referer: https://google.com

This will tell the server request is coming from which web page.

User Agent

The user agent is used to tell the server about its devices and browsers used by the user like chrome firefox with other details like version and os. The server can give different responses based on the browser or os like mobile or desktop etc.

Content-Type

The Content-Type header is used in both requests and responses. The purpose of this header is to tell the server or browser about the type of the request/response body like HTML/XML/JSON etc.

HTTP Response

Each HTTP request sent by the users will have a response from the server in a similar format to the request but some other information which can be used by the browser or other relevant applications. The main parts of the HTTP response are as below.

  • Status Code
  • Response Header
  • Response Body
HTTP/1.1 200 OK
Date: Sun, 10 Jul 2022 08:53:01 GMT
Content-Type: text/html; charset=UTF-8
x-powered-by: PHP/8.1.1
Strict-Transport-Security: max-age=15552000
Server: Nginx
Content-Length: 92791

<!DOCTYPE html><html lang="en-GB">
....

HTTP Status Code

The HTTP status code is a 3-digit code sent by the server for each HTTP response. The HTTP response code allows the user/browser to understand if the request sent by the user is successful or failed or has any error and other details. The status code is divided into 5 series or categories.

  • 1 or 1xx – Used for information
  • 2 or 2xx – used for success
  • 3 or 3xx – Used for redirection
  • 4 or 4xx – Error from client
  • 5 or 5xx – Error at Server

Common Status Code

101 switching protocol

The 101 or switching protocol status code is used to tell the user or inform that server is asking to switch the protocol from HTTP to another. Mostly it is used to switch from HTTP to WSS.

200 OK

The 200 status is used to inform that request is successful. This is mostly used when you are requesting for a page like /content-us and the server will give you the page in the response body with 200 Ok means the request is fulfilled.

201 Created

It indicates that the request is fulfilled and a new resource is created based on the user request.

301 Moved Permanently

The 301 or Moved Permanently indicates that the resource you are looking for is now changed to another location like /contact is now /contact-us.

302 Temporarily Moved

The 302 is also used when the resource is temporarily moved to another location.

400 Bad Request

The Bad Request or 400 indicates that the server is unable to handle a request sent by the user due to an invalid request with malformed data or invalid request syntax being used.

401 Unauthorized

The unauthorized status code means that resources are required to have valid credentials to access like you need to log in then only you can access the profile page etc.

403 Forbidden

The forbidden is used when you don’t have permission to access the particular resource irrespective of the credentials.

404 Not Found

The most popular one 404 or Not Found indicates that a resource is not available or not found by the server.

500 internal Server Error

The 500 Internal Server Error indicates that the server has faced some issues while proccing your request and is unable to complete it.

There are soo many other status codes available which are used for different purposes.

Response Header

Same as Request headers there will be response headers as well. The purpose of the response headers is similar to request headers but the response headers will be used by the browser instead of the server. The browser will use this header to process the response body/security/access and other relevant tasks.

Common HTTP Response Headers

Content-Type

The same header is used in requests as well. The meaning is same for response as well. This informs the browser of the type of the content like what type of body is like HTML, JS/CSS/JSON etc.

Set-Cookie

Set the cookie for the user. The cookie is used to track the user activity and allows the server to validate the user session and authentication. will understand that in detail later.

Strict-Transport-Security

HSTS header is used by the server to inform the browser that the webpage should only be open via HTTPS not HTTP. This is one of the security headers that should always be implemented.

Content-Length

Content-Length is used in request and response both and indicates the size of the request/response body in bytes.

There are other headers as well we will look at that header while we kept going further in the series. Each header has meaning and is used for purposes like information, security etc. That is, for now, we will look at the other details in another part of the series.

Our ongoing giveaway is still running on Twitter for the Tryhackme premium subscription for one user. You can participate from here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.