CS Web Applications
Whether it’s using the Internet or controlling your lawnmower from a distance, web applications are essential to nearly everything we do. We will explore the fundamentals of web application security in this introductory course.Whether it’s using the Internet or controlling your lawnmower from a distance, web applications are essential to nearly everything we do. We will explore the fundamentals of web application security in this introductory course.
The HTTP protocol
Our browsers and applications may receive content like HTML (“Hyper Text Markup Language”), CSS (“Cascading Style Sheets”), photos, and videos thanks to the HTTP carrier protocol.
URLs, Query Parameters and Scheme
We utilize a URL (“Uniform Resource Locator”) to access web applications. For instance, https://www.google.com/search?q=w3schools+cyber+security&ie=UTF-8
A domain, Query Parameters, and a script that is being visited are all present in the URL for google.com.
We are gaining access to a script named /search. The / denotes that it is located in the server’s top directory, which is where files are being served. The & separates distinct input parameters, while the? denotes the script’s input parameters. The input parameters for our URL are:
q with a value of w3schools cyber security
ie with a value of UTF-8
It is up to the webserver application to interpret these inputs.
Occasionally, you can notice merely / or /?, which means that a script has been set up to react to this IP. This script usually functions as an index file, capturing all requests until a specific script is passed in.
The protocol to be used was established by the Scheme. It is the initial portion of the URL in this instance, https. The program is free to choose what to use when the scheme is not defined in the URL. Schemes can incorporate a wide range of protocols, including:
- HTTP
- HTTPS
- FTP
- SSH
- SMB
HTTP Headers
Numerous headers, some specific to the application and others that are well-defined and supported by the technology, are used by the HTTP protocol.
Example request to http://google.com
GET /search?q=w3schools+cyber+security&ie=UTF-8 HTTP/1.1
Host: google.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
Accept: image/avif,image/webp,image/apng,image/*,*/*;q=0.8
Referer: https://w3schools.com/
Accept-Encoding: gzip, deflate
Cookie: cookie1=value1;cookie2=value2
What the client want to accomplish on the target webserver is specified in the request header. It also contains information on the type of client visiting it, whether it supports compression, and any cookies that the server has instructed the client to provide. Here is an explanation of the HTTP request headers:
Header | Explanation |
---|---|
GET /search… HTTP/1.1 | GET is the verb we are using to access the application. Explained in detail in the section HTTP Verbs. We also see the path and query parameters and HTTP version |
Host: google.com | This header indicates the target service we want to use. A server can have multiple services as explained in the section on VHOSTS. |
User-Agent | A client application, that is the browser in most cases, can identify itself with the version, engine and operating system |
Accept | Defines which content the client can accept |
Referer: https://w3schools.com/ | If the client clicked a link from a different website the Referer header is used to say from where the client came from |
Accept-Encoding: gzip, deflate | Can the content be compressed or encoded? This defines what we can accept |
Cookie | Cookies are values sent by the server in previous requests which the client sends back in every subsequent request. Explained in detail in the section State |
The server will respond to this request with headers and content. Headers examples are seen below:
HTTP/1.1 200 OK
Content-Type: text/html
Set-Cookie:
What appears in our browser is determined by the response header and content. The explanation of the HTTP response headers is as follows:
Header | Explanation |
---|---|
HTTP/1.1 200 OK | The HTTP Response code. Explained in detail in the HTTP Response Codes section |
Content-Type: text/html | Specifies the type of content being returned, e.g. HTML, JSON or XML |
Set-Cookie: | Any special values the client should remember and return in the next request |
HTTP Verbs
The client is given instructions on how to deliver data to the web application when gaining access to it. The program can take a wide variety of verbs.
!Verb | Used for |
---|---|
GET | Typically used to retrieve values via Query Parameters |
POST | Used to send data to a script via values in the body of the Request sent to the webserver. Typically it involves creating, uploading or sending large quantities of data |
PUT | Often use to upload or write data to the webserver |
DELETE | Indicate a resource which should be deleted |
PATCH | Can be used to update a resource with a new value |
These are employed as needed by the web application. The whole range of HTTP Verbs can be effectively used by restful (RESTful) web services to specify what needs to be done on the backend.
HTTP Response Codes
Depending on what happened on the server side, the webserver application can react with various codes. The following are typical response codes that the webserver will provide to the client and that security experts need to be aware of:
Code | Explanation |
---|---|
200 | Application returned normally |
301 | Server asks client to permanently remember a redirect to a new location where the client should access |
302 | Redirect temporarily. Client doesn’t need to save this reply |
400 | The client made an invalid request |
403 | The client is not allowed to access this resource. Authorization is required |
404 | The client tried to access a resource which does not exist |
500 | The server errored in trying to fulfill the request |
REST
Rest services, also known as RESTful services, use all of the HTTP Verbs and HTTP Response Codes available to them in order to make using the online application easier. In order to control what happens on the web application, RESTful services frequently employ portions of the URL as query parameters. “Application Programming Interfaces” (APIs) usually employ REST.
REST URLs will call functions according to the various components in the URL.
http://example.com/users/search/w3schools is an example of a REST URL.
This URL will use its own functionality rather than the query parameters. The URL can be interpreted as:
Parameter | Comment |
---|---|
users | Accessing the users part of the functionality |
search | Accessing the search feature |
w3schools | The user to search for |
Sessions & State
With HTTP, a server cannot automatically recognize a recurrent visitor. A secret value must be sent to and received from the client in each request for a webserver to be able to identify the user. Although cookies in headers are usually used for this, alternative methods, including GET and POST parameters or other headers, are also frequently used. It is not advised to pass state using GET parameters since they are frequently logged on the server or by intermediaries like proxies.
Here are a few typical Cookie instances that let the webserver application manage sessions and states:
- PHPSESSID
- JSESSIONID
- ASP.NET_SessionID
These values stand for a particular server state, sometimes referred to as a session. This state is symbolic of things like:
- What user you have logged in as
- Privileges and authorizations
It is crucial that the session value given to the client be difficult for other parties to deduce or otherwise identify. An attacker may then pose as other users on the web application if they were able to do so.
On the client, state can also be preserved. In order for this to work, the client must return all objects, and the server must communicate all states to the client. Encryption is used in these implementations to verify that the client’s claimed state is authentic. Below is a list of examples of how this has been implemented:
- JWT (“JSON Web Tokens”)
- ASP.Net ViewState
To take this class, cookies are being used by you! By launching the developer tools in your browser, you can examine these cookies. To accomplish this, open the developer tools window in the browser by pressing F12. You should be able to locate the proper location for your cookies within this window.
The cookies were located in the Application tab above in Google Chrome.
Note: Can you figure out why the cookies in the screenshot are hidden so you can’t see them?
Virtual Hosts
Virtual hosts, also known as Vhosts, allow several applications to be processed by a single web server. The web server normally reads off the Host header of the client request and forwards the request to the appropriate application based on this value to enable access to different Virtual Hosts.
URL Encoding
Certain characters need to be encoded in order for an application to securely communicate content between the server and client without interfering with the protocol. URL encoding is used to protect the communications’ integrity.
Unsafe characters are replaced by a % and two hexadecimal digits when using URL encoding. As an illustration:
- Percentage is replaced with %25
- Space is replaced with %20
- Quote is replaced with %22
CyberChef is a great tool for text analysis and for running operations like URL decoding. It is available for browser testing at https://gchq.github.io/CyberChef/.
Note: Play around with Cyber Chef and see if you can reveal what the following message in URL encoded characters hold: %48 %65 %6c %6c %6f %20 %64 %65 %61 %72 %20 %77 %33 %73 %63 %68 %6f %6f %6c %73 %20 %73 %74 %75 %64 %65 %6e %74 %2e %20 %48 %6f %70 %65 %20 %79 %6f %75 %20 %61 %72 %65 %20 %6c %65 %61 %72 %6e %69 %6e %67 %20 %73 %6f %6d %65 %74 %68 %69 %6e %67 %20 %74 %6f %64 %61 %79 %21
JavaScript
Browsers employ the scripting language JavaScript to support dynamic content. This makes it possible for programmers to create client-side solutions, resulting in more dynamic and “alive” online content.
Numerous attacks against web applications and client apps, like browsers, also use JavaScript.
Encryption with TLS
Since the HTTP protocol does not provide encryption for data-in-transit, encryption support must be implemented by adding a wrapper around HTTP. This is denoted by HTTPS, the S that comes after HTTP.
SSL (“Secure Sockets Layer”) was the encryption used previously, but it has since been deprecated. Instead, encryption is usually enforced using TLS (“Transport Layer Security”).