Introduction to Web Technologies – Day2

E&R Department, Mysore

Session Plan Day 2
• • • • • • • • • • Web Applications Web Servers Introduction to the Common Gateway Interface (CGI) Load Balancing Application Server Web Security Encryption Digital Certificates Performance of web Applications Appendix

Copyright © 2004, Infosys 2 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Recap of Day 1
• • • • • Process, Thread, Daemon process, Server HTTP Protocol Web Servers Static Content Dynamic Content

Copyright © 2004, Infosys 3 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Web Applications
• • • • Web Servers evolved as a mechanism to share documentation and information among researchers (Static content) To enable dynamically generated content, special extensions are needed CGI (Common Gateway Interface) Gateways Other technologies for Dynamic Content Generation
– – – – Using Server’s API for Dynamic content code (NSAPI, ISAPI) Java Servlets ASP (Active Server Pages) JSP (Java Server Pages)

Copyright © 2004, Infosys 4 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Common Gateway Interface (CGI)
• • • • A general framework for creating server side web applications. The first mechanism for creating dynamic web site. Instead of returning a static web document, web server returns results of a program. CGI programs can be written in languages like C/C++, Perl, Java, etc.

Copyright © 2004, Infosys 5 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

CGI Overview
For example
– Browser sends the parameter: name=Sachin . – Web server passes the request to a Perl program – Perl Program returns HTML that says, Hello, Sachin!

Name= Sachin

Name= Sachin

Web Browser
Hello, Sachin!

Web Server
Hello, Sachin !

C\Perl Program

Copyright © 2004, Infosys 6 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Working of a CGI Application
7. Invoke getinfo.exe under cgi-bin, passes params thru env variable/stdin 8. Query DB for Data

(Local Network)

3. Cannot resolve locally? Resolve from DNS Server other DNS…

2. Translate DNS 4. Returns IP name to IP address Address 202.68.33.49

Web Browser PC

the hru r t e The L ct HTM ne serv Internet n o ose t Co omp u st 5. n/w t 9. C n stdo ue q o Re Web Server TP T se on (Internet) dH n sp Se Re TML 6. H ly TP HT tains ical abcbooks.co.in d n m en o na 202.68.33.49 0. S ich c ed dy 1 wh erat n ge

getinfo.exe

1. User Submits form; URL http://abcbooks.co.in/cgibin/getinfo.exe?title=Web+Servers

Copyright © 2004, Infosys 7 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Developer’s Role in this Scenario (Dynamic Content) - 1
• • Installation of Web Server Configuration of Web Server
– TCP/IP Port for listening – Web Server Root – Folder to serve static content from – Thread-Pool size and other performance parameters

CGI Gateway configuration
– CGI-BIN Folder – The folder which contains CGI applications or Scripts – Additional modules for connecting to other servers or Application Servers • Example: mod_jk for Apache web server configures connection to Tomcat JSP/Servlet Engine

Copyright © 2004, Infosys 8 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Developer’s Role in this Scenario (Dynamic Content) - 2
• Develop Static content
– Static HTML is required even in some applications for UI and presentation pursposes

• •

Develop Forms in HTML
– Develop forms in HTML to submit user input to a CGI application

Develop CGI Applications
– Simple CGI applications: Write code to read query parameters and to generate HTML dynamically on stdout (and DB queries if any)
• Languages: C, Perl, VB Script, Java Script, Shell Script

– If using advanced server side technologies like JSP/ASP, develop JSPs or ASPs to generate content dynamically

• •

Domain Name/Server Name Registration (if required) Debug and Test

Copyright © 2004, Infosys 9 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Performance aspects in Web Applications

Load
• • • Every request serviced by the server amounts to some amount of load in terms of CPU time on the server side As number of concurrent users increase, load also increases Can a server serve any number of Requests?
– No! Several Factors that determine the maximum load on server
• • • • • Speed of CPU(s) in the server Number of CPU(s) in the server Memory capacity of the server Hard disk space of the server (Marginal impact) Amount of time CPU of the server spends (Turn-around time) for each request

Copyright © 2004, Infosys 11 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Scalability
• What is Scalability? (Move to previous slide)
– – “The ability of an application (or server) to perform without degradation in Quality of Service as the load or demand increases” Every server/application has a limit based on several factors Tune up and optimize code to reduce turnaround time for each request
• • • • • Tune code Tune SQL Queries Use Database Connection Pools Tune the server (use optimum thread pool size) Caching

How do I scale up my server application?

– – – –

Increase memory capacity Increase number of CPUs Increase the speed of CPUs Increase hard-disk capacity (Marginal impact) Go for multiple servers Load balance the servers Each request requires more processing compared to static content Turnaround time for each request is higher

Still not Scalable enough?
– –

Dynamic content and Scalability
– –

Copyright © 2004, Infosys 12 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Performance of Web Applications
• • • Response Time Scalability Peak Load

Performance Enhancement :
•Web Farms •Clustering App Servers and Database Servers •Tuning the code •Changing the architecture •Optimizing database queries •Optimizing the database indexes

Copyright © 2004, Infosys 13 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Load Balancing

Load Balancing Schemes
• “Distributing processing activity evenly across computers in a cluster so that no single device is overwhelmed” Types of Load Balancing Schemes for Web Servers
– – – – – DNS Load Balancing (Round Robin Type) Hardware and Software Load balancing Distributed Content (Among Hosts) Distributed Content (Among Providers) Reverse Proxying (Not a popular method)

Copyright © 2004, Infosys 15 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

DNS Load Balancing
• Multiple copies of the website is created on separate physical servers Each server has its own distinct IP Address DNS server returns one of these IP addresses in a round-robin fashion Load gets distributed

• • •

Copyright © 2004, Infosys 16 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

DNS Load Balancing – Advantages and Disadvantages
• Advantages
– Easy to implement – Cost-Effective (No additional hardware/software required)

Disadvantages
– DNS server doesn’t have a way to detect which server is overloaded – DNS server cannot detect if one of the server is down – Difficult to determine which IP address each client will resolve the site name as – Browser caching prevents effective load balancing and can result in additional load on the network

Application
– For less than 5 servers

Copyright © 2004, Infosys 17 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Hardware and Software Load Balancing
• • • A machine is setup to intercept all requests to one IP address This machine has a specialized Load balancing software Distribution occurs at IP routing level which transparently maps a single source/destination address

Copyright © 2004, Infosys 18 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Hardware and Software Load Balancing - Advantages
• Advantages
– Specialized load balancing software which is tunable – Identifies if a web server in the pool has gone down and stops forwarding requests to the same – Can assess load on a particular server in the pool before deciding to route the request

Disadvantages
– Not cost effective (Specialized software and additional hardware required)

Application
– Heavy Hit sites, ISPs (Internet Service Providers)

Copyright © 2004, Infosys 19 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Distributed Content (Among Hosts)
• • • Put content on different servers based on type or size Hyperlinks refer to different servers for different types of content Example:
– http://www.infosys.com contains HTML documents – http://image1.infosys.com contains GIF, JPG and other types of content

Advantages
– Option to use bigger server for content which is big in size (Images, Movie Clips etc) – Option to use older/less-powerful servers for content which is smaller in size (HTML files etc) – Cost Effective (No additional cost incurred)

Application
– Heavy hit sites, Portals

Copyright © 2004, Infosys 20 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Distributed Content (Among Providers)
• • • Servers are located at different geographical locations As per user’s geographic location, nearest server is chosen to serve the request In most of the download/mirror sites, user specifies his/her geographic location
– Example: Indian ISP’s server for downloads within India, German ISP’s server for downloads within germany

It is possible to find geographic location of user automatically
– Example: www.google.com

Application
– Download sites, Heavy-hit Portals

Copyright © 2004, Infosys 21 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Composite approach to load balancing
• One approach to load balancing may not serve the needs of an application A combination of the schemes can be used
– Simple and Cost Effective – DNS load balancing – Heavy Hit Sites, Content providers – Hardware and Software Load balancing along with Distributed Content among hosts – ISPs – DNS balancing and Distributed content among Providers – Download Sites – Distributed content among Providers

Copyright © 2004, Infosys 22 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Turn-Around Time for a Request using CGI application
Application Loads into Memory Application Connects to Database
Application Logic: 1. Parse Parameters 2. Database Query 3. Compose HTML

Disconnect from Database

Application Unloads from memory

Useful processing time Turnaround time for a single Request

• •

Request for Dynamic Content is costly in terms of CPU time, compared to a request for static content If Turnaround time = Useful Processing Time only
– performance and scalability will improve drastically

Copyright © 2004, Infosys 23 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Thread and Database Connection Pooling
• TBD Meera: Animation

Copyright © 2004, Infosys 24 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Application Servers

Application Servers
• An Application server is software server that lets thin clients use applications and databases that are managed by the server. The application server handles all the application operations and connections for the clients. An application server is a server program in a computer within a distributed network that provides the business logic for an application program. It is frequently viewed as part of a three-tier application, consisting of a graphical user interface (GUI) server, an application (business logic) server, and a database and transaction server.

Copyright © 2004, Infosys 26 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

App Server versus Web Server
• Web Server is a specialized server which is capable of HTTP protocol
– Designed best to serve Static content (File based content which doesn’t change) – Has capability to generate Dynamic Content as well – Dynamic content generation within Web Server may be a costly operation in terms of CPU Time

Application Servers
– Also referred as Middle Ware sometimes – Handles Business Logic – Data Access and Data Manipulation

Other Services provided by Application Servers
– – – – – – State and Session Management Load Balancing Support for Distributed Application Services Security Transaction Management Pooling (Connection and Thread)
Copyright © 2004, Infosys 27 Technologies Ltd ER/CORP/CRS/OS41/003 Version no: 2.0

Application Servers - Examples
• • • • • BEA’s Weblogic (J2EE) IBM’s Websphere (J2EE) Microsoft’s IIS (ISAPI) Microsoft’s MTS (Microsoft Transaction Server) iPlanet (Formerly Netscape) (NSAPI, J2EE)

Copyright © 2004, Infosys 28 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Multi-Tier Applications

Copyright © 2004, Infosys 29 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Advantages of Multi-Tier Applications
• • • • Modular Design Scalability High Availability Incremental Development of applications

Copyright © 2004, Infosys 30 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

The Big Picture – (Real World Scenario)

Client Database1 Network Client Firewall The Internet Firewall Network Client Web Server App Server To Other Applications or Application Servers Client Web Server App Server Web Server App Server Database2

Mainframe

Copyright © 2004, Infosys 31 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Security in Web Applications

Web Security Issues
Authentication :
– The process of identifying an individual, usually based on a username and password. – Ability of each party in a transaction to ascertain identity of other party.

Authorization:
– Is the process of giving individuals access to system objects based on their identity. – Authorization involves granting or denying access to a network resource

Auditing:
– The process an operating system uses to detect and record securityrelated events, such as an attempt to create, access, or delete objects such as files and directories. – Identifies all controls that governs the information system, and assesses their effectiveness.

Copyright © 2004, Infosys 33 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Integrity and Confidentiality
Integrity:
– A service provided by cryptographic technology that ensures data has not been modified. – In a network environment, data integrity allows the receiver of a message to verify that data has not been modified in transit. – Ability to ascertain that the transmitted message has not been copies or altered

Confidentiality:
– A service provided by cryptographic technology to assure that data can be read only by authorized users or programs. – In a network, data confidentiality ensures that intruders cannot read data. – It is a status indicating that the information is sensitive.

Copyright © 2004, Infosys 34 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Threat Detection
Threat detection systems help reduce the risk or help you mitigate the threats to your network and critical data. The three different types of intrusion detection systems are hostbased, network-based, and anomaly-based.
– Host-based detection: Detect changes made to operating system files and other critical files such as data. This method uses checksum and hashes to determine that a change has occurred. – Network-based Detection: Examine network traffic and provide alerts when undesired traffic is present on the network. – Anomaly-based Detection: Looks for network traffic that is not expected..

Copyright © 2004, Infosys 35 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Network Security Challenges
Tapping Sniffing Message Alteration Theft and Fraud Server Corporate System

Client

Computer Viruses Line taps Data Corruption Identity Theft

Hacking Computer Viruses Theft and Fraud Line taps Vandalism Denial-of-service attacks

Theft of data Copying of data Alteration of the data

Copyright © 2004, Infosys 36 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Encryption and Decryption
Encryption:
– Encryption is a method by which information is scrambled to make it unreadable to everyone except the desired recipient. – It prevents unauthorized users from reading or tampering the data. – Encryption uses mathematical algorithms to scramble data. – The efficiency of encryption depends on the choice of the algorithm – Cipher Text: The scrambled data, after encryption is known as cipher text.

Decryption: To read the encrypted file, the recipient must convert the encrypted data back into its original form. This process is known as decryption..

Copyright © 2004, Infosys 37 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Symmetric Encryption ( private key encryption):
– Symmetric encryption is a traditional scheme of encryption where the same secret key is used for encryption and decryption. – The sender and the receiver share the same secret key. Advantages: – The keys are very short – The encryption and decryption are fast Disadvantages: – Requires secure transmission of keys – Requires a separate key for each group of people who exchange information

Copyright © 2004, Infosys 38 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Symmetric Encryption
Person A Person B
Plain Text Secret Key

38975 Plain Text

s7%&`=S|
Encrypted Data Moving in the Encryption Software Network

38975

Decrypted Data

Decryption Software

Copyright © 2004, Infosys 39 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Asymmetric Encryption ( public key encryption):
– Asymmetric Encryption uses two keys. – One is public key which can be distributed to all users and the other is private key which corresponds to the public key and is possessed only by the owner of the file. – The public and private keys are related in such a way that when the public key is used to encrypt messages, only the corresponding private key can be used to decrypt them and vice versa.

Advantages:
– Increased security – Ensures non repudiation

Disadvantages: - The encryption and decryption are very slow

Copyright © 2004, Infosys 40 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Asymmetric Encryption
Person A Person B Computer Algorithm
Public key can only be used to encrypt (it can’t decode ciphers)

Public Key Sent to Person B over the Internet

Public Key

Public Key
Person B then uses the public key to encrypt any messages that he wishes to send to person A

Private Key
Private key held in a safe place. Only this can decode ciphers that have been encrypted using the public key

Copyright © 2004, Infosys 41 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Secured Socket Layer (SSL)(1 of 6)
– The well-known implementation of encryption is Secured Sockets Layer (SSL). – Secured Sockets layer is a way of authenticated and encrypted communication between clients and servers. – It is a protocol developed by Netscape for transmitting information securely over an insecure network. – It is universally accepted and used by web browsers and web servers for transmitting sensitive information.

Copyright © 2004, Infosys 42 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

How SSL works?(2 of 6)
– A client asks for a connection to the server. – The server listens to the client request. – The server sends its digital certificate to the client to authenticate it self along with its public key. – The client verifies the servers’ authentication. – If authenticated it creates a random session key and encrypts it with the server's public key. – Server decrypts session key using its private key and establishes a secure session. – Optionally authenticate the client to the server.

Copyright © 2004, Infosys 43 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

SSL – How it works – Animation (3 of 6)
SSL HandShaking - Step a

Internet
1. Client request

http://icicibank.com

2. Server sends the digital certificate, public key and the encryption algorithm preferences in response

Digital certificate

Server
Server's public key

List of symmetric encryption algorithms that the server supports

Copyright © 2004, Infosys 44 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

SSL – How it works – Animation (4 of 6)
SSL HandShaking - Step b
3. Client verifies the server's certificate with the certifying authority
http://icicibank.com

Digital certificate

4. Client chooses a particular encryption algorithm which both client and server supports List of symmetric encryption algorithms that the server supports

SSL HandShaking - Step c
5. Client creates a secret key using the encryption algorithm selected Secret Key Server's public key

http://icicibank.com

6. Client encrypts the secret key using server's public key

111@#gjsdu nn&1677%^1 000jhhdh11
Encrypted Secret key

Copyright © 2004, Infosys 45 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

SSL – How it works – Animation (5 of 6)

SSL HandShaking - Step d

http://icicibank.com

111@#gjsdu nn&1677%^1 000jhhdh11
Encrypted Secret key

7. The key thus generated and encrypted is send to the server

Secret Key

Internet

Server

Copyright © 2004, Infosys 46 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

SSL – How it works – Animation (6 of 6)

Secure Data Communication

Internet

Server

Client encrypts the request using the Secret key and sends it to server
http://icicibank.com

Secret Key

Secret Key Server decrypts using the same Secret key and sends the response to the client

Copyright © 2004, Infosys 47 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Digital Certificates
Digital signature:
Digital code attached to electronically transmitted message to uniquely identify contents and sender

Digital certificate:
Attachment to electronic message to verify the sender and to provide receiver with means to encode reply

certificate authority:
The certificate authority acts as an intermediary trusted by both the computers. It confirms that each computer, in fact, is who it claims to be, and then provides the public keys of each computer to the other.

Copyright © 2004, Infosys 48 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Testing Web Applications
Different Kinds of Tests
– – – – – Functionality Usability Reliability Performance (Response Time) Scalability

Copyright © 2004, Infosys 49 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Summary
• • • • • • • Basics of networking and Internet
– TCP/IP and networking, IP address, DHCP, DNS

World Wide Web
– Basics of the world wide web, Browser architecture

HTTP protocol
– Request and Response, GET, POST, HEAD methods

Web Servers
– Working of a web server, Dynamic content genertion

Load Balancing
– Scalability, Load Balancing schemes

Web Security
– Authentication, Authorization and auditing

Application Servers
– The Big Picture, Real world Scenario

Copyright © 2004, Infosys 50 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Appendix – Apache Web Server, IIS

Installing and configuring Apache Web Server
• • • • • • Download latest version of installable from http://www.apache.org (Available for many platforms – Select appropriate platform) If installing on windows, run installable Installer provides option to install as an NT Service or Stand-alone Complete Installation Options are provided to Start and Stop Apache Server in the Start Menu under the folder “Apache Web Server” The config file ‘httpd.conf’ is located under Apache’s installation folder Typical configuration parameters to look at
– – – – ServerRoot: Apache install folder DocumentRoot: Content folder ErrorLog: The Error Log file cgi-bin
Copyright © 2004, Infosys 52 Technologies Ltd ER/CORP/CRS/OS41/003 Version no: 2.0

• •

Installing IIS
• Windows 2000 and beyond
– Open Control Panel – Select Add/Remove Applications – Select Add/Remove Windows Components

• • •

Select “Internet Information Services (IIS)” check box if not yet selected If installed Already, IIS option will be selected already Click on “Next” button to start installation (Setup Program might ask for the Windows CD)

Copyright © 2004, Infosys 53 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Configuring IIS
• • • • • Complete Installation Open Control Panel Select Administrative Tools Start “Internet Information Services” icon The control panel applet for IIS shows up with the default web site configuration

Copyright © 2004, Infosys 54 Technologies Ltd

ER/CORP/CRS/OS41/003 Version no: 2.0

Thank You!
Copyright © 2004, Infosys 55 Technologies Ltd ER/CORP/CRS/OS41/003 Version no: 2.0

Sign up to vote on this title
UsefulNot useful