Table of contents

Features

Fragments and templates

Any web application can be used to provide html fragments or page templates for other applications. Fragments are defined using ESI tags.

Cross technologies

As we only use ESI tags inside application pages, an application may have been developed with any technology, including Java, PHP or .NET and work with ESIGate.

Reverse proxy

ESIGate works as a reverse-proxy and can be used to retrieve and cache static contents (images, css, js...) as well a dynamic contents.

Pages aggregator

ESIGate parses html pages and processes ESI instructions in order to merge pages, fragments from several applications.

ESI 1.0 specification support

ESIGate fully implements ESI specification and adds some useful custom extensions.

Xpath expressions and XSLT

Tools are also provided to retrieve and transform pages using regular expressions, xpath expressions and to apply on-the-fly XSLT transformations. Xpath and XSLT can be used even with malformed html documents.

HTTP 1.1 Cache

In order to improve performance, the tool uses a cache that fully implements HTTP 1.1 specification. In addition the cache is highly configurable to help you improve cache efficiency and overall performance of your web site.

User context and Single Sign On

Applications may have to share informations about connected users. ESIGate provided a CAS module supporting the proxy authentication mode and the JASIG CAS client. The extension mechanism lets you integrate other Single Sign On systems if needed.

Installation

ESIGate runs as a standard Java servlet filter and can be run in any Java servlet container like Tomcat, Jetty, WebSphere JBoss... If you are not familiar with Java servlet containers, you can use esigate-server that includes a pre-configured web application with an embedded Jetty server. Since Esigate 5.0, minimal Java version is 1.7.

If you are familiar with Java servlet-api, you may want to build your own web application using esigate-servlet as a dependency. With Maven:

<dependency>
	<groupId>org.esigate</groupId>
	<artifactId>esigate-servlet</artifactId>
	<version>RELEASE</version>
</dependency>
			

The servlet filter can be declared in WEB-INF/web.xml file this way:

	
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd">
<web-app>
	<filter>
		<filter-name>EsiGate</filter-name>
		<filter-class>org.esigate.servlet.ProxyFilter</filter-class>
	</filter>
	<filter-mapping>
		<filter-name>EsiGate</filter-name>
		<url-pattern>/*</url-pattern>
	</filter-mapping>
</web-app>

				

You can use esigate-war web application as an example.

esigate.properties

Then you have to configure esigate.properties which defines provider applications, mappings, network and caching parameters... All details in chapter Configuration

Here is an example configuration, with 4 providers using different mapping types :


# Sample 1 : process all urls	        
provider1.remoteUrlBase=http://host1/
provider1.mappings=/*

# Sample 2 : virtual host configuration
# Process requests made for myhost 
provider2.remoteUrlBase=http://host2/
provider2.mappings=http://myhost/* 

# Sample 3 : Process all php files
provider3.remoteUrlBase=http://host3/
provider3.mappings=*.php

# Sample 4 : Process all files in css and images directories
theme.remoteUrlBase=http://host4/
theme.mappings=/css/*, /images/*

# Sample 5 : Process all files in css and images directories and strip mapping path
theme.remoteUrlBase=http://host4/css-and-images/
theme.mappings=/css/*, /images/*
theme.stripMappingPath=true


	

Each application can use ESI tags in its html pages to include some fragments coming from the other.

Configuration

Configuration file

ESIGate default behavior is to load a configuration file from the classpath /esigate.properties

Alternatively you can use method org.esigate.DriverFactory.configure(Properties) .

Configuration directives

Directive Usage Mandatory Default value
remoteUrlBase Base URL of the remote application. Eg: remoteUrlBase=http://localhost:8080/ When load-balancing (remote application runs on several servers), use a comma-separated list. Yes -
mappings Paths mappings specifies the Urls pattern for wich the remote application should be called. Eg: mappings=/cms/* Use a comma-separated list to define several mappings for same provider.

Mapping is split in 3 parts :
  • Host, including the scheme and port : http://www.example:8080
  • path, left part before the wildcard caracter *
  • extension, right part after the wildcard caracter *
No -
stripMappingPath If enabled, the mapping path will be striped from incoming url before calling the remote application. Eg: mappings=/cms/*. With stripMappingPath=trueA request to http://esigate/cms/myresource will be proxied to http://remote/myresource With stripMappingPath=falseA request to http://esigate/cms/myresource will proxied to http://remote/cms/myresource No false 5.0
uriEncoding Charset used for encoding parameters in URI No ISO-8859-1
parsableContentTypes List of parsable content types.. Use this syntax to set content types : parsableContentTypes=text/html,application/xhtml+xml,text/plain No text/html,application/xhtml+xml
maxConnectionsPerHost Maximum number of HTTP connections simultaneously opened with 1 server. No 20
connectTimeout Defines the timeout while trying to establish a connection with the server. No 1000
socketTimeout Defines the timeout waiting for data once the connection to the server has been opened. No 10000
proxyHost Proxy host name or IP. The tool can work through an HTTP proxy server. No
proxyPort Proxy port No
proxyUser Username used by the driver for proxy authentication. Leave blank if no authentication is required by the proxy. No
proxyPassword Proxy password No
preserveHost Instructs send the request to the target server with the same Host header value as in the incoming request. This feature is very usefull when the target server uses virtual hosts. No true
cookieManager The cookieManager to use. Must be a class that implements org.esigate.cookie.CookieManager No org.esigate.cookie.DefaultCookieManager
discardCookies Comma separated list of the names of the cookies to ignore. By default, cookies are forwarded. You can use the value * to discard all cookies. No
storeCookiesInSession Comma separated list of the names of the cookies to store in the session on ESIGate side. By default, cookies are forwarded. You can use the value * to store all the cookies. Domain and path are rewritten in order to match the domain and path that are visible from the client. No
fixMode If "relative" the generated URLs will be relative to the root of the server (ie starting with "/"). If "absolute" generated URLs will be absolute (ie starting with "http://") No relative
visibleUrlBase The base URL to use while rewriting URLs for links or resources if different from remoteUrlBase No same value as remoteUrlBase
remoteUrlBaseStrategy The strategy to use if load balancing (ie remoteUrlBase has been defined as a comma-separated list). Value can be "roundrobin", "iphash" or "stickysession". See clustering for details No roundrobin
extensions A comma-separated list of extensions (class names). Extensions can register to events and customize standard behavior, for instance add logging or handle authentication. Extensions will be called in the same order as in this list. No org.esigate.extension.FragmentLogging, org.esigate.extension.FetchLogging, org.esigate.authentication.RemoteUserAuthenticationHandler, org.esigate.extension.Esi, org.esigate.extension.ResourceFixup, org.esigate.extension.XPoweredBy, org.esigate.extension.surrogate.Surrogate, org.esigate.extension.ConfigReloadOnChange
useCache Use the cache No true
maxCacheEntries Maximum number of entries in the cache (this parameter is only taken into account by the default implementation) No 1000
maxObjectSize Maximum size of a cache entry (bytes). If 0, no size limit. This directive can be used to avoid excessive memory usage No 1000000
cacheStorage Implementation of org.esigate.cache.CacheStorage to use. It can be one of these values:
  • org.esigate.cache.BasicCacheStorage
  • org.esigate.cache.EhcacheCacheStorage
  • org.esigate.cache.MemcachedCacheStorage
No org.esigate.cache.BasicCacheStorage
xCacheHeader Activates X-Cache header in HTTP responses (usefull to debug cache) No false
viaHeader Activates Via header in HTTP responses. No true
ttl Time to live (seconds) of any cached page. If 0, cache expiration will be calculated automatically depending on http response headers. If set to a non-zero value, the value will apply for all GET requests ignoring any Cache-control header! No 0
heuristicCachingEnabled Heuristic caching enabled (see Caching in HTTP) No true
heuristicCoefficient Heuristic coefficient. No 0.1
heuristicDefaultLifetimeSecs Default lifetime of a cache entry if there is absolutely no information about it in the http headers. No 0
staleWhileRevalidate If non 0, when we receive a request for which an entry is in the cache but this entry is stale, we will send immediately the stale cache entry and try to update it from the server for next time. The value indicates the maximum staleness of the cache entry. This strategy can reduce a lot the load on the target server as there is only 1 refresh request for a cache entry at the same time. No 0
staleIfError If non 0, when we receive an error from the target server, we will try to use the corresponding cache entry even if it is stale. The value indicates the maximum staleness of the cache entry. No 0
minAsynchronousWorkers Minimum number of threads processing background revalidations. No 0
maxAsynchronousWorkers Maximum number of threads processing background revalidations. Set this parameter to 0 in order to deactivate background revalidation. No 0
asynchronousWorkerIdleLifetimeSecs Maximum idle lifetime for a background revalidation thread before it gets reclaimed. No 60
maxUpdateRetries number of retries on a failed cache update. No 1
revalidationQueueSize Maximum number of requests in the revalidation queue. No
ehcache.cacheName Name of the EhCache No esigate
ehcache.configurationFile Xml configuration file for EhCache (loaded via the classloader of the application). No /ehcache.xml
memcached.servers Comma separated list of MemCached servers and ports. Syntax: server1:port1,server2:port2 No

Local and cross-context providers

You can use local resources or resources from another web application deployed on the same server by using extension org.esigate.servlet.ServletExtension

Advantages:

  • Use local resources and apply to them esi transformation
  • Use resources from another web application without network calls
# In this example, we use 2 providers: local and crosscontext
# Local context. Local resources may contain esi tags
local.remoteUrlBase=http://localhost:8080/esigate/
local.extensions=org.esigate.servlet.ServletExtension,org.esigate.extension.Esi
local.mappings=*

# Another web application deployed in the same servlet container
crosscontext.remoteUrlBase=http://localhost:8080/myWebapp/
crosscontext.extensions=org.esigate.servlet.ServletExtension,org.esigate.extension.Esi
crosscontext.context=/myWebapp
crosscontext.mappings=/crosscontext/*

Technically, local and cross-context call rely on the following methods:

Call type Proxy Include
Local filterChain.doFilter(request, response) request.getRequestDispatcher(url).include(request, response)
Cross-context servletContext().getContext(context).getRequestDispatcher(url).forward(request, response) servletContext().getContext(context).getRequestDispatcher(url).include(request, response)

Note: cross-context has to be enabled in the context configuration. For example on Tomcat context xml file:

<Context docBase="esigate" path="" crossContext="true"/>
			

Note: local and crosscontext providers do not support background revalidation.

Variables resolver

You can define variables in classpath esigate-vars.properties

variable_name=variable_value
someUrl=/cms/article123

The syntax for using a variable is $(variable_name) and is fixed by the standard ESI.

How URLs are mapped and rewritten

Mapping requests

When a request is received, the ProxyFilter first has to find the provider application to which to forward this request.

  1. remove the context path
  2. try to match to one of the different mappings declared in the configuration. If several mappings are matching, the longest will be selected in priority
  3. if stripMappingPath="true" removes the path part of the mapping

Ex: if the application is deployed under path "/myapplication" and the mappings for a provider application is "/mypath/*.jsp" then we receive a request "http://server/myapplication/mypath/other/test.jsp" EsiGate will map the request to the provider and transform the URL to a relative URL "/mypath/other/test.jsp"

Includes

The situation is more simple for includes as the provider is explicitely declared. Ex: <esi:include src="$(PROVIDER{myprovider})/test.jsp"/> will be mapped to "myprovider" and URL transformed to "/test/jsp"

Routing requests to the provider applications

Now EsiGate has to build the URL to send the request to. First we have to get the base URL of the server and append the relative URL found in the previous step. Ex: if remoteUrlBase=http://localhost:8080/ the previous example's request "http://server/myapplication/mypath/other/test.jsp" transformed to "/mypath/other/test.jsp" will be rewritten to "http://localhost:8080/mypath/other/test.jsp".

If preserveHost=true, the request will still be sent to localhost:8080 but the host header will contain "server" and the URL will become "http://server/mypath/other/test.jsp".

Note: If load-balancing, the URLs are rewritten according to the target host that is selected according to the balancing algorithm.

Rewriting URLs in responses

All the URLs contained in the responses are automatically rewritten. This includes:

  • URLs contained in pages (<a href="... scripts, images, javascripts, links)
  • Redirections (location header)
  • Cookies path and domain
  • Other HTTP headers (Refresh, Content-location, Referer)

The rewriting takes into account the original URL in order to do the rewriting in the other way. Ex: a link <a href="http://localhost:8080/mypath/result.jsp"> will be rewritten back to <a href="http://server/myapplication/mypath/result.jsp">.

Depending on parameter fixMode, all URL are rewritten as absolute URLs ex: "http://server/myapplication/mypath/result.jsp" or relative to the server root ex: "/myapplication/mypath/result.jsp". In both cases, the URL is normalized (expressions containing dots are removed) ex: "../test.jsp" will be replaced by "http://server/myapplication/test.jsp").

visibleUrlBase parameter adds a possibility to force the base URL to rewrite with instead of the one from the incoming request

ESI syntax

Edge Side Include

ESIGate fully implements ESI 1.0 Language Specification 1.0 and adds also a few extra (but useful) features. Here is the reference list of all supported tags, attributes and variable expressions.

<esi:include>

This tag specifies to include some part of another web page. Here is basic example:

<esi:include src="$(PROVIDER{cms})/news" fragment="news_1"/>

This tag has several other attributes that enable to have a precise control on what you want to retrieve from the page, transform, cache, handle errors... See the reference table bellow for details. src should start with $(PROVIDER, any character before will be ignored

<esi:replace>

This tag can be used only nested inside an include tag and is used to specifies things to replace inside the included fragment.

<esi:include src="...">
    <esi:replace fragment="my_fragment"> Replacement text</esi:replace>
</esi:include>

This tag has several other attributes that enable to have a precise control on what you want to retrieve from the page, transform, cache, handle errors... See the reference table bellow for details.

<esi:fragment>

Delimits a fragment inside a page. This fragment could be fetched or retrieved by another page.

<esi:fragment name="my_fragment">
   Content of the fragment
</esi:fragment>

<esi:try> <esi:attempt> <esi:except>

Enables to handle http errors like 404 or 500.

<esi:try>
   <esi:attempt> ... </esi:attempt>
   <esi:except code="500"> ... </esi:except>
</esi:try>

<esi:choose> <esi:when> <esi:otherwise>

Defines conditional structures.

<esi:choose>
   <esi:when test="..."> ... </esi:when>
   <esi:when test="..."> ... </esi:when>
   <esi:otherwise> ... </esi:otherwise>
</esi:choose>
			

<esi:inline>

Defines fragments that will be stored separately in the cache in order to be reused later. See ESI 1.0 Language Specification 1.0 for details.

<esi:comment>

A comment that appears in the source code of the page but will be removed after ESI processing.

<esi:comment text="This comment will not be sent to the client" />

<esi:remove>

Almost like comment tag. The html nested inside remove tag will be visible inside the page before processing but will be removed by the processing. This tag is very useful for example when you are using a page as a template and you want to see it with sample contents when it has not been ESI-processed.

<esi:remove>
   <strong>This is a sample text that will be removed</strong>
</esi:remove>
			

<!--esi -->

Exactly the opposite of remove tag. Before ESI processing the content of the tag will be seen by the browser as an html comment but after inclusion the tag itself will be removed and the content will become visible.

<!--esi
   <strong>This page has been processed by an ESI processor!</strong>
-->
			

<esi:vars>

Some variable expressions can be used inside esi tag attribute values. With this tag you can use expressions anywhere in your page, you just have to put a vars tag around the part of the page that may contain expressions.

<esi:vars>
   The user-agent of your browser is: $(HTTP_USER_AGENT)
</esi:vars>
			

Tag reference

Tag Attribute Usage Examples ESI 1.0 ESIGate 4 Akamai Edgesuite 5* Varnish 3*
<esi:include> Include a part of another page or the complete page <esi:include src="URI" /> Yes Yes Yes Yes
src Url of the page to include <esi:include src=" $(PROVIDER...)URI" /> Yes Yes Yes Yes
fragment Name of the fragment to retrieve <esi:include src="URI" fragment="NAME" />   Yes    
alt Alternative if src cannot be fetched <esi:include src="URI" alt="URI" /> Yes Yes Yes  
onerror If "continue" the processor will ignore if an error occurs (http code > 400). If "display" will display the error page retrieved. <esi:include src="URI" onerror="display" /> Yes Yes Yes  
stylesheet xsl stylesheet to apply to the ressource (works with xml and html, stylesheet should be searched first as a local resource and only if not found, on the remote server) <esi:include src="a.html" stylesheet="s.xsl">...</esi:include>   Yes Yes  
xpath xpath expression to retrieve <esi:include src="..." xpath="..." />   Yes    
<esi:replace> Replace some part of the included ressource   Yes    
fragment Name of the fragment to replace <esi:replace fragment="NAME">...</esi:replace>   Yes    
expression Regular expression to replace. See regular expression syntax <esi:replace expression="$(HTTP_HOST)">www.my_host.com</esi:replace>   yes    
<esi:fragment> Delimits a fragment <esi:fragment name="my_fragment"/>   Yes    
name name of the fragment   Yes    
<esi:try> Try block <esi:try>
  <esi:attempt>...</esi:attempt>
  <esi:except code="404">...</esi:except>
  <esi:except code="500">...</esi:except>
  <esi:except>...</esi:except>
<:esi:try>

If multiple except tags are present, only the first matching one is used. As a result, except tags with the code attribute should appear first.
Yes Yes Yes  
<esi:attempt> The part to try to execute Yes Yes Yes  
<esi:except> A block that will replace the attempt block in case an exception occurs Yes Yes Yes  
code Http return code that trigger this except block   Yes    
<esi:choose> Conditional block <esi:choose> <esi:when test="..."> ... </esi:when> <esi:when test="..."> ... </esi:when> <esi:otherwise> ... </esi:otherwise> </esi:choose> Yes Yes Yes  
<esi:when> Condition Yes Yes Yes  
test Expression to evaluate Yes Yes Yes  
<esi:otherwise> Fallback if none of the previous conditions has been fullfilled Yes Yes Yes  
<esi:inline> An fragment that will be stored independently in the cache and fetched <esi:inline name="URI" fetchable="{yes no}">...</esi:inline> Yes Yes Yes  
name name of the fragment Yes Yes Yes  
fetchable Whether the fragment is independantly fetchable by name or not. Yes Yes Yes  
<esi:comment> A comment that will be removed by the processor <esi:comment text="..." /> Yes Yes Yes  
<esi:remove> A page fragment that will be removed by the processor <esi:remove> ... </esi:remove> Yes Yes Yes Yes
<!--esi--> A html-commented fragment that will be uncommented by the processor <!--esi ...--> Yes Yes Yes Yes
<esi:vars> Delimits a fragment that may contain expressions to evaluate <esi:vars> ... </esi:vars> Yes Yes Yes  

(If you find a mistake in this table, feel free to contact us.)

Please note that this table lists ESI 1.0 and Esigate additional tags only. Other implementations may have other additional tags (Akamai Edgesuite has).

Variable reference

Variable types

Type Syntax
String $(VAR) : value
List $(VAR) : complete list
$(VAR{item}) : true if item exists in the list, false otherwise
Dictionnary $(VAR) : complete list
$(VAR{key}) : value of item key or an empty string if not found.
$(VAR{key}|'default value') : value of item key or default value if not found.

Variables

Variable Name Type Example ESI 1.0
HTTP_ACCEPT_LANGUAGE list da, en-gb, en Yes
HTTP_COOKIE dictionary id=571; visits=42 Yes
HTTP_HOST String esi.xyz.com Yes
HTTP_REFERER String http://roberts.xyz.com/ Yes
HTTP_USER_AGENT dictionary Mozilla; MSIE 5.5 Yes
QUERY_STRING dictionary first=Robin&last=Roberts Yes
PROVIDER dictionary http://provider.com No

Functions

Function Name Return type Example ESI 1.0
exists boolean <esi:choose><esi:when test="$exists($(HTTP_COOKIE{'username'}))">[Take some action]</esi:when></esi:choose> No

For the complete list of variable and expressions that can be used, see §4 in the ESI 1.0 Language Specification 1.0 Note: the expression "PROVIDER" is a specific ESIGate expression useful to externalize the base Url of provider applications inside the configuration file. All the other expressions supported by ESIGate are the ones defined inside the specification.

Cache

Cache configuration

ESIGate uses the HttpClient Cache Apache HttpClient since version 4.1. This cache is compliant with HTTP/1.1 specification The cache can use several alternatives as storage, in addition to its native backend where cache entries are kept in memory there are 2 other implementations with EhCache and MemCached All the parameters for the cache including the choice of the backend are set in the main configuration file described in the configuration section.

Content expiration, heuristic expiration, forced expiration

By default the cache uses Cache-control and Expires headers to check if a response is cacheable and how long.

If the headers contains only a Last-modified header with no other header to define when the response expires, the cache will keep the response a for a duration that will be a fraction of its age, by default 10%. This mechanism is known as "Heuristic expiration" and is well described in HTTP 1.1 specification.

In certain cases you may want to force the time-to-live of all the responses coming from a server regardless of the http headers presents in the responses. This can be done using the parameter "ttl" in the configuration.

Variants, E-tag and Vary headers

The cache will store several response variants depending on E-tag and Vary headers. This strategy can be very effective when some contents depend on the user profile or language.

Cache revalidation

When a server response contains E-tag and/or Last-modified header, the cache will use conditional requests using If-none-match and/or If-modified-since request headers for subsequent request to revalidate the cache entries without having to reload them each time.

Background revalidation

The cache also implements the HTTP Cache-Control Extensions for Stale Content specification. This means that the cache receives a request and already has a response to be faster and then revalidate it in order to have an up-to-date response next time. It can also use the stale cache entry when the target server is not responding.

According to the specification, this behavior depends only on the headers Stale-while-revalidate and Stale-if-error defined by the target server, but in ESIGate you can also set these parameters by default for all cacheable responses.

Http headers support

Request headers

Some request headers are forwarded to the target server, some are ignored and some are transformed. Here is the list of supported Request HTTP headers. Any other header would be ignored.

Request header Action
Accept Forwarded
Accept-Charset Forwarded
Accept-Encoding Forwarded
Accept-Language Forwarded
Authorization Forwarded
Cache-control Forwarded
Connection Not forwarded. The value will always be "keep-alive" as managed by the HTTP client
Content-Encoding Forwarded
Content-Language Forwarded
Content-Length Not forwarded. Managed automatically by the HTTP client. Usually chunked content encoding will be used (see HttpEntity.setChunked
Content-MD5 Forwarded
Content-Range Forwarded
Content-Type Forwarded
Cookie Depends on cookie configuration. Rewritten and forwarded by default (see Cookies)
Date Forwarded
Expect Not supported: replies with a 417 (Expectation Failed) as required by HTTP specification
From Forwarded
Host Forwarded by default, depends on preserveHost parameter
If-Match All cache validator are recalculated by the cache (see Cache)
If-Modified-Since All cache validator are recalculated by the cache (see Cache)
If-None-Match All cache validator are recalculated by the cache (see Cache)
If-Range All cache validator are recalculated by the cache (see Cache)
If-Unmodified-Since All cache validator are recalculated by the cache (see Cache)
Max-Forwards Not forwarded
Pragma Ignored (see Cache)
Proxy-Authorization Not forwarded
Range Forwarded
Referer Rewritten
TE Not forwarded (chunked encoding managed by the container)
Trailer Not forwarded (chunked encoding managed by the container)
Transfer-Encoding Not forwarded (chunked encoding managed by the container)
Upgrade Not forwarded
User-Agent Forwarded
Warning Forwarded
X-Forwarded-For Forwarded or created if not present.
X-Forwarded-Proto Can be "http" or "https", set to the scheme in the original request uri

Response headers

Some of the response headers sent by the target servers are forwarded to the browser, some are ignored, some are transformed and some are used for Cache management. Any header not in the following list would be ignored.

Response header Action
Age Recalculated by the cache
Allow Forwarded
Cache-control Forwarded
Connection Not forwarded (keep-alive managed by the servlet container)
Content-Disposition Forwarded
Content-Encoding Forwarded only if the entity has not been decompressed (if we have to transform it, we have to decompress it)
Content-Language Forwarded
Content-Length Recalculated by the servlet container (usually set if content length is less than the buffer size)
Content-Location Rewritten
Content-MD5 Not forwarded
Content-Range Forwarded
Content-Type Forwarded
Date Set automatically by the servlet container
Expires Forwarded
E-tag Forwarded
Keep-Alive Not forwarded (managed by the servlet container)
Last-modified Forwarded
Location Rewritten
Link Rewritten
P3p Rewritten
Proxy-Authenticate Not forwarded, see authentication
Refresh Forwarded
Retry-After Forwarded
Server Forwarded
Set-Cookie Depends on cookie configuration. By default rewritten and forwarded (see Cookies)
Trailer Not forwarded (chunked encoding managed by the container)
Transfer-Encoding Not forwarded (chunked encoding managed by the container)
Vary Forwarded
Via Set by Http caching client.
Warning Forwarded
WWW-Authenticate Forwarded

Cookies

Cookie policy and cookie specifications

ESIGate is designed to behave exactly like any web browser. It should accept any cookie coming from provider applications if a standard browser accepts them.

Cookies are checked against cookie specifications. Note that "path" and "domain" attributes are checked against the target server name and path. Target Host name means the name specified in the "host" header of the request sent to the server. In other words, when preserveHost option is set to true, domain must match the original server name used by the browser ; when preserveHost is set to false, cookie domain must match the server name defined in the baseURL parameter in the configuration.

Cookie storing

By default, cookies sent by the browser are forwarded to target applications and cookies sent back by the server are forwarded to the browser. It is also possible to store them in the session on EsiGate side: they are kept server side into a the user context. Of course, every user has a separate user context.

Every driver instance has its own UserContext. Contexts are insulated. This means that cookies sent to one provider are not shared with other sites, even domain cookie. Just like if we were using separate browsers for each provider.

As a result, the only cookie viewed from the browser is the "jsessionid" corresponding to the user session used by ESIGate to store the user context.

Cookie forwarding, cookie storing, cookie discarding

It is possible to configure ESIGate to store all cookies or specific cookies to the session and also discard all or specific cookies. This is done by using the properties "storeCookiesInSession" and "discardCookies" in the configuration file. Values for these properties should be a comma separated list of cookie names or the single value "*".

Example 1: store cookies named "cookie1" and "cookie2", discard cookie name "cookie3", all other cookies will be forwarded (default behavior).

storeCookiesInSession=cookie1,cookie2
discardCookies=cookie3
					

Example 2: store cookies named "cookie1" and "cookie2", discard all other cookies.

storeCookiesInSession=cookie1,cookie2
discardCookies=*
					

Note: if the name of a cookie is present in storeCookiesInSession, it takes priority over "*" in discardCookies

Note: if the name of a cookie is present in discardCookies, it takes priority over "*" in storeCookiesInSession

Note: using value "*" for both storeCookiesInSession and discardCookies is not allowed.

Note: forwarded cookies are not kept server side, they are forwarded both ways, server to browser when the header "set-cookie" is received, browser to server on every request.

Cookie rewriting

For forwarded cookie, the domain and path of the cookie and secure or not cannot be kept as-is because a cookie that would not match its originating domain should be rejected by the browser.

Domain:

  • if domain is the server name, it is converted to esigate server name used in the request
  • if domain is more general domain, esigate tries to convert it to a domain matching the request domain

Path:

  • the path is rewritten to the biggest matching path in the url

Secure:

  • cookie is set secure only if server sent it as secure and scheme is https, in all other cases, cookie est set not secure when sent to browser

Authentication and SSO

Authentication

User authentication consists in two different things :

  1. authenticating user requests
  2. forwarding user information to the providers

Authentication is managed by extensions in ESIGate 4.x. In older versions, authentication was handled by implementations of the AuthenticationHandler interface.

To ease update from a previous ESIGate version or simply for inspiration, an adapter between AuthenticationHandler and Extension is provided : GenericAuthenticationHandler

Extensions are declared for each provider. This means that ESIGate can get content from applications using different authentication systems.

Default authentication handler

The default AuthenticationHandler is RemoteUserAuthenticationHandler implementation. This implementation tries to retrieve the user by calling the method request.getRemoteUser() from the container. If this method returns a user name, the name is forwarded to the provider in the "X_REMOTE_USER" HTTP header.

CAS authentication

There is also an AuthenticationHandler for CAS Single Sign On. See CAS for more information.

Custom authentication

Any other authentication method can be implementing by writing a class that implements Extension or extends GenericAuthenticationHandler.

Errors management

While retrieving a page or resource from a distant server, a lot of things can occur. ESIGate provides solutions to handle properly this kind of problems.

Http errors

Any HTTP status code different from the following codes is considered as an error :

  • 200 OK
  • 301 Moved permanently
  • 302 Found
  • 304 Not modified

Network errors and timeout

There are several kind of network problems for example :

  • DNS errors (host name cannot be resolved)
  • Connection refused (can be a wrong port number or a firewall problem)
  • Broken pipe (host closed connection unexpectedly)
  • ...

In addition, for performance issues, ESIGate cannot wait indefinitely for the target server to answer. That is the reason for the "timeout" configuration parameter. This parameter is used in 2 cases :

  • Connection timeout (trying to connect to the server but the server does not answer)
  • Socket timeout (the connection has been established, the server may have started sending the response but no packet has been received for too much time)

All these problems are handled like Http errors with the following codes :

Problem type Http status Http message
Connection refused 502 Bad Gateway
Connection pool timeout (all the http connections to current host are busy) 504 Gateway timeout
Connect timeout 504 Gateway timeout
Socket timeout 504 Gateway timeout
Error retrieving URL (any other error) 500 Internal server error

Handling errors with ESIGate

In case of an error, there are several possible ways to handle the problem :

  • Display the error page from the target server in case of an HTTP error
  • Display an error page in case of other errors (application server default error page with stacktrace or custom error page)
  • Display a simple error message (status code + status text)
  • Display a generic error message or just nothing

Depending on where we are, we cannot do anything : for example, while rendering a block inside a page, if a problem occurs but the response is already commited, we can only display a simplified message or generic message because rendering the complete error page may result in a page totally broken.

Cache and errors

According to HTTP specifications, a HTTP compliant cache system should never cache pages with status code other than 200 OK. In ESIGate, it was decided to do exactly the contrary :

error pages are always cached.

This behaviour is designed to avoid performance issues, when there is an error on a resource, there is no reason to continue asking for the same resource again and again. ESIGate will keep the response in cache and try a new request only when the cache entry has expired.

Load balancing

For better performance and availability, it is possible to work with several servers for each provider application. As ESIGate can handle the load-balancing, you will not need any extra software nor hardware. Basically, load-balancing is simple, you just have to define in property 'remoteUrlBase' a comma-separated list of backend servers. E.g.:

default.remoteUrlBase=http://example.com:8080/,http://example2.com:8080/

There are 3 strategies to retrieve backed server url for current request:

  • roundrobin - rotate backend urls for every request (this is the default value)
  • iphash - for the same remote client ip always will return the same backend url
  • stickysession - add to client browser cookie with id of backend server url, so for all requests from this browser esigate will use the same backend url

Strategy can be defined in property 'remoteUrlBaseStrategy', if this property is not defined, by default is used 'roundrobin' strategy.

Extending and customizing

ESIGate 4.0 introduces Extensions and Events . These are an easy way to customize ESIGate behavior : remove unused features and add user-defined functions.

Extensions

Extensions are simply classes which implement the Extension interface .

They are loaded at startup by ESIGate according to the " extensions " configuration directive . Each provider can use a different set of extensions.

In the init() method, an extension will usually read configuration and register to events.

Events

Events are hooks on ESIGate's request processing, allowing to safely insert custom code at every step.

Using this extensions can :

  • handle security or login in remote applications by doing more http calls to correctly provide credentials when access is refused.
  • alter html content or headers before a request result is used or even put into the cache.
  • change cache ttl based on urls .
  • rewrite links in headers or in the html body.
  • update or remove cookies.
  • send additional headers to client or to backends.
  • cancel requests.
  • do custom logging.
  • ...and much more

Extensions can register an event listener to events using the Event Manager : Driver#getEventManager().register()

The following events are supported :

  • Proxy events : ESIGate process an incoming request (ESIGate configured as a proxy).
    • EVENT_PROXY_PRE : before processing an incoming request.
    • EVENT_PROXY_POST : after processing an incoming request.
  • Fragment events : A fragment is required for inclusion (esi:include). ESIGate will try to use its cache or fallback to an http call to the remote backend.
    • EVENT_FRAGMENT_PRE : before retrieving a fragment.
    • EVENT_FRAGMENT_POST : after retrieving a fragment.
  • Fetch events : An HTTP call is made to a remote backend.
    • EVENT_FETCH_PRE : before creating the HTTP call.
    • EVENT_FETCH_POST : after we receive the response.
  • Render events : Renderers are applied to the current page. This event can be used to inject additional renderers.
    • EVENT_RENDER_PRE : before applying renderers
    • EVENT_RENDER_POST : after applying renderers
  • Read entity event : response is read using the charset declared by HTTP headers.
    • EVENT_READ_ENTITY : after reading response using the default encoding

Available extensions

ESIGate comes with several existing extensions :

Class name Description Default From
org.esigate.extension.FetchLogging log http calls to remote backends, including target host, url, status code, request and response headers. Yes 4.0
org.esigate.extension.FragmentLogging log the use of http fragments (requests to the cache) including request and response headers and cache use (hit/miss/validated). Yes 4.0
org.esigate.extension.ResourceFixup if enabled by configuration directives (fixResources, fixMode, visibleUrlBase), rewrites html content to ensure links points directly to the remote backend. This should be used when Esigate is embedded in an application or to ensure all application links go through esigate. Yes 4.0
org.esigate.authentication.RemoteUserAuthenticationHandler sends current user id as an http request header (X_REMOTE_USER). Yes 4.0
org.esigate.authentication.CasAuthenticationHandler handles backends requiring authentication on CAS SSO. No 4.0
org.esigate.authentication.RequestAuthenticationHandler sends selected session attributes and request attributes as http request headers. No 4.0
org.esigate.extension.ForwardOriginalUrl sends original request url as received by esigate as http request header (X-Esigate-Request). No 4.0
org.esigate.extension.XPoweredBy adds
X-Powered-By: Esigate
in response.
Yes 4.0
org.esigate.extension.DefaultCharset use a custom defaut charset instead of ISO-8859-1 when no charset information is available in HTTP headers. Use providerid.defaultCharset parameter. No 4.1
org.esigate.extension.HtmlEncodingProcessor Read HTML documents (text/html and application/xhmtl+xml) meta tags to get the right charset. No 5.0
org.esigate.extension.Esi This extension processes ESI directives, like :
<esi:include src="$(PROVIDER{cms})/news" fragment="news_1"/>
Yes 5.0
org.esigate.extension.Aggregate (deprecated) This extension processes the old esigate directives based on html comments, like :
<!--$includeblock$aggregated2$block.html$myblock$-->

see : http://www.esigate.org/html-comments.html for complete syntax.
Yes 5.0
org.esigate.extension.ConfigReloadOnChange This extension reloads configuration when esigate.properties is updated.
This only works on configuration defined using "esigate.config" system property.
This extension is not intended to use in production.
No 5.0
org.esigate.extension.ConfigReloadOnHup This extension reloads configuration when signal HUP is received. On POSIX systems, this signal is sent using :
kill -1 <esigatepid>

This only works on configuration defined using "esigate.config" system property.
This class relies on the sun.misc package and may not work on all JVM.
No 5.0
org.esigate.extension.Metric This extension will record proxy request, and backend request to generate driver statistics.
Metric extension will count occurrences of backend request(fetch-post) and driver proxy request (proxy-post) events to generate statistics about rate of event per seconds. Statistic will display the mean throughput and the average throughputs during the last one, five, and fifteen-minute.
Errors events are maintained in separated counters for each status code.

Statistics are logged using SLF4J in INFO level every 60 seconds. Period can be configured in driver properties :
metricPeriod=60

Sample statistics logs :
 Metric.aggregated1.org.esigate.fetch-post, count=13, mean_rate=0.28, m1=0.86, m5=1.41, m15=1.53, rate_unit=events/second
 Metric.aggregated1.org.esigate.fetch-post.error.404, count=32, mean_rate=0.60, m1=3.02, m5=5.50, m15=6.08, rate_unit=events/second
 Metric.aggregated1.org.esigate.proxy-post, count=31, mean_rate=0.68, m1=1.04, m5=1.46, m15=1.55, rate_unit=events/second
 Metric.aggregated1.org.esigate.proxy-post.error.404, count=32, mean_rate=0.60, m1=3.02, m5=5.50, m15=6.08, rate_unit=events/second
 Metric.aggregated2.org.esigate.fetch-post, count=59, mean_rate=1.37, m1=6.17, m5=9.80, m15=10.58, rate_unit=events/second
                            
In this example, proxy-post for aggregated1 driver show 31 successfull request , and 32 errors with 404 status code
No 5.0
org.esigate.extension.http.DNS DNS extension allow to associates IP addresses to the given host in a DNS overrider. The IP addresses are assumed to be already resolved. With this extension, you can force an arbitrary Host header value to be sent to the remote app without any modification of the DNS or network alias in /etc/hosts.
Sample configuration :
								dns.remoteUrlBase=http://myPrivateVirtualHost:8080/
								dns.remoteIP=172.17.42.1
							
No 5.2

ESIGate users can add custom extensions packaged in a jar or simply compiled classes depending on the way ESIGate is used (standalone server or library).

How to debug HTTP requests/responses?

FragmentLogging and FetchLogging extensions

2 extensions can be used to debug all the requests/responses (with request/response headers, cookies and status code) to the logs:

  • org.esigate.extension.FragmentLogging
  • org.esigate.extension.FetchLogging

The difference between these 2 extension is that FetchLogging logs only the requests actually sent to the target server when FragmentLogging also logs the responses served from the cache and gives some details about the cache status.

If you did not set "extensions" parameter in esigate.properties file, these extensions are already active. If you have set "extensions" parameter, you will have to explicitly declare them. Then you just need to set the corresponding category to INFO. For example if you are using log4j:

log4j.rootLogger=WARN, A

log4j.appender.A.threshold=TRACE
log4j.appender.A=org.apache.log4j.ConsoleAppender
log4j.appender.A.layout=org.apache.log4j.PatternLayout
log4j.appender.A.layout.ConversionPattern=%d{dd-MM HH:mm:ss} %-8r [%t] %-5p %c %x - %m%n

#log4j.category.org.esigate.extension.FetchLogging=INFO
log4j.category.org.esigate.extension.FragmentLogging=INFO

The logs are compact (every request generates only 1 line in the logs) and the performance overhead is small so these extensions can be used safely in production if needed.

HttpClient wire logging

If you need even more details, you can use HttpClient wire logging. This way you can get all the information you need but log files are going to be very verbose so it is not recommended to use it in production. Also because every request generates a lot of lines in the log, it becomes very hard to analyze if the application is used concurrently by a lot of users.

API documentation

Any functionality used in the taglib, reverse proxy or aggregator can be used through the API for more specific purpose. All functionalities are implemented in class org.esigate.Driver

See the Javadoc for more information

Other features: Taglib, JSF, Wicket, html comments-based syntax

Taglib

The JSP taglib, JSF and Wicket components have been removed since version 5.0 and replaced by a servlet filter that will benefit of the full ESI syntax and will be compatible with all presentation frameworks (JSP, JSF, Wicket...).

If you plan to migrate an application developped using the taglib please refer to the upgrade documentation.

Html comments-based syntax

In addition to ESI language, ESIGate also supports a syntax based on html comments though the functionalities are more limited

See the html comments-based syntax

comments powered by Disqus