Nov 21 09:27:31 localhost haproxy[2169]: 217.232.135.126:43837 10.50.14.244:35693 [21/Nov/2012:09:27:24.304] ConCardisPL ConCardisPL/aflpve100210 1/1/6941 209 -- 0/0/0/0/0 0/0 Field Format Extract from the example above 1 process_name '[' pid ']:' haproxy[2169]: 2 client_ip ':' client_port 217.232.135.126:43837 3 backend_source_ip ':' backend_source_port 10.50.14.244:35693 4 '[' accept_date ']' [21/Nov/2012:09:27:24.304] 5 frontend_name ConCardisPL 6 backend_name '/' server_name ConCardisPL/aflpve100210 7 Tw '/' Tc '/' Tt* 1/1/6941 8 bytes_read* 209 9 termination_state -- 10 actconn '/' feconn '/' beconn '/' srv_conn '/' retries* 0/0/0/0/0 11 srv_queue '/' backend_queue 0/0 Detailed fields description : - "backend_source_ip" is the IP address of haproxy while connecting to the backend. - "backend_source_port" is the TCP port of haproxy while connecting to the backend. - "client_ip" is the IP address of the client which initiated the TCP connection to haproxy. If the connection was accepted on a UNIX socket instead, the IP address would be replaced with the word "unix". Note that when the connection is accepted on a socket configured with "accept-proxy" and the PROXY protocol is correctly used, then the logs will reflect the forwarded connection's information. - "client_port" is the TCP port of the client which initiated the connection. If the connection was accepted on a UNIX socket instead, the port would be replaced with the ID of the accepting socket, which is also reported in the stats interface. - "accept_date" is the exact date when the connection was received by haproxy (which might be very slightly different from the date observed on the network if there was some queuing in the system's backlog). This is usually the same date which may appear in any upstream firewall's log. - "frontend_name" is the name of the frontend (or listener) which received and processed the connection. - "backend_name" is the name of the backend (or listener) which was selected to manage the connection to the server. This will be the same as the frontend if no switching rule has been applied, which is common for TCP applications. - "server_name" is the name of the last server to which the connection was sent, which might differ from the first one if there were connection errors and a redispatch occurred. Note that this server belongs to the backend which processed the request. If the connection was aborted before reaching a server, "" is indicated instead of a server name. - "Tw" is the total time in milliseconds spent waiting in the various queues. It can be "-1" if the connection was aborted before reaching the queue. See "Timers" below for more details. - "Tc" is the total time in milliseconds spent waiting for the connection to establish to the final server, including retries. It can be "-1" if the connection was aborted before a connection could be established. See "Timers" below for more details. - "Tt" is the total time in milliseconds elapsed between the accept and the last close. It covers all possible processings. There is one exception, if "option logasap" was specified, then the time counting stops at the moment the log is emitted. In this case, a '+' sign is prepended before the value, indicating that the final one will be larger. See "Timers" below for more details. - "bytes_read" is the total number of bytes transmitted from the server to the client when the log is emitted. If "option logasap" is specified, the this value will be prefixed with a '+' sign indicating that the final one may be larger. Please note that this value is a 64-bit counter, so log analysis tools must be able to handle it without overflowing. - "termination_state" is the condition the session was in when the session ended. This indicates the session state, which side caused the end of session to happen, and for what reason (timeout, error, ...). The normal flags should be "--", indicating the session was closed by either end with no data remaining in buffers. See below "Session state at disconnection" for more details. - "actconn" is the total number of concurrent connections on the process when the session was logged. It is useful to detect when some per-process system limits have been reached. For instance, if actconn is close to 512 when multiple connection errors occur, chances are high that the system limits the process to use a maximum of 1024 file descriptors and that all of them are used. See section 3 "Global parameters" to find how to tune the system. - "feconn" is the total number of concurrent connections on the frontend when the session was logged. It is useful to estimate the amount of resource required to sustain high loads, and to detect when the frontend's "maxconn" has been reached. Most often when this value increases by huge jumps, it is because there is congestion on the backend servers, but sometimes it can be caused by a denial of service attack. - "beconn" is the total number of concurrent connections handled by the backend when the session was logged. It includes the total number of concurrent connections active on servers as well as the number of connections pending in queues. It is useful to estimate the amount of additional servers needed to support high loads for a given application. Most often when this value increases by huge jumps, it is because there is congestion on the backend servers, but sometimes it can be caused by a denial of service attack. - "srv_conn" is the total number of concurrent connections still active on the server when the session was logged. It can never exceed the server's configured "maxconn" parameter. If this value is very often close or equal to the server's "maxconn", it means that traffic regulation is involved a lot, meaning that either the server's maxconn value is too low, or that there aren't enough servers to process the load with an optimal response time. When only one of the server's "srv_conn" is high, it usually means that this server has some trouble causing the connections to take longer to be processed than on other servers. - "retries" is the number of connection retries experienced by this session when trying to connect to the server. It must normally be zero, unless a server is being stopped at the same moment the connection was attempted. Frequent retries generally indicate either a network problem between haproxy and the server, or a misconfigured system backlog on the server preventing new connections from being queued. This field may optionally be prefixed with a '+' sign, indicating that the session has experienced a redispatch after the maximal retry count has been reached on the initial server. In this case, the server name appearing in the log is the one the connection was redispatched to, and not the first one, though both may sometimes be the same in case of hashing for instance. So as a general rule of thumb, when a '+' is present in front of the retry count, this count should not be attributed to the logged server. - "srv_queue" is the total number of requests which were processed before this one in the server queue. It is zero when the request has not gone through the server queue. It makes it possible to estimate the approximate server's response time by dividing the time spent in queue by the number of requests in the queue. It is worth noting that if a session experiences a redispatch and passes through two server queues, their positions will be cumulated. A request should not pass through both the server queue and the backend queue unless a redispatch occurs. - "backend_queue" is the total number of requests which were processed before this one in the backend's global queue. It is zero when the request has not gone through the global queue. It makes it possible to estimate the average queue length, which easily translates into a number of missing servers when divided by a server's "maxconn" parameter. It is worth noting that if a session experiences a redispatch, it may pass twice in the backend's queue, and then both positions will be cumulated. A request should not pass through both the server queue and the backend queue unless a redispatch occurs. Session state at disconnection TCP and HTTP logs provide a session termination indicator in the "termination_state" field, just before the number of active connections. It is 2-characters long in TCP mode, and is extended to 4 characters in HTTP mode, each of which has a special meaning : - On the first character, a code reporting the first event which caused the session to terminate : C : the TCP session was unexpectedly aborted by the client. S : the TCP session was unexpectedly aborted by the server, or the server explicitly refused it. P : the session was prematurely aborted by the proxy, because of a connection limit enforcement, because a DENY filter was matched, because of a security check which detected and blocked a dangerous error in server response which might have caused information leak (eg: cacheable cookie), or because the response was processed by the proxy (redirect, stats, etc...). R : a resource on the proxy has been exhausted (memory, sockets, source ports, ...). Usually, this appears during the connection phase, and system logs should contain a copy of the precise error. If this happens, it must be considered as a very serious anomaly which should be fixed as soon as possible by any means. I : an internal error was identified by the proxy during a self-check. This should NEVER happen, and you are encouraged to report any log containing this, because this would almost certainly be a bug. It would be wise to preventively restart the process after such an event too, in case it would be caused by memory corruption. D : the session was killed by haproxy because the server was detected as down and was configured to kill all connections when going down. U : the session was killed by haproxy on this backup server because an active server was detected as up and was configured to kill all backup connections when going up. K : the session was actively killed by an admin operating on haproxy. c : the client-side timeout expired while waiting for the client to send or receive data. s : the server-side timeout expired while waiting for the server to send or receive data. - : normal session completion, both the client and the server closed with nothing left in the buffers. - on the second character, the TCP or HTTP session state when it was closed : R : the proxy was waiting for a complete, valid REQUEST from the client (HTTP mode only). Nothing was sent to any server. Q : the proxy was waiting in the QUEUE for a connection slot. This can only happen when servers have a 'maxconn' parameter set. It can also happen in the global queue after a redispatch consecutive to a failed attempt to connect to a dying server. If no redispatch is reported, then no connection attempt was made to any server. C : the proxy was waiting for the CONNECTION to establish on the server. The server might at most have noticed a connection attempt. H : the proxy was waiting for complete, valid response HEADERS from the server (HTTP only). D : the session was in the DATA phase. L : the proxy was still transmitting LAST data to the client while the server had already finished. This one is very rare as it can only happen when the client dies while receiving the last packets. T : the request was tarpitted. It has been held open with the client during the whole "timeout tarpit" duration or until the client closed, both of which will be reported in the "Tw" timer. - : normal session completion after end of data transfer. - the third character tells whether the persistence cookie was provided by the client (only in HTTP mode) : N : the client provided NO cookie. This is usually the case for new visitors, so counting the number of occurrences of this flag in the logs generally indicate a valid trend for the site frequentation. I : the client provided an INVALID cookie matching no known server. This might be caused by a recent configuration change, mixed cookies between HTTP/HTTPS sites, persistence conditionally ignored, or an attack. D : the client provided a cookie designating a server which was DOWN, so either "option persist" was used and the client was sent to this server, or it was not set and the client was redispatched to another server. V : the client provided a VALID cookie, and was sent to the associated server. E : the client provided a valid cookie, but with a last date which was older than what is allowed by the "maxidle" cookie parameter, so the cookie is consider EXPIRED and is ignored. The request will be redispatched just as if there was no cookie. O : the client provided a valid cookie, but with a first date which was older than what is allowed by the "maxlife" cookie parameter, so the cookie is consider too OLD and is ignored. The request will be redispatched just as if there was no cookie. U : a cookie was present but was not used to select the server because some other server selection mechanism was used instead (typically a "use-server" rule). - : does not apply (no cookie set in configuration). - the last character reports what operations were performed on the persistence cookie returned by the server (only in HTTP mode) : N : NO cookie was provided by the server, and none was inserted either. I : no cookie was provided by the server, and the proxy INSERTED one. Note that in "cookie insert" mode, if the server provides a cookie, it will still be overwritten and reported as "I" here. U : the proxy UPDATED the last date in the cookie that was presented by the client. This can only happen in insert mode with "maxidle". It happens everytime there is activity at a different date than the date indicated in the cookie. If any other change happens, such as a redispatch, then the cookie will be marked as inserted instead. P : a cookie was PROVIDED by the server and transmitted as-is. R : the cookie provided by the server was REWRITTEN by the proxy, which happens in "cookie rewrite" or "cookie prefix" modes. D : the cookie provided by the server was DELETED by the proxy. - : does not apply (no cookie set in configuration). The combination of the two first flags gives a lot of information about what was happening when the session terminated, and why it did terminate. It can be helpful to detect server saturation, network troubles, local system resource starvation, attacks, etc... The most common termination flags combinations are indicated below. They are alphabetically sorted, with the lowercase set just after the upper case for easier finding and understanding. Flags Reason -- Normal termination. CC The client aborted before the connection could be established to the server. This can happen when haproxy tries to connect to a recently dead (or unchecked) server, and the client aborts while haproxy is waiting for the server to respond or for "timeout connect" to expire. CD The client unexpectedly aborted during data transfer. This can be caused by a browser crash, by an intermediate equipment between the client and haproxy which decided to actively break the connection, by network routing issues between the client and haproxy, or by a keep-alive session between the server and the client terminated first by the client. cD The client did not send nor acknowledge any data for as long as the "timeout client" delay. This is often caused by network failures on the client side, or the client simply leaving the net uncleanly. CH The client aborted while waiting for the server to start responding. It might be the server taking too long to respond or the client clicking the 'Stop' button too fast. cH The "timeout client" stroke while waiting for client data during a POST request. This is sometimes caused by too large TCP MSS values for PPPoE networks which cannot transport full-sized packets. It can also happen when client timeout is smaller than server timeout and the server takes too long to respond. CQ The client aborted while its session was queued, waiting for a server with enough empty slots to accept it. It might be that either all the servers were saturated or that the assigned server was taking too long a time to respond. CR The client aborted before sending a full HTTP request. Most likely the request was typed by hand using a telnet client, and aborted too early. The HTTP status code is likely a 400 here. Sometimes this might also be caused by an IDS killing the connection between haproxy and the client. cR The "timeout http-request" stroke before the client sent a full HTTP request. This is sometimes caused by too large TCP MSS values on the client side for PPPoE networks which cannot transport full-sized packets, or by clients sending requests by hand and not typing fast enough, or forgetting to enter the empty line at the end of the request. The HTTP status code is likely a 408 here. CT The client aborted while its session was tarpitted. It is important to check if this happens on valid requests, in order to be sure that no wrong tarpit rules have been written. If a lot of them happen, it might make sense to lower the "timeout tarpit" value to something closer to the average reported "Tw" timer, in order not to consume resources for just a few attackers. SC The server or an equipment between it and haproxy explicitly refused the TCP connection (the proxy received a TCP RST or an ICMP message in return). Under some circumstances, it can also be the network stack telling the proxy that the server is unreachable (eg: no route, or no ARP response on local network). When this happens in HTTP mode, the status code is likely a 502 or 503 here. sC The "timeout connect" stroke before a connection to the server could complete. When this happens in HTTP mode, the status code is likely a 503 or 504 here. SD The connection to the server died with an error during the data transfer. This usually means that haproxy has received an RST from the server or an ICMP message from an intermediate equipment while exchanging data with the server. This can be caused by a server crash or by a network issue on an intermediate equipment. sD The server did not send nor acknowledge any data for as long as the "timeout server" setting during the data phase. This is often caused by too short timeouts on L4 equipments before the server (firewalls, load-balancers, ...), as well as keep-alive sessions maintained between the client and the server expiring first on haproxy. SH The server aborted before sending its full HTTP response headers, or it crashed while processing the request. Since a server aborting at this moment is very rare, it would be wise to inspect its logs to control whether it crashed and why. The logged request may indicate a small set of faulty requests, demonstrating bugs in the application. Sometimes this might also be caused by an IDS killing the connection between haproxy and the server. sH The "timeout server" stroke before the server could return its response headers. This is the most common anomaly, indicating too long transactions, probably caused by server or database saturation. The immediate workaround consists in increasing the "timeout server" setting, but it is important to keep in mind that the user experience will suffer from these long response times. The only long term solution is to fix the application. sQ The session spent too much time in queue and has been expired. See the "timeout queue" and "timeout connect" settings to find out how to fix this if it happens too often. If it often happens massively in short periods, it may indicate general problems on the affected servers due to I/O or database congestion, or saturation caused by external attacks. PC The proxy refused to establish a connection to the server because the process' socket limit has been reached while attempting to connect. The global "maxconn" parameter may be increased in the configuration so that it does not happen anymore. This status is very rare and might happen when the global "ulimit-n" parameter is forced by hand. PD The proxy blocked an incorrectly formatted chunked encoded message in a request or a response, after the server has emitted its headers. In most cases, this will indicate an invalid message from the server to the client. PH The proxy blocked the server's response, because it was invalid, incomplete, dangerous (cache control), or matched a security filter. In any case, an HTTP 502 error is sent to the client. One possible cause for this error is an invalid syntax in an HTTP header name containing unauthorized characters. It is also possible but quite rare, that the proxy blocked a chunked-encoding request from the client due to an invalid syntax, before the server responded. In this case, an HTTP 400 error is sent to the client and reported in the logs. PR The proxy blocked the client's HTTP request, either because of an invalid HTTP syntax, in which case it returned an HTTP 400 error to the client, or because a deny filter matched, in which case it returned an HTTP 403 error. PT The proxy blocked the client's request and has tarpitted its connection before returning it a 500 server error. Nothing was sent to the server. The connection was maintained open for as long as reported by the "Tw" timer field. RC A local resource has been exhausted (memory, sockets, source ports) preventing the connection to the server from establishing. The error logs will tell precisely what was missing. This is very rare and can only be solved by proper system tuning. The combination of the two last flags gives a lot of information about how persistence was handled by the client, the server and by haproxy. This is very important to troubleshoot disconnections, when users complain they have to re-authenticate. The commonly encountered flags are : -- Persistence cookie is not enabled. NN No cookie was provided by the client, none was inserted in the response. For instance, this can be in insert mode with "postonly" set on a GET request. II A cookie designating an invalid server was provided by the client, a valid one was inserted in the response. This typically happens when a "server" entry is removed from the configuration, since its cookie value can be presented by a client when no other server knows it. NI No cookie was provided by the client, one was inserted in the response. This typically happens for first requests from every user in "insert" mode, which makes it an easy way to count real users. VN A cookie was provided by the client, none was inserted in the response. This happens for most responses for which the client has already got a cookie. VU A cookie was provided by the client, with a last visit date which is not completely up-to-date, so an updated cookie was provided in response. This can also happen if there was no date at all, or if there was a date but the "maxidle" parameter was not set, so that the cookie can be switched to unlimited time. EI A cookie was provided by the client, with a last visit date which is too old for the "maxidle" parameter, so the cookie was ignored and a new cookie was inserted in the response. OI A cookie was provided by the client, with a first visit date which is too old for the "maxlife" parameter, so the cookie was ignored and a new cookie was inserted in the response. DI The server designated by the cookie was down, a new server was selected and a new cookie was emitted in the response. VI The server designated by the cookie was not marked dead but could not be reached. A redispatch happened and selected another one, which was then advertised in the response.