Back to home page

Enduro/X

 
 

    


0001 TPBRIDGE(8)
0002 ===========
0003 :doctype: manpage
0004 
0005 
0006 NAME
0007 ----
0008 tpbridge - Enduro/X Bridge Server.
0009 
0010 
0011 SYNOPSIS
0012 --------
0013 *tpbridge* ['OPTIONS']
0014 
0015 
0016 DESCRIPTION
0017 -----------
0018 This is special ATMI server which is used to connect local ATMI instances
0019 over the network. The result is network joined instances which makes
0020 Enduro/X cluster.
0021 
0022 Bridge process is used to exchange service lists between two nodes,
0023 calculate monotonic clock diff (so that later for messages time can
0024 be adjusted) between nodes and send XATMI IPC traffic between the machines.
0025 
0026 To establish network connection, on one machine bridge must be in 
0027 passive mode (socket server) and on other machine it must be configured 
0028 in active mode (socket client). Active 'tpbridge' periodically tries to connect 
0029 to the passive Enduro/X instance. If connection is dropped, active node
0030 will re-try to connect. Single 'tpbridge' process accepts only single
0031 TCP connection. Between two Enduro/X instances only one link can be defined
0032 where on each Enduro/X node there is 'tpbridge' process configured accordingly.
0033 Enduro/X node may have several 'tpbridge' process definitions defined,
0034 but each of these processes must define links for different Enduro/X
0035 cluster nodes.
0036 
0037 All data messages are prefixed with 4 byte message length indicator.
0038 Meaning that the logical message can be split over the multiple packets or
0039 within one packed multiple logical messages can be carried.
0040 
0041 When connection is established, sequence of actions happens: 1)
0042 clock difference between nodes and advertised service lists are exchanged. 2)
0043 After initial data exchange (clock & tables) 'tpbridge' is used to send 
0044 XATMI IPC over the network. I.e. 'tpcall()', 'tpforward()', 
0045 conversations, etc.
0046 
0047 When connection is stopped. This is reported to 'ndrxd' daemon which
0048 removes services from shared memory accordingly.
0049 
0050 tpbridge supports two network message formats. First format is native format
0051 which sends over the network directly internal (C lang) structures. This format
0052 will work faster, but cannot be used between different type of computers.
0053 I.e. in this case it is not possible to mix for example x86_64 with x86. Or
0054 x86 with RISC/ARM 32bit. If mixing is necessary, then use Enduro/X Network Protocol option,
0055 activated by flag '-f' on both nodes. In this case standard common TLV data format is used
0056 for data exchange between nodes. This might be slower than native format.
0057 
0058 When using host name (*-h*) for resolving binding host or connection address,
0059 tpbridge will resolve IP addresses. Multiple IP addresses for host name are
0060 supported. The logic for using them is following:
0061 
0062 In case of binding (*server*):
0063 
0064 1. Query host name
0065 
0066 2. Select first IP found in list
0067 
0068 3. Try to bind to selected IP
0069 
0070 4. If bind failed with *EADDRINUSE* or *EADDRNOTAVAIL*, select next IP, continue with 3.
0071 
0072 5. If list of IP addresses are exhausted,  start with 1.
0073 
0074 In case of connecting to address (*client*):
0075 
0076 1. Query host name
0077 
0078 2. Select first IP found in list
0079 
0080 3. Perform connect() to selected IP
0081 
0082 4. If connect() asynchronously failed (after *EINPROGRESS*) with any error, 
0083 select next IP, continue with 3.
0084 
0085 5. If list of IP addresses are exhausted, start with 1.
0086 
0087 tpbridge to-network and from-network streams are separated by different threads.
0088 Thus reading from XATMI queues is doing server main thread. And reading from
0089 network is done by other thread.
0090 
0091 BLOCKING CONDITION SOLVING
0092 --------------------------
0093 
0094 *In case if IPC queues are full:*
0095 
0096 In case when messages are received from network, but local queues are blocked (full), e.g
0097 services are slow to process such amount of incoming requests, tpbridge will try
0098 to solve situation in following way:
0099 
0100 1. In case if sending to XATMI IPC queue bridge gets *EAGAIN* error, message is
0101 added to internal temporary queue. Queue size is set by *-Q* cli flag (or read
0102 from *NDRX_MSGMAX* or defaulted to 100, see 'TEMPORARY_QUEUE_SIZE' bellow. 
0103 When message is queued, special queue sender thread is being activated.
0104 
0105 2. In case if queued message count is greater than 
0106 'TEMPORARY_QUEUE_SIZE', then incoming net-in thread is blocked until 
0107 queue sender thread will send some messages and queued message count will be less than 
0108 'TEMPORARY_QUEUE_SIZE'.
0109 
0110 3. In case if queue action (*-A*) is set to *2* and new message arrives for 
0111 single XATMI IPC queue (e.g. service queue, reply queue, etc.) which is full, 
0112 the message is discarded.
0113 
0114 4. In case if queue action (*-A*) is set to *3* and new message arrives for 
0115 single XATMI IPC queue which is full or message arrives for other queue 
0116 but overall limit is exceeded of temporary enqueued messages, 
0117 then message is discarded.
0118 
0119 5. In case if bridge was solving conditions with blocking and is blocked,
0120 once every second net-in thread is unblocked, to process 
0121 any incoming messages from socket. 
0122 
0123 *Queue sender Logic:*
0124 
0125 1. Enqueued messages are grouped by single XATMI IPC queue names.
0126 
0127 2. Each enqueued message get scheduled next attempt run. Where minimum wait
0128 time is set by *-m* parameter.
0129 
0130 3. On every attempt unsuccessful send attempt, next attempt time is multiplied by 2.
0131 
0132 4. If single XATMI IPC queue attempt was successful, the next message of the
0133 same queue is tried to send immediately, if it fails, the next attempt for the 
0134 same queue is scheduled and queue runner switches to next different single XATMI
0135 IPC queue (if any).
0136 
0137 5. If attempts are reached or TTL expired (if configured), message is discarded. 
0138 
0139 6. In case if message by it self is timed (i.e. tpcall() without *TPNOTIME* flag)
0140 and it expire, the message is discarded.
0141 
0142 7. In case if queue destination queue is broken (attempt did no end with *EAGAIN* or
0143 *EINTER*), the message is discarded.
0144 
0145 
0146 *In case if network socket is full:*
0147 
0148 1. In case if outgoing socket is being blocked (full), tpbridge stop process 
0149 outgoing traffic is processed - e.g. "@TPBRIDGEXXX" queue will not be read.
0150 
0151 2. To respond to *ndrxd(5)* pings, bridge lets to process 
0152 one outgoing message per second (i.e. read Enduro/X XATMI server queues).
0153 
0154 
0155 *Message discard strategy*
0156 
0157 If message is service call and client is waiting for answer, server 
0158 error *TPESVCERR* is returned to caller.
0159 
0160 All discarded messages are logged with error level *3* (warning) to the bridge
0161 logs. 
0162 
0163 ULOG contains entry "Discarding message" for every discarded message.
0164 
0165 CLOCK SYNC
0166 ----------
0167 
0168 As Enduro/X mostly all time elements (timeouts, etc) accounts in local Monotonic
0169 time, the time correction (adjustment from remote Monotonic clock to local) is
0170 required when XATMI IPC message is received from remote node. Bridge process
0171 uses special messages to exchange the clock information between the nodes.
0172 
0173 When connection is established, each node sends to other node it's
0174 local Monotonic time. This information is used for time correction between the nodes.
0175 However over the time the Monotonic clocks of connected hosts may drift away from
0176 the difference measured at connection startup. The messages
0177 received or sent from one node might look like expired on other node.
0178 To solve this issue, tpbridge periodically sends dynamic clock exchange messages 
0179 between nodes, in synchronous fashion. The round trip time is measured (just like a ping time)
0180 and if it is within acceptable boundaries, the time from other host is accepted
0181 and time correction value is updated. The max rountrip time is set by '-k'
0182 flag (default 200ms), and interval is set by '-K' flag (600 sec).
0183 To monitor the clock status, TM_MIB class *T_BRCON* can be used 
0184 for this purpose, e.g. "$ xadmin -c T_BRCON" call.
0185 
0186 OPTIONS
0187 -------
0188 *-n* 'NODE_ID'::
0189 Other Enduro/X instance's Node ID. Numerical 1..32.
0190 
0191 [*-r*]::
0192 Send Refresh messages to other node. If not set, other node will
0193 not see our's node's services. OPTIONAL flag.
0194 
0195 *-t* 'MODE'::
0196 'MODE' can be 'P' for passive/TCPIP server mode, any other (e.g. 'A')
0197 will be client mode.
0198 
0199 *-i* 'IP_ADDRESS'::
0200 In Active mode it is IP address to connect to. In passive mode it is
0201 binding/listen address.
0202 
0203 *-h* 'HOST_NAME'::
0204 Binding/connection IP Address may be resolved from host name set in -h parameter.
0205 Host name is resolved by OS, DNS queries, etc. *tpbridge* shall be started with 
0206 *-i* or with *-h*, if both flags will be set, error will be generated.
0207 
0208 *-6*::
0209 If set, then IPv6 addresses will be used. By default *tpbridge* operates with
0210 *IPv4* addresses.
0211 
0212 *-p* 'PORT_NUMBER'::
0213 In active mode 'PORT_NUMBER' is port to connect to. In passive mode it is
0214 port on which to listen for connection.
0215 
0216 *-T* 'TIME_OUT_SEC'::
0217 Parameter indicates time-out value for packet receive in seconds. This is
0218 socket option. Receive is initiate when it either there is poll even on socket
0219 or incomplete logical message is received and then next 'recv()' is called.
0220 If the message part is not received in time, then socket is closed and connection
0221 is restarted. This parameter also is used in case if target socket to which msg
0222 is being sent is full for this given time period. If msg is not fully sent
0223 and time out is reached, the connection is restarted, outgoing msg is being dropped.
0224 
0225 [*-b* 'BACKLOG_NR']::
0226 Number of backlog entries. This is server's (passive mode) connection queue, before
0227 server accepts connection. OPTIONAL parameter. Default value is 100. But
0228 could be set to something like 5.
0229 
0230 [*-c* 'CONNECTION_CHECK_SEC']::
0231 Connection check interval in seconds. OPTIONAL parameter. Default value 5.
0232 
0233 [*-z* 'PERIODIC_ZERO_SEND_SEC']::
0234 Interval in seconds between which zero length message is wrote to socket.
0235 This is useful to keep the connection option over the firewalls, etc.
0236 OPTIONAL parameter. Default value 0 (Do not send).
0237 
0238 [*-a* 'INCOMING_RECV_ACTIVITY_SEC']::
0239 If set, then this is maximum time into which some packet from network must be
0240 received. If no receive activity on socket is done, the connection is reset.
0241 The *0* value disables this functionality. The default value is '-z'
0242 multiplied by 2. Note that checks are performed with '-c' interval.
0243 intervals. Usually this is used with '-z', so that it is guaranteed that during
0244 that there will be any traffic.
0245 
0246 [*-f*]::
0247 Use 'Enduro/X Standard Network TLV Protocol' instead of native data structures
0248 for sending data over the network. This also ensure some backwards compatibility
0249 between Enduro/X versions. But cases for backwards compatibility must be checked
0250 individually.
0251 
0252 [*-P* 'THREAD_POOL_SIZE']::
0253 This is number of worker threads for sending and receiving messages
0254 for/to network. 50% of the threads are used for upload and other 50% are
0255 used for network download. Thus number is divided by 2 and two thread pools
0256 are created. If divided value is less than 1, then default is used.
0257 The default size is *4*.
0258 
0259 [*-R* 'QUEUE_RETRIES']::
0260 Number of attempts to send message to local queue, if on pervious attempt queue
0261 was full. The first attempt is done in real time, any further (if this flag allows)
0262 are performed with calculated frequency of: nr_messages_failed_to_send - nr_messages_sent
0263 in milliseconds. Default value is *999999*. To disable temporary queue, set value
0264 to *0*.
0265 
0266 [*-A* 'TEMPORARY_QUEUE_ACTION']::
0267 This value indicates the action how tpbridge shall process the cases when temporary
0268 queue space for unsent / blocking messages are full. Values are following:
0269 value *1* - if global temp queue is full (*-Q* param) - block the 
0270 bridge / stop incoming traffic, if single XATMI IPC queue is full (*-q*) - 
0271 ignore the condition (i.e. let to fill till the *-Q* limit). Value *2* - if 
0272 global temp queue full - block, if single XATMI IPC queue is full - 
0273 discarded the message. Value *3* - if global temp queue is full - discarded the message, 
0274 if single XATMI IPC queue is full - discarded the message. Default is *1*.
0275 
0276 [*-Q* 'TEMPORARY_QUEUE_SIZE']::
0277 This is number of messages that tpbridge can accumulate in case if message is
0278 received from network and destination queue is full (e.g. service call queue, reply queue, etc).
0279 If this parameter is not set, then value uses *NDRX_MSGMAX* environment variable setting.
0280 If env variable is not available, then value is defaulted to *100*. The value
0281 of temporary queue size is preferred (and not string) as due to parallel processing
0282 conditions, the number of messages in queue might go over this number until
0283 the bridge is locked.
0284 
0285 [*-q* 'TEMPORARY_QUEUE_SIZE_DEST']::
0286 This is max number of messages tpbridge can accumulate 
0287 for single XATMI IPC queue which is full/blocking. 
0288 This parameter is used in case if queue action (-A) is configure to drop messages, 
0289 if single XATMI IPC queue temporary space is full.
0290 
0291 [*-L* 'TEMPORARY_QUEUE_TTL']::
0292 This is number of milliseconds for messages to live in temporary queue. Default
0293 value is *NDRX_TOUT* env setting converted to millisecond.
0294 
0295 [*-m* 'TEMPORARY_QUEUE_MINSLEEP']::
0296 This is minimum number milliseconds to wait after which schedule next attempt
0297 of message sending to the XATMI IPC queue. The default is *40*.
0298 
0299 [*-M* 'TEMPORARY_QUEUE_MAXSLEEP']::
0300 This is maximum number milliseconds to wait after which schedule next attempt
0301 of message sending to the XATMI IPC queue. The default is *150*.
0302 
0303 [*-B* 'THREADPOOL_BUFFER_SIZE']::
0304 This is number of messages that either net-out or net-in threads can accumulate
0305 to corresponding thread job queue. Higher the number, will mean tpbridge will
0306 start to collect some unprocessed messages, but better would be the pipeline
0307 for incoming/outgoing main threads and the thread pool workers. The default
0308 value is half of the 'THREAD_POOL_SIZE'.
0309 
0310 [*-k* 'CLOCKSYNC_ROUNDTRIP']::
0311 Maximum periodic clock sync message rountrip from local host to remote host nad
0312 back in milliseconds for accepting the remote hosts monontonic clock value 
0313 for time adjustments. If roundtrip time for clock request is greater than this
0314 value, the response with remote hosts monotonic clock value is ignored. 
0315 Default is *200*.
0316 
0317 [*-K* 'CLOCKSYNC_PERIOD']::
0318 Number of seconds to send the request from clock synchronization. Value *0* disables
0319 this functionality. Default value is *600*. Checking is performed with
0320 the granularity of the 'CONNECTION_CHECK_SEC'.
0321 
0322 
0323 EXIT STATUS
0324 -----------
0325 *0*::
0326 Success
0327 
0328 *1*::
0329 Failure
0330 
0331 BUGS
0332 ----
0333 Report bugs to support@mavimax.com
0334 
0335 SEE ALSO
0336 --------
0337 *ex_env(5)* *ndrxconfig.xml(5)* *xadmin(8)* *ndrxd(8)* *ex_netproto(guides)*
0338 
0339 COPYING
0340 -------
0341 (C) Mavimax, Ltd
0342