doc/manpage/tmsrv.adoc

0001 TMSRV(8)
0002 ========
0003 :doctype: manpage
0004
0005
0006 NAME
0007 ----
0008 tmsrv - Transaction Manager Server.
0009
0010
0011 SYNOPSIS
0012 --------
0013 *tmsrv* ['OPTIONS']
0014
0015
0016 DESCRIPTION
0017 -----------
0018 This is a special ATMI server that is used for distributed transaction coordination.
0019 For transactions started with *tpbegin(3)*, *tmsrv* generates new XID and passes
0020 XID back to transaction initiator. At the same time, transaction is remembered by *tmsrv*
0021 as active transaction, transaction time-out expiry is checked by the background thread.
0022
0023 In Enduro/X XA Resource Managers are numeric identifiers, which are allowed to
0024 be in range of 1..255. Enduro/X's Resource Manager ID (RMID) is same identifier as
0025 Group Number or grpno known in other ATMI systems. In one global transaction, maximum
0026 *32* number of different resources may participate.
0027
0028 If during distributed transaction processing new resource manager is associated
0029 with transaction, then process notifies initial transaction manager that new
0030 the association must be made.
0031
0032 Every transaction is logged to a separate file on the disk. The file name contains
0033 the resource manager ID and transaction XID. All involved resource manager statuses are logged
0034 to this particular machine-readable log file. Once the transaction is
0035 completed, file is removed. If *tmsrv* has crashed having in-progress transactions,
0036 transaction log files are read from the disk at the next *tmsrv* startup,
0037 and appropriate actions according to two phase commit state machine
0038 are performed (aborted or rolled back).
0039
0040 If running transaction did time-out, then background thread will abort it automatically,
0041 and for caller process commit() will fail with *TPEABORT* error code.
0042
0043 If several resource managers are used in the single transaction,
0044 other transaction managers for corresponding resource managers are used as workers
0045 for executing prepare/abort/commit actions on the enlisted resource managers. These
0046 other transaction managers may be located on other cluster nodes, depending on the
0047 system setup. Cluster setup must be done correctly because an initiator
0048 transaction manager must have direct access (i.e. direct *tpbridge(8)* link) to
0049 all enlisted resource manager-associated transaction managers (workers).
0050
0051 Transaction managers can be load-balanced with *ndrxconfig.xml(5)* with min/max attributes.
0052 In load-balanced mode at *tpbegin()* corresponding free transaction manager will be
0053 selected. Later at transaction processing selected manager is responsible for the full life cycle
0054 of the transaction. Other enlisted transaction/resource managers for this transaction will help
0055 prepare/commit/abort transaction branches . These other TMs will be selected in load-balanced
0056 mode.
0057
0058 Every instance of *tmsrv* will advertise the following list of services
0059
0060 1. '@TM-<Resource Manager ID>'
0061
0062 2. '@TM-<Cluster (or Virtual) Node ID>-<Resource Manager ID>'
0063
0064 3. '@TM-<Cluster (or Virtual) Node ID>-<Resource Manager ID>-<EnduroX Server ID>'
0065
0066 For example for Resource Manager ID 1, Cluster Node ID 6, Enduro/X Server ID 10
0067 services will look like:
0068
0069 1. @TM-1
0070
0071 2. @TM-6-1
0072
0073 3. @TM-6-1-10
0074
0075 Currently service format 1. is used for starting new transaction,
0076 and accepting prepare/commit/abort calls from the transaction initiator TM.
0077 Service Nr 3. is used by transaction initiator for subsequent calls
0078 of the *tpcommit(3)*/*tpabort(3)*. Also 3. is used by services involved in transaction
0079 to register new Resource Manager ID as part of the transaction.
0080
0081 For XA processing, resource manager drivers are loaded via dynamic loadable shared libs.
0082 Resource manager should expose xa_switch in shared lib. For every different resource manager,
0083 there is different Enduro/X *tmsrv* running. Enduro/X process gets associated with
0084 the corresponding RMID via *NDRX_XA_RES_ID* environment variable.
0085
0086 To configure different RMID's for different processes or tmsrvs, use the Enduro/X built-in
0087 facility of environment variable override or associate processes with *<cctag>* setting
0088 which corresponds to *[@global/<cctag>]* sub-section where the XA settings can be placed.
0089 See the manpage of *ex_env(5)* for more details.
0090
0091 Enduro/X supports static and dynamic XA registration.
0092
0093 LOGGING AND RECOVERY
0094 --------------------
0095 tmsrv register all activities of the transactions and resource managers in the
0096 machine readable log file. Log file is used for crash recovery, where last
0097 state of the transaction is read and transaction is completed according to the
0098 target state set in log file.
0099
0100 Each log file line contains a CRC32 checksum, which is verified during the
0101 recovery, any bad line is ignored, which might happen in case if data have not
0102 fully flushed to the disk. If during the recovery process some of the
0103 lines are invalid, they are ignored, and tmsrv acts with knowledge of last
0104 known state.
0105
0106 When the transaction is started or when a new resource manager joins the transaction
0107 or when commit/abort request is made, logging is mandatory, i.e. if the disk is
0108 full or permissions error, the transaction is either not started or state not
0109 changed.
0110
0111 When the transaction is finalized (committed or aborted) the transaction and
0112 resource states are logged optionally, thus write errors are ignored
0113 (but logged to ULOG). Thus if recovery is necessary at this stage,
0114 the transaction would be finalized according to any last valid data logged.
0115
0116 If after crash recovery some transactions still exist in Resource Manager
0117 as not completed, following *xadmin(8)* commands may be used to finish them at
0118 particular Resource Manager level:
0119
0120 - $ xadmin recoverlocal
0121
0122 - $ xadmin commitlocal
0123
0124 - $ xadmin abortlocal
0125
0126 - $ xadmin forgetlocal
0127
0128 *WARNING:* These commands does not consult with the originating transaction
0129 managers for the transaction statuses, thus these command shall be used only
0130 when system is idling (not processing any useful workload and it is known that
0131 there some records at resources locked / stuck at prepare stage).
0132
0133 To collect and rollback any orphaned prepared transactions, it is recommended
0134 to configure singled *tmrecoversv(8)* copy at the end of *ndrxconfig.xml(5)*
0135 server startup sequence, so that this server automatically would collect any
0136 broken transaction branches. Or after the system startup, manually invoke
0137 *tmrecovercl(5)* command line tool to perform transaction collection and rollback.
0138
0139 By default *tmsrv* writes log files to disk and uses *fflush()* unix call to
0140 persist the data. This call submits the message only to Operating System kernel,
0141 but does not guarantee that data is written to disk. Thus if power outage happens
0142 some transaction information may be lost. Thus for critical systems it is
0143 recommended to use special flags which instruct the *tmsrv* to perform disk
0144 synchronization when commit decisions have been taken. These flags shall be
0145 set in *NDRX_XA_FLAGS*: *FSYNC* or *FDATASYNC* and *DSYNC*. The *FSYNC* or
0146 *FDATASYNC* corresponds to *fsync()* and *fdatasync()* unix calls to flush
0147 the transaction log data to disk. *DSYNC* ensures that log file directory structure
0148 is flushed to disk. *FDATASYNC* is bit faster than *FSYNC*, as it does not update the
0149 file last change and other insignificant attributes). Usage of these flags may
0150 significantly reduce the transaction TPS performance. *DSYNC* usage depends
0151 on the operating system. It is necessary for Linux and Solaris operating systems.
0152 For other operating systems, please consult with vendors manuals, when directory
0153 fsync is needed for new files to be persisted.
0154
0155 OPTIONS
0156 -------
0157 *-t* 'DEFAULT_TIMEOUT'::
0158 'DEFAULT_TIMEOUT' is maximum transaction time-out in seconds, used in case if *tpbegin(3)* was
0159 started with 'timeout' *0*.
0160
0161 *-l* 'LOG_DIR'::
0162 'LOG_DIR' is a full path to the transaction log file directory. The process at startup
0163 scans the directory for transaction files. The format of the file name
0164 is the following: 'TRN-<Cluster Node ID>-<RMID>-<Server ID>'. To move transactions
0165 to different transaction managers. The log file should be renamed accordingly.
0166
0167 [*-s* 'SCAN_TIME']::
0168 Time in seconds for one cycle to perform transaction actions for the background thread.
0169 I.e. the background thread does the sleep of this time on every loop. Default is set to '10'.
0170
0171 [*-c* 'TIME_OUT_CHECK']::
0172 This is periodic timer for doing active transactions time-out checks. Default is set to '1'
0173
0174 [*-m* 'MAX_TRIES']::
0175 Max tries to complete the whole transaction by background thread. If the counter is reached,
0176 then no more attempts to complete the transaction are done. The counter is restarted at
0177 *tmsrv* reboot. The default is set to '100'.
0178
0179 [*-r* 'XA_RETRIES']::
0180 This is the number of attempts on the resource manager when it returns *XA_RETRY* or *XAER_RMFAIL*
0181 during the commit or other type of operations (in case of *XAER_RMFAIL*).
0182 So lets say we have issued *tpcommit()* and some involved database is returning
0183 *XA_RETRY*. If '-r' is set above 2, then during the processing
0184 of 'tpcommit()', the xa commit to database will be retries one more time.
0185 If XA_RETRY is returned again for third time, then *TPEHAZARD* is returned to caller,
0186 transaction is moved to background thread, and will by processed
0187 by '-m' tries. But also here every '-m' try for *XA_RETRY*/*XAER_RMFAIL* will
0188 be multiplied by '-r' attempts. Default value is set to '3'.
0189
0190 [*-p* 'THREAD_POOL_SIZE']::
0191 This is the number of threads processing incoming requests. If all threads are busy, then
0192 job is internally queued. It is known that some databases slowly process some of
0193 the XA operations, for example, 'xa_rollback'. Thus multiple threads can handle this
0194 more efficiently. *Default thread pool size is set to 10*. For more load balancing it
0195 is recommended to start multiple *tmsrv* processes for same RMID.
0196 Note that *tmsrv* run with multiple threads, thus for Oracle DB flag '+Threads=true'
0197 *MUST* be set in *NDRX_XA_OPEN_STR*.
0198 Otherwise, unexpected core dumps can be received from *tmsrv*.
0199
0200 [*-P* 'PING_SECONDS']::
0201 Number of seconds to perform database pings by either xa_start+TMJOIN flag or
0202 by xa_recover+TMSTARTRSCAN and TMENDRSCAN flags. The xa_recover is enabled by
0203 *-R* parameter. The *default* is xa_start. In the case of xa_start from the database, it
0204 is expected error code XAER_NOTA (transaction not found) as the scan is performed
0205 for a nonexistent XID, generated for each worker thread. For xa_recover it is
0206 expected that the operation succeeds. If the operations go out of the normal
0207 behavior, then the re-connection procedure is set in *NDRX_XA_FLAGS* - tag *RECON*
0208 i.e. thread will perform xa_close() and xa_open() and retry operation. See the
0209 *ex_env(5)* manpage for the details. But for quick reference, you may use value
0210 'RECON:*:3:100' which will perform 3x attempts on any error by sleeping 100 ms in
0211 between attempts. The *NDRX_XA_FLAGS* must be set in CC config or environment
0212 and the attempts must be greater that 1. Other with the *tmsrv* will not boot
0213 with *-P* flag set.
0214
0215 [*-R*]::
0216 Enable xa_recover() call for PINGs instead of xa_start(). See *-P* flag description.
0217
0218 [*-h* 'HOUSEKEEP_TIMEOUT']::
0219 The number of seconds after which corrupted transaction log files are removed at
0220 tmsrv startup or later load attempts set by *-a*. the default value is *5400* (1 hour 30 min).
0221 In case flag *-X* is enabled, the broken log file housekeeping is performed regularly.
0222
0223 [*-n* 'ENDUROX_NODE_ID']::
0224 Indicates the virtual Enduro/X cluster node ID. If the parameter is not set, then
0225 the given parameter matches the local application domain node ID set in *NDRX_NODEID*
0226 environment variable. The parameter normally is set to some common cluster node ID
0227 number for singleton process group operations. So that in case of group failover,
0228 the tmsrv server from shared storage would read, recognize and process
0229 failed (other) node's transaction logs. Additionally, it is required that <srvid>
0230 for the <server> tag in *ndrxconfig.xml(5)* used by tmsrv instance matches the
0231 failed nodes <srvid> value. The same applies to *NDRX_XA_RES_ID* setting from
0232 the *ex_env(5)*, it must match with the failed node's *NDRX_XA_RES_ID* setting.
0233 In case if parameters do not match, the log entries from shared storage are
0234 silently ignored.
0235
0236 [*-X* 'NR_OF_SECONDS']::
0237 Number of seconds for periodic verification of the transaction log files. Check verifies
0238 that there aren't any not yet loaded transaction logs on the disk. Normally
0239 that shall never happen. However, if working with singleton groups
0240 and infrastructure is not using STONITH device or shared fs used for logs does not become
0241 read-only or unavailable at the moment of node failure, then at the failover
0242 and a coincidence of circumstances (such as failed machine
0243 pause/resume cycle), at small period there is chance of the
0244 duplicate runs of the *tmsrv* processes, which for the currently active
0245 *tmsrv* might introduce unknown transaction log files. When setting *-X* is
0246 enabled (is greater than *0*), then with *-s* granularity, it scans the
0247 log file directory. If any unknown log is found, *tmsrv* marks that
0248 transaction log file for loading at the next check cycle. If the log fails
0249 to load, it makes additional attempts to load the log file on
0250 the next cycles (set by *-a*). This parameter also activates
0251 periodic housekeeping of the transaction log files
0252 configured by setting *-h*. Housekeeping is needed to release any
0253 prepared transactions in resource managers, for whom on the disk broken transaction log
0254 is stored, and that blocks the *tmrecoversv(8)* to zap the prepared
0255 transactions properly. If the parameter is disabled and if a broken log file exists
0256 on the disk, that does not block *tmrecoversv(8)* to zap the prepared transaction.
0257 The default parameter value is *0* (disabled).
0258
0259 [*-a* 'NR_OF_ATTEMPTS']::
0260 The number of attempts used by the background thread to load an incomplete transaction
0261 log files (left by any crashed process instances and not yet flushed to
0262 the file system by the OS). The default value is *1*. During the attempts
0263 processing, additionally, housekeeping is performed on the files (configured by the *-h* flag).
0264
0265 XA RECOVER SETTINGS FOR ORACLE DB
0266 ---------------------------------
0267 The -R mode might not be enabled in a database for the user. I.e. user is not allowed
0268 to see open transactions. Thus must be enabled by following commands on the DB user
0269 set in XA open string:
0270
0271 --------------------------------------------------------------------------------
0272 grant select on pending_trans$ to <database_user>;
0273 grant select on dba_2pc_pending to <database_user>;
0274 grant select on dba_pending_transactions to <database_user>;
0275 grant execute on dbms_system to <database_user>;  (If using Oracle 10.2)
0276 grant execute on dbms_xa to <database_user>; (If using Oracle 10.2)
0277 --------------------------------------------------------------------------------
0278
0279
0280 ORACLE RAC SETTINGS
0281 -------------------
0282 If planning to use Oracle RAC, to successfully process the distributed transaction
0283 across binaries which are connected to different RAC nodes, Oracle RAC Singleton
0284 Service must be configured so that only one node actively serves the transactions,
0285 and this ensures XA affinity.
0286
0287 Typically on gird infrastructure, that can be configured as:
0288
0289 --------------------------------------------------------------------------------
0290
0291 $ srvctl add service -db RACDB -service XARAC -preferred RAC1
0292   -available RAC2
0293
0294 --------------------------------------------------------------------------------
0295
0296 For policy-based RAC cluster management, use:
0297
0298 --------------------------------------------------------------------------------
0299
0300 $ srvctl add service -db RACDB -service XARAC -serverpool xa_pool
0301   -cardinality SINGLETON
0302
0303 --------------------------------------------------------------------------------
0304
0305 *NOTE:* *-dtp*  option shall be leaved to default, which is *FALSE*.
0306
0307 If this above is not configured and say two binaries are working with the same XA
0308 transaction, one binary is connected to the first RAC node and another binary with the second RAC node,
0309 the transaction will not work, as XA API will not see the transaction on other
0310 node than where it was started and the following error would be generated:
0311
0312 --------------------------------------------------------------------------------
0313
0314 ORA-24798: cannot resume the distributed transaction branch on another instance
0315
0316 --------------------------------------------------------------------------------
0317
0318
0319 For more details consult with Oracle instructions, as basically Enduro/X uses
0320 plain X/Open XA API for managing the transactions, and it is expected that
0321 Oracle DB provides support for XA API.
0322
0323 LIMITATIONS
0324 -----------
0325 When using dynamic registration xa switches with the *RECON* XA flag functionality,
0326 to keep the process working in case if communications are lost while executing non XA AP code
0327 e.g. SQL statements, the process by it self must perform *tpclose(3)*/*tpopen(3)* until
0328 it succeeds, or the process shall perform exit so that Enduro/X would restart it.
0329 This extra logic is needed due to fact, that if outside of XA API communications are lost,
0330 the Enduro/X by it self would not see that comms status have changed because ax_start()
0331 is executed only when resource is modified by the application.
0332 If comms are not working in the application, the resource is not modified and
0333 thus ax_start() is not invoked.
0334
0335 When the process joins the transaction (either initiator or participating XATMI server), firstly
0336 it register with *tmsrv* and only then performs xa_start() API call. If transaction
0337 at *tmsrv* expires concurrently while the joining process has not yet called the xa_start(),
0338 there is the possibility that an orphan transaction may be created (i.e. created active transaction
0339 in the resource, but the transaction is not managed by Enduro/X as already rolled back).
0340 To overcome this limitation, careful transaction timeout planning shall be performed which
0341 applies to tpbegin() setting and timeout setting at the resource for inactive transactions.
0342
0343 If transaction expires at *tmsrv*, this fact does not terminate any *tpcall(3)* operations,
0344 except that if called service's associate resource manager is not registered with given
0345 global transaction.
0346
0347 EXIT STATUS
0348 -----------
0349 *0*::
0350 Success
0351
0352 *1*::
0353 Failure
0354
0355 BUGS
0356 ----
0357 Report bugs to support@mavimax.com
0358
0359 If logs directory (*-l*) is located on Linux *ext4* file system
0360 and FSYNC/FDATASYNC/DSYNC flags are used, the transaction manager
0361 might perform much slower than physical hard disk is capable of.
0362 Instead, it is recommended to use *xfs* file system for Linux,
0363 which performs better.
0364
0365 SEE ALSO
0366 --------
0367 *ex_env(5)* *buildtms(8)* *xadmin(8)* *tmrecoversv(8)* *tmrecovercl(8)*
0368
0369 COPYING
0370 -------
0371 (C) Mavimax, Ltd
0372