summaryrefslogtreecommitdiff
path: root/bgpd/bgp_fsm.c
AgeCommit message (Collapse)Author
2009-07-28bgpd: fd leak in bgpdSteve Hill
* bgp_fsm.c: I have found an fd leak in bgpd that is caused by the 'new' Clearing state. I've been seeing it from hold timer failures, but it can also be triggered by other things. When Hold_Timer_expired fires in Established state, a notify is sent and BGP_Stop event queued. The fsm then transitions into Clearing state. That is the problem; When the BGP_Stop event is serviced, the state table says to ignore it while in Clearing. Thus bgp_stop is not called and the fd leaks. Previously the peer would be in Idle state, which correctly handles the BGP_Stop event. Fix by making bgp_stop safe to call from Clearing state, without losing ClearingCompleted events, and then ensuring it is called prior to transition from Clearing->Idle.
2009-06-18[bgpd/cleanup] Make BGP FSM table read-only staticStephen Hemminger
The finite state machine table is immutable.
2009-06-12[cleanup] functions taking no args should be declared with void argsStephen Hemminger
Use Ansi-C prototypes rather than old K&R method of declaring function without arguments
2008-08-26Revert "[bgpd] Add 'bgp open-accept' option, to send OPEN immediately on ↵Paul Jakma
accepted conns" Revert commit d664ae1182c29b74b409bc8594b7bd0575e91ce9. An experimental patch which violates RFC4271 quite badly, but managed to accidently sneak its way in.
2008-08-22[bgpd] Add 'bgp open-accept' option, to send OPEN immediately on accepted connsPaul Jakma
2007-08-31 Paul Jakma <paul.jakma@sun.com> * (general) Add 'bgp open-accept' option, to allow bgpd to send OPEN on accepted connections, i.e. to not wait till after collision-detect to send OPEN, which appears to be allowed in RFC4271. This may help speed up establishing sessions, or help avoid FSM problems with sessions to certain peers. Not enabled by default though.
2007-06-22[bgpd] bug #368: Fix possible loop between peers going Idle<->OpenSentPaul Jakma
2007-06-22 Paul Jakma <paul.jakma@sun.com> * bgp_fsm.c: (struct FSM) Bug #368. TCP Errors during OpenSent should cycle to Active, not to Idle or else peer bringup can race and cycle Idle<->Active. Reported and fix tested by Mukesh Agrawal.
2007-02-22[bgpd] Peer delete can race with reconfig leading to crashPaul Jakma
2007-02-22 Paul Jakma <paul.jakma@sun.com> * bgp_fsm.c: (bgp_fsm_change_status) Handle state change into clearing or greater here. Simpler. (bgp_event) Clearing state change work moved to previous * bgp_route.c: (bgp_clear_route_node) Clearing adj-in here is too late, as it leaves a race between a peer being deleted and an identical peer being configured before clearing completes, leading to a crash. Simplest fix is to clean peers Adj-in up-front, rather than queueing such work. (bgp_clear_route_table) Clear peer's Adj-In and Adj-Out up-front here, rather than queueing such work. Extensive comment added on the various bits of indexed data that exist and how they need to be dealt with. (bgp_clear_route) Update comment.
2006-12-08[bgpd] Bug #302, bgpd can get stuck in state ClearingPaul Jakma
2006-12-07 Paul Jakma <paul.jakma@sun.com> * bgp_fsm.c: Bug #302 fix, diagnosis, suggestions and testing by Juergen Kammer <j.kammer@eurodata.de>. Fix follows from his suggested fix, just made in a slightly different way. (bgp_event) Transitions into Clearing always must call bgp_clear_route_all(). (bgp_stop) No need to clear routes here, BGP FSM should do it.
2006-10-15[bgpd] Bug #302 fixes. ClearingCompleted event gets flushed, leaving peers ↵Paul Jakma
stuck in Clearing. 2006-10-14 Paul Jakma <paul.jakma@sun.com> * bgp_fsm.h: Remove BGP_EVENT_FLUSH_ADD, dangerous and not needed. * bgp_fsm.c: (bgp_stop) Move BGP_EVENT_FLUSH to the top of the of the function, otherwise it could flush a ClearingCompleted event, bug #302. * bgp_packet.c: Replace all BGP_EVENT_FLUSH_ADD with BGP_EVENT_ADD, fixing bug #302.
2006-09-14[bgpd] simplify peer refcounts, squash slow peer leakPaul Jakma
2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) fix the peer refcount issue exposed by previous, by just removing refcounting of peer threads, which is mostly senseless as they're references leading from struct peer, which peer_free cancels anyway. No need to muck around.. * bgp_fsm.h: Just remove the refcounting from the various TIMER/READ/WRITE/EVENT ON/OFF/ADD macros. * bgp_fsm.c: (bgp_stop) use BGP_EVENT_FLUSH, no refcounts attached to events anymore. (bgp_event) remove peer_unlock, events not refcounted. * bgpd.c: (peer_free) flush events before free.
2006-09-14[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted statesPaul Jakma
2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-07-02[bgpd] Fix crash on shutdown of peerPaul Jakma
2006-07-02 Paul Jakma <paul.jakma@sun.com> * bgp_fsm.c: (bgp_{stop,start}) Move clear/free of certain bits of state from stop to start, as they may be used via peer references on clearing queues..
2006-02-21[bgpd] Record afi/safi in bgp_table. Serialise peer clear with FSM.Paul Jakma
2006-02-21 Paul Jakma <paul.jakma@sun.com> * bgpd.h: move the clear_node_queue to be peer specific. Add a new peer status flag, PEER_STATUS_CLEARING. * bgp_table.h: (struct bgp_table) Add fields to record afi, safi of the table. (bgp_table_init) Take afi and safi to create table for. * bgp_table.c: (bgp_table_init) record the afi and safi. * bgp_nexthop.c: Update all calls to bgp_table_init. * bgp_vty.c: ditto. * bgpd.c: ditto. * bgp_fsm.c: (bgp_timer_set) dont bring up a session which is clearing. * bgp_route.c: (general) Update all bgp_table_init calls. (bgp_process_{rsclient,main}) clear_node is serialised via PEER_STATUS_CLEARING and fsm now. (struct bgp_clear_node_queue) can be removed. struct bgp_node can be the queue item data directly, as struct peer can be kept in the new wq global user data and afi/safi can be retrieved via bgp_node -> bgp_table. (bgp_clear_route_node) fix to get peer via wq->spec.data, afi/safi via bgp_node->bgp_table. (bgp_clear_node_queue_del) no more item data to delete, only unlock the bgp_node. (bgp_clear_node_complete) only need to unset CLEARING flag and unlock struct peer. (bgp_clear_node_queue_init) queue attaches to struct peer now. record peer name as queue name. (bgp_clear_route_table) If queue transitions to active, serialise clearing by setting PEER_STATUS_CLEARING rather than plugging process queue, and lock peer while queue active. Update to pass only bgp_node as per-queue-item specific data.
2005-06-282005-06-28 Paul Jakma <paul.jakma@sun.com>paul
* (global) The great bgpd extern and static'ification. * bgp_routemap.c: remove unused ROUTE_MATCH_ASPATH_OLD code (route_set_metric_compile) fix u_int32_t to ULONG_MAX comparison warnings. * bgp_route.h: (bgp_process, bgp_withdraw, bgp_update) export these used by various files which had their own private declarations, in the case of mplsvpn - incorrect.
2005-06-012005-06-01 Paul Jakma <paul.jakma@sun.com>paul
* bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-05-192005-05-19 Paul Jakma <paul@dishone.st>paul
* bgp_fsm.c: (bgp_stop) use sockunion_free, not XFREE.. * bgp_network.c: (bgp_getsockname) ditto * bgp_routemap.c: (route_match_peer) ditto, als use a ret value and remove one sockunion_free. * bgpd.c: (peer_delete) ditto
2005-02-02 * bgp_fsm.c, bgp_open.c, bgp_packet.c, bgp_route.[ch], bgp_vty.c,hasso
bgpd.[ch]: Add BGP_INFO_STALE flag and end-of-rib support. "bgp graceful-restart" commands added. Show numbers of individual messages in "show ip bgp neighbor" command. Final pieces of graceful restart. [merge from GNU Zebra]
2005-02-01 * bgp_nexthop.c: Improve debug.hasso
* bgpd.[ch], bgp_nexthop.c, bgp_snmp.c: Remove useless bgp_get_master() function. * bgp_packet.c: MP AFI_IP update and withdraw parsing. * bgp_fsm.c: Reset peer synctime in bgp_stop(). bgp_fsm_change_status() is better place to log about peer status change than bgp_event(). Log in bgp_connect_success(). * bgp_vty.c: Fix typo in comment. * bgp_attr.c: Better log about unknown attribute. [merge from GNU Zebra]
2004-12-082004-12-08 Andrew J. Schorr <ajschorr@alumni.princeton.edu>ajs
* *.c: Change level of debug messages to LOG_DEBUG.
2004-10-132004-10-13 Paul Jakma <paul@dishone.st>paul
* (global) more const'ification and fixups of types to clean up code. * bgp_mplsvpn.{c,h}: (str2tag) fix abuse. Still not perfect, should use something like the VTY_GET_INTEGER macro, but without the vty_out bits.. * bgp_routemap.c: (set_aggregator_as) use VTY_GET_INTEGER_RANGE (no_set_aggregator_as) ditto. * bgpd.c: (peer_uptime) fix unlikely bug, where no buffer is returned, add comments about troublesome return value.
2004-05-21Merge graceful restart capability display and some small fixes from Zebrahasso
repository by Rivo Nurges.
2004-05-20Merge bgpd changeset 1176 from Zebra repository by Rivo Nurges.hasso
2004-05-032004-05-03 Daniel Roesen <dr@cluenet.de>paul
* bgp_fsm.c: (bgp_stop) Reset uptime only on transition from Established so that it reflects true downtime (rather time since last transition, eg Active->Idle)
2004-05-012004-05-01 Paul Jakma <paul@dishone.st>paul
* Revert the attempted clean-up of the dummy peer hack, reverts patchsets 435 (see 2004-02-17 below) and 456.
2004-02-172004-02-17 Paul Jakma <paul@dishone.st>paul
* bgpd.h: (bgp_peer) add fd_local and fd_accept file descriptor's, fd becomes a pointer to one of these. * bgpd.c: (global) adjust for fact that fd is now a pointer. (peer_create_accept) removed. * bgp_route.c: (global) adjust for change of peer fd to pointer * bgp_packet.c: (bgp_collision_detect) adjust and remove the "replace with other peer" hack. * bgp_network.c: (bgp_accept) Remove the dummy peer hack. Update peer->fd_accept instead. (global) Adjust fd references - now a pointer. * bgp_fsm.c: (global) adjust peer fd to pointer. (bgp_connection_stop) new function, to stop connection. (global) adjust everything which closed peer fd to use bgp_connection_stop().
2003-08-132003-08-13 kunihiro <kunihiro@zebra.org>paul
* bgpd/bgp{_fsm.c,_vty.c,d.c,d.h}: Add support for "bgp log-neighbor-changes" command.
2002-12-13Initial revisionpaul