Yingdi Yu | cb5c551 | 2014-08-03 22:31:40 -0700 | [diff] [blame^] | 1 | Design Document |
| 2 | =============== |
| 3 | |
| 4 | |
| 5 | Leaves |
| 6 | ------ |
| 7 | |
| 8 | Leaf Node |
| 9 | ~~~~~~~~~ |
| 10 | |
| 11 | ``Leaf`` is the base class which contains NameInfo and SeqNo. |
| 12 | |
| 13 | NameInfo |
| 14 | ++++++++ |
| 15 | |
| 16 | ``NameInfo`` is an abstract class and has only one implementation class ``StdNameInfo``. |
| 17 | Name is managed as a string in ``StdNameInfo``. |
| 18 | |
| 19 | ``StdNameInfo`` also maintains an internal table of all created ``StdNameInfo`` objects. |
| 20 | (**NOT CLEAR WHY**) |
| 21 | |
| 22 | SeqNo |
| 23 | +++++ |
| 24 | |
| 25 | ``SeqNo`` contains two pieces of information ``m_session`` and ``m_seq``. |
| 26 | |
| 27 | ``m_session`` (uint64) is Session ID (e.g., after crash, application will choose new session ID. |
| 28 | Session IDs for the same name should always increase. |
| 29 | |
| 30 | ``m_seq`` (uint64) is Sequence Number for a session. |
| 31 | It always starts with 0 and goes to max value. |
| 32 | (**For now, wrapping sequence number after max to zero is not supported**) |
| 33 | |
| 34 | ``SeqNo`` also contains a member ``m_valid``. |
| 35 | It seems that ``SeqNo`` is "invalid" if ``m_seq`` is zero. |
| 36 | |
| 37 | Full Leaf Node |
| 38 | ~~~~~~~~~~~~~~ |
| 39 | |
| 40 | ``FullLeaf`` inherits from ``Leaf``. |
| 41 | Compared to ``Leaf``, ``FullLeaf`` can calculate digest of a leaf. |
| 42 | |
| 43 | Digest is calculated as ``H(H(name)|H(session|seq))``. |
| 44 | |
| 45 | Diff Leaf Node |
| 46 | ~~~~~~~~~~~~~~ |
| 47 | |
| 48 | ``DiffLeaf`` inherits from ``Leaf``. |
| 49 | Compared to ``Leaf``, ``DiffLeaf`` contains information about operation: Update & Remove. |
| 50 | |
| 51 | |
| 52 | State (Tree) |
| 53 | ------------ |
| 54 | |
| 55 | State |
| 56 | ~~~~~ |
| 57 | |
| 58 | ``State`` is an abstract class of digest tree. |
| 59 | This class defines two methods ``update`` and ``remove``. |
| 60 | It has only one implementation class ``FullState``. |
| 61 | |
| 62 | FullState |
| 63 | ~~~~~~~~~ |
| 64 | |
| 65 | ``FullState`` implements the interface defined by ``State`` and also support digest calculation. |
| 66 | Digest is calculated as: |
| 67 | |
| 68 | 0. If the tree is empty, the digest is directly set to ``00``. |
| 69 | 1. Sort all leaves according to names (in std::string order, see std::string::compare). |
| 70 | 2. ``H(H(l1)|H(l2)|...|H(ln))`` |
| 71 | |
| 72 | |
| 73 | |
| 74 | ``FullState`` also records the timestamp of the last update. |
| 75 | |
| 76 | DiffState |
| 77 | ~~~~~~~~~ |
| 78 | |
| 79 | ``DiffState`` can be considered as a "commitment", which includes a set of ``DiffLeaves``. |
| 80 | ``DiffState`` is maintained as a list. |
| 81 | Each ``DiffState`` object has a pointer to the next diff. |
| 82 | Each ``DiffState`` object also has a digest which corresponds to its full state. |
| 83 | |
| 84 | |
| 85 | SyncLogic |
| 86 | --------- |
| 87 | |
| 88 | ``SyncLogic`` is the core class of ChronoSync. |
| 89 | |
| 90 | Initialization |
| 91 | ~~~~~~~~~~~~~~ |
| 92 | |
| 93 | ``SyncLogic`` requires a ``m_syncPrefix``, a callback ``m_onUpdate`` to handle notification of state update, and a callback ``m_onRemove`` to handle notification of leaf removal. |
| 94 | |
| 95 | When created, ``SyncLogic`` will listen to ``m_syncPrefix`` to receive sync interest. |
| 96 | A ``m_syncInterestTable`` is created to hold incoming sync interest with the same state digest as the receiving instance. |
| 97 | |
| 98 | ``SyncLogic`` also has a ``m_perBranch`` boolean member and a ``m_onUpdateBranch`` callback member. |
| 99 | (**NOT CLEAR ABOUT THE USAGE**. It seems that if ``m_perBranch`` is set, ``m_onUpdateBranch`` will be called, but ``m_onRemove`` and ``m_onUpdate`` will not be called.) |
| 100 | |
| 101 | Working Flow (onData) |
| 102 | ~~~~~~~~~~~~~~~~~~~~~ |
| 103 | |
| 104 | ``SyncLogic`` periodically expresses sync interest which contains its own state digest. |
| 105 | Sync interest is constructed by concatenating the ``m_syncPrefix`` with the hex-encoded state digest. |
| 106 | The ``MustBeFresh`` field of the sync interest must be set. |
| 107 | When a sync interest is sent out, a timer is set up to re-express sync interest. |
| 108 | The value of timer is set to ``4,000,000 + rand(100, 500)`` milliseconds. |
| 109 | |
| 110 | Sync interest is associated with two callbacks: ``onSyncData`` and ``onSyncTimeout``. |
| 111 | ``onSyncTimeout`` does nothing, because interest re-expressing is handled by a scheculer. |
| 112 | ``onSyncData`` validates the data. Invalid data will be dropped. |
| 113 | If data can be validated, the packet will go through following process: |
| 114 | |
| 115 | 1. ``SyncLogic`` deletes interests in ``m_syncInterestTable`` which can be satisfied by the sync data, because these interests |
| 116 | in ``m_syncInterestTable`` must have been satisfied before the data packet is received. |
| 117 | |
| 118 | 2. ``SyncLogic`` decodes the sync data into a ``DiffState`` object. ``DiffLeaves`` in ``DiffState`` is sorted according to name. |
| 119 | ``SyncLogic`` applies each ``DiffLeaf`` in the sorted order. If the operation in ``DiffLeaf`` changes the status, ``SyncLogic`` |
| 120 | will generate its own diff log which is also a ``DiffState`` object. If ``SyncLogic`` detect that the received sync data has some |
| 121 | information that it is missing, it will prepare a ``MissingDataInfo`` which includes ``prefix`` (name), ``low`` (lowest missing SeqNo), |
| 122 | and ``high`` (highest missing SeqNo), and call ``m_onUpdate`` on it. |
| 123 | If the content of sync data cannot be parsed, the packet is dropped. |
| 124 | |
| 125 | 3. ``SyncLogic`` compares the sync data name with ``m_outstandingInterestName`` which is the name of the outstanding sync interest |
| 126 | sent by the instance. If so, ``SyncLogic`` expresses new sync interest after sync data processing is done, otherwise does nothing |
| 127 | |
| 128 | Working Flow (onInterest) |
| 129 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 130 | |
| 131 | ``SyncLogic`` also expects to receive sync interest from others. |
| 132 | If a sync interest is received, the packet will be processed as following: |
| 133 | |
| 134 | 1. ``SyncLogic`` checks whether the interest is a normal interest or recovery interest (which will be described later). |
| 135 | If data is a normal sync data packet, go to Step 2, otherwise go to Step 7. |
| 136 | |
| 137 | 2. Receiving a normal sync interest, check if its digest is ``00``. If so, go to Step 3, otherwise go to Step 4. |
| 138 | |
| 139 | 3. Receiving an initial sync interest (sending instance has an empty sync tree). If receiving ``SyncLogic`` 's sync tree is not empty, |
| 140 | encode the whole sync tree into response (sync data), otherwise both sender and receiver are in the same status, put the interest into |
| 141 | ``m_syncInterestTable``. |
| 142 | |
| 143 | 4. Receiving a sync interest which contains some state. Check if the sender and receiver are in the same status, if so, put the interest |
| 144 | into ``m_syncInterestTable``. Otherwise check if sender's state exist in receiver's log, if so, merge all the difference into one |
| 145 | ``DiffState`` and encode it into response (sync data). If the receiver cannot recognize the sender's state, go to Step 5. |
| 146 | |
| 147 | 5. Check if immediate recovery is needed (whether ``timedProcessing`` is set), if so, express recovery interest immediately (go to Step 6). |
| 148 | Otherwise check if the interest exist in ``m_syncInterestTable``. If the interest has not been inserted into the table before, |
| 149 | process the interest again after ``rand(200, 100)`` milliseconds but with ``timedProcessing set``. If the interest exist in the table, |
| 150 | insert it again simply refreshes the table and defers the interest processing for another random period. It is guaranteed that a recovery |
| 151 | interest will be sent out when the interest is processed again. |
| 152 | |
| 153 | 6. Send recovery interest. A recovery interest is constructed by inserting a name component ``recovery`` between the ``m_syncPrefix`` and |
| 154 | the received digest. Default recovery interest re-expressing timer is 200 milliseconds with a jitter ``rand(100, 500)`` milliseconds. |
| 155 | Every time when recovery interest times out, the timer will be doubled until is longer than 100 seconds. |
| 156 | |
| 157 | 7. Receiving a recovery sync interest, check if its digest is recongized. If not, drop the interest, otherwise, encode the sync tree in the |
| 158 | response. |
| 159 | |
| 160 | Name Publishing |
| 161 | ~~~~~~~~~~~~~~~ |
| 162 | |
| 163 | ``addLocalName`` method is used to update leaf nodes. |
| 164 | Caller needs to supply prefix (name), session, and seq. |
| 165 | The 3-tuple will trigger state changes and sync interest/data exchange. |
| 166 | |
| 167 | Data Validation (in paper, not implemented) |
| 168 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 169 | |
| 170 | Each leaf as a sibling for endorsement. |
| 171 | The name of the sibling is constructed by concatenating the original leaf's name with a name component ``ENDORSE``. |
| 172 | ``SyncLogic``, when parsing the returned data, will automatically fetch & process ``ENDORSE`` data. |
| 173 | |
| 174 | Unresolved Issue |
| 175 | ~~~~~~~~~~~~~~~~ |
| 176 | |
| 177 | 1. Remove leaf |
| 178 | |
| 179 | 2. SeqNo wrapp-up |
| 180 | |
| 181 | SyncSocket |
| 182 | ---------- |
| 183 | |
| 184 | ``SyncSocket`` is a wrapper of ``SyncLogic``. It can be viewed as an interface to a leaf node. |
| 185 | Through SyncSocket, user only needs to publish its data. |
| 186 | ``SyncSocket`` will maitain corresponding sequence number, and trigger synchronization throug ``SyncLogic::addLocalName`` method. |
| 187 | ``SyncSocket`` is also responsible for actual data publishing. |