KOffice – TDE office suite
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

746 lines
31 KiB

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <html>
  3. <head>
  4. <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
  5. <title>KSpread Development Notes</title>
  6. </head>
  7. <body>
  8. <h1>KSpread Development Notes</h1>
  9. <p>Maintainer: Ariya Hidayat (<a href="mailto:ariya@kde.org">ariya@kde.org</a>)</p>
  10. <p>Some portions by Tomas Mecir (<a href="mailto: mecirt@gmail.com">mecirt@gmail.com</a>)</p>
  11. <p>Revision: September 2004.</p>
  12. <h2>Introduction</h2>
  13. <p>This document contains information about internal structure of KSpread
  14. as well as some notes of upcoming redesign. The sources for this document
  15. are mainly the discussions which take place in koffice-devel mailing-list
  16. and the source code itself.</p>
  17. <h2>Document/View Architecture</h2>
  18. <p>tqStatus: IN PROGRESS.</p>
  19. <p>MVC (Model/View/Controller) means that the application consists of three
  20. big parts, the <i>Model</i> which holds the data structure and objects,
  21. the <i>View</i> which shows the model to the user and the <i>Controller</i>
  22. which handles user inputs and changes the model accordingly. Like other
  23. office applications, KSpread uses the Document/View architecture, a slightly
  24. different variant of MVC where the View and Controller are put together
  25. as one part.</p>
  26. <p>In order of its complexity scale, KSpread code has to be well separated,
  27. i.e. the <i>Document</i> and the <i>View</i>. We may also call them as
  28. <i>back-end</i> and <i>front-end</i> respectively. Right now part of which
  29. should belong to the Document sometimes has access to the View. For example,
  30. a cell stores information about its metrics in pixels (which is zoom dependent),
  31. knows whether it is visible to the user or not (which is view dependent), etc.
  32. This needs to be changed.</p>
  33. <p>One easy way to decide whether some stuff or relationship must really really
  34. alienated in the Document is to imagine that somebody wants to create another
  35. View (front-end) to the Document object model (back-end) that is being worked
  36. on. Say, one decent guy would like to copy the look-and-feel of classic Lotus
  37. 1-2-3 (for whatever reason we are not really interested in here); so basically
  38. to some extent he can take most part of the KSpread back-end and glue a new
  39. user interface around the code.</p>
  40. <h2>Dependency Handling</h2>
  41. <p>tqStatus: IN PROGRESS.</p>
  42. <p>When a cell holds a formula, then it is likely that it depends on other
  43. cell(s) for calculating the result. For example, if cell A11 has the formula
  44. &quot;=SUM(A1:A10)&quot;, this means that values in cells A1, A2, A3, until
  45. A10 must be correctly calculated first before the sum can be obtained for
  46. cell A11. This is called <i>dependency</i>.</p>
  47. <p>As for now, KSpread tries to manage dependency by storing the dependent
  48. cells or ranges in the cell itself. This is not too efficient. If a cell
  49. is very simple, i.e. stored only value, not formula, such scheme will just
  50. waste a couple of bytes of pointers for the dependency data structure.
  51. It is much more wiser to simply create one <i>dependency manager</i> for each
  52. worksheet; it should be responsible for maintaining and handling cell
  53. dependencies for that sheet. Also KSpread always stores ranges which depend
  54. on one particular cell and ranges whose one of its dependent is that cell
  55. (and these are all in the cell structure itself). This is not necessary as
  56. that information is redundant. The dependency manager should be able to handle
  57. both cases.</p>
  58. <p>Let us have a look at this simple example:</p>
  59. <table cellspacing="0" cellpadding="3" border="1">
  60. <tr>
  61. <td align="center">&nbsp;</td>
  62. <td align="center">A</td>
  63. <td align="center">B</td>
  64. <td align="center">C</td>
  65. <td align="center">D</td>
  66. </tr>
  67. <tr>
  68. <td align="center">1</td>
  69. <td align="right">14</td>
  70. <td align="right">36</td>
  71. <td align="right">&nbsp;</td>
  72. <td align="right">&nbsp;</td>
  73. </tr>
  74. <tr>
  75. <td align="center">2</td>
  76. <td align="right">3</td>
  77. <td align="right">&nbsp;</td>
  78. <td align="right">&nbsp;</td>
  79. <td align="right">&nbsp;</td>
  80. </tr>
  81. <tr>
  82. <td align="center">3</td>
  83. <td align="right">77</td>
  84. <td align="right">&nbsp;</td>
  85. <td align="right">&nbsp;</td>
  86. <td align="right">&nbsp;</td>
  87. </tr>
  88. <tr>
  89. <td align="center">4</td>
  90. <td align="right">=SUM(<b>A1:A3</b>)</td>
  91. <td align="right">=A4+SUM(<b>B1:B3</b>)</td>
  92. <td align="right">=100*<b>B4</b></td>
  93. <td align="right">&nbsp;</td>
  94. </tr>
  95. </table>
  96. <p>Such sheet should produce dependencies like:</p>
  97. <table cellspacing="0" cellpadding="3" border="1">
  98. <tr>
  99. <td><b>Reference</b></td>
  100. <td><b>Dependent(s)</b></td>
  101. </tr>
  102. <tr>
  103. <td>A4</td>
  104. <td>A1:A3</td>
  105. </tr>
  106. <tr>
  107. <td>B4</td>
  108. <td>A4 and B1:B3</td>
  109. </tr>
  110. <tr>
  111. <td>C4</td>
  112. <td>B4</td>
  113. </tr>
  114. </table>
  115. <p>When we want to recalculate cell B4, from the dependencies shown above we
  116. may know that first we need to know values of cell A4 and range B1:B3. Further on,
  117. cell A4 needs to know values of cells in range A1:A3. Therefore, <i>given one reference
  118. cell</i> (e.g. B4), the dependency manager must be able to <i>return all
  119. dependents, cells and/or ranges</i> (e.g. A4, B1:B3). Do we need to go
  120. recursively when searching for dependencies? That really depends on the
  121. implementation, but it is not a big problem, though.</p>
  122. <p>In another case, say the user has changed cell A3 so we need to update the
  123. calculation. We should not recalculate the whole sheet because it wastes time.
  124. We just need to recalculate cells that depend on A3, in this case A4, B4 and C4.
  125. So the dependency manager has another responsibility: <i>given a cell</i>
  126. it should <i>find all cells and/or ranges which depend on that particular
  127. cell</i>. It is a matter of iterating over all dependencies and checking
  128. whether the cell is within the dependent(s) and returning the reference cell.
  129. In this example, cell A3 is in the range A1:A3, a dependent range of cell A4.
  130. Hence, we just return A4. Recursive or not, we can either continue finding
  131. dependents of A4 or just stop here.</p>
  132. <p>Note also that dependency manager should not store cell pointers, but rather
  133. only the location of the cell (i.e. the sheet that owns the cell, row number and
  134. column number). This is because on some cases the dependent cell may not exist
  135. yet. As illustrated in the example, dependents of cell B4 are A4, B1, B2 and B3
  136. but here cells B2 and B3 are still empty. Of course, when we just want to
  137. know which cells we need to recalculate for one reference cell, the dependency
  138. manager is allowed to return only non-empty cells (e.g. A4 and B1 in our case)
  139. as empty cells have no effect and will not be recalculated anyway.</p>
  140. <p>By the same manner, dependency manager can also held responsible when
  141. chart comes into play. Any charts placed in the sheet (that are actually KChart
  142. parts) depend on some values of the cells. An action by the user to changing
  143. those cells, directly or indirectly, should trigger the update of the respective
  144. charts.</p>
  145. <p>Inter-sheet dependencies can be well handled if we store the owner of
  146. each dependent. This is not shown yet in the explanation above to avoid
  147. unnecessary complication. But let have one example now: if Sheet2!A1 is
  148. &quot;=SUM(Sheet1!A1:A10)&quot; then changing Sheet1!A1 (the dependent)
  149. means updating Sheet2!A1 (the reference). Of course during recalculation we
  150. must take care that all sheets in the document must be processed, even though
  151. only one single cell in one sheet has been changed.</p>
  152. <p>Implementation-wise, there will be one instance of the dependency manager
  153. for each sheet. This class will fully manage all dependencies and trigger cell
  154. recalculation. An important part of this concept is this: The cell itself knows
  155. <em>nothing</em> about dependencies, and it doesn't care about them either.
  156. The cell will just inform about the fact, that its value has been changed,
  157. and the dependency manager will do the rest. In addition, this gives us
  158. recursive dependency calculation at almost no cost.</p>
  159. <h2>Manipulators</h2>
  160. <p>tqStatus: PLANNED.</p>
  161. <p>Currently, every operation on a cell or on a range of cells is quite complex.
  162. You need to ensure correct repainting, recalculation, iterate on a range and so on.</p>
  163. <p>To address this issue, manipulators shall be implemented. A manipulator will
  164. implement one operation (formatting change, sequence fill, ..., ...).</p>
  165. <p>Basically, usage of a manipulator should look like this:</p>
  166. <p><pre>
  167. Manipulator *manip = manipulatorManager::self()->getManip ("seqfill");
  168. manip->setArgument ("type", 1);
  169. ... (more setArgument's)
  170. manip->exec (selection);
  171. </pre></p>
  172. <p>That's all...</p>
  173. <p>What concerns manipulator implementation, you'll derive from the base
  174. manipulator and reimplement constructor and methods initialize()
  175. (called just before the operation starts), processCell(), and maybe
  176. done(). The constructor or initialize() would set some properties for
  177. the cell-walking algorithm, and then it won't care about it anymore.
  178. The base class will walk the range and call processCell() for each
  179. cell, possibly creating it if it doesn't exist (if the manipulator
  180. wants so).There will also be some methods that can be used to process
  181. the whole range or row/column at once, if the manipulator wants to do so
  182. (useful for, say, formatting manipulators that will be able to set attributes
  183. of a whole range or row/col, in accordance with thoughts about format storage
  184. below.</p>
  185. <p>In addition, the manipulator can implement the undo/redo functionality
  186. - the base manipulator will provide some common stuff needed to
  187. accomplish this.</p>
  188. <h2>Selection handling</h2>
  189. <p>tqStatus: PLANNED</p>
  190. <p>The selection shall be an instance of some RangeList class,
  191. or however we want to call it - this will contain a list of
  192. cells/ranges/rows/whatever - like current selection, but will contain
  193. more entries. This will allow easy implementation of CTRL-selections and so,
  194. because thanks to manipulators, each operation will automatically support these.</p>
  195. <h2>Repaint Triggering</h2>
  196. <p>tqStatus: PLANNED</p>
  197. <p>As mentioned above, the interface between the core and the GUI needs to be kept
  198. at minimum. Also, the number of repaints needs to be as low as possible, and repaints
  199. should be groupped whenever possible. To achieve all this, the following approach
  200. can be used:</p>
  201. <p>When a cell is changed, it calls some method in KSpread::Sheet - valueChanged()
  202. or formattingChanged(). These methods then trigger everything necessary, like
  203. a call to the painting routine or dependency calculation.</p>
  204. <p>This simple system would work on itself, but it would be slow. If you do
  205. a sequence fill on A1:A1000 and you have a SUM(A1:A1000) somewhere, why would
  206. you want to compute that SUM 1000 times, when you can simply compute it after
  207. the sequence fill has been finished? Hence, the sheet will offer some more
  208. methods - disableUpdates(), enableUpdates(), rangeListChanged() and
  209. rangeListFormattingChanged(). All these will be used (solely?) by manipulators,
  210. preferably by the base manipulator class, so that we don't have to call these
  211. functions in each operation. After a call to disableUpdates(), there will
  212. be no repainting and no dependency calculation. Note that a call to
  213. enableUpdates() won't cause any repaints either, as the sheet cannot remember
  214. all the calls (due to loss of range information). Hence, the base manipulator
  215. class needs to call the correct rangeList*Changed method to trigger an
  216. update in an effective way. The base manipulator needs to be configurable by
  217. the manipulators that derive from it, so that it knows whether it changed
  218. cell's content or formatting.</p>
  219. <h2>Formula Engine</h2>
  220. <p>tqStatus: FINISHED.</p>
  221. <p>This formula engine is just an expression evaluator. To offer better
  222. performance, the expression is first compiled into byte codes which will
  223. be executed later by a virtual machine.</p>
  224. <p>Before compilation, the expression is separated into pieces, called tokens.
  225. This step, which is also known as lexical analysis, takes places at once
  226. and will produce sequence of tokens. They are however not stored and used only
  227. for the purpose of the subsequent step.
  228. Tokens are supplied to the parser, also known as syntax analyzer. In this
  229. design, the parser is also a code generator. It involve the generation
  230. of byte codes which represents the expression.
  231. Evaluating the formula expression is now basically running the virtual
  232. machine to execute compiled byte codes. No more scanning or parsing
  233. are performed during evaluation, this saves time a lot.</p>
  234. <p>The virtual machine itself (and of course the byte codes) are designed to be
  235. as simple as possible. This is supposed to be stack-based, i.e. the virtual
  236. machine has an execution stack of values which would be manipulated
  237. as each byte code is processed. Beside the stack, there will be a list of
  238. constant (sometimes also called as "constants pool") to hold Boolean,
  239. integer, floating-point or string values. When a certain byte code needs
  240. a constant for the operand, an index is specified which should be used
  241. to look up the constant in the constants pool.</p>
  242. <p>There are only few byte code, sufficient enough to perform calculation.
  243. Yes, this is really minimalist but yet does the job fairly well.
  244. The following provides brief description for each type of bytecode.</p>
  245. <blockquote>
  246. <p><i>Nop</i> means no operation.</p>
  247. <p><i>Load</i> means loads a constant and push it to the stack. The constant can
  248. be found at constant pools, at position by 'index', it could be a Boolean,
  249. integer, floating-point or string value.</p>
  250. <p><i>Ref</i> means gets a value from a reference. Member variable 'index' will
  251. refers to a string value in the constant pools, i.e. the name of the reference.
  252. Typically the reference is either a cell (e.g. A1), range of cells (A1:B10)
  253. or possibly function name. Example: expression A2+B2 will be compiled as:<br>
  254. Constants:<br>
  255. #0: "A2"<br>
  256. #1: "B2"<br>
  257. Codes:<br>
  258. Ref #0<br>
  259. Ref #1<br>
  260. Add
  261. </p>
  262. <p><i>Function</i>.
  263. Example: expression "sin(x)" will be compiled as:<br>
  264. Constants:<br>
  265. #0: "sin"<br>
  266. #1: "x"<br>
  267. Codes:<br>
  268. Ref #0<br>
  269. Ref #1<br>
  270. Function 1
  271. </p>
  272. <p><i>Neg</i> is a unary operator, a value is popped from stack and negated and then
  273. pushed back to the stack. If it is not number (Boolean or string), it
  274. will be converted first.</p>
  275. <p><i>Add, Sub, Mul, Div and Pow</i> are binary operators, two values are popped from
  276. stack and processed (added, subtracted, multiplied, divided, or power) and
  277. the result is pushed to the stack.</p>
  278. <p><i>Concat</i> is string operation, two values are popped from stack (and converted
  279. to string if they are not string values), concatenated, and the result is
  280. pushed to the stack.</p>
  281. <p><i>Not</i> is a logical operation, a value is popped from stack and its Boolean
  282. not is pushed into the stack. When it is not Boolean value, there will be
  283. a cast.</p>
  284. <p><i>Equal, Less, and Greater</i> are comparison operators, two values are
  285. popped from stack and compared appropriately. The result, which is a Boolean
  286. value, is pushed into the stack. To simplify, there no &quot;not equal&quot;
  287. comparison because it can be regarded as &quot;equal&quot; followed by
  288. &quot;not&quot; byte codes. Same goes for &quot;less than or equal to&quot; and
  289. &quot;greater than or equal to&quot;.</p>
  290. </blockquote>
  291. <p>The expression scanner is based on finite state acceptor. The state denotes
  292. the position of cursor, e.g. inside a cell token, inside an identifier, etc.
  293. State transition is following by emitting the associated token to the
  294. result buffer. Rather than showing the state diagrams here, it is much more
  295. convenience and less complicated to browse the scanner source code and try
  296. to follow its algorithm from there.</p>
  297. <p>The parser is designed using one of bottom-up parsing technique, namely
  298. based on Polish notation. Instead of ordering the tokens in suffix Polish
  299. form, the parser (which is also the code generator) simply outputs
  300. byte codes. In its operation, the parser requires the knowledge of operator
  301. precedence to correctly translate unparenthesized infix expression and
  302. thus requires the use of a syntax stack.</p>
  303. <p>The parser algorithm is given as follows:</p>
  304. <blockquote>
  305. Repeat the following steps:<br>
  306. Step 1: Get next token<br>
  307. Step 2: If it is an identifier<br>
  308. - push it to syntax stack<br>
  309. - generated "Ref"<br>
  310. Step 3: If it is a Boolean, integer, float or string value<br>
  311. - push it to syntax stack<br>
  312. - generated "Load"<br>
  313. Step 4: If it is an operator<br>
  314. - check for reduce rules<br>
  315. <br>
  316. - when no more rules applies, push token to the syntax stack<br>
  317. </blockquote>
  318. <p>The reduce rules are:</p>
  319. <p>Rule A: <i>function argument</i>:
  320. if token is semicolon or right parenthesis,
  321. if syntax stack looks as:
  322. <ul type="square">
  323. <li>non-operator &lt;--- top</li>
  324. <li>operator ;</li>
  325. <li>non-operator</li>
  326. <li>operator (</li>
  327. <li>identifier</li>
  328. </ul>
  329. then reduce to
  330. <ul type="circle">
  331. <li>non operator</li>
  332. <li>operator (</li>
  333. <li>identifier</li>
  334. <li>increase number of function arguments</li>
  335. </ul>
  336. </p>
  337. <p>Rule B: last function argument<br>
  338. if syntax stack looks as:<br>
  339. <ul type="square">
  340. <li>operator )</li>
  341. <li>non-operator</li>
  342. <li>operator (</li>
  343. <li>identifier</li>
  344. </ul>
  345. then reduce to:<br>
  346. <ul type="circle">
  347. <li>non-operator</li>
  348. <li>generated "Function" + number of function arguments</li>
  349. </ul>
  350. </p>
  351. <p>Rule C: function without argument<br>
  352. if syntax stack looks as:<br>
  353. <ul type="square">
  354. <li>operator )</li>
  355. <li>operator (</li>
  356. <li>identifier</li>
  357. </ul>
  358. then reduce to:<br>
  359. <ul type="circle">
  360. <li>non-operator (dummy)</li>
  361. </ul>
  362. </p>
  363. <p>Rule D: parenthesis removal<br>
  364. if syntax stack looks as:<br>
  365. <ul type="square">
  366. <li>operator (</li>
  367. <li>non-operator</li>
  368. <li>operator )</li>
  369. </ul>
  370. then reduce to:<br>
  371. <ul type="circle">
  372. <li>non-operator</li>
  373. </ul>
  374. </p>
  375. <p>Rule E: binary operator<br>
  376. if syntax stack looks as:<br>
  377. <ul type="square">
  378. <li>non-operator</li>
  379. <li>binary operator</li>
  380. <li>non-operator</li>
  381. <li>and if the precedence of the binary operator in the syntax stack
  382. is greater or equals to the precedence of token</li>
  383. </ul>
  384. then reduce to:<br>
  385. <ul type="circle">
  386. <li>non-operator</li>
  387. <li>and generated appropriate byte code for the binary operator</li>
  388. </ul>
  389. </p>
  390. <p>Rule F: unary operator<br>
  391. if syntax stack looks as:<br>
  392. <ul type="square">
  393. <li>non-operator</li>
  394. <li>unary operator</li>
  395. <li>operator</li>
  396. </ul>
  397. then reduce to:<br>
  398. <ul type="circle">
  399. <li>operator</li>
  400. <li>and generated "Neg" if unary operator is '-'</li>
  401. </ul>
  402. </p>
  403. <p>Percent operator is a special case and not handled the above mentioned rule.
  404. When the parser finds the percent operator, it checks whether there's a non-operator
  405. token right before the percent. If yes, then the following code is generated:
  406. <tt>load 0.01</tt> followed by <tt>multiply</tt>.</p>
  407. <h2>Value</h2>
  408. <p>tqStatus: FINISHED.<br>
  409. </p>
  410. <p>to be written.</p>
  411. <h2>Commands Based on KCommand<br>
  412. </h2>
  413. <p>tqStatus: IN PROGRESS.</p>
  414. <p>Until lately, to implement undo and redo, KSpread creates corresponding
  415. KSpreadUndo classes for each action and runs them when the user undoes
  416. those actions. KSpreadUndo also has redo function whose job is to redo
  417. again the action after being undone.</p>
  418. <p>All this needs to be converted to manipulators - these will be KCommand,
  419. hence we should be able to undo/redo every operation (provided that the
  420. corresponding manipulator provides methods to store/recall the undo information).</p>
  421. <h2>Cell Storage</h2>
  422. <p>tqStatus: PLANNED.</p>
  423. <p>Cells are grouped together, and then hashed.</p>
  424. <h2>Format Storage</h2>
  425. <p>tqStatus: PLANNED.</p>
  426. <p>Formatting specifies how a cell should look like. It involves font
  427. attributes like bold or italics, vertical and horizontal tqalignment,
  428. rotation angle, shading, background color and so on. Each cell can have
  429. its own format, but bear also in mind that a whole row or column format
  430. should also apply.</p>
  431. <p>Current way of storing formatting information is rather inefficient:
  432. pack it together inside the cell. The reason is because most of cells
  433. are either very plain (no formatting) and/or only have partial attribute
  434. (e.g. only marked as bold, no font family or color is specified).
  435. Therefore the approximately 20 bytes used to hold formatting information
  436. are quite a waste of memory. Even worse, this requires that the cell
  437. must exist even if it is not in use. As illustration, imagine a worksheet
  438. where within range A1:B20 only 5 cells are not empty. When the user
  439. selects this range and changes the background color to yellow, then
  440. those 5 cells must store this information in their data structure but
  441. how about the other 35 cells? Since the formatting is attached to the cell,
  442. there is no choice but to create them. Doing this, just for the sake of
  443. storing format, is actually not really good.</p>
  444. <p>A new way to store formatting information is proposed below.</p>
  445. <p>For each type of format a user can use, we have the corresponding
  446. <i>formatting piece</i>, for example &quot;bold on&quot;, &quot;bold off&quot;,
  447. &quot;font Arial&quot, &quot;left-border 1 px&quot;, etc. Whenever the user
  448. applies formatting to a range (could also be a whole column, row, or worksheet),
  449. we save appropriate respective formatting piece in a stack. Say the user has
  450. marked column B as bold, row 2 as italics, and set range A1:C5 with
  451. yellow background color. Our formatting stack would look like:
  452. </p>
  453. <table cellspacing="0" cellpadding="3" border="1">
  454. <tr>
  455. <td><b>Range</b></td>
  456. <td><b>Formatting Piece</b></td>
  457. </tr>
  458. <tr>
  459. <td>Column B</td>
  460. <td>Bold on</td>
  461. </tr>
  462. <tr>
  463. <td>Row 2</td>
  464. <td>Italics on</td>
  465. </tr>
  466. <tr>
  467. <td>A1:C5</td>
  468. <td>Yellow background</td>
  469. </tr>
  470. </table>
  471. <p>Now let try to figure out the overall format of cell B2. From the first
  472. we know it should be bold, from the second it should be italics, and last
  473. it should have yellow background. This complete format is the one which
  474. we used to render cell B2 on screen. In similar fashion, we can know that
  475. cell A1 on the other only specifies the yellow background, because the first
  476. and second pieces do not apply there.</p>
  477. <p>Another possible way to see the format storage is by using 3-D perspective.
  478. For each formatting piece, imagine there is a surface which covers the
  479. formatted range (the xy-plane). The formatting information is simply attached
  480. to the surface (say, as surface attribute). Every surface is stacked together,
  481. its depth (the z-axis) denotes the sequence, i.e. the first surface is the
  482. deepest. For the example above, we can view the pieces as one surface specifies
  483. &quot;bold on&quot; which is a vertical of column B, one surface specifies
  484. &quot;italics on&quot; which is a horizontal band of row 2 and one last surface
  485. which specifies &quot;yellow background&quot; stretched in the range A1:C5.
  486. How to find complete format for A2? This is now a matter of surface
  487. determination. Traversing in the direction of z-axis from B2 reveals
  488. that we hit the last and second surfaces only; thereby we can know the complete
  489. format is &quot;italics, yellow background&quot;.</p>
  490. <p>It is clear that a format storage corresponds to one sheet only. For each
  491. sheet, there should be one format storage. Cells can still have accessors to
  492. its formatting information, these simply become wrapper for proper calls to the
  493. format storage. Since each formatting piece holds information about the applied
  494. range, we must take care that the formatting storage is correctly updated
  495. whenever cells, rows or columns are inserted and deleted.</p>
  496. <p>In order to avoid low performance, we must use a smart way to iterate over
  497. all formatting pieces whenever we want to find out complete format for given
  498. cell(s). When the sheet gets very complex, it is likely that we will have
  499. many many formatting pieces that are not even overlap. This means, when we
  500. need formatting of cell A1, it is no use to check formatting pieces of range
  501. Z1:Z10 or A100:B100 which do not include cell A1 and are even very far from A1.
  502. One possible solution is to arrange the formatting pieces in a quad-tree.
  503. Because one piece can cover a very large area, it is possible that it will
  504. be in more than one leaf in the quad-tree. <i>Details on the possible use of
  505. quad-tree or other methods should be explored further more</i>.</p>
  506. <h2>Default Toolbars</h2>
  507. <p>tqStatus: IN PROGRESS.</p>
  508. <p>Relevant mailing-list threads:</p>
  509. <ul>
  510. <li><a href="http://lists.kde.org/?l=kde-usability&m=110751297906231&w=2">
  511. http://lists.kde.org/?l=kde-usability&amp;m=110751297906231&amp;w=2</a></li>
  512. <li><a href="http://lists.kde.org/?l=koffice-devel&m=110723903332496&w=2">
  513. http://lists.kde.org/?l=koffice-devel&amp;m=110723903332496&amp;w=2</a></li>
  514. </ul>
  515. <p>Toolbars are utilized to place most frequently used actions. It is
  516. important to present the user with default toolbars which make
  517. sense, i.e. they do not contain unnecessary buttons. In-depth usability
  518. analysis and/or further discussions are needed to make decision which
  519. buttons need to be in the toolbar and which don't.</p>
  520. <p>For reference, here is a list of default shown toolbars in
  521. some spreadsheet applications:</p>
  522. <p><b>Microsoft Excel 2002</b>:</p>
  523. <ul>
  524. <li><em>Standard toolbar</em>: New, Open, Save, Search, Print, Print Preview,
  525. Spelling, Cut, Copy, Paste, Format Painter, Undo, Redo, Insert Hyperlink,
  526. Autosum, Sort Ascending, Sort Descending, Chart Wizard, Drawing, Zoom, Help</li>
  527. <li><em>Formating toolbar:</em>: Font, Font Size, Bold, Italic, Align Left,
  528. Align Center, Align Right, Merge and Center, Format Currency, Format Percent,
  529. Comma Style, Increase Precision, Decrease Precision, Increase Indent,
  530. Decrease Indent, Merge Cells, Unmerge Cells, Merge Across, Borders,
  531. Fill Color, Font Color</li>
  532. </ul>
  533. <p><b>OpenOffice 2.0</b>:</p>
  534. <ul>
  535. <li><em>Standard toolbar</em>: New, Open, Save, E-mail, Edit, PDF, Print,
  536. Page Preview, Spellcheck, Cut, Copy, Paste, Format Paintbrush, Undo, Redo,
  537. Hyperlink, Sort Ascending, Sort Descending, Insert Chart, Navigator, Styles,
  538. Gallery, Zoom</li>
  539. <li><em>Format Object toolbar:</em> Font Name, Font Size, Bold, Italic,
  540. Underline, Font Color, Align Left, Align Center, Align Right, Justified,
  541. Merge Cells, Format as Currency, Format as Percent, Format Standard,
  542. Increase Precision, Decrease Precision, Increase Indent, Decrease Indent,
  543. Borders, Line Style, Line Color, Background Color, Align Top, Align Center
  544. Vertically, Align Bottom </li>
  545. </ul>
  546. <p><b>Gnumeric 1.4</b>:</p>
  547. <ul>
  548. <li><em>Standard toolbar</em>: New, Open, Save, Print, Print Preview,
  549. Cut, Copy, Paste, Undo, Redo, Hyperlink, Autosum, Function/Formula,
  550. Insert Chart, Zoom</li>
  551. <li><em>Format toolbar:</em> Font Family, Font Size, Bold, Italic,
  552. Underline, Align Left, Align Center, Align Right, Merge and Center,
  553. Merge Cells, Split Cell, Format Currency, Format Percent, Thousand Separator,
  554. Increase Precision, Decrease Precision, Decrease Indent, Increase Indent,
  555. Borders, Background, Foreground</li>
  556. </ul>
  557. <h2>Test Framework</h2>
  558. <p>tqStatus: IN PROGRESS.</p>
  559. <p>Relevant mailing-list threads:</p>
  560. <ul>
  561. <li><a href="http://lists.kde.org/?l=kde-cvs&m=109511244721480&w=2">
  562. http://lists.kde.org/?l=kde-cvs&amp;m=109511244721480&amp;w=2</a></li>
  563. <li><a href="http://lists.kde.org/?l=koffice-devel&m=109518196922944&w=2">
  564. http://lists.kde.org/?l=koffice-devel&amp;m=109518196922944&amp;w=2</a></li>
  565. </ul>
  566. <p>It is well known that writing clean and easily understandable module will lead
  567. to better maintenance. However testing that particular module everytime there is
  568. a significant change requires considerable amount of time and effort. Since KSpread
  569. and other applications of its scale consist of hundreds of modules, in this case
  570. automatic testing of each module will help a lot, not to mention that it might
  571. catch bug as early as possible.</p>
  572. <p>KSpread has a simple test framework to facilitate such kind of test. This can
  573. be activated using the shortcut <tt>Ctrl+Shift+T</tt>. This test is however
  574. not accessible via menu, because it is intended to be used only by the developers.
  575. Ideally, there should be tests for all modules contained in KSpread. It is
  576. the responsibility of the developer to create the corresponding <tt>tester</tt>
  577. for the code that he or she is working on. All tests should be kept in
  578. <tt>koffice/kspread/tests/</tt>.</p>
  579. <p>Making a new tester is not difficult. The easiest way is to copy an already
  580. existing tester and modify it. Basically, it must be a subclass of class Tester
  581. (see <tt>koffice/kspread/tests/tester.h</tt>). Just reimplement the virtual
  582. function <tt>run()</tt> and it is ready. In order to make it possible to run
  583. the new tester, add an instance of the class in TestRunner
  584. (for details, see <tt>koffice/kspread/tests/testrunner.cc</tt>).</p>
  585. <p>A tester must be self-contained, it should not use any test data from
  586. current document. If necessary, it must create (or hard code) the data by
  587. itself.</p>
  588. <p>Whenever parts of KSpread features are improved or rewritten, it is always
  589. a good idea to run the related tests to ensure that all the changes do not do
  590. any harm. However, bear in mind that there is no 100% guarantee that the new
  591. code is bug-free.</p>
  592. <p>Also, if there is a bug which is not caught by the tester (i.e. it does not
  593. fail the tester, but the bug is confirmed), then the relevant tester must be
  594. modified to include one or more test cases similar to the offending bug.
  595. When the bug is finally fixed, from that point the test should always pass all
  596. test cases.</p>
  597. </p>
  598. <h2>Coding Style</h2>
  599. <p>tqStatus: IN PROGRESS.</p>
  600. <p>(to be written in details).</p>
  601. <p>Write <b>clean code</b>. To be correct is better than to be fast.
  602. KSpread source code is known to grow very fast in its early days and but later
  603. on also more difficult to understand.</p>
  604. <p>Put comment as documentation for classes and member functions. There is still
  605. lack of documentation as for now, whoever understands something about the
  606. classes and functions should write the documentation.</p>
  607. <p>In complex source files, list of header includes can be very long. Unless
  608. there is special reason not do it, try to group them together, i.e. standard
  609. C/C++ headers come first, followed by TQt headers, and then KDE headers,
  610. KOffice core/UI headers and application specific headers. For each group,
  611. sort the header files alphabetically. </p>
  612. <p>Write test cases. This will ease further maintenance. See also the section on Test
  613. Framework above.</p>
  614. <p>Do not use the term <i>table</i>. It was incorrectly invented quite likely
  615. because of the term <i>Tabelle</i> (German, literally means table). The correct
  616. term is <i>sheet</i> or <i>worksheet</i>. The English version of Microsoft
  617. uses <i>sheet</i> while the German version uses <i>Tabelle</i>.</p>
  618. <p>Use <a href="http://developer.kde.org/documentation/library/kdeqt/trinityarch/devel-binarycompatibility.htm">d-pointer</a> trick (also known pimpl) whenever possible. Such practice will help when later on
  619. we want to expose the API and need to maintain binary compatibility. But the
  620. most important thing is to separate the interface and the implementation.
  621. Furthermore, build time is reduced since modification on the implementation
  622. would not cause tons of recompile.</p>
  623. <p>When creating a new class, use namespace KSpread. Do not use KSpread prefix
  624. anymore. Example: use <tt>KSpread::Foo</tt> instead of <tt>KSpreadFoo</tt>.
  625. Also source file name should not contain kspread prefix anymore, i.e.
  626. <tt>foo.h</tt> and <tt>foo.cc</tt> (but not <tt>kspread_foo.h</tt> and
  627. <tt>kspread_foo.cc</tt>) for the above example.</p>
  628. </body></html>