Diff
Not logged in

Differences From Artifact [0f9001c9f0]:

To Artifact [3bce849e6e]:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

36
37
38
39
40
41
42
43
44
45

46
47
48
49
50

51
52
53
54
55
56
57
58
59
60
61
62
63
64

Known problems and areas to work on
===================================

*	Not yet able to handle the specification of multiple projects
	for one CVS repository. I.e. I can, for example, import all of
	tcllib, or a single subproject of tcllib, like tklib, but not
	multiple sub-projects in one go.

*	We have to look into the pass 'InitCsets' and hunt for the
	cause of the large amount of memory it is gobbling up.

	Results from the first look using the new memory tracking
	subsystem:

	(1) The general architecture, workflow, is a bit wasteful. All
	    changesets are generated and kept in memory before getting
	    persisted. This means that allocated memory piles up over
	    time, with later changesets pushing the boundaries. This
	    is made worse that some of the preliminary changesets seem
	    to require a lot of temporary memory as part of getting
	    broken down into the actual ones. InititializeBreakState
	    seems to be the culprit here. Its memory usage is possibly
	    quadratic in the number of items in the changeset.

	(2) A number of small inefficiencies. Like 'state eval' always
	    pulling the whole result into memory before processing it
	    with 'foreach'. Here potentially large lists.

	(3) We maintain an in-memory map from tagged items to their
	    changesets. While this is needed later in the sorting
	    passes during the creation this is wasted space. And also
	    wasted time, to maintain it during the creation and
	    breaking.


	Changes:

	(a) Re-architect to create, break, and persist changesets one
	    by one, completely releasing all associated in-memory data
	    before going to the next. Should be low-hanging fruit with
	    high impact, as we have all the necessary operations
	    already, just not in that order, and that alone should
	    already keep the pile from forming, making the spikes of
	    (2) more manageable.


	(b) Look into the smaller problems described in (2), and
	    especially (3). These should still be low-hanging fruit,
	    although of lesser effect than (a). For (3) disable the
	    map and its maintenace during construction, and put it
	    into a separate command, to be used when loading the

	    created changesets at the end.

	(c) With larger effect, but more difficult to achieve, go into
	    command 'InitializeBreakState' and the preceding
	    'internalsuccessors', and rearchitect it. Definitely not a
	    low-hanging fruit. Possibly also something we can skip if
	    doing (a) had a large enough effect.

*	Look at the dependencies on external packages and consider
	which of them can be moved into the importer, either as a
	simple utility command, or wholesale.

	struct::list
		assign, map, reverse, filter









<
<
|
<
<
|
<
<
<
<
<
<
<
<
<

<
<
<
|
<
<
<
<
<

>
|
|
|
|
|
|
|
<
|
|
>
|
|
|
|
|
>
|
|
<
|
|
<
<







1
2
3
4
5
6
7
8
9


10


11









12



13





14
15
16
17
18
19
20
21
22

23
24
25
26
27
28
29
30
31
32
33

34
35


36
37
38
39
40
41
42

Known problems and areas to work on
===================================

*	Not yet able to handle the specification of multiple projects
	for one CVS repository. I.e. I can, for example, import all of
	tcllib, or a single subproject of tcllib, like tklib, but not
	multiple sub-projects in one go.



*	Consider to rework the breaker- and sort-passes so that they


        do not need all changesets as objects in memory.













	Current memory consumption after all changesets are loaded:






	bwidget		 6971627    6.6
	cvs-memchan	 4634049    4.4
	cvs-sqlite	45674501   43.6
	cvs-trf		 8781289    8.4
	faqs		 2835116    2.7
	libtommath	 4405066    4.2
	mclistbox	 3350190    3.2 
	newclock	 5020460    4.8

	oocore		 4064574    3.9
	sampleextension	 4729932    4.5
	tclapps		 8482135    8.1
	tclbench	 4116887    3.9
	tcl_bignum	 2545192    2.4
	tclconfig	 4105042    3.9
	tcllib		31707688   30.2
	tcltutorial	 3512048    3.3
	tcl	       109926382  104.8
	thread		 8953139    8.5
	tklib		13935220   13.3

	tk		66149870   63.1
	widget		 2625609    2.5



*	Look at the dependencies on external packages and consider
	which of them can be moved into the importer, either as a
	simple utility command, or wholesale.

	struct::list
		assign, map, reverse, filter