EvC Forum: The Minkowski's challenge

Message 106 of 120 (367675)
12-04-2006 2:33 PM

Reply to: Message 104 by Barbarian
12-04-2006 3:02 AM

Could Objective Caml itself be a performance bottleneck? I'm not myself familiar with the language, and in the five minutes I gave myself to study up I was unable to discover whether compilation produces actual object code or just preparsed scripts that still have to be interpreted. The frequent focus of Caml articles on performance, with frequent mention of such issues as avoidance of run-time type checking, leads me to believe that it is interpretive. If this is the case, then even the couple positive statements I saw claiming the performance is pretty good ("50% of a decent C compiler" was one comment) are not enough to persuade me away from the possibility that your biggest performance bottleneck could well be your choice of language.

--Percy

Edited by Percy, : Fix garbled last sentence.

This message is a reply to:
	Message 104 by Barbarian, posted 12-04-2006 3:02 AM		Barbarian has replied

Replies to this message:
	Message 107 by Barbarian, posted 12-05-2006 2:14 PM		Percy has replied

Barbarian

Inactive Member

Message 107 of 120 (367805)
12-05-2006 2:14 PM

Reply to: Message 106 by Percy
12-04-2006 2:33 PM

Actually, it is not me who allocates and deallocates, I just forget about the objects I no longer use and the runtime library frees them when sees so fit. I am not given a chance to write my own suballocator, and considering the years that went into the generic garbage collector of OCaml, I would need lots of optimism to even try. GC is not usually a showstopper, as I would waste more time re-debugging the allocation-deallocation schemes in C whenever I change the code.

I agree it is unusual to have such an even spread of CPU usage. Maybe it would be advisable to try to profile it under Linux too - I did so only over Cygwin, the native-code compiler can produce code which writes logs for gprof to process, but I cannot exactly vouchsafe for the compatibility of gprof and the OCaml compiler.

OCaml can produce both interpretable and native code - the latter boasts the 50% of the performance of a decent C compiler. In my case only native code can be profiled, for the profiler of the interpreted code insists on no preprocessing whatsoever, and I use the standard camlp4o preprocessor for stream support.

I don't think the performance is that bad, actually; it is just not so lightning-fast as I imagined it to be. It's a few hundred births per second for a normal program, whereas I hoped for about 2000 births per second. I just produced a different profile which shows List.length as first, probably due to extensive actual evolution happening. Since you asked for the other functions towards the top of the profile, I quote it here. Functions above 1%, accounting for a total of 153sec out of 162sec total are shown:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
 19.72     31.99    31.99                             camlList__length_aux_57
 17.39     60.21    28.22                             camlPervasives__$40_166
 13.38     81.92    21.71                             sweep_slice
 12.38    102.01    20.09                             mark_slice
  6.09    111.89     9.88                             caml_obj_dup
  4.54    119.25     7.36                             camlMachine__exec_instr_306
  3.09    124.27     5.01                             caml_oldify_mopup
  3.04    129.20     4.93                             caml_modify
  3.03    134.12     4.92                             caml_oldify_one
  2.23    137.74     3.62                             caml_fl_merge_block
  2.13    141.19     3.46                             caml_alloc_shr
  1.68    143.92     2.73                             caml_fl_allocate
  1.67    146.63     2.71                             caml_alloc_small
  1.45    148.99     2.35                             caml_oldify_local_roots
  1.43    151.31     2.32                             camlMachine__exec_aux_412
  1.34    153.47     2.17                             allocate_block

Only the two camlMachine_* ones are mine.

I do not think excessive optimization is worth it - I might try to remove excessive calls to List.length(), but only after the mysterious bug is caught.

This message is a reply to:
	Message 106 by Percy, posted 12-04-2006 2:33 PM		Percy has replied

Replies to this message:
	Message 108 by Percy, posted 12-05-2006 2:58 PM		Barbarian has replied

Percy

Member

Posts: 22503

From: New Hampshire

Joined: 12-23-2000

Member Rating: 4.9

Message 108 of 120 (367813)
12-05-2006 2:58 PM

Reply to: Message 107 by Barbarian
12-05-2006 2:14 PM

Am I misinterpreting your table when I think it's telling me that 85% of your time is OCaml overhead? If so, are the services that the runtime OCaml system is providing of sufficiently high value to justify that?

Is compiled OCaml known to be much, much faster than interpreted Perl? A couple years ago I tried to rig up something real quick because it was so easy in Perl, and that's when I discovered that Perl is less than a hundredth as fast as compiled C++. All that OCaml runtime stuff in your performance profile seems odd if it's really producing object code, and runtime reference tracking for GC is expensive. Taking a look at Objective CAML Tutorial, my guess is that, yes, compiling produces object code, but most of the object code is just calls to runtime system routines. The performance claims no doubt hold up extremely well for Caml code like "let max a b = if a > b then a else b;;", but I have my doubts about anything requiring calls to the runtime system.

But it sounds like you're not really worried about performance. I guess something you said must have made me think you were, but if you're not worried about it then I'm certainly not.

--Percy

This message is a reply to:
	Message 107 by Barbarian, posted 12-05-2006 2:14 PM		Barbarian has replied

Replies to this message:
	Message 109 by Barbarian, posted 12-07-2006 4:22 AM	Percy has not replied
	Message 112 by Jazzns, posted 12-12-2006 12:29 PM	Percy has not replied

Barbarian

Inactive Member

Message 109 of 120 (368129)
12-07-2006 4:22 AM

Reply to: Message 108 by Percy
12-05-2006 2:58 PM

More like 65%, since camlList__length_aux (with almost 20%) is a library function I might be calling unnecessarily too often. But yes, memory management takes about 65% of the time, and that is unexpectedly high. I suspect that the unusually high List.length() usage and the memory management issue are related somehow; most likely, the program, being excessively functional, keeps recomputing some value unnecessarily and I haven't found the place yet where this happens. It may also be that the code is not doing what I intended it to do, hence the excessive use of such functions. The conclusion is that I have to find the reason for this anomaly because it could be a symptom of a logical error; otherwise, the performance would not bother me that much.

I think the choice of language is justified. For one, I know this language fairly well, so there is no overhead of library lookups etc. Then, I am sure these results are abnormal in some sense, so if I find the mistake, everything is well. Finally, the time I lose when running the program is more than compensated for what I would waste debugging a manual memory management scheme after every change in the code, and changes are bound to happen fairly often to this code.

As for comparison with Perl, I really cannot tell; I haven't used Perl for years now and consequently have little relevant experience. I do not think the OCaml native code compiler is so simple - to me it looks like a bona fide optimizing compiler, which only calls routines for memory management, otherwise operations are performed in generated code. Most of the OCaml library is written in OCaml anyway, and there are extremely few built-in functions.

This message is a reply to:
	Message 108 by Percy, posted 12-05-2006 2:58 PM		Percy has not replied

Barbarian

Inactive Member

Message 110 of 120 (369239)
12-12-2006 6:50 AM

The tool is ready for production

Have found the source of the mysterious bug, fixed it and the tool seems to be able to run now for hours without hitting any errors. IOW, it is ready to use. I could add some cosmetic touches including even more comments to the source code, but all that is optional, really.

As a bonus, unnecessarily complex self-replicating proglets can now evolve into the minimal self-replicating one without throwing exceptions. To illustrate this, let me show a sample session (with the tool run in a DOS command window) aimed at evolving the shortest possible self-replicating proglet from a complex one:

D:\My Data\TheZone\Misc>evot
Evot launched
(evot) help
    The following are valid commands
    newworld SELECTOR FILENAME*
                           : create new world
    load FILENAME          : load world from file
    save FILENAME          : save world to file
    sels                   : list available selectors
    newpop SELECTOR        : add new population based on selector
    back                   : move active population pointer back
    forward                : move active population pointer forward
    cut                    : cut all populations before the current one
    drop                   : drop all populations above the current one
    do COUNT               : run COUNT generations' worth of evolution
    run                    : run evolution until interrupted
    try THRESHOLD          : run until last population reaches threshold
    pops                   : enumerates populations
    mems                   : lists all members
    mems FIRST LAST        : lists members between FIRST and LAST
    mems COUNT             : lists the topmost COUNT members
    list INDEX             : disassemble member identified by INDEX
    dbug INDEX             : launch identified member in debugger
    dump INDEX FILENAME    : save member code to file
    size SIZE              : set max size of current population
    globsize SIZE          : set max size of new populations
                    : print short status
    quit                   : leave program
    help or '?'            : this list
(evot) sels
    Selectors:
    ----------
    Null (max 1) = Accepts any self-reproducing program
    SegmentCount (max 1073741823) = Counts the segments of the program
    Stomp (max 36) = Selects for smaller max segment length
(evot) newworld Stomp "reprod2.gel"
    Current world: 0 cut and 1 active populations, 1 total
    Global max fitness = 9
    Current population is #1
    Selected by 'Stomp', size = 1, top fitness = 9
(evot) try 36
    Top fitness = 36, generation #367, target reached
    Current world: 0 cut and 1 active populations, 1 total
    Global max fitness = 36
    Current population is #1
    Selected by 'Stomp', size = 200, top fitness = 36
(evot)

(evot) is the prompt of the evo-tool. The first command asks for a listing of available commands, just to show off. The second one asks for a listing of installed selectors/fitness evaluators. The third one instructs the tool to set up a new world with one single population, having Stomp as a selector and the proglet whose source can be found in the file "reprod2.gel" as the only individual. The Stomp selector grants a fitness of zero to any proglet which fails to self-replicate, otherwise grants two points for every loss of an instruction from the longest segment, one point for every loss of a complete segment and applies a five-point penalty for proglets finishing by timeout.

The fourth command instructs the tool to run evolution until an individual with fitness =36 appears (36 is the theoretical maximum for self-replicating proglets under Stomp). It took about two minutes to produce the following:

prog
    *Main
    specinstr 0
    specinstr 1
end

which is indeed the shortest possible self-replicating entity.

For your reference, here is the source listing ("reprod2.gel") of the proglet we started with (please refer to my post earlier where I explained the instructions, and please ignore the comments between (* and *), which contain the assertions used to prove the correctness of the proglet and hence find the bugs in it):

prog
    #0 newseg                   (* [*FirstData] *)
    *Main                       (* [*Main,*FirstData]  *)
Cycle1:                         (* [*Addr,*FirstData] *)
    #0 dup                      (* [*Addr,*Addr,*FirstData] *)
    *CopyToStack call           (* [count,val,val,...,val,*Addr,*FirstData] *)
    *CopyToNewSeg call          (* [*NewSeg,*Addr,*FirstData] *)
    specinstr 0                 (* -- *)
    drop                        (* [*Addr,*Firstdata] *)
    nextseg                     (* [*Addr',*FirstData] *)
    #1 dup                      (* [*FirstData,*Addr',*FirstData] *)
    #1 dup                      (* [*Addr',*FirstData,*Addr',*FirstData] *)
    *Done1 ifequal              (* [*Addr',*FirstData] *)
    *Cycle1 jump                (* -- *)
Done1:
    specinstr 1
                                (* *)
seg CopyToStack                 (* [*Addr|...] *)
                                (* [*Addr,|...] *)
    #0 dup                      (* [*Addr,*Addr,|...] *)
    *Done2 invalidaddr          (* [*Addr,|...] *)
    #0 dup                      (* [*Addr,*Addr,|...] *)
    load                        (* [i,*Addr,|...] *)
    swap                        (* [*Addr,i,|...] *)
    incr                        (* [*Addr',i,|...] *)
    *CopyToStack jump           (* [*Addr',i,|...] *)
Done2:                          (* [*Addr,|...] *)
    val                         (* [count,|...] *)
    ret                         (* -- *)
                                (* *)
seg CopyToNewSeg                (* [count,|...] *)
    #1 neg add                  (* [count-1,|...] *)
    swap                        (* [fistval,count-1,|...] *)
    newseg                      (* [*Addr,count-1,|...] *)
Cycle2:                         (* [*Addr,ccnt,|...] *)
    #1 dup                      (* [ccnt,*Addr,ccnt,|...] *)
    #0                          (* [0,ccnt,*Addr,ccnt,|...] *)
    *Done3 ifequal              (* [*Addr,cnt,|...] *)
    #2 lift                     (* [firstval,*Addr,cnt,|...] *)
    #1 dup                      (* [*Addr,firstval,*Addr,cnt,|...] *)
    push                        (* [*Addr,cnt,|...] *)
    swap                        (* [cnt,*Addr,|...] *)
    #1 neg add                  (* [cnt-1,*Addr,|...] *)
    swap                        (* [*Addr,cnt-1,|...] *)
    *Cycle2 jump                (* -- *)
Done3:                          (* [*Addr,0|...] *)
    swap drop                   (* [*Addr|...] *)
    ret                         (* -- *)

end

I might add that in every generation the intermediate form was able to write a perfect replica of itself, and that I find it amazing that there exists a line of single mutations leading from the long quoted proglet to this small one all the while being able to self-replicate. Otherwise, the complete mutation history is preserved for every individual, so it can be inspected for the benefit of the doubters.

Now the project can enter the second phase: proper selectors have to be implemented in order to evolve encryption. We should also return to the issue of how to distribute the source if anyone is interested in playing with it.

Edited by Barbarian, : missing space, missing 'end' from the short proglet

Replies to this message:
	Message 111 by Jazzns, posted 12-12-2006 12:26 PM		Barbarian has not replied

Jazzns

Member (Idle past 3940 days)

Posts: 2657

From: A Better America

Joined: 07-23-2004

Message 111 of 120 (369285)
12-12-2006 12:26 PM

Reply to: Message 110 by Barbarian
12-12-2006 6:50 AM

Re: The tool is ready for production

How many generations did it take from the proglet to the simple replicator?

Hopefully Percy can take some time off his busy schedule to talk about hosting. If not then I can arange something either temporarily while we wait for Percy or permanently if he is unable.

Of course, biblical creationists are committed to belief in God's written Word, the Bible, which forbids bearing false witness; --AIG (lest they forget)

This message is a reply to:
	Message 110 by Barbarian, posted 12-12-2006 6:50 AM		Barbarian has not replied

Jazzns

Member (Idle past 3940 days)

Posts: 2657

From: A Better America

Joined: 07-23-2004

Message 112 of 120 (369286)
12-12-2006 12:29 PM

Reply to: Message 108 by Percy
12-05-2006 2:58 PM

Email Exchange

I am always leery of posting my email in a public forum. I noticed that Barbarian also does not have his email in his profile. Wish his permission, could you send us eachothers email?

At the very least, could you send him mine so I can get a copy of his code to start playing with. I have ocaml installed and everything!

Of course, biblical creationists are committed to belief in God's written Word, the Bible, which forbids bearing false witness; --AIG (lest they forget)

This message is a reply to:
	Message 108 by Percy, posted 12-05-2006 2:58 PM		Percy has not replied

Replies to this message:
	Message 113 by Barbarian, posted 12-12-2006 1:35 PM		Jazzns has replied

Barbarian

Inactive Member

Message 113 of 120 (369299)
12-12-2006 1:35 PM

Reply to: Message 112 by Jazzns
12-12-2006 12:29 PM

Re: Email Exchange

If that is enough, I do agree with Percy giving my e-mail address to you, although it would also be enough if I got yours, and sent you the packed sources.

Concerning the number of generations needed for the evolution of a minimal replicator: in the particular case I posted, it took 367 generations, but I have seen extremes like 210 and 1580 generations too - the framework re-seeds the random number generator at startup, so evolution is not repeatable (used to be, during debugging ...) and every run takes different paths and different generation counts. The range depends on the ur-proglet, as the first one I wrote only needed about 8-10 generations to get to the minimal case. But the one I posted above calls two subroutines in a cycle, the first subroutine copies a segment on the stack while the second one copies the stacked instructions to a newly allocated segment. In addition the subroutines are - in fact, must be - located in separate segments. From this we get a single-segment, no-subroutines program which then fairly quickly evolves into the minimal one; it takes a lot of time to do away with the subroutine mechanism, and the maximal fitness shows stagnation for the first 85-90% of the total time, after which it quickly goes up.

This message is a reply to:
	Message 112 by Jazzns, posted 12-12-2006 12:29 PM		Jazzns has replied

Replies to this message:
	Message 114 by Jazzns, posted 12-12-2006 1:46 PM		Barbarian has replied

Jazzns

Member (Idle past 3940 days)

Posts: 2657

From: A Better America

Joined: 07-23-2004

Message 114 of 120 (369301)
12-12-2006 1:46 PM

Reply to: Message 113 by Barbarian
12-12-2006 1:35 PM

Re: Email Exchange

It would be nice to be able to handpick the random seed so you could reproduce the exact evolution if you wanted to.

Of course, biblical creationists are committed to belief in God's written Word, the Bible, which forbids bearing false witness; --AIG (lest they forget)

This message is a reply to:
	Message 113 by Barbarian, posted 12-12-2006 1:35 PM		Barbarian has replied

Replies to this message:
	Message 115 by Barbarian, posted 12-12-2006 2:20 PM		Jazzns has not replied

Barbarian

Inactive Member

Message 115 of 120 (369311)
12-12-2006 2:20 PM

Reply to: Message 114 by Jazzns
12-12-2006 1:46 PM

Picking random seeds

Done.

Invocation              Effect
--------------------------------------------------
evot                 => initializes seed from clock
evot -randseed 123   => seed set to 123
evot -randseed 0     => initializes seed from clock

This message is a reply to:
	Message 114 by Jazzns, posted 12-12-2006 1:46 PM		Jazzns has not replied

Percy

Member

Posts: 22503

From: New Hampshire

Joined: 12-23-2000

Member Rating: 4.9

Message 116 of 120 (369317)
12-12-2006 2:40 PM

To Jazzns and Barbarian,

Okay, okay, I've been remiss. Yes, I can host. I promise to try to find time tonight to set up a subdomain for Jazzns, this will give him the ability to take things from there. Takes max of 24 hours for DNS name to propagate through the Internet, though.

--Percy

Replies to this message:
	Message 117 by Barbarian, posted 12-13-2006 2:02 PM		Percy has not replied

Barbarian

Inactive Member

Message 117 of 120 (369542)
12-13-2006 2:02 PM

Reply to: Message 116 by Percy
12-12-2006 2:40 PM

I will be away from the internets from now till Friday afternoon (in CET = GMT + 1), so you don't have to hurry.

This message is a reply to:
	Message 116 by Percy, posted 12-12-2006 2:40 PM		Percy has not replied

Replies to this message:
	Message 118 by Barbarian, posted 12-19-2006 11:29 AM		Barbarian has not replied

Barbarian

Inactive Member

Message 118 of 120 (370859)
12-19-2006 11:29 AM

Reply to: Message 117 by Barbarian
12-13-2006 2:02 PM

Is it possible that we are waiting for each other in a deadlock? For instance, did anyone send me an e-mail and holds now the vain hope that it got past the draconian spamfilter of my ISP?

This message is a reply to:
	Message 117 by Barbarian, posted 12-13-2006 2:02 PM		Barbarian has not replied

Replies to this message:
	Message 119 by Jazzns, posted 12-19-2006 11:41 AM		Barbarian has not replied

Jazzns

Member (Idle past 3940 days)

Posts: 2657

From: A Better America

Joined: 07-23-2004

Message 119 of 120 (370862)
12-19-2006 11:41 AM

Reply to: Message 118 by Barbarian
12-19-2006 11:29 AM

Bootstrapping the communication

I just created a yahoo account. Go ahead and email me there. My yahoo username is 'jazzns'. I'll reply to you after that form my real account.

This message is a reply to:
	Message 118 by Barbarian, posted 12-19-2006 11:29 AM		Barbarian has not replied

Replies to this message:
	Message 120 by Percy, posted 12-19-2006 1:25 PM		Jazzns has not replied

Percy

Member

Posts: 22503

From: New Hampshire

Joined: 12-23-2000

Member Rating: 4.9