|
Register | Sign In |
|
QuickSearch
Thread ▼ Details |
|
Thread Info
|
|
|
Author | Topic: The Minkowski's challenge | |||||||||||||||||||||||
Percy Member Posts: 22503 From: New Hampshire Joined: Member Rating: 4.9 |
Could Objective Caml itself be a performance bottleneck? I'm not myself familiar with the language, and in the five minutes I gave myself to study up I was unable to discover whether compilation produces actual object code or just preparsed scripts that still have to be interpreted. The frequent focus of Caml articles on performance, with frequent mention of such issues as avoidance of run-time type checking, leads me to believe that it is interpretive. If this is the case, then even the couple positive statements I saw claiming the performance is pretty good ("50% of a decent C compiler" was one comment) are not enough to persuade me away from the possibility that your biggest performance bottleneck could well be your choice of language.
--Percy Edited by Percy, : Fix garbled last sentence.
|
|||||||||||||||||||||||
Barbarian Inactive Member |
Actually, it is not me who allocates and deallocates, I just forget about the objects I no longer use and the runtime library frees them when sees so fit. I am not given a chance to write my own suballocator, and considering the years that went into the generic garbage collector of OCaml, I would need lots of optimism to even try. GC is not usually a showstopper, as I would waste more time re-debugging the allocation-deallocation schemes in C whenever I change the code.
I agree it is unusual to have such an even spread of CPU usage. Maybe it would be advisable to try to profile it under Linux too - I did so only over Cygwin, the native-code compiler can produce code which writes logs for gprof to process, but I cannot exactly vouchsafe for the compatibility of gprof and the OCaml compiler. OCaml can produce both interpretable and native code - the latter boasts the 50% of the performance of a decent C compiler. In my case only native code can be profiled, for the profiler of the interpreted code insists on no preprocessing whatsoever, and I use the standard camlp4o preprocessor for stream support. I don't think the performance is that bad, actually; it is just not so lightning-fast as I imagined it to be. It's a few hundred births per second for a normal program, whereas I hoped for about 2000 births per second. I just produced a different profile which shows List.length as first, probably due to extensive actual evolution happening. Since you asked for the other functions towards the top of the profile, I quote it here. Functions above 1%, accounting for a total of 153sec out of 162sec total are shown:
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ts/call Ts/call name 19.72 31.99 31.99 camlList__length_aux_57 17.39 60.21 28.22 camlPervasives__$40_166 13.38 81.92 21.71 sweep_slice 12.38 102.01 20.09 mark_slice 6.09 111.89 9.88 caml_obj_dup 4.54 119.25 7.36 camlMachine__exec_instr_306 3.09 124.27 5.01 caml_oldify_mopup 3.04 129.20 4.93 caml_modify 3.03 134.12 4.92 caml_oldify_one 2.23 137.74 3.62 caml_fl_merge_block 2.13 141.19 3.46 caml_alloc_shr 1.68 143.92 2.73 caml_fl_allocate 1.67 146.63 2.71 caml_alloc_small 1.45 148.99 2.35 caml_oldify_local_roots 1.43 151.31 2.32 camlMachine__exec_aux_412 1.34 153.47 2.17 allocate_blockOnly the two camlMachine_* ones are mine. I do not think excessive optimization is worth it - I might try to remove excessive calls to List.length(), but only after the mysterious bug is caught.
|
|||||||||||||||||||||||
Percy Member Posts: 22503 From: New Hampshire Joined: Member Rating: 4.9 |
Am I misinterpreting your table when I think it's telling me that 85% of your time is OCaml overhead? If so, are the services that the runtime OCaml system is providing of sufficiently high value to justify that?
Is compiled OCaml known to be much, much faster than interpreted Perl? A couple years ago I tried to rig up something real quick because it was so easy in Perl, and that's when I discovered that Perl is less than a hundredth as fast as compiled C++. All that OCaml runtime stuff in your performance profile seems odd if it's really producing object code, and runtime reference tracking for GC is expensive. Taking a look at Objective CAML Tutorial, my guess is that, yes, compiling produces object code, but most of the object code is just calls to runtime system routines. The performance claims no doubt hold up extremely well for Caml code like "let max a b = if a > b then a else b;;", but I have my doubts about anything requiring calls to the runtime system. But it sounds like you're not really worried about performance. I guess something you said must have made me think you were, but if you're not worried about it then I'm certainly not. --Percy
|
|||||||||||||||||||||||
Barbarian Inactive Member |
More like 65%, since camlList__length_aux (with almost 20%) is a library function I might be calling unnecessarily too often. But yes, memory management takes about 65% of the time, and that is unexpectedly high. I suspect that the unusually high List.length() usage and the memory management issue are related somehow; most likely, the program, being excessively functional, keeps recomputing some value unnecessarily and I haven't found the place yet where this happens. It may also be that the code is not doing what I intended it to do, hence the excessive use of such functions. The conclusion is that I have to find the reason for this anomaly because it could be a symptom of a logical error; otherwise, the performance would not bother me that much.
I think the choice of language is justified. For one, I know this language fairly well, so there is no overhead of library lookups etc. Then, I am sure these results are abnormal in some sense, so if I find the mistake, everything is well. Finally, the time I lose when running the program is more than compensated for what I would waste debugging a manual memory management scheme after every change in the code, and changes are bound to happen fairly often to this code. As for comparison with Perl, I really cannot tell; I haven't used Perl for years now and consequently have little relevant experience. I do not think the OCaml native code compiler is so simple - to me it looks like a bona fide optimizing compiler, which only calls routines for memory management, otherwise operations are performed in generated code. Most of the OCaml library is written in OCaml anyway, and there are extremely few built-in functions.
|
|||||||||||||||||||||||
Barbarian Inactive Member |
Have found the source of the mysterious bug, fixed it and the tool seems to be able to run now for hours without hitting any errors. IOW, it is ready to use. I could add some cosmetic touches including even more comments to the source code, but all that is optional, really.
As a bonus, unnecessarily complex self-replicating proglets can now evolve into the minimal self-replicating one without throwing exceptions. To illustrate this, let me show a sample session (with the tool run in a DOS command window) aimed at evolving the shortest possible self-replicating proglet from a complex one:
D:\My Data\TheZone\Misc>evot Evot launched (evot) help The following are valid commands newworld SELECTOR FILENAME* : create new world load FILENAME : load world from file save FILENAME : save world to file sels : list available selectors newpop SELECTOR : add new population based on selector back : move active population pointer back forward : move active population pointer forward cut : cut all populations before the current one drop : drop all populations above the current one do COUNT : run COUNT generations' worth of evolution run : run evolution until interrupted try THRESHOLD : run until last population reaches threshold pops : enumerates populations mems : lists all members mems FIRST LAST : lists members between FIRST and LAST mems COUNT : lists the topmost COUNT members list INDEX : disassemble member identified by INDEX dbug INDEX : launch identified member in debugger dump INDEX FILENAME : save member code to file size SIZE : set max size of current population globsize SIZE : set max size of new populations : print short status quit : leave program help or '?' : this list (evot) sels Selectors: ---------- Null (max 1) = Accepts any self-reproducing program SegmentCount (max 1073741823) = Counts the segments of the program Stomp (max 36) = Selects for smaller max segment length (evot) newworld Stomp "reprod2.gel" Current world: 0 cut and 1 active populations, 1 total Global max fitness = 9 Current population is #1 Selected by 'Stomp', size = 1, top fitness = 9 (evot) try 36 Top fitness = 36, generation #367, target reached Current world: 0 cut and 1 active populations, 1 total Global max fitness = 36 Current population is #1 Selected by 'Stomp', size = 200, top fitness = 36 (evot)(evot) is the prompt of the evo-tool. The first command asks for a listing of available commands, just to show off. The second one asks for a listing of installed selectors/fitness evaluators. The third one instructs the tool to set up a new world with one single population, having Stomp as a selector and the proglet whose source can be found in the file "reprod2.gel" as the only individual. The Stomp selector grants a fitness of zero to any proglet which fails to self-replicate, otherwise grants two points for every loss of an instruction from the longest segment, one point for every loss of a complete segment and applies a five-point penalty for proglets finishing by timeout. The fourth command instructs the tool to run evolution until an individual with fitness =36 appears (36 is the theoretical maximum for self-replicating proglets under Stomp). It took about two minutes to produce the following:
prog *Main specinstr 0 specinstr 1 endwhich is indeed the shortest possible self-replicating entity. For your reference, here is the source listing ("reprod2.gel") of the proglet we started with (please refer to my post earlier where I explained the instructions, and please ignore the comments between (* and *), which contain the assertions used to prove the correctness of the proglet and hence find the bugs in it):
prog #0 newseg (* [*FirstData] *) *Main (* [*Main,*FirstData] *) Cycle1: (* [*Addr,*FirstData] *) #0 dup (* [*Addr,*Addr,*FirstData] *) *CopyToStack call (* [count,val,val,...,val,*Addr,*FirstData] *) *CopyToNewSeg call (* [*NewSeg,*Addr,*FirstData] *) specinstr 0 (* -- *) drop (* [*Addr,*Firstdata] *) nextseg (* [*Addr',*FirstData] *) #1 dup (* [*FirstData,*Addr',*FirstData] *) #1 dup (* [*Addr',*FirstData,*Addr',*FirstData] *) *Done1 ifequal (* [*Addr',*FirstData] *) *Cycle1 jump (* -- *) Done1: specinstr 1 (* *) seg CopyToStack (* [*Addr|...] *) (* [*Addr,|...] *) #0 dup (* [*Addr,*Addr,|...] *) *Done2 invalidaddr (* [*Addr,|...] *) #0 dup (* [*Addr,*Addr,|...] *) load (* [i,*Addr,|...] *) swap (* [*Addr,i,|...] *) incr (* [*Addr',i,|...] *) *CopyToStack jump (* [*Addr',i,|...] *) Done2: (* [*Addr,|...] *) val (* [count,|...] *) ret (* -- *) (* *) seg CopyToNewSeg (* [count,|...] *) #1 neg add (* [count-1,|...] *) swap (* [fistval,count-1,|...] *) newseg (* [*Addr,count-1,|...] *) Cycle2: (* [*Addr,ccnt,|...] *) #1 dup (* [ccnt,*Addr,ccnt,|...] *) #0 (* [0,ccnt,*Addr,ccnt,|...] *) *Done3 ifequal (* [*Addr,cnt,|...] *) #2 lift (* [firstval,*Addr,cnt,|...] *) #1 dup (* [*Addr,firstval,*Addr,cnt,|...] *) push (* [*Addr,cnt,|...] *) swap (* [cnt,*Addr,|...] *) #1 neg add (* [cnt-1,*Addr,|...] *) swap (* [*Addr,cnt-1,|...] *) *Cycle2 jump (* -- *) Done3: (* [*Addr,0|...] *) swap drop (* [*Addr|...] *) ret (* -- *) end I might add that in every generation the intermediate form was able to write a perfect replica of itself, and that I find it amazing that there exists a line of single mutations leading from the long quoted proglet to this small one all the while being able to self-replicate. Otherwise, the complete mutation history is preserved for every individual, so it can be inspected for the benefit of the doubters. Now the project can enter the second phase: proper selectors have to be implemented in order to evolve encryption. We should also return to the issue of how to distribute the source if anyone is interested in playing with it. Edited by Barbarian, : missing space, missing 'end' from the short proglet
|
|||||||||||||||||||||||
Jazzns Member (Idle past 3940 days) Posts: 2657 From: A Better America Joined: |
How many generations did it take from the proglet to the simple replicator?
Hopefully Percy can take some time off his busy schedule to talk about hosting. If not then I can arange something either temporarily while we wait for Percy or permanently if he is unable. Of course, biblical creationists are committed to belief in God's written Word, the Bible, which forbids bearing false witness; --AIG (lest they forget)
|
|||||||||||||||||||||||
Jazzns Member (Idle past 3940 days) Posts: 2657 From: A Better America Joined: |
I am always leery of posting my email in a public forum. I noticed that Barbarian also does not have his email in his profile. Wish his permission, could you send us eachothers email?
At the very least, could you send him mine so I can get a copy of his code to start playing with. I have ocaml installed and everything! Of course, biblical creationists are committed to belief in God's written Word, the Bible, which forbids bearing false witness; --AIG (lest they forget)
|
|||||||||||||||||||||||
Barbarian Inactive Member |
If that is enough, I do agree with Percy giving my e-mail address to you, although it would also be enough if I got yours, and sent you the packed sources.
Concerning the number of generations needed for the evolution of a minimal replicator: in the particular case I posted, it took 367 generations, but I have seen extremes like 210 and 1580 generations too - the framework re-seeds the random number generator at startup, so evolution is not repeatable (used to be, during debugging ...) and every run takes different paths and different generation counts. The range depends on the ur-proglet, as the first one I wrote only needed about 8-10 generations to get to the minimal case. But the one I posted above calls two subroutines in a cycle, the first subroutine copies a segment on the stack while the second one copies the stacked instructions to a newly allocated segment. In addition the subroutines are - in fact, must be - located in separate segments. From this we get a single-segment, no-subroutines program which then fairly quickly evolves into the minimal one; it takes a lot of time to do away with the subroutine mechanism, and the maximal fitness shows stagnation for the first 85-90% of the total time, after which it quickly goes up.
|
|||||||||||||||||||||||
Jazzns Member (Idle past 3940 days) Posts: 2657 From: A Better America Joined: |
It would be nice to be able to handpick the random seed so you could reproduce the exact evolution if you wanted to.
Of course, biblical creationists are committed to belief in God's written Word, the Bible, which forbids bearing false witness; --AIG (lest they forget)
|
|||||||||||||||||||||||
Barbarian Inactive Member |
Done.
Invocation Effect -------------------------------------------------- evot => initializes seed from clock evot -randseed 123 => seed set to 123 evot -randseed 0 => initializes seed from clock
|
|||||||||||||||||||||||
Percy Member Posts: 22503 From: New Hampshire Joined: Member Rating: 4.9 |
To Jazzns and Barbarian,
Okay, okay, I've been remiss. Yes, I can host. I promise to try to find time tonight to set up a subdomain for Jazzns, this will give him the ability to take things from there. Takes max of 24 hours for DNS name to propagate through the Internet, though. --Percy
|
|||||||||||||||||||||||
Barbarian Inactive Member |
I will be away from the internets from now till Friday afternoon (in CET = GMT + 1), so you don't have to hurry.
|
|||||||||||||||||||||||
Barbarian Inactive Member |
Is it possible that we are waiting for each other in a deadlock? For instance, did anyone send me an e-mail and holds now the vain hope that it got past the draconian spamfilter of my ISP?
|
|||||||||||||||||||||||
Jazzns Member (Idle past 3940 days) Posts: 2657 From: A Better America Joined: |
I just created a yahoo account. Go ahead and email me there. My yahoo username is 'jazzns'. I'll reply to you after that form my real account.
|
|||||||||||||||||||||||
Percy Member Posts: 22503 From: New Hampshire Joined: Member Rating: 4.9 |
The subdomain jazzns. has been created, but the DNS entry probably won't have propagated until tonight or tomorrow morning. There's a corresponding jazzns account, you can probably log in now, I'll send you a little info via email. The account has ftp, telnet and ssh access, your choice. It also has CGI capability, of course, and MySQL access.
--Percy
|
|
|
Do Nothing Button
Copyright 2001-2023 by EvC Forum, All Rights Reserved
Version 4.2
Innovative software from Qwixotic © 2024