You should add Dawkins'
The Blind Watchmaker to your reading list. At the very least, the first half of Chapter 3, in which he describes the difference between
single-step selection (SSS, to make my typing go faster) and
cumulative selection (CS). In SSS, you try to get everything to fall randomly in place all at once, then you try again from scratch, and again from scratch, etc. OTOH, CS uses each attempt as the starting point for the next attempt. SSS selection is what creationists and IDists claim that evolution uses, whereas CS is what evolution does use.
The effectiveness of SSS can be calculated directly and, as you stated, it results in an extremely low probability, so low as to be deemed virtually impossible. But that's not what evolution uses. To illustrate and test a CS model, he wrote his WEASEL program in BASIC that used CS to generate a single line of Shakespeare, "Methinks it is like a weasel". They started the program and left for lunch and and it was finished by the time they returned. In the book, he does not provide a code listing, but he does describe what the program does. Armed with that information, many skeptics have written their own versions of WEASEL in their own choice of language (BASIC, being an interpreted language, is rather slow and so is not a very good choice unless you're not a very experienced programmer). The Wikipedia article that Modulus pointed you to (please read it) points to a long-established page that presents a number of those programs:
Almost Like a Whale.
My own program, MONKEY, was written a couple decades ago in Turbo Pascal (I am David Wise). You can download my program along with the source code from his site (my own is down until I can find a new provider). Unfortunately, a timing loop in the start-up code of TP started failing when PCs started running faster, as described by Musgrave on his page. I found and incorporated a fix for that, but I don't know whether Musgrave ever got it.
I wrote MONKEY because I couldn't believe Dawkins' claim. I calculated the SSS probability and assumed a computer much faster than my PC/XT (Norton Factor 2) that could perform a million tests per second and came to the conclusion that in order to have one chance in a million that computer would need to run for a couple hundred billion years, many times longer that the universe has been in existence. But with CS, MONKEY succeeded within a few minutes (most often within 30 seconds, depending on population size) consistently, repeatedly, without fail.
I then analyzed the probabilities involved, though for CS I had to employ Markovian chains. What I found was that the only way for MONKEY to fail using CS would be for almost every single attempt to fail, the probability for which is much lower that the probability for SSS to succeed.
I just checked and Musgrave included my calculations and write-up on his page.