Search This Blog

Follow adrianbowyer on Twitter

My home page

Tuesday, 25 April 2017

TTestComplete



A while ago the British Government was silly enough to allow me onto committees to decide how to spend millions of taxpayers' money on scientific and engineering research.  They even had me chair the meetings occasionally.

We'd get a stack of proposals for experiments, together with peer-review reports from other people like me on whether the experiments were worth doing or not.  The committees' modi operandorum were to put the proposals that the reviewers said were best at the top of the pile then work down discussing them and giving their proposers the money they wanted until the money ran out.

I liked to cause trouble by starting each meeting with my explanation of why this approach is All Wrong.

"The ones we should put at the top of the pile," I said, "are the ones where half the reviewers say 'Brilliant!' and the other half say 'Rubbish!'.  Those are the proposals that nobody knows the answer to, clearly.  So those are the experiments that are most important."

The other academics there would smile at me indulgently because of my political naivety.  The civil servants would smile at me nervously in case any of my fellow academics actually decided to do what I proposed.  And then everyone would carry on exactly as they had always done.

After a while I started saying no when I was asked to attend.

---o---

There has been an understandable fuss recently prompted by some good research by my erstwhile colleague Joanna Bryson and others about algorithmic racism - that is to say things like Google's autocomplete function giving the sort of results you can see in the picture above.

Google's (and other's) argument in defence of this is a strong one.  The essence of it is that their systems are driven by their user's preferences and actions; they gather the statistics and show people what most other people want to see when those other people do the same as you do.  The results are modified sometimes from "most other people" to "most other people like you" where "like you" is again the result of a statistical process.  If most other people are racist, historically ignorant cretins, then you will see results suitable for racist, historically ignorant cretins.  They (Google and the rest) are not like newspaper editors deciding what to put in front of people; they are just reflecting humanity back at you, you human you.

But you can see from the picture that the results of this are sometimes very bad, by almost any sensible moral definition.

Clearly what is needed is not the intervention of an editor - that would result in Google, Facebook and the rest turning into the New York Times or the Daily Mail, which would be a retrograde step, not an improvement.  What is needed is an unbiased statistical process that weights searches, hyperlinks and the rest from clever people more heavily than those from stupid people.

Note that I'm not saying that clever people aren't racists, and that stupid people are. I suspect that there is not that good a correlation, though this is interesting.  I'm just saying that in general all the web's automated linking and ranking systems ought to work better if they weighted the actions of people by their intelligence.

But how to grade the intellectual ability of web users?  The answer lies in the big data that all the web companies already use.  Facebook, for example, has a record of billions of people's educational achievements.  More interestingly it should be simple to train a neural network to examine tweets, blog posts and so on and to correlate their content with that educational data.  That network would then be able to grade new people and those who hadn't revealed any qualifications just by reading what they say online and apply weights accordingly.

I have no idea if this is a good idea or not.  It is my idea, but I'm not intelligent enough...


Thursday, 12 January 2017

HardClever


By now there must be a lot of people who actually believe that little glowing lights move along their axons and dendrites when they think, flashing at the synapses.

Anyway.

There has been a lot of fuss about AI lately, what with Google translate switching over to a neural network, rich people funding AI ethics research, and the EU trying to get ahead of the legislative curve.  There has also (this is humans in conversation after all...) been a lot of stuff on the grave dangers to humanity of super intelligent AIs from the likes of Stephen Hawking and Nick Bostrom.

Before we get too carried away, it seems to me that there is one very important question that we should be investigating.  It is: What is the computational complexity of general intelligence?  Before I say how we might find an answer, let me explain why this is important by looking at the extremes that that answer might take.  

At one end is linear complexity.  In this case, if we have a smart computer, we can make it ten times smarter by using a computer that is ten times bigger or faster.

At the other end is exponential complexity.  In this case, if we have a smart computer, we can make it ten times smarter only by having a computer that is twenty-two-thousand times bigger or faster.  (That is e10 times bigger or faster; there may be a factor in there too, but that's the essence of it.)

If smart computers do really present a danger, then the linear case is bad news because the machines can easily outstrip us once they start designing and building themselves and it is quicker to make a computer than to make a person.  In the exponential case the danger becomes negligible because the machines would have great difficulty obtaining the resources to make smarter versions of themselves.  The same problem would inhibit us trying to make smarter machines too (or smarter people by genetic engineering, come to that).

Note, in passing, that given genetic engineering the computers have no advantage over us when they, or we, make smarter versions of themselves or ourselves.  The computational complexity of the problem must be the same for both.

The big fuss about AI at the moment is almost all about machine learning using neural networks.  These have been around for decades doing interesting little tricks like recognising printed letters of the alphabet in images.  Indeed, thirty years ago I used to set my students a C programming exercise to make a neural network that did precisely that.

Some of the computational complexity of neural-net machine learning falls neatly into two separate parts.  The first is the complexity of teaching the network, and the second is the complexity of it thinking out an answer to a given problem once it has been taught.  The computer-memory required for the underlying network is the same in both cases, but the time taken for the teaching process and the give-an-answer process are different and separable.

Typically learning takes a lot longer than finding an answer to a problem once the learning is finished.  This is not a surprise - you are a neural network, and it took you a lot longer to learn to read than it now takes you actually to read - say - a blog post.

The reason for the current fuss about machine learning is that the likes of Google have realised that their big-data stores (which are certainly exponentially bigger than the newsprint that I used to give my students to get a computer to read) are an amazingly rich teaching resource for a neural network.

And here lies a possible hint at an answer to my question.  The teaching data has increased exponentially, and as a result the machines have got a little bit smarter.

On the other hand, once you have taught a neural network, it comes up with answers (that are often right...) to problems blindingly fast.  The time taken is roughly proportional to the logarithm of the size of the network.  This is to say that, if a network takes one millisecond to answer a question, a network twenty-two-thousand times bigger will take just ten milliseconds.

But the real experiments to find the computational complexity of general intelligence are staring us in the face.  They lie in biology, not in computing.  Psychologists have spent decades figuring out how smart squirrels, crows, ants, and all the rest are.  And they have also investigated related matters like how fast they learn, and how much they can remember.  Brain sections and staining should allow us to plot a graph of numbers of neurons and their degree of interconnectivity against an ordering of smartness of species.  We'd then be able to get an idea if ten times as smart requires ten times as much brain, or twenty-two-thousand times as much, or somewhere in between.

Finally, Isaac Asimov had a nice proof that telepathy doesn't exist.  If it did, he said, evolution would have exploited and refined it so fast and so far that it would be obvious everywhere.

We, as the smartest organisms on the planet, like to think we have taken it over.  We have certainly had an effect, and now find ourselves living in the Anthropocene.  But that effect on the planet is negligible compared to - say - the effect of phytoplankton, which are not smart at all.  And our unique intelligence took three billion years to achieve.  This is a strong indication that it is quite hard to engineer, even for evolution.

My personal guess is that general intelligence, by which I mean what a crow does when it bends a wire to hook a nut from a bottle, or what a human does when they explain quantum chromodynamics, will turn out to be exponentially hard.  We may well get there by throwing exponential resources at the problem.  But to get further either the intelligent computer, or we, will require exponentially more resources.

Saturday, 17 December 2016

TubeFreight



Occasionally one sees a freight train like this one on the London Tube.

If you want to move something smaller, like a parcel, quickly from an office in Fenchurch Street to one in the Fulham Road you give it to a bike messenger.  That person goes off at speed, makes a dent in a BMW bumper ("I didn't see you, mate."), gets sticky red stuff all over the BMW's windscreen, and fails to transfer the parcel.  This is not a very satisfactory solution to the delivery problem.

But it would be quite easy to design a QR-code-based system that allowed you to drop off your parcel at Aldgate station.  Your parcel would slide down a chute to the appropriate platform, having had its destination automatically scanned and having had your account appropriately debited.  There a robot would load it onto the next train (maybe on a parcel and letter rack between the passenger carriages).

At each station the robots would be loading and unloading packages, and swapping them by conveyor to different lines automatically.

When your parcel reached Fulham Broadway that station's robot would unload it and send it on a conveyor out to a collection point on the street.  Its recipient would get a text to say their parcel was ready, whereupon they would stroll to the station, wave their phone at the collection point, and be given their parcel.

The whole system would be fast and fully automatic, and it would make extra income for Transport for London. It would also reduce the need for BMW drivers to keep cleaning their windscreens.




Tuesday, 18 October 2016

MemoryLane


Google Street View lets you go anyplace on Earth that Google's cameras have previously visited (which is pretty much everywhere) and explore that place interactively as a 3D virtual world.  Sometimes the pictures are a bit out of date, but the system is still both interesting and useful.

In one way, however, the pictures are not out of date enough.

There are now many complete 3D computer models of cities as they were in different historical eras.  The picture above, for example, is a still from a video fly-through of a model of seventeenth century London created by De Montfort University.  But a directed video fly-through is not the same as a virtual world that you can explore interactively.

So why not integrate these models with Street View?  You could have an extra slider on the screen that would allow you to wind back to any point in history and walk round your location at that date.  There would be gaps, of course, which could be filled in as more models became available.  And also some of the buildings and other features would be conjecture (the De Montfort model is accurate as far as the known information is concerned, but it is set before the Great Fire so there are interpolations).  As long as these were flagged as such there would be no danger of confusion.  Street View does allow you to go back through Google's scanned archive, but in the seventeenth century they were quite a small company without the resources needed to do the scanning.

On your 'phone, the historical data could be superimposed on the modern world in augmented reality as you walked in it, Pokémon Go style, giving you details of superseded historical architecture in your current location.

And when there were enough data we could train a neural network to predict the likely buildings at a given location on a given date from the buildings preceding them in history.  Running that on the contemporary Street View would give us an idea of what our cities might look like in the future...


Wednesday, 31 August 2016

DashDot


Amazon now have their Dash button that allows you to buy a restricted range of goods from, surprise - Amazon - when something runs out.  So you put the button on your washing machine, press it when the powder gets low, the button automatically does a buy-with-one-click using your home wifi, and a new pack arrives a day later.

But you can't set the buttons up to buy anything you like from Amazon, let alone from other suppliers.  The button locks you in to products that may well not be the best deal, nor exactly what you want.

Clearly what's needed is a user-programmable button that you can set up to take any online action that you preset into it.  Thus pressing the button might indeed do an Amazon one-click, or it might add an item to your Tesco online order, or it might boost your web-controlled central heating in the room where you are sitting, or it might just tweet that you are having your breakfast (if you feel that the world needs to know that on a daily basis).

Electronically, such a device would be straightforward.  And - as a marketing opportunity - it is potentially huge.  It would allow people total control over what they buy and from whom, completely subsuming Amazon Dash within itself among a much wider range of possibilities.  And in addition it could be used to carry out a vast range of non-buying online actions that are amenable to your pressing a button when you feel like it.

If I can find a spare afternoon, I might just design it and open-source the results...

Tuesday, 16 February 2016

GunAngel



Steven Pinker's famous book The Better Angels of our Nature posits four reasons for the decline in violence that has happened and is happening in the world: empathy, self-control, our moral sense, and reason.  He explicitly (though only partially) rejects the idea that we are evolving to become more peaceful.

I am not sure (particularly given meme as well as gene copying) that evolution can be discounted as an explanation for the decline in violence.

Recall John Maynard Smith's hawks-and-doves example of an evolutionarily stable strategy. Suppose the payout or utility matrix is


hawk dove
hawk -1, -1 +0.5, -0.5
dove -0.5, +0.5 +0.25, +0.25

What this says in English is that when two hawks meet they fight and each loses 1 unit of utility (the -1s top left) because of energy wastage, injury or death.   When a hawk meets a dove the hawk gains +0.5 units of utility because the hawk can easily steal from the dove (the +0.5 top right) and the dove loses 0.5 (the -0.5).  When a dove meets a hawk the reverse happens (bottom left).  And when two doves meet they each gain 0.25 units because they don't fight and can cooperate (bottom right).

The resulting utility graph looks like this:

The horizontal axis is the proportion of doves (the proportion of hawks is one minus the proportion of doves) and the vertical axis is utility .  The blue line is what hawks get for any given proportion of doves, and the orange line is what doves get.  To the left of the crossing point the orange line is higher, so there it makes more sense to be a dove than a hawk.  To the right the blue line is higher so there it makes more sense to be a hawk than a dove.  This means that the crossing point is the point where the population is evolutionarily stable - at that point it makes no sense for either doves or hawks to change their behaviour.  The crossing point is where the population has 33% of hawks and 67% of doves.

(I have chosen numbers that make the Nash equilibrium occur at zero utility for simplicity; this is not necessary for the argument that follows.)

Now suppose that one thing changes: technological advance makes weapons more deadly.

Note very carefully that better weapons is not the same thing as more weapons.  The number of weapons always goes as the proportion of hawks (33% above) and is an output from, not an input to, the model.

With better weapons, when a dove meets a dove nothing is different because they didn't fight before and they don't now.  When a hawk meets a dove the hawk gets the same profit as before because the dove surrendered all that it had before.  So the numbers in the right hand column stay the same except for...

When a dove meets a hawk the dove may lose more (maybe it dies instead of merely being injured: the -0.75s). And when a hawk meets a hawk both lose disastrously because their better weapons mean greater injury and more death (the -1.5s).  So the numbers in the left hand column get more negative:


hawkdove
hawk-1.5,-1.5+0.5, -0.75
dove-0.75, +0.5+0.25, +0.25

and the utility graph changes:

Now the population is stable when there are fewer hawks (25%) - and thus also fewer weapons - and more doves (75%).

Making weapons better at killing gives a society with fewer of them; a society that is more peaceful.

Monday, 1 February 2016

PhoneChat



It is quite entertaining to listen in when my daughter (who's not the woman in the photo above) gets a scam telephone call.  She sets herself two targets:
  1. To keep the scammer on the line as long as possible to waste their time and money, and
  2. To try to get the scammer's credit card or bank details.
So far she has failed on Target 2, but she does manage to keep some of them doggedly attempting to return to their scripts after she has led them up garden paths after wild geese and red herrings for a long time.

But the problem is that all this wastes her time too.

Chatterbots have been around since I was programming computers by punching little holes in rectangles of cardboard.   The first was Weizenbaum's ELIZA psychiatrist.  That mimicked a non-directive therapist.  It was completely brainless, but so strong is the human impulse to ascribe active agency to anything that talks to us, it was both interesting and fairly convincing to have a typed conversation with.

And these days chatterbots are much more sophisticated.  With near real-time speech recognition, voice synthesis that sounds like a proper human, and recursive sentence writers that never make a grammatical mistake, they can just about hold a real 'phone conversation.   Just listen to the second recording here - the appropriate laughter from the robot is stunning.

So how about a 'phone app that you tap when you get a scam call?  This app takes over the conversation, wasting the scammer's time for as long as possible and allowing you to get on with your life.

But it needn't end there.  The app could transcribe the entire scam conversation and upload it.  This would automatically compile a reference collection of scammer's scripts that anyone could google while they had someone on the line that they were suspicious of.  Also the app could evolve: conversational gambits that led to longer calls could be strengthened and new weights could be incorporated in upgrades so the app would get better and better at hanging the hapless scammer on the line.  Finally, the app could take the record of the things that the scammers themselves say and add variations on that to its repertoire of responses.

Already there are online lists of source numbers for scammers (though most disguise their origins, of course).  When the app found that your 'phone's account was coming to the end of the month and that you had unused free minutes it could dial up those scammer numbers at three in the morning and see how many crook's credit card and bank details it could gather and post online...


Tuesday, 29 December 2015

SprogWort



The very wonderful Ben Goldacre rightly has it in for (among others) journalists who misrepresent scientific research to generate a completely artificial scare story.  Think no further than the MMR scandal in journalism for example, in which newspapers killed many children.  (And in which one journalist, Brian Deer, exposed the original fraud.)

Often the problem is not the original paper describing the research, but the press release put out by the authors' institution (whose PR departments are usually staffed by more journalists).  Of course the authors are at fault here - the PR department will always give them a draft of any release to check, and they should be savage in removing anything that they think may cause distortions if reported.  But authors are not disinterested in fame and publicity.

It seems to me that there is a simple solution.  The normal sequence of events is this:

  1. Do experiments,
  2. Write a paper on the results,
  3. Submit it to a journal,
  4. Correct the paper according to criticisms from the journal's reviewers,
  5. See the paper published,
  6. Have the PR people write a press release based on the paper,
  7. Check it,
  8. Send it out, and
  9. See the research - or a distortion of it - appear in the press.

But simply by moving Item 6 to between Items 2 and 3 - that is by having the press release sent out with the paper to the journal's reviewers - a lot of trouble could be avoided.  The reviewers have no interest in getting fame and publicity (unlike the authors and their institution), but they are concerned with accuracy and truth.  If they were to correct the press release along with the paper itself, and in particular were compelled to add a list at the start of the press release on what the paper does and does not say in plain terms, then a lot of trouble could be avoided.

The list would look something like:

  1. This paper shows that rats eating sprogwort reduced their serum LDL (cholesterol) levels by a statistically significant amount.
  2. This paper does not show that sprogwort reduces cardiovascular disease in rats.
  3. This paper does not show that sprogwort reduces cardiovascular disease in humans.
  4. Sprogwort is known to be neurotoxic in large doses; the research in this paper did not study that at all.


Then everyone would quickly discover that the following morning's headline in the Daily Beast that screams

SPROGWORT CURE FOR HEART ATTACKS

was nonsense.  In particular other journalists would know, and - of course - there's nothing one journalist loves more than being able to show that a second journalist is lying...