[LINK] Machine Learning Was: Re: Robot cars and the fear gap

Tue Jul 26 06:41:20 AEST 2016

At 4:57 +1000 26/7/16, Andy Farkas wrote:
>A really nice and simple overview of machine learning:
>https://medium.com/@ageitgey/machine-learning-is-fun-80ea3ec3c471#.pigplooz9

To the sceptics among us, it provides a nice, easy way to develop the critique that machine learning badly needs.

For example:

>"Machine learning" is an umbrella term covering lots of ... generic algorithms [that can tell you something interesting about a set of data without you having to write any custom code specific to the problem]

This begs such questions as:
-   how big is the set?
-   what parts of the real world do they correspond to sufficiently
    accurately to be useful?
-   how is the choice made as to which algorithm to use in which context?
-   how is the size and content of a sufficient training-set determined?
-   how are 'answers' tested against the real-world?
-   how is an audit performed?

>In supervised learning, you are letting the computer work out that relationship for you. And once you know what math was required to solve this specific set of problems, you could answer to any other problem of the same type!

If that meant "you can generate *an* answer", all would be well.

But the presumption is far too easily made that it's *the* answer.

> ... unsupervised learning is becoming increasingly important as the algorithms get better because it can be used without having to label the data with the correct answer

Oh dear, now the risk of blind presumption of 'truth' is no longer just a sceptic's fantasy.  Geitgey now believes his own mythology.

It's particularly hilarious given that his chosen example is property valuation, and 'value' is highly context-dependent and not a topic to which any notion of 'truth' / 'correct answer' applies.

And his aside addressed to sceptics like me missed the point entirely:
>Side note for pedants: There are lots of other types of machine learning algorithms. But this is a pretty good place to start.

>Of course if you are reading this 50 years in the future and we've figured out the algorithm for Strong AI, then this whole post will all seem a little quaint.

It's quaint, but not for the reason he thinks.  The quaintness lies in the naive belief that there's such a thing as "the algorithm for Strong AI".

> ... you've just written a function that you don't really understand but that you can prove will work

The notion that you can 'prove that something doesn't work' is tenable. 

But the notion that you can 'prove that something *does* work' is delusional.

What works on any particular training-set (nomatter how large) may or may not work for the next instance.

And his very next example suggests that the algorithm that supports guesses about a house's market-value (a basically harmless application) is the one you want to embed in your autonomous car (an essentially dangerous application).

>Then you are using that equation to guess the sales price of houses you've never seen before based where that house would appear on your line. It's a really powerful idea and you can solve "real" problems with it.

The leap from 'guess' to 'solve' is breathtaking.

> ... it's important to remember that machine learning only works if the problem is actually solvable with the data that you have

Ah, could he be about to become wise?

>For example, if you build a model that predicts home prices based on the type of potted plants in each house, it's never going to work. There just isn't any kind of relationship between the potted plants in each house and the home's sale price. So no matter how hard it tries, the computer can never deduce a relationship between the two.

But he's completely missed the point that:
(a)  there are many, many circumstances in which correlations can be found
(b)  the scheme implicitly confuses correlation with causality ("predicts")

>In my mind, the biggest problem with machine learning right now is that it mostly lives in the world of academia and commercial research groups

And he's got that arse-about as well.  Keeping AI, sub-sets like ML, and naive people like him, in laboratories is the best thing we can do.

Nope, I haven't yet read Parts 2-4, but there are some rather more useful things I need to do this morning than deconstruct a silly belief system.

</tetchy>

-- 
Roger Clarke                                 http://www.rogerclarke.com/

Xamax Consultancy Pty Ltd      78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916                        http://about.me/roger.clarke
mailto:Roger.Clarke at xamax.com.au                http://www.xamax.com.au/

Visiting Professor in the Faculty of Law            University of N.S.W.
Visiting Professor in Computer Science    Australian National University